[Go] Abbreviations used in gene association files

Mike Cherry cherry at stanford.edu
Fri Jul 11 12:37:55 PDT 2008


There is interest in the GOC using standard abbr in our GAFs.  To help  
you identify issues in your GAF I have created an abbreviation  
checker.  Its in CVS at:

go/software/utilities/check-abbr-ga-file.pl

This script reports when the abbreviation is one of the defined  
synonyms, and thus should be changed to the standard abbr.  For  
example the use of UniProt include of the defined standard UniProtKB.   
The script will also report when the case of your abbr does not match  
the stanford.  For example, INTERPRO instead of InterPro.

The GOC using the same abbreviations is good for the community, they  
will see that we too can make a standard and use it.  It also helps  
those that process our files, they can easily find common IDs without  
processing all the abbreviations themselves.  Use of a standard also  
helps AmiGO work better, hyperlinking code can be simplified, thus be  
more reliable.

Please use the above script to fix your project's GAF file.  The  
following filtered GAFs have abbreviation issues.  The majority of the  
mismatches are in the WITH column, but there are problems with  
ASSIGNED_BY, DB and DB:REF.

goa_pdb
mgi
fb
gramene_oryza
wb
sgd
tair
GeneDB_Lmajor
GeneDB_Tbrucei
rgd
GeneDB_Pfalciparum
goa_human
reactome
goa_uniprot
goa_cow
goa_chicken
dictyBase
zfin
GeneDB_Spombe
tigr_Tbrucei_chr2
PAMGO_Mgrisea

-Mike




More information about the Go mailing list