[go] partitioning gene association files

Mike Cherry cherry at stanford.edu
Sun Jan 27 21:55:07 PST 2008


At the Princeton GOC meeting (our 18th) it was decided to partition  
each GA file in two.  One file would contain all annotations with non- 
IEA evidence, the other would contain all the annotations with IEA  
evidence.

We need to specify this a bit more.  I have a script that divides up  
the annotations.

Question:  Names of the resulting files?  At Princeton I recall it was  
agreed to have the file without IEA annotations to keep the name of  
the current file.  Then there would be a new file for just the IEA  
annotations, I didn't find the name mentioned the minutes but I recall  
it was something long like  gene_association.XXX.iea_annotations.gz

For example:

current file:

   gene_association.mgi.gz

after partitioning happens:

   gene_association.mgi.gz  -- non-IEA annotations
   gene_association.mgi.iea_annotations.gz  -- IEA annotations

Question: Both files would be created for all projects?  In some cases  
all the current annotations are IEA.  Here the xxx.gz file would have  
no annotations, just a comment to say check the other file.  For other  
projects there are no IEA annotations, here the xxx.iea_annotations.gz  
files would have no annotations just comments.  Most projects will  
have annotations in both files.

The submission of files would not change.  Each project would continue  
to submit the ga file as is done now.

All this is about changing the processing of the submitted file, it  
would become filtered and partitioned in one step.

We would need to announce and give amply notice of this change, at  
least 2-3 months after the announcement.

-Mike




More information about the Go mailing list