[go] partitioning gene association files

Judith Blake jblake at informatics.jax.org
Mon Jan 28 06:29:55 PST 2008


Mike,
My sense was that this was to be for the GA files for reference genomes 
only. 

I am fine with your naming proposal.

Judy

Mike Cherry wrote:
> At the Princeton GOC meeting (our 18th) it was decided to partition 
> each GA file in two.  One file would contain all annotations with 
> non-IEA evidence, the other would contain all the annotations with IEA 
> evidence.
>
> We need to specify this a bit more.  I have a script that divides up 
> the annotations.
>
> Question:  Names of the resulting files?  At Princeton I recall it was 
> agreed to have the file without IEA annotations to keep the name of 
> the current file.  Then there would be a new file for just the IEA 
> annotations, I didn't find the name mentioned the minutes but I recall 
> it was something long like  gene_association.XXX.iea_annotations.gz
>
> For example:
>
> current file:
>
>   gene_association.mgi.gz
>
> after partitioning happens:
>
>   gene_association.mgi.gz  -- non-IEA annotations
>   gene_association.mgi.iea_annotations.gz  -- IEA annotations
>
> Question: Both files would be created for all projects?  In some cases 
> all the current annotations are IEA.  Here the xxx.gz file would have 
> no annotations, just a comment to say check the other file.  For other 
> projects there are no IEA annotations, here the xxx.iea_annotations.gz 
> files would have no annotations just comments.  Most projects will 
> have annotations in both files.
>
> The submission of files would not change.  Each project would continue 
> to submit the ga file as is done now.
>
> All this is about changing the processing of the submitted file, it 
> would become filtered and partitioned in one step.
>
> We would need to announce and give amply notice of this change, at 
> least 2-3 months after the announcement.
>
> -Mike
>



More information about the Go mailing list