[go] partitioning gene association files
Mike Cherry
cherry at stanford.edu
Sun Jan 27 21:55:07 PST 2008
At the Princeton GOC meeting (our 18th) it was decided to partition
each GA file in two. One file would contain all annotations with non-
IEA evidence, the other would contain all the annotations with IEA
evidence.
We need to specify this a bit more. I have a script that divides up
the annotations.
Question: Names of the resulting files? At Princeton I recall it was
agreed to have the file without IEA annotations to keep the name of
the current file. Then there would be a new file for just the IEA
annotations, I didn't find the name mentioned the minutes but I recall
it was something long like gene_association.XXX.iea_annotations.gz
For example:
current file:
gene_association.mgi.gz
after partitioning happens:
gene_association.mgi.gz -- non-IEA annotations
gene_association.mgi.iea_annotations.gz -- IEA annotations
Question: Both files would be created for all projects? In some cases
all the current annotations are IEA. Here the xxx.gz file would have
no annotations, just a comment to say check the other file. For other
projects there are no IEA annotations, here the xxx.iea_annotations.gz
files would have no annotations just comments. Most projects will
have annotations in both files.
The submission of files would not change. Each project would continue
to submit the ga file as is done now.
All this is about changing the processing of the submitted file, it
would become filtered and partitioned in one step.
We would need to announce and give amply notice of this change, at
least 2-3 months after the announcement.
-Mike
More information about the Go
mailing list