[Annotation] evidence code advice
Kara Dolinski
kara at genomics.princeton.edu
Wed Mar 26 13:59:11 PDT 2008
Hi,
The root of the problem, as I see it, is that we are mixing apples
and oranges with evidence codes. All but one of the evidence codes
indicate the type of experimental evidence for a GO annotation, but
we have one oddball, IEA, that indicates not what the experiment is,
but rather how the annotation was done. We keep running into
variations of the same problem: we have some evidence (whether
experimental or computational) for a GO annotation, but also want to
indicate whether a curator looked at it or not.
My proposed (albeit radical) solution:
Remove IEA as an evidence code.
Create a new property for GO annotations (or add a new type of
qualifier) that captures how the annotation was done: manual or
automated.
Everything that is currently IEA would be given the 'automated'
property/qualifier, and then would be given a new evidence code as
appropriate (mostly a flavor of ISS I would assume).
There can be a rule that all 'automated' annotations that are a
flavor of ISS must have a 'with' value.
This would allow us to use 'RCA' as appropriate, in some cases they'd
be 'manual', in others, they'd be 'automated'. In Rama's case, the
annotations would be 'RCA' with an 'automated' qualifier.
I realize the issues involved in making such a drastic change, so I
understand if we don't go there, but I do think that some approach
such as the one above is the best representation of the information
that we are trying to capture.
Cheers,
Kara
On Mar 26, 2008, at 4:30 PM, Rama Balakrishnan wrote:
>
> Hi All,
>
> SGD has come across couple of computationally predicted GO
> annotation data sets for S. cerevisiae that we would like to add to
> our database. The GO annotations from these data sets are
> predictions based on multiple high-throughput data sets. RCA
> evidence code came to our minds but according to the documentation,
> the annotations all have to be manually reviewed by a curator to
> use this evidence. There are several 100 annotations of this kind
> and it is not feasible for us to manually review these annotations.
>
> Hence, we thought these annotations can be bulk loaded with IEA
> evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it
> was decided that the 'with' column information has to be filled in
> for all IEAs (else Mike's filtering script strips them out). But
> these GO annotations being predictions based on multiple high-
> throughput data sets, don't have any information for the with
> column. So, we are left with no choice.
>
> Which evidence code do people think should be used for these kinds
> of computational datasets when there is not an obvious "with"?
>
> Thanks for your input.
>
>
> Rama
>
>
> +-----o--o
> ---------------------------------------------------------------
> o-o Rama Balakrishnan Ph.D
> O Senior Scientific Curator
> o-o Saccharomyces Genome Database
> o---o Stanford University
> o----o Stanford, CA 94305-5120
> O-----O Ph: 650.725.8956 Fax: 650.723.7016
> 0--o email: rama at genome.stanford.edu
> O Website: http://www.yeastgenome.org
> o-o SGD Wiki- http://wiki.yeastgenome.org
> +- o---o
> -----------------------------------------------------------------
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080326/5addb7c6/attachment.html
More information about the Annotation
mailing list