[Go] [go] annotations for refgenomes

Judith Blake jblake at informatics.jax.org
Fri Feb 29 11:46:13 PST 2008


I agree that we would gain by being explicit if we agree on the 
organization.

I think this should be finalized at the GOC meeting.

ISS now has subcodes, and I would insert these codes in between the 
experimental and ISS.  I bet it's a good guess that others would 
organized differently.

(IMP|IDA|IPI|IGI|IEP) > (ISO)(ISM) >(ISA)(ISS) > (IC|NR|ND|NAS|TAS) > 
(IGC|RCA) > (IEA)

Judy

Chris Mungall wrote:
>
> On Feb 28, 2008, at 7:58 AM, Mike Cherry wrote:
>
>> On Feb 28, 2008, at 3:47 AM, Judith Blake wrote:
>>
>>> Mike,
>>>
>>> I'm a little confused.  I don't understand how this is constructed.
>>> Does number of genes 'ISS' mean the number of genes annotated Only
>>> with ISS?  or are the numbers cumulative.  For mouse, for example,
>>> do you mean to say that ~ 8000 genes are annotated with IGC or RCA?
>>> or that ~ 1000 genes are annotated with these codes in addition to
>>> the other codes except IEA?
>>>
>>> Judy
>>>
>>
>> Its not only.  There is a hierarchy used.  If a gp had IDA, ISS and
>> IEA then only IDA is counted.  If ISS and IEA then ISS is counted.
>> Here is my order of evidence I use for the graphs:
>>
>> c
>
> The bar charts are great!
>
> Minor comments:
>
> 1) If the trust hierarchy Mike mentions above is useful beyond Mike's 
> teaching purposes (it seems that we agree it is) then it should be 
> explicitly encoded in ECO, or some supplemental ontology to ECO, so 
> that we don't have to hardcode this in every piece of software that 
> summarises annotations in a similar fashion
>
> 2) We should adopt some consistent visuals or terminology across all 
> presentation aspects of GO to avoid confusion regarding whether 
> annotated entities are double counted or not.
>
> One option is to explicitly list codes and the "only" qualifier, for 
> all but the most trusted evidence:
>
> IEA only
> IGC, RCA or IEA only
> ...
> ISS, IC, ND, NR, NAS, TAS, IGC, RCA or IEA only
> IDA, IMP, IPI, IEP
>
> This is relatively unambiguous, but costs more screen real estate
>
> Another option would be to include the '<' symbol in the key, and to 
> ensure we use this symbol consistently in the context of evidence. 
> This has the advantage of keeping the figure legend almost as compact 
> as in Mike's diagram, although it doesn't make the exclusion clear
>
> Perhaps the evidence WG can get back on (1) and the WPWG on (2)?
>
>> I use the graph for a class I teach.  My point is to show how many
>> annotations exist and general classes of experimental, and non-
>> experimental, evidence known for an organism's gene products.
>>
>> For the ND annotations I prefer not to include them.  For sure ND are
>> in the GA files but in my opinion they don't help the students
>> understand the research of the model organism.  This is about
>> community research not about the work of the MODs.  Annotations to the
>> root are "No Data".  My point is how much or how little
>> experimentation has been done on a particular organism.  ND shows the
>> work of the curators, but what the experimental community hasn't
>> done.  Also not all MODs have filled in annotations to the root for
>> all gene products.  In my slide before this graph I talk about how
>> many genes are in each organism.
>>
>> Because of evolution we have this beautiful connectedness in biology.
>> We work on the particular systems that are most appropriate, more
>> powerful, in exploring biology.
>>
>> -Mike
>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>>
>


More information about the Go mailing list