[Go] [go] annotations for refgenomes

Valerie Wood val at sanger.ac.uk
Thu Feb 28 01:25:35 PST 2008


Hi Mike,

The graphs are really great.
I also prefer to be able to see the ND annotations, these are a crucial 
to indicate absence of data as opposed to absence of curation.

I have 2 suggestions which might make the graphs more informative.

i) separate out ND from IC,NAS,TAS as this is a root node annotation, 
and also captures the fact that there is no available annotation
ii) show ND at the top of the bar (for the same reason)

(by the way, in the legend for the first graph you include ND, but the 
data isn't included which is probably why this was initially confusing)

Thanks

Val




Mike Cherry wrote:

>Sue,
>
>A graph with annotations to the root and ND annotations can be found at:
>
>   http://genetics.stanford.edu/~cherry/RefGenome-with-ND.pdf
>
>I put it there to reduce the size of this email.
>
>-Mike
>
>On Feb 27, 2008, at 10:32 PM, Sue Rhee wrote:
>
>  
>
>>Great. Actually, it would be very interesting to include the  
>>unknowns in the graph. Since you color-code by the evidence codes,  
>>the NDs would be a separate color, no? I think knowing the extent of  
>>the unknown is quite useful..
>>
>>I would love to see last year's graph. I think these graphs would be  
>>very cool to have on the GO website at some point, given that the  
>>participating groups are OK with that.
>>
>>Cheers,
>>Sue
>>
>>Mike Cherry wrote:
>>    
>>
>>>Sue,
>>>
>>>Thanks I'm glad you like the graph.  I have the graph from last year
>>>if anyone wants it.
>>>
>>>Good question about the filtering.  The filtering different between
>>>the two graphs simply refers to the nightly filtering of gene
>>>association files.  This includes a large list of potential "errors"
>>>that are removed including badly formated information, syntax errors,
>>>use of obsolete GOIDs.  The normal filtering.  Thats good that there
>>>isn't much different between the two graphs.
>>>
>>>For both graphs I remove all the annotations to the roots: GO: 
>>>0008150,
>>>GO:0003674, GO:0005575.  Then count the number of IDs in column 2.
>>>For TAIR there are 20,500 unique IDs for Function annotations with  
>>>the
>>>root annotations excluded.
>>>
>>>-Mike
>>>
>>>
>>>On Feb 27, 2008, at 6:25 PM, Sue Rhee wrote:
>>>
>>>
>>>      
>>>
>>>>Hi Mike,
>>>>
>>>>It's a beautiful graph. Thanks for sharing. I have a couple of
>>>>questions though. What is being filtered out? It seems to me the two
>>>>graphs are very similar. Also I'm a little confused about the total
>>>>number of genes. In TAIR, we have 24465 (for function), 23534 (for
>>>>process), and 22038 (for component) protein-encoding genes annotated
>>>>with GO, but the graphs seem to be showing different numbers.
>>>>
>>>>Sue
>>>>
>>>>Mike Cherry wrote:
>>>>
>>>>        
>>>>
>>>>>Graphs I made list week.  One is from the submitted GA file and one
>>>>>from the filtered GA file.
>>>>>
>>>>>-Mike
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>-- 
>>>>Sue Rhee
>>>>Staff Scientist
>>>>Carnegie Institution, Department of Plant Biology
>>>>260 Panama Street, Stanford, CA 94305
>>>>Email: (650) 325-1521 x251
>>>>Fax: (650) 325-6857
>>>>
>>>>        
>>>>
>>>_______________________________________________
>>>Go mailing list
>>>Go at geneontology.org
>>>http://fafner.stanford.edu/mailman/listinfo/go
>>>
>>>      
>>>
>>-- 
>>Sue Rhee
>>Staff Scientist
>>Carnegie Institution, Department of Plant Biology
>>260 Panama Street, Stanford, CA 94305
>>Email: (650) 325-1521 x251
>>Fax: (650) 325-6857
>>    
>>
>
>_______________________________________________
>Go mailing list
>Go at geneontology.org
>http://fafner.stanford.edu/mailman/listinfo/go
>
>
>
>  
>



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


More information about the Go mailing list