[Go] growth in manual GO annotation

Doug howe dhowe at cs.uoregon.edu
Thu Apr 10 13:03:59 PDT 2008


Agreed.

Judith Blake wrote:
> Doug,
> I would check with Emily about the species and also the evidence codes 
> for those species.  There are several mechanisms that I can think of 
> where annotations would be other than IEA.
>
> Judy
>
> Doug howe wrote:
>> WOW...that's way more species than I would have predicted....thanks 
>> Chris.
>>
>> Chris Mungall wrote:
>>  
>>> On Apr 10, 2008, at 10:08 AM, Doug howe wrote:
>>>
>>>    
>>>> Oopps...let me be more clear..I'm looking for:
>>>>
>>>> 1.  The number of distinct gene products (across all species) 
>>>> annotated using NON-IEA, NON-ND evidence on 1/1 of each year from 
>>>> 2002-2008.
>>>>       
>>> SELECT count(DISTINCT gene_product_id) AS num_gps
>>> FROM association INNER JOIN evidence ON 
>>> (evidence.association_id=association.id)
>>> WHERE code != 'IEA' AND code != 'ND';
>>>
>>> go_old_20030101
>>> num_gps
>>> 42746
>>>
>>> go_old_20040101
>>> num_gps
>>> 99116
>>>
>>> go_old_20050101
>>> num_gps
>>> 144635
>>>
>>> go_old_20060101
>>> num_gps
>>> 136734
>>>
>>> go_old_20070101
>>> num_gps
>>> 140370
>>>
>>> go_old_20080101
>>> num_gps
>>> 192535
>>>
>>>
>>>    
>>>> 2.  The number of distinct species with any NON-IEA, NON-ND GO 
>>>> annotation on 1/1 of each year from 2002-2008.
>>>>       
>>> SELECT count(DISTINCT species_id) AS num_species
>>> FROM gene_product
>>>  INNER JOIN association ON 
>>> (gene_product.id=association.gene_product_id)
>>>  INNER JOIN evidence ON (evidence.association_id=association.id)
>>> WHERE code != 'IEA' AND code != 'ND';
>>> go_old_20030101
>>> num_species
>>> 207
>>>
>>> go_old_20040101
>>> num_species
>>> 375
>>>
>>> go_old_20050101
>>> num_species
>>> 533
>>>
>>> go_old_20060101
>>> num_species
>>> 638
>>>
>>> go_old_20070101
>>> num_species
>>> 884
>>>
>>> go_old_20080101
>>> num_species
>>> 930
>>>
>>> (yep, these numbers are correct, there is a lot of non-MOD 
>>> annotations to GO)
>>>
>>>    
>>>> Doug howe wrote:
>>>>      
>>>>> Thanks Chris those are very useful numbers.  If you don't mind 
>>>>> running two more queries, it won't be necessary to open the older 
>>>>> stuff to Goose.
>>>>> I'd be interested to see:
>>>>> 1.  The number of distinct gene products (across all species) 
>>>>> annotated on 1/1 of each year from 2002-2008.
>>>>> 2.  The number of distinct species with any GO annotation on 1/1 
>>>>> of each year from 2002-2008.
>>>>>
>>>>> -Thanks!
>>>>> -Doug
>>>>>
>>>>>
>>>>> Chris Mungall wrote:
>>>>>
>>>>>        
>>>>>> On Apr 8, 2008, at 9:48 AM, Doug howe wrote:
>>>>>>
>>>>>>
>>>>>>          
>>>>>>> Does anyone have, or know how to get, historical stats on the 
>>>>>>> number of
>>>>>>> GO annotations that have been contributed to the GOC over time?  
>>>>>>> I'm
>>>>>>> looking for the number of non-IEA, non-ND GO annotations that 
>>>>>>> existed
>>>>>>> for each year from 2002-2008.
>>>>>>>
>>>>>>> Midori provided me with the following numbers of GO terms for that
>>>>>>> period if anyone is interested:
>>>>>>> date            total       obsolete
>>>>>>> 1/1/2002    10305    152
>>>>>>> 1/1/2003    13339    383
>>>>>>> 1/1/2004    16771    725
>>>>>>> 1/1/2005    18219    969
>>>>>>> 1/1/2006    20348    992
>>>>>>> 1/1/2007    22928    1011
>>>>>>> 1/1/2008    25758    1137
>>>>>>>
>>>>>>>             
>>>>>> We have historical go dbs mirrored here - we can open these to 
>>>>>> GOOSE if you like, or you can just request queries.
>>>>>>
>>>>>> This is what you're after:
>>>>>>
>>>>>> SELECT count(*) AS num_annots
>>>>>> FROM association INNER JOIN evidence ON 
>>>>>> (evidence.association_id=association.id)
>>>>>> WHERE code != 'IEA' AND code != 'ND';
>>>>>> go_old_20030101
>>>>>> num_annots
>>>>>> 133699
>>>>>>
>>>>>> go_old_20040101
>>>>>> num_annots
>>>>>> 386339
>>>>>>
>>>>>> go_old_20050101
>>>>>> num_annots
>>>>>> 416224
>>>>>>
>>>>>> go_old_20060101
>>>>>> num_annots
>>>>>> 469107
>>>>>>
>>>>>> go_old_20070101
>>>>>> num_annots
>>>>>> 489402
>>>>>>
>>>>>> go_old_20080101
>>>>>> num_annots
>>>>>> 580052
>>>>>>
>>>>>>
>>>>>> This one may also be informative: the number of terms used 
>>>>>> directly in annotations (all):
>>>>>>
>>>>>> SELECT count(DISTINCT term_id) AS num_terms_used_directly
>>>>>> FROM association;
>>>>>> go_old_20030101
>>>>>> num_terms_used_directly
>>>>>> 7116
>>>>>>
>>>>>> go_old_20040101
>>>>>> num_terms_used_directly
>>>>>> 9008
>>>>>>
>>>>>> go_old_20050101
>>>>>> num_terms_used_directly
>>>>>> 10134
>>>>>>
>>>>>> go_old_20060101
>>>>>> num_terms_used_directly
>>>>>> 11113
>>>>>>
>>>>>> go_old_20070101
>>>>>> num_terms_used_directly
>>>>>> 12340
>>>>>>
>>>>>> go_old_20080101
>>>>>> num_terms_used_directly
>>>>>> 13812
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>          
>>>>>>> -Doug
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Go mailing list
>>>>>>> Go at geneontology.org
>>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>>>
>>>>>>>
>>>>>>>             
>>>>> _______________________________________________
>>>>> Go mailing list
>>>>> Go at geneontology.org
>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>
>>>>>
>>>>>         
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>>   
>
>


More information about the Go mailing list