[Go] growth in manual GO annotation

Judith Blake jblake at informatics.jax.org
Thu Apr 10 13:02:33 PDT 2008


Doug,
I would check with Emily about the species and also the evidence codes 
for those species.  There are several mechanisms that I can think of 
where annotations would be other than IEA.

Judy

Doug howe wrote:
> WOW...that's way more species than I would have predicted....thanks Chris.
>
> Chris Mungall wrote:
>   
>> On Apr 10, 2008, at 10:08 AM, Doug howe wrote:
>>
>>     
>>> Oopps...let me be more clear..I'm looking for:
>>>
>>> 1.  The number of distinct gene products (across all species) 
>>> annotated using NON-IEA, NON-ND evidence on 1/1 of each year from 
>>> 2002-2008.
>>>       
>> SELECT count(DISTINCT gene_product_id) AS num_gps
>> FROM association INNER JOIN evidence ON 
>> (evidence.association_id=association.id)
>> WHERE code != 'IEA' AND code != 'ND';
>>
>> go_old_20030101
>> num_gps
>> 42746
>>
>> go_old_20040101
>> num_gps
>> 99116
>>
>> go_old_20050101
>> num_gps
>> 144635
>>
>> go_old_20060101
>> num_gps
>> 136734
>>
>> go_old_20070101
>> num_gps
>> 140370
>>
>> go_old_20080101
>> num_gps
>> 192535
>>
>>
>>     
>>> 2.  The number of distinct species with any NON-IEA, NON-ND GO 
>>> annotation on 1/1 of each year from 2002-2008.
>>>       
>> SELECT count(DISTINCT species_id) AS num_species
>> FROM gene_product
>>  INNER JOIN association ON (gene_product.id=association.gene_product_id)
>>  INNER JOIN evidence ON (evidence.association_id=association.id)
>> WHERE code != 'IEA' AND code != 'ND';
>> go_old_20030101
>> num_species
>> 207
>>
>> go_old_20040101
>> num_species
>> 375
>>
>> go_old_20050101
>> num_species
>> 533
>>
>> go_old_20060101
>> num_species
>> 638
>>
>> go_old_20070101
>> num_species
>> 884
>>
>> go_old_20080101
>> num_species
>> 930
>>
>> (yep, these numbers are correct, there is a lot of non-MOD annotations 
>> to GO)
>>
>>     
>>> Doug howe wrote:
>>>       
>>>> Thanks Chris those are very useful numbers.  If you don't mind 
>>>> running two more queries, it won't be necessary to open the older 
>>>> stuff to Goose.
>>>> I'd be interested to see:
>>>> 1.  The number of distinct gene products (across all species) 
>>>> annotated on 1/1 of each year from 2002-2008.
>>>> 2.  The number of distinct species with any GO annotation on 1/1 of 
>>>> each year from 2002-2008.
>>>>
>>>> -Thanks!
>>>> -Doug
>>>>
>>>>
>>>> Chris Mungall wrote:
>>>>
>>>>         
>>>>> On Apr 8, 2008, at 9:48 AM, Doug howe wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Does anyone have, or know how to get, historical stats on the 
>>>>>> number of
>>>>>> GO annotations that have been contributed to the GOC over time?  I'm
>>>>>> looking for the number of non-IEA, non-ND GO annotations that existed
>>>>>> for each year from 2002-2008.
>>>>>>
>>>>>> Midori provided me with the following numbers of GO terms for that
>>>>>> period if anyone is interested:
>>>>>> date            total       obsolete
>>>>>> 1/1/2002    10305    152
>>>>>> 1/1/2003    13339    383
>>>>>> 1/1/2004    16771    725
>>>>>> 1/1/2005    18219    969
>>>>>> 1/1/2006    20348    992
>>>>>> 1/1/2007    22928    1011
>>>>>> 1/1/2008    25758    1137
>>>>>>
>>>>>>             
>>>>> We have historical go dbs mirrored here - we can open these to 
>>>>> GOOSE if you like, or you can just request queries.
>>>>>
>>>>> This is what you're after:
>>>>>
>>>>> SELECT count(*) AS num_annots
>>>>> FROM association INNER JOIN evidence ON 
>>>>> (evidence.association_id=association.id)
>>>>> WHERE code != 'IEA' AND code != 'ND';
>>>>> go_old_20030101
>>>>> num_annots
>>>>> 133699
>>>>>
>>>>> go_old_20040101
>>>>> num_annots
>>>>> 386339
>>>>>
>>>>> go_old_20050101
>>>>> num_annots
>>>>> 416224
>>>>>
>>>>> go_old_20060101
>>>>> num_annots
>>>>> 469107
>>>>>
>>>>> go_old_20070101
>>>>> num_annots
>>>>> 489402
>>>>>
>>>>> go_old_20080101
>>>>> num_annots
>>>>> 580052
>>>>>
>>>>>
>>>>> This one may also be informative: the number of terms used directly 
>>>>> in annotations (all):
>>>>>
>>>>> SELECT count(DISTINCT term_id) AS num_terms_used_directly
>>>>> FROM association;
>>>>> go_old_20030101
>>>>> num_terms_used_directly
>>>>> 7116
>>>>>
>>>>> go_old_20040101
>>>>> num_terms_used_directly
>>>>> 9008
>>>>>
>>>>> go_old_20050101
>>>>> num_terms_used_directly
>>>>> 10134
>>>>>
>>>>> go_old_20060101
>>>>> num_terms_used_directly
>>>>> 11113
>>>>>
>>>>> go_old_20070101
>>>>> num_terms_used_directly
>>>>> 12340
>>>>>
>>>>> go_old_20080101
>>>>> num_terms_used_directly
>>>>> 13812
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> -Doug
>>>>>>
>>>>>> _______________________________________________
>>>>>> Go mailing list
>>>>>> Go at geneontology.org
>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>>
>>>>>>
>>>>>>             
>>>> _______________________________________________
>>>> Go mailing list
>>>> Go at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>
>>>>
>>>>         
> _______________________________________________
> Go mailing list
> Go at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/go
>   




More information about the Go mailing list