[Go] growth in manual GO annotation

Mike Cherry cherry at stanford.edu
Fri Apr 11 16:50:16 PDT 2008


Doug,

Here are two tables for all projects by year.  One for species in  
their filtered gene association file, and one for the gene products.   
IEA and ND annotations were excluded before the counts.  The numbers  
should be similar to those Chris got from the GODB, this is just a  
break down by project.  I got these from processing the gene  
association files, this can be a little different from what was  
successfully loaded into GODB.

Note there were no gene association files on January 1, 2002.  On  
1/1/2003 there was only the TIGR gene index file.  Most projects did  
not submit a file until something during 2004.  A null cell means the  
file did not exist at that time, a zero means there were no non-IEA/ 
non-ND annotations.  All numbers are from January 1 on the year  
indicated.

-Mike

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ga-year-geneproducts.xls
Type: application/octet-stream
Size: 1613 bytes
Desc: not available
Url : http://fafner.stanford.edu/pipermail/go/attachments/20080411/747a62fd/attachment.obj 
-------------- next part --------------



-------------- next part --------------
A non-text attachment was scrubbed...
Name: ga-year-species.xls
Type: application/octet-stream
Size: 1240 bytes
Desc: not available
Url : http://fafner.stanford.edu/pipermail/go/attachments/20080411/747a62fd/attachment-0001.obj 
-------------- next part --------------




P.S.  Has anyone won email bingo yet?

On Apr 10, 2008, at 12:28 PM, Doug howe wrote:
> WOW...that's way more species than I would have predicted....thanks  
> Chris.
>
> Chris Mungall wrote:
>>
>> On Apr 10, 2008, at 10:08 AM, Doug howe wrote:
>>
>>> Oopps...let me be more clear..I'm looking for:
>>>
>>> 1.  The number of distinct gene products (across all species)
>>> annotated using NON-IEA, NON-ND evidence on 1/1 of each year from
>>> 2002-2008.
>>
>> SELECT count(DISTINCT gene_product_id) AS num_gps
>> FROM association INNER JOIN evidence ON
>> (evidence.association_id=association.id)
>> WHERE code != 'IEA' AND code != 'ND';
>>
>> go_old_20030101
>> num_gps
>> 42746
>>
>> go_old_20040101
>> num_gps
>> 99116
>>
>> go_old_20050101
>> num_gps
>> 144635
>>
>> go_old_20060101
>> num_gps
>> 136734
>>
>> go_old_20070101
>> num_gps
>> 140370
>>
>> go_old_20080101
>> num_gps
>> 192535
>>
>>
>>> 2.  The number of distinct species with any NON-IEA, NON-ND GO
>>> annotation on 1/1 of each year from 2002-2008.
>>
>> SELECT count(DISTINCT species_id) AS num_species
>> FROM gene_product
>> INNER JOIN association ON  
>> (gene_product.id=association.gene_product_id)
>> INNER JOIN evidence ON (evidence.association_id=association.id)
>> WHERE code != 'IEA' AND code != 'ND';
>> go_old_20030101
>> num_species
>> 207
>>
>> go_old_20040101
>> num_species
>> 375
>>
>> go_old_20050101
>> num_species
>> 533
>>
>> go_old_20060101
>> num_species
>> 638
>>
>> go_old_20070101
>> num_species
>> 884
>>
>> go_old_20080101
>> num_species
>> 930
>>
>> (yep, these numbers are correct, there is a lot of non-MOD  
>> annotations
>> to GO)
>>
>>>
>>> Doug howe wrote:
>>>> Thanks Chris those are very useful numbers.  If you don't mind
>>>> running two more queries, it won't be necessary to open the older
>>>> stuff to Goose.
>>>> I'd be interested to see:
>>>> 1.  The number of distinct gene products (across all species)
>>>> annotated on 1/1 of each year from 2002-2008.
>>>> 2.  The number of distinct species with any GO annotation on 1/1 of
>>>> each year from 2002-2008.
>>>>
>>>> -Thanks!
>>>> -Doug
>>>>
>>>>
>>>> Chris Mungall wrote:
>>>>
>>>>> On Apr 8, 2008, at 9:48 AM, Doug howe wrote:
>>>>>
>>>>>
>>>>>> Does anyone have, or know how to get, historical stats on the
>>>>>> number of
>>>>>> GO annotations that have been contributed to the GOC over  
>>>>>> time?  I'm
>>>>>> looking for the number of non-IEA, non-ND GO annotations that  
>>>>>> existed
>>>>>> for each year from 2002-2008.
>>>>>>
>>>>>> Midori provided me with the following numbers of GO terms for  
>>>>>> that
>>>>>> period if anyone is interested:
>>>>>> date            total       obsolete
>>>>>> 1/1/2002    10305    152
>>>>>> 1/1/2003    13339    383
>>>>>> 1/1/2004    16771    725
>>>>>> 1/1/2005    18219    969
>>>>>> 1/1/2006    20348    992
>>>>>> 1/1/2007    22928    1011
>>>>>> 1/1/2008    25758    1137
>>>>>>
>>>>> We have historical go dbs mirrored here - we can open these to
>>>>> GOOSE if you like, or you can just request queries.
>>>>>
>>>>> This is what you're after:
>>>>>
>>>>> SELECT count(*) AS num_annots
>>>>> FROM association INNER JOIN evidence ON
>>>>> (evidence.association_id=association.id)
>>>>> WHERE code != 'IEA' AND code != 'ND';
>>>>> go_old_20030101
>>>>> num_annots
>>>>> 133699
>>>>>
>>>>> go_old_20040101
>>>>> num_annots
>>>>> 386339
>>>>>
>>>>> go_old_20050101
>>>>> num_annots
>>>>> 416224
>>>>>
>>>>> go_old_20060101
>>>>> num_annots
>>>>> 469107
>>>>>
>>>>> go_old_20070101
>>>>> num_annots
>>>>> 489402
>>>>>
>>>>> go_old_20080101
>>>>> num_annots
>>>>> 580052
>>>>>
>>>>>
>>>>> This one may also be informative: the number of terms used  
>>>>> directly
>>>>> in annotations (all):
>>>>>
>>>>> SELECT count(DISTINCT term_id) AS num_terms_used_directly
>>>>> FROM association;
>>>>> go_old_20030101
>>>>> num_terms_used_directly
>>>>> 7116
>>>>>
>>>>> go_old_20040101
>>>>> num_terms_used_directly
>>>>> 9008
>>>>>
>>>>> go_old_20050101
>>>>> num_terms_used_directly
>>>>> 10134
>>>>>
>>>>> go_old_20060101
>>>>> num_terms_used_directly
>>>>> 11113
>>>>>
>>>>> go_old_20070101
>>>>> num_terms_used_directly
>>>>> 12340
>>>>>
>>>>> go_old_20080101
>>>>> num_terms_used_directly
>>>>> 13812
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -Doug
>>>>>>
>>>>>> _______________________________________________
>>>>>> Go mailing list
>>>>>> Go at geneontology.org
>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>>
>>>>>>
>>>> _______________________________________________
>>>> Go mailing list
>>>> Go at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>
>>>>
>>>
>>
> _______________________________________________
> Go mailing list
> Go at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/go



More information about the Go mailing list