[Go] growth in manual GO annotation
Doug howe
dhowe at cs.uoregon.edu
Thu Apr 10 13:03:59 PDT 2008
Agreed.
Judith Blake wrote:
> Doug,
> I would check with Emily about the species and also the evidence codes
> for those species. There are several mechanisms that I can think of
> where annotations would be other than IEA.
>
> Judy
>
> Doug howe wrote:
>> WOW...that's way more species than I would have predicted....thanks
>> Chris.
>>
>> Chris Mungall wrote:
>>
>>> On Apr 10, 2008, at 10:08 AM, Doug howe wrote:
>>>
>>>
>>>> Oopps...let me be more clear..I'm looking for:
>>>>
>>>> 1. The number of distinct gene products (across all species)
>>>> annotated using NON-IEA, NON-ND evidence on 1/1 of each year from
>>>> 2002-2008.
>>>>
>>> SELECT count(DISTINCT gene_product_id) AS num_gps
>>> FROM association INNER JOIN evidence ON
>>> (evidence.association_id=association.id)
>>> WHERE code != 'IEA' AND code != 'ND';
>>>
>>> go_old_20030101
>>> num_gps
>>> 42746
>>>
>>> go_old_20040101
>>> num_gps
>>> 99116
>>>
>>> go_old_20050101
>>> num_gps
>>> 144635
>>>
>>> go_old_20060101
>>> num_gps
>>> 136734
>>>
>>> go_old_20070101
>>> num_gps
>>> 140370
>>>
>>> go_old_20080101
>>> num_gps
>>> 192535
>>>
>>>
>>>
>>>> 2. The number of distinct species with any NON-IEA, NON-ND GO
>>>> annotation on 1/1 of each year from 2002-2008.
>>>>
>>> SELECT count(DISTINCT species_id) AS num_species
>>> FROM gene_product
>>> INNER JOIN association ON
>>> (gene_product.id=association.gene_product_id)
>>> INNER JOIN evidence ON (evidence.association_id=association.id)
>>> WHERE code != 'IEA' AND code != 'ND';
>>> go_old_20030101
>>> num_species
>>> 207
>>>
>>> go_old_20040101
>>> num_species
>>> 375
>>>
>>> go_old_20050101
>>> num_species
>>> 533
>>>
>>> go_old_20060101
>>> num_species
>>> 638
>>>
>>> go_old_20070101
>>> num_species
>>> 884
>>>
>>> go_old_20080101
>>> num_species
>>> 930
>>>
>>> (yep, these numbers are correct, there is a lot of non-MOD
>>> annotations to GO)
>>>
>>>
>>>> Doug howe wrote:
>>>>
>>>>> Thanks Chris those are very useful numbers. If you don't mind
>>>>> running two more queries, it won't be necessary to open the older
>>>>> stuff to Goose.
>>>>> I'd be interested to see:
>>>>> 1. The number of distinct gene products (across all species)
>>>>> annotated on 1/1 of each year from 2002-2008.
>>>>> 2. The number of distinct species with any GO annotation on 1/1
>>>>> of each year from 2002-2008.
>>>>>
>>>>> -Thanks!
>>>>> -Doug
>>>>>
>>>>>
>>>>> Chris Mungall wrote:
>>>>>
>>>>>
>>>>>> On Apr 8, 2008, at 9:48 AM, Doug howe wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Does anyone have, or know how to get, historical stats on the
>>>>>>> number of
>>>>>>> GO annotations that have been contributed to the GOC over time?
>>>>>>> I'm
>>>>>>> looking for the number of non-IEA, non-ND GO annotations that
>>>>>>> existed
>>>>>>> for each year from 2002-2008.
>>>>>>>
>>>>>>> Midori provided me with the following numbers of GO terms for that
>>>>>>> period if anyone is interested:
>>>>>>> date total obsolete
>>>>>>> 1/1/2002 10305 152
>>>>>>> 1/1/2003 13339 383
>>>>>>> 1/1/2004 16771 725
>>>>>>> 1/1/2005 18219 969
>>>>>>> 1/1/2006 20348 992
>>>>>>> 1/1/2007 22928 1011
>>>>>>> 1/1/2008 25758 1137
>>>>>>>
>>>>>>>
>>>>>> We have historical go dbs mirrored here - we can open these to
>>>>>> GOOSE if you like, or you can just request queries.
>>>>>>
>>>>>> This is what you're after:
>>>>>>
>>>>>> SELECT count(*) AS num_annots
>>>>>> FROM association INNER JOIN evidence ON
>>>>>> (evidence.association_id=association.id)
>>>>>> WHERE code != 'IEA' AND code != 'ND';
>>>>>> go_old_20030101
>>>>>> num_annots
>>>>>> 133699
>>>>>>
>>>>>> go_old_20040101
>>>>>> num_annots
>>>>>> 386339
>>>>>>
>>>>>> go_old_20050101
>>>>>> num_annots
>>>>>> 416224
>>>>>>
>>>>>> go_old_20060101
>>>>>> num_annots
>>>>>> 469107
>>>>>>
>>>>>> go_old_20070101
>>>>>> num_annots
>>>>>> 489402
>>>>>>
>>>>>> go_old_20080101
>>>>>> num_annots
>>>>>> 580052
>>>>>>
>>>>>>
>>>>>> This one may also be informative: the number of terms used
>>>>>> directly in annotations (all):
>>>>>>
>>>>>> SELECT count(DISTINCT term_id) AS num_terms_used_directly
>>>>>> FROM association;
>>>>>> go_old_20030101
>>>>>> num_terms_used_directly
>>>>>> 7116
>>>>>>
>>>>>> go_old_20040101
>>>>>> num_terms_used_directly
>>>>>> 9008
>>>>>>
>>>>>> go_old_20050101
>>>>>> num_terms_used_directly
>>>>>> 10134
>>>>>>
>>>>>> go_old_20060101
>>>>>> num_terms_used_directly
>>>>>> 11113
>>>>>>
>>>>>> go_old_20070101
>>>>>> num_terms_used_directly
>>>>>> 12340
>>>>>>
>>>>>> go_old_20080101
>>>>>> num_terms_used_directly
>>>>>> 13812
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -Doug
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Go mailing list
>>>>>>> Go at geneontology.org
>>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> _______________________________________________
>>>>> Go mailing list
>>>>> Go at geneontology.org
>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>
>>>>>
>>>>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>>
>
>
More information about the Go
mailing list