[go] total genes/gene products annotated versus genome size

Harold Drabkin hjd at informatics.jax.org
Thu Dec 20 12:25:17 PST 2007


I think it was meant not for the annotation (which we don't) , but in 
the total gene count when figuring % of genes annotated. We should I 
think include as the dominator only "annotatable" genes; pseudogenes 
would not be included.
h

Karen Christie wrote:
> On the subject of pseudogenes, I thought it was agreed some time ago 
> that no one should be making GO annotations to pseudogenes because by 
> definition they are not producing gene products.
>
> -Karen
>
>
> On Thu, 20 Dec 2007, Shimoyama, Mary wrote:
>
>> I agree with Judy on this point.  We do not annotate to pseudo genes so
>> it is a distortion to include these numbers and as Judy indicated for
>> mouse, at this point there would be no figures for functional RNAs in
>> rat for which we could possibly have any confidence.  I believe we
>> should stick with protein coding genes as the number we use.
>>
>> Mary Shimoyama
>> Program Manager
>> Rat Genome Database
>> Human and Molecular Genetics Center
>> Medical College of Wisconsin
>> shimoyama at mcw.edu
>> Tel: 414-456-7505
>> Fax: 414-456-6595
>> http://rgd.mcw.edu
>>
>> -----Original Message-----
>> From: owner-go at genome.stanford.edu [mailto:owner-go at genome.stanford.edu]
>> On Behalf Of Valerie Wood
>> Sent: Thursday, December 20, 2007 8:28 AM
>> To: Judith Blake
>> Cc: gene ontology; Sue Rhee
>> Subject: Re: Re: [go] total genes/gene products annotated versus genome
>> size
>>
>>
>>> Also, I would not favor combining protein coding genes, pseudogenes,
>> and
>>> functional RNAs in the same set. For one thing, we should not have any
>>> annotations to pseudogenes. That was decided some time back. Also, for
>>> mouse at least, while we have good algorithms for identifying protein
>>> coding genes, we have little or no confidence in any representation of
>>> numbers of functional RNAs. So this would lead to a distortion of
>>> comparison between genomes.
>>>
>>>
>>
>>
>> Hi Judy,
>>
>> I agree that it is best to not to combine these.
>> However, using the GO annotation it is not possible to distinguish
>> between annotations to protein or non coding RNA (as we only have gene,
>> transcript etc) in this column in the association file.
>>
>> I think we discussed allowing any SO ID, is this ratified?
>> and if so does anyone use it?
>>
>> I think it would be really useful to be able to query GO based on
>> 'features that code for proteins' only.
>> Otherwise you get artificially inflated categories (for example for
>> translation which has all the tRNAs annotated).
>>
>> I  only manage to make the distinction for pombe and cerevisiae by
>> filtering the association files based on the IDs, because I know which
>> ID types belong to which features.
>>
>>
>> Val
>>
>>
>>




More information about the Go mailing list