[go] total genes/gene products annotated versus genome size

Shimoyama, Mary shimoyama at mcw.edu
Thu Dec 20 06:45:18 PST 2007


I agree with Judy on this point.  We do not annotate to pseudo genes so
it is a distortion to include these numbers and as Judy indicated for
mouse, at this point there would be no figures for functional RNAs in
rat for which we could possibly have any confidence.  I believe we
should stick with protein coding genes as the number we use.

Mary Shimoyama
Program Manager
Rat Genome Database
Human and Molecular Genetics Center
Medical College of Wisconsin
shimoyama at mcw.edu
Tel: 414-456-7505
Fax: 414-456-6595
http://rgd.mcw.edu
 
-----Original Message-----
From: owner-go at genome.stanford.edu [mailto:owner-go at genome.stanford.edu]
On Behalf Of Valerie Wood
Sent: Thursday, December 20, 2007 8:28 AM
To: Judith Blake
Cc: gene ontology; Sue Rhee
Subject: Re: Re: [go] total genes/gene products annotated versus genome
size


> Also, I would not favor combining protein coding genes, pseudogenes,
and
> functional RNAs in the same set. For one thing, we should not have any
> annotations to pseudogenes. That was decided some time back. Also, for
> mouse at least, while we have good algorithms for identifying protein
> coding genes, we have little or no confidence in any representation of
> numbers of functional RNAs. So this would lead to a distortion of
> comparison between genomes.
> 
>


Hi Judy,

I agree that it is best to not to combine these.
However, using the GO annotation it is not possible to distinguish
between annotations to protein or non coding RNA (as we only have gene,
transcript etc) in this column in the association file.

I think we discussed allowing any SO ID, is this ratified?
and if so does anyone use it?

I think it would be really useful to be able to query GO based on
'features that code for proteins' only.
Otherwise you get artificially inflated categories (for example for
translation which has all the tRNAs annotated).

I  only manage to make the distinction for pombe and cerevisiae by
filtering the association files based on the IDs, because I know which
ID types belong to which features.


Val






More information about the Go mailing list