[go] total genes/gene products annotated versus genome size

Sue Rhee rhee at acoma.Stanford.EDU
Wed Dec 19 23:35:44 PST 2007


Dear GO consortium,

It looks like different GOC members are submitting annotations to
different objects (genes, proteins, transcripts) and the total number of
the same objects in the genome is not included in the association files.
Also, some databases are submitting 'genes' but it looks like they are
more like transcripts (e.g. TAIR). Therefore, it is difficult for me (or
anyone else) to generate a simple table as follows. Michael suggested
that I use genes rather than gene products for this table, but I am
having trouble doing this by querying the GO database.

If the organism(s) that you are submitting annotation files are included
in the following table, would you kindly send me tthe following three
numbers?

1. total number of genes (including protein-coding, RNA and pseudogenes)
in the genome as of October 2007
2. total number of genes annotated with GO as of October 2007
3. total number of genes annotated with evidence codes IDA, IMP, IGI,
IPI, IEP as of October 2007

Thanks much,
Sue

species (NCBI taxon ID)

	

experi-mental anno-tations

	

total anno-tations

	

% expt anno-tations

	

annotated gene products

	

total gene products^a

	

% annotated^b

	

% known in genome^c

baker’s yeast (4932)

	

23993

	

36746

	

65.3%

	

6476

	

7137

	

90.7%

	

59.2%

fission yeast (4896)

	

12343

	

33385

	

37.0%

	

5243

	

5463

	

96.0%

	

35.5%

fruit fly (7227)

	

14148

	

20303

	

69.7%

	

10581

	

30971

	

34.2%

	

23.8%

worm (6239)

	

27472

	

68594

	

40.1%

	

12534

	

28866

	

43.4%

	

17.4%

Candida albicans (5476)

	

3413

	

5326

	

64.1%

	

1262

	

6344

	

19.9%

	

12.7%

arabidopsis (3702)

	

14060

	

103850

	

13.5%

	

34683

	

42929

	

80.8%

	

10.9%

mouse (10090)

	

28540

	

133743

	

21.3%

	

18052

	

35466

	

50.9%

	

10.9%

human^d (9606)

	

12437

	

166419

	

7.5%

	

33760

	

37,106

	

91.0%

	

6.8%

slime mold (44689)

	

2691

	

30299

	

8.9%

	

4328

	

6729

	

64.3%

	

5.7%

Pseudomonas aeruginosa PAO1 (208964)

	

1123

	

7377

	

15.2%

	

1519

	

5670

	

26.8%

	

4.1%

rat^d (10116)

	

12986

	

135246

	

9.6%

	

11606

	

37,106

	

31.3%

	

3.0%

zebrafish (7955)

	

4204

	

70340

	

6.0%

	

13194

	

27532

	

47.9%

	

2.9%

Plasmodium falciparum (5833)

	

196

	

12026

	

1.6%

	

3165

	

5595

	

56.6%

	

0.9%

Trypanosoma brucei (5691)

	

438

	

19006

	

2.3%

	

3921

	

10966

	

35.8%

	

0.8%

rice (39947)

	

265

	

49582

	

0.5%

	

37548

	

58587

	

64.1%

	

0.3%

cow^d (9913)

	

278

	

85951

	

0.3%

	

22727

	

42836

	

53.1%

	

0.2%

chicken^d (9031)

	

179

	

55498

	

0.3%

	

16067

	

33566

	

47.9%

	

0.2%


-- 
Sue Rhee
Staff Scientist
Carnegie Institution, Department of Plant Biology
260 Panama Street, Stanford, CA 94305
Email: (650) 325-1521 x251
Fax: (650) 325-6857

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://fafner.stanford.edu/pipermail/go/attachments/20071219/69f64524/attachment.html 


More information about the Go mailing list