From midori at ebi.ac.uk Tue Jan 2 08:09:02 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Tue, 2 Jan 2007 16:09:02 +0000 (GMT) Subject: GO_REFS - proposed merges-keyword In-Reply-To: <1165333233.2713.16.camel@paul.gen.cam.ac.uk> References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> Message-ID: OK, I've come up with some drafts. Please let me know if anything about these should be changed, or if any existing GO_REFs should not be included in the merges (merged entries have one or more 'alt_id' entries, corresponding to the ids of entries merged in). If annotators agree that these are OK, we can have the merges done in time for the consortium meeting. thanks, midori go_ref_id: GO_REF:0000002 alt_id: GO_REF:0000007 alt_id: GO_REF:0000014 alt_id: GO_REF:0000016 alt_id: GO_REF:0000017 title: Gene Ontology annotation in FlyBase through association of InterPro records with GO terms. authors: DDB, FB, MGI, UniProt, ZFIN curators year: 2001 external_accession: MGI:2152098 external_accession: J:72247 external_accession: ZFIN:ZDB-PUB-020724-1 external_accession: FBrf0174215 external_accession: DDB:10157 abstract: Transitive assignment of GO terms based on InterPro classification. For any database entry (representing a protein or protein-coding gene) thah has been annotated with one or more InterPro domains, The corresponding GO terms are obtained from a translation table of InterPro entries to GO terms (interpro2go) generated manually by the InterPro team at EBI. comment: Formerly GOA:interpro. Note that GO annotations based on InterPro-to-GO transitive assignment may undergo subsequent filtering, e.g. to remove annotations redundant with manual curation; consult documentation from the annotation providers for further information. go_ref_id: GO_REF:0000003 alt_id: GO_REF:0000005 title: Gene Ontology annotation based on Enzyme Commission mapping authors: UniProt curators; MGI curators year: 2001 external_accession: MGI:2152096 external_accession: J:72245 citation: Genomics 74:121-128 abstract: Transitive assignment using Enzyme Commission identifiers. This method is used for any database entry, such as a protein record in Swiss-Prot or TrEMBL, that has had an Enzyme Commission number assigned. The corresponding GO term is determined using the EC cross-references in the GO molecular function ontology. Also see Hill et al., Genomics (2001) 74:121-128. comment: Formerly GOA:spec. go_ref_id: GO_REF:0000004 alt_id: GO_REF:0000009 alt_id: GO_REF:0000013 title: Gene Ontology annotation based on Swiss-Prot keyword mapping. authors: Swiss-Prot/TrEMBL curators year: 2000 external_accession: MGI:1354194 external_accession: J:60000 external_accession: ZFIN:ZDB-PUB-020723-1 abstract: Transitive assignment using Swiss-Prot keywords. This method is used for any database record that has one or more Swiss-Prot keywords assigned. Each keyword is mapped to the corresponding GO term in the spkw2go file, which was originally constructed manually by MGI curators and is now maintained by the GOA team at EBI. comment: Formerly GOA:spkw. From midori at ebi.ac.uk Tue Jan 2 08:11:59 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Tue, 2 Jan 2007 16:11:59 +0000 (GMT) Subject: GO_REFS - proposed merges-keyword In-Reply-To: References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> Message-ID: oops, Ev noticed that the title for the interpro2go one was wrong. It should be: go_ref_id: GO_REF:0000002 title: Gene Ontology annotation through association of InterPro records with GO terms. m On Tue, 2 Jan 2007, Midori Harris wrote: > OK, I've come up with some drafts. Please let me know if anything about these > should be changed, or if any existing GO_REFs should not be included in the > merges (merged entries have one or more 'alt_id' entries, corresponding to > the ids of entries merged in). > > If annotators agree that these are OK, we can have the merges done in time > for the consortium meeting. > > thanks, > midori > > go_ref_id: GO_REF:0000002 > alt_id: GO_REF:0000007 > alt_id: GO_REF:0000014 > alt_id: GO_REF:0000016 > alt_id: GO_REF:0000017 > title: Gene Ontology annotation in FlyBase through association of InterPro > records with GO terms. > authors: DDB, FB, MGI, UniProt, ZFIN curators > year: 2001 > external_accession: MGI:2152098 > external_accession: J:72247 > external_accession: ZFIN:ZDB-PUB-020724-1 > external_accession: FBrf0174215 > external_accession: DDB:10157 > abstract: Transitive assignment of GO terms based on InterPro classification. > For any database entry (representing a protein or protein-coding gene) thah > has been annotated with one or more InterPro domains, The corresponding GO > terms are obtained from a translation table of InterPro entries to GO terms > (interpro2go) generated manually by the InterPro team at EBI. > comment: Formerly GOA:interpro. Note that GO annotations based on > InterPro-to-GO transitive assignment may undergo subsequent filtering, e.g. > to remove annotations redundant with manual curation; consult documentation > from the annotation providers for further information. > > go_ref_id: GO_REF:0000003 > alt_id: GO_REF:0000005 > title: Gene Ontology annotation based on Enzyme Commission mapping > authors: UniProt curators; MGI curators > year: 2001 > external_accession: MGI:2152096 > external_accession: J:72245 > citation: Genomics 74:121-128 > abstract: Transitive assignment using Enzyme Commission identifiers. This > method is used for any database entry, such as a protein record in Swiss-Prot > or TrEMBL, that has had an Enzyme Commission number assigned. The > corresponding GO term is determined using the EC cross-references in the GO > molecular function ontology. Also see Hill et al., Genomics (2001) > 74:121-128. > comment: Formerly GOA:spec. > > go_ref_id: GO_REF:0000004 > alt_id: GO_REF:0000009 > alt_id: GO_REF:0000013 > title: Gene Ontology annotation based on Swiss-Prot keyword mapping. > authors: Swiss-Prot/TrEMBL curators > year: 2000 > external_accession: MGI:1354194 > external_accession: J:60000 > external_accession: ZFIN:ZDB-PUB-020723-1 > abstract: Transitive assignment using Swiss-Prot keywords. This method is > used for any database record that has one or more Swiss-Prot keywords > assigned. Each keyword is mapped to the corresponding GO term in the spkw2go > file, which was originally constructed manually by MGI curators and is now > maintained by the GOA team at EBI. > comment: Formerly GOA:spkw. > > > From camon at ebi.ac.uk Tue Jan 2 08:16:12 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Tue, 02 Jan 2007 16:16:12 +0000 Subject: GO_REFS - proposed merges-keyword References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> Message-ID: <459A854C.9070207@ebi.ac.uk> Hi, Im ok with this...can we have authors GOA Curators, instead of UniProt Curators and Swiss-Prot/TrEMBL curators.. already reported removing FlyBase from first title... should we provide a link to the mapping file on the GO or GOA ftp sites thanks Evelyn Midori Harris wrote: > OK, I've come up with some drafts. Please let me know if anything about > these should be changed, or if any existing GO_REFs should not be > included in the merges (merged entries have one or more 'alt_id' > entries, corresponding to the ids of entries merged in). > > If annotators agree that these are OK, we can have the merges done in > time for the consortium meeting. > > thanks, > midori > > go_ref_id: GO_REF:0000002 > alt_id: GO_REF:0000007 > alt_id: GO_REF:0000014 > alt_id: GO_REF:0000016 > alt_id: GO_REF:0000017 > title: Gene Ontology annotation in FlyBase through association of > InterPro records with GO terms. > authors: DDB, FB, MGI, UniProt, ZFIN curators > year: 2001 > external_accession: MGI:2152098 > external_accession: J:72247 > external_accession: ZFIN:ZDB-PUB-020724-1 > external_accession: FBrf0174215 > external_accession: DDB:10157 > abstract: Transitive assignment of GO terms based on InterPro > classification. For any database entry (representing a protein or > protein-coding gene) thah has been annotated with one or more InterPro > domains, The corresponding GO terms are obtained from a translation > table of InterPro entries to GO terms (interpro2go) generated manually > by the InterPro team at EBI. > comment: Formerly GOA:interpro. Note that GO annotations based on > InterPro-to-GO transitive assignment may undergo subsequent filtering, > e.g. to remove annotations redundant with manual curation; consult > documentation from the annotation providers for further information. > > go_ref_id: GO_REF:0000003 > alt_id: GO_REF:0000005 > title: Gene Ontology annotation based on Enzyme Commission mapping > authors: UniProt curators; MGI curators > year: 2001 > external_accession: MGI:2152096 > external_accession: J:72245 > citation: Genomics 74:121-128 > abstract: Transitive assignment using Enzyme Commission identifiers. > This method is used for any database entry, such as a protein record in > Swiss-Prot or TrEMBL, that has had an Enzyme Commission number assigned. > The corresponding GO term is determined using the EC cross-references in > the GO molecular function ontology. Also see Hill et al., Genomics > (2001) 74:121-128. > comment: Formerly GOA:spec. > > go_ref_id: GO_REF:0000004 > alt_id: GO_REF:0000009 > alt_id: GO_REF:0000013 > title: Gene Ontology annotation based on Swiss-Prot keyword mapping. > authors: Swiss-Prot/TrEMBL curators > year: 2000 > external_accession: MGI:1354194 > external_accession: J:60000 > external_accession: ZFIN:ZDB-PUB-020723-1 > abstract: Transitive assignment using Swiss-Prot keywords. This method > is used for any database record that has one or more Swiss-Prot keywords > assigned. Each keyword is mapped to the corresponding GO term in the > spkw2go file, which was originally constructed manually by MGI curators > and is now maintained by the GOA team at EBI. > comment: Formerly GOA:spkw. > -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From midori at ebi.ac.uk Tue Jan 2 08:18:13 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Tue, 2 Jan 2007 16:18:13 +0000 (GMT) Subject: GO_REFS - proposed merges-keyword In-Reply-To: <459A854C.9070207@ebi.ac.uk> References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> <459A854C.9070207@ebi.ac.uk> Message-ID: > > Im ok with this...can we have authors GOA Curators, instead of UniProt > Curators and Swiss-Prot/TrEMBL curators.. OK > > already reported removing FlyBase from first title... and already fixed :) > > should we provide a link to the mapping file on the GO or GOA ftp sites you mean put a URL in the abstract? m From dhowe at cs.uoregon.edu Tue Jan 2 08:48:26 2007 From: dhowe at cs.uoregon.edu (Doug howe) Date: Tue, 02 Jan 2007 08:48:26 -0800 Subject: GO_REFS - proposed merges-keyword In-Reply-To: References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> Message-ID: <459A8CDA.8080204@cs.uoregon.edu> These look fine to me. You could also add ZDB-PUB-031118-3 as an external_accession for the ec2go ref. -Doug Midori Harris wrote: > oops, Ev noticed that the title for the interpro2go one was wrong. It > should be: > > go_ref_id: GO_REF:0000002 > title: Gene Ontology annotation through association of InterPro > records with GO terms. > > m > > On Tue, 2 Jan 2007, Midori Harris wrote: > >> OK, I've come up with some drafts. Please let me know if anything >> about these should be changed, or if any existing GO_REFs should not >> be included in the merges (merged entries have one or more 'alt_id' >> entries, corresponding to the ids of entries merged in). >> >> If annotators agree that these are OK, we can have the merges done in >> time for the consortium meeting. >> >> thanks, >> midori >> >> go_ref_id: GO_REF:0000002 >> alt_id: GO_REF:0000007 >> alt_id: GO_REF:0000014 >> alt_id: GO_REF:0000016 >> alt_id: GO_REF:0000017 >> title: Gene Ontology annotation in FlyBase through association of >> InterPro records with GO terms. >> authors: DDB, FB, MGI, UniProt, ZFIN curators >> year: 2001 >> external_accession: MGI:2152098 >> external_accession: J:72247 >> external_accession: ZFIN:ZDB-PUB-020724-1 >> external_accession: FBrf0174215 >> external_accession: DDB:10157 >> abstract: Transitive assignment of GO terms based on InterPro >> classification. For any database entry (representing a protein or >> protein-coding gene) thah has been annotated with one or more >> InterPro domains, The corresponding GO terms are obtained from a >> translation table of InterPro entries to GO terms (interpro2go) >> generated manually by the InterPro team at EBI. >> comment: Formerly GOA:interpro. Note that GO annotations based on >> InterPro-to-GO transitive assignment may undergo subsequent >> filtering, e.g. to remove annotations redundant with manual curation; >> consult documentation from the annotation providers for further >> information. >> >> go_ref_id: GO_REF:0000003 >> alt_id: GO_REF:0000005 >> title: Gene Ontology annotation based on Enzyme Commission mapping >> authors: UniProt curators; MGI curators >> year: 2001 >> external_accession: MGI:2152096 >> external_accession: J:72245 >> citation: Genomics 74:121-128 >> abstract: Transitive assignment using Enzyme Commission identifiers. >> This method is used for any database entry, such as a protein record >> in Swiss-Prot or TrEMBL, that has had an Enzyme Commission number >> assigned. The corresponding GO term is determined using the EC >> cross-references in the GO molecular function ontology. Also see Hill >> et al., Genomics (2001) 74:121-128. >> comment: Formerly GOA:spec. >> >> go_ref_id: GO_REF:0000004 >> alt_id: GO_REF:0000009 >> alt_id: GO_REF:0000013 >> title: Gene Ontology annotation based on Swiss-Prot keyword mapping. >> authors: Swiss-Prot/TrEMBL curators >> year: 2000 >> external_accession: MGI:1354194 >> external_accession: J:60000 >> external_accession: ZFIN:ZDB-PUB-020723-1 >> abstract: Transitive assignment using Swiss-Prot keywords. This >> method is used for any database record that has one or more >> Swiss-Prot keywords assigned. Each keyword is mapped to the >> corresponding GO term in the spkw2go file, which was originally >> constructed manually by MGI curators and is now maintained by the GOA >> team at EBI. >> comment: Formerly GOA:spkw. >> >> >> > From jblake at informatics.jax.org Tue Jan 2 09:12:46 2007 From: jblake at informatics.jax.org (Judith Blake) Date: Tue, 02 Jan 2007 12:12:46 -0500 Subject: GO_REFS - proposed merges-keyword In-Reply-To: References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> <459A854C.9070207@ebi.ac.uk> Message-ID: <459A928E.2030102@informatics.jax.org> looks good from here judy Midori Harris wrote: >> >> Im ok with this...can we have authors GOA Curators, instead of >> UniProt Curators and Swiss-Prot/TrEMBL curators.. > > OK > >> >> already reported removing FlyBase from first title... > > and already fixed :) > >> >> should we provide a link to the mapping file on the GO or GOA ftp sites > > you mean put a URL in the abstract? > > m From midori at ebi.ac.uk Tue Jan 2 16:00:08 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Wed, 3 Jan 2007 00:00:08 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701030000.l03008u1168031@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070103/9540122d/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070103/9540122d/attachment.pl From sart2 at gen.cam.ac.uk Wed Jan 3 08:06:11 2007 From: sart2 at gen.cam.ac.uk (Susan Tweedie) Date: Wed, 03 Jan 2007 16:06:11 +0000 Subject: GO_REFS - proposed merges-keyword In-Reply-To: <459A928E.2030102@informatics.jax.org> References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> <459A854C.9070207@ebi.ac.uk> <459A928E.2030102@informatics.jax.org> Message-ID: <1167840371.5829.18.camel@paul.gen.cam.ac.uk> yes, fine for me too. Susan On Tue, 2007-01-02 at 12:12 -0500, Judith Blake wrote: > looks good from here > judy > > Midori Harris wrote: > >> > >> Im ok with this...can we have authors GOA Curators, instead of > >> UniProt Curators and Swiss-Prot/TrEMBL curators.. > > > > OK > > > >> > >> already reported removing FlyBase from first title... > > > > and already fixed :) > > > >> > >> should we provide a link to the mapping file on the GO or GOA ftp sites > > > > you mean put a URL in the abstract? > > > > m > > From midori at ebi.ac.uk Wed Jan 3 08:57:01 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Wed, 3 Jan 2007 16:57:01 +0000 (GMT) Subject: GO_REFS - proposed merges-keyword In-Reply-To: <1167840371.5829.18.camel@paul.gen.cam.ac.uk> References: <451BFD6F.8060503@informatics.jax.org> <1159523862.29566.4.camel@paul> <451CFCEF.4040306@ebi.ac.uk> <45212559.70305@informatics.jax.org> <452128E0.1060808@ebi.ac.uk> <456DC75A.8040501@cs.uoregon.edu> <45755C2B.70800@ebi.ac.uk> <1165331522.2713.9.camel@paul.gen.cam.ac.uk> <457590C3.2040700@ebi.ac.uk> <1165333233.2713.16.camel@paul.gen.cam.ac.uk> <459A854C.9070207@ebi.ac.uk> <459A928E.2030102@informatics.jax.org> <1167840371.5829.18.camel@paul.gen.cam.ac.uk> Message-ID: OK, I've heard several yeses and no objections, soooo ... I will make the changes at the end of this week unless anyone objects by then. m From midori at ebi.ac.uk Thu Jan 4 16:00:10 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Fri, 5 Jan 2007 00:00:10 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701050000.l0500Ah1236981@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070105/5ebab844/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070105/5ebab844/attachment.pl From midori at ebi.ac.uk Thu Jan 11 16:00:07 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Fri, 12 Jan 2007 00:00:07 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701120000.l0C007X1484747@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070112/33a1cb90/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070112/33a1cb90/attachment.pl From edimmer at ebi.ac.uk Mon Jan 15 08:07:42 2007 From: edimmer at ebi.ac.uk (Emily Dimmer) Date: Mon, 15 Jan 2007 16:07:42 +0000 Subject: GO annotation question for protein complex terms. Message-ID: <45ABA6CE.1000506@ebi.ac.uk> Hi, I'm annotating a beta subunit of calcium channels. This subunit is an 'accessory' subunit which determines the formation, processing, regulation and function of the channel (PMID: 8530407). Unlike the alpha subunits, the betas do not possess membrane-spanning domains, but are localized on the cytoplasmic side of the membrane and are attached to an alpha subunit. I've annotated the beta subunit to 'voltage-gated calcium channel complex' (GO:0005891), however this term is a child of 'integral to membrane'. Should I be concerned that the parentage of channel term is not correct for my subunit, and ask for the calcium channel complex term to have the less granular term 'part of plasma membrane' as a parent instead? (I've also annotated the beta subunit to: 'internal side of plasma membrane' (GO:0009898)). I'm concerned as if someone were to use a GO slim which included the 'integral to membrane' GO term, my protein would get mapped to this location - which would be incorrect. Thanks, Emily -- ************************************ Emily Dimmer GOA and IntAct Database Curator EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD, U.K. Tel: +44 1223 494654 Fax: +44 1223 494468 email: edimmer at ebi.ac.uk ************************************ From val at sanger.ac.uk Mon Jan 15 08:17:39 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Mon, 15 Jan 2007 16:17:39 +0000 Subject: GO annotation question for protein complex terms. In-Reply-To: <45ABA6CE.1000506@ebi.ac.uk> References: <45ABA6CE.1000506@ebi.ac.uk> Message-ID: <45ABA923.9060404@sanger.ac.uk> I would have the complex term moved from under integral to membrane, and re-annotate the individual subunits which are 'membrane integral' back to this term. my 2p val Emily Dimmer wrote: > Hi, > > I'm annotating a beta subunit of calcium channels. This subunit is an > 'accessory' subunit which determines the formation, processing, > regulation and function of the channel (PMID: 8530407). > Unlike the alpha subunits, the betas do not possess membrane-spanning > domains, but are localized on the cytoplasmic side of the membrane and > are attached to an alpha subunit. > > I've annotated the beta subunit to 'voltage-gated calcium channel > complex' (GO:0005891), however this term is a child of 'integral to > membrane'. Should I be concerned that the parentage of channel term is > not correct for my subunit, and ask for the calcium channel complex > term to have the less granular term 'part of plasma membrane' as a > parent instead? (I've also annotated the beta subunit to: 'internal > side of plasma membrane' (GO:0009898)). > I'm concerned as if someone were to use a GO slim which included the > 'integral to membrane' GO term, my protein would get mapped to this > location - which would be incorrect. > Thanks, > Emily > -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From camon at ebi.ac.uk Mon Jan 15 08:25:15 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Mon, 15 Jan 2007 16:25:15 +0000 Subject: GO annotation question for protein complex terms. References: <45ABA6CE.1000506@ebi.ac.uk> <45ABA923.9060404@sanger.ac.uk> Message-ID: <45ABAAEB.1050907@ebi.ac.uk> Hi Emily I agree with Val...well spotted better to have the complex term moved. Ev Valerie Wood wrote: > > > I would have the complex term moved from under integral to membrane, and > re-annotate the individual subunits which are 'membrane integral' back > to this term. > > my 2p > val > > Emily Dimmer wrote: > >> Hi, >> >> I'm annotating a beta subunit of calcium channels. This subunit is an >> 'accessory' subunit which determines the formation, processing, >> regulation and function of the channel (PMID: 8530407). >> Unlike the alpha subunits, the betas do not possess membrane-spanning >> domains, but are localized on the cytoplasmic side of the membrane and >> are attached to an alpha subunit. >> >> I've annotated the beta subunit to 'voltage-gated calcium channel >> complex' (GO:0005891), however this term is a child of 'integral to >> membrane'. Should I be concerned that the parentage of channel term is >> not correct for my subunit, and ask for the calcium channel complex >> term to have the less granular term 'part of plasma membrane' as a >> parent instead? (I've also annotated the beta subunit to: 'internal >> side of plasma membrane' (GO:0009898)). >> I'm concerned as if someone were to use a GO slim which included the >> 'integral to membrane' GO term, my protein would get mapped to this >> location - which would be incorrect. >> Thanks, >> Emily >> > > -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From dph at informatics.jax.org Mon Jan 15 08:28:57 2007 From: dph at informatics.jax.org (David Hill) Date: Mon, 15 Jan 2007 11:28:57 -0500 Subject: GO annotation question for protein complex terms. In-Reply-To: <45ABAAEB.1050907@ebi.ac.uk> References: <45ABA6CE.1000506@ebi.ac.uk> <45ABA923.9060404@sanger.ac.uk> <45ABAAEB.1050907@ebi.ac.uk> Message-ID: <45ABABC9.9020707@informatics.jax.org> I agree too. David Evelyn Camon wrote: > Hi Emily > > I agree with Val...well spotted better to have the complex term moved. > > Ev > > Valerie Wood wrote: >> >> >> I would have the complex term moved from under integral to membrane, >> and re-annotate the individual subunits which are 'membrane integral' >> back to this term. >> >> my 2p >> val >> >> Emily Dimmer wrote: >> >>> Hi, >>> >>> I'm annotating a beta subunit of calcium channels. This subunit is >>> an 'accessory' subunit which determines the formation, processing, >>> regulation and function of the channel (PMID: 8530407). >>> Unlike the alpha subunits, the betas do not possess >>> membrane-spanning domains, but are localized on the cytoplasmic side >>> of the membrane and are attached to an alpha subunit. >>> >>> I've annotated the beta subunit to 'voltage-gated calcium channel >>> complex' (GO:0005891), however this term is a child of 'integral to >>> membrane'. Should I be concerned that the parentage of channel term >>> is not correct for my subunit, and ask for the calcium channel >>> complex term to have the less granular term 'part of plasma >>> membrane' as a parent instead? (I've also annotated the beta subunit >>> to: 'internal side of plasma membrane' (GO:0009898)). >>> I'm concerned as if someone were to use a GO slim which included the >>> 'integral to membrane' GO term, my protein would get mapped to this >>> location - which would be incorrect. >>> Thanks, >>> Emily >>> >> >> > > From kchris at genome.Stanford.EDU Tue Jan 16 10:11:18 2007 From: kchris at genome.Stanford.EDU (Karen Christie) Date: Tue, 16 Jan 2007 10:11:18 -0800 (PST) Subject: GO annotation question for protein complex terms. In-Reply-To: <45ABABC9.9020707@informatics.jax.org> References: <45ABA6CE.1000506@ebi.ac.uk> <45ABA923.9060404@sanger.ac.uk> <45ABAAEB.1050907@ebi.ac.uk> <45ABABC9.9020707@informatics.jax.org> Message-ID: I agree that the term 'voltage-gated calcium channel complex' (GO:0005891) does need to be moved up to avoid the TPV. However, I was wondering if it would be worth adding some child terms to distinguish a core portion versus accessory subunits, so that the core portion could be given 'integral to membrane' parentage directly. -Karen On Mon, 15 Jan 2007, David Hill wrote: > I agree too. > > David > > Evelyn Camon wrote: >> Hi Emily >> >> I agree with Val...well spotted better to have the complex term moved. >> >> Ev >> >> Valerie Wood wrote: >>> >>> >>> I would have the complex term moved from under integral to membrane, and >>> re-annotate the individual subunits which are 'membrane integral' back to >>> this term. >>> >>> my 2p >>> val >>> >>> Emily Dimmer wrote: >>> >>>> Hi, >>>> >>>> I'm annotating a beta subunit of calcium channels. This subunit is an >>>> 'accessory' subunit which determines the formation, processing, >>>> regulation and function of the channel (PMID: 8530407). >>>> Unlike the alpha subunits, the betas do not possess membrane-spanning >>>> domains, but are localized on the cytoplasmic side of the membrane and >>>> are attached to an alpha subunit. >>>> >>>> I've annotated the beta subunit to 'voltage-gated calcium channel >>>> complex' (GO:0005891), however this term is a child of 'integral to >>>> membrane'. Should I be concerned that the parentage of channel term is >>>> not correct for my subunit, and ask for the calcium channel complex term >>>> to have the less granular term 'part of plasma membrane' as a parent >>>> instead? (I've also annotated the beta subunit to: 'internal side of >>>> plasma membrane' (GO:0009898)). >>>> I'm concerned as if someone were to use a GO slim which included the >>>> 'integral to membrane' GO term, my protein would get mapped to this >>>> location - which would be incorrect. >>>> Thanks, >>>> Emily >>>> >>> >>> >> >> > > From val at sanger.ac.uk Tue Jan 16 11:33:33 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Tue, 16 Jan 2007 19:33:33 UT Subject: GO annotation question for protein complex terms. Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070116/248ba7f0/attachment.pl From midori at ebi.ac.uk Wed Jan 17 16:00:07 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Thu, 18 Jan 2007 00:00:07 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701180000.l0I007Y1205248@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070118/37673edb/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070118/37673edb/attachment.pl From dhowe at cs.uoregon.edu Thu Jan 18 15:38:08 2007 From: dhowe at cs.uoregon.edu (Doug howe) Date: Thu, 18 Jan 2007 15:38:08 -0800 Subject: GO vs Phenotype curation Message-ID: <45B004E0.2000900@cs.uoregon.edu> ZFIN curators have the ability to curate GO and phenotype (using PATO) data from publications, and responsible for extracting all data from each paper they curate. Not surprisingly, GO annotations derived from mutant phenotypes are frequently associated with a corresponding phenotype annotations. I was wondering if anyone had any thoughts about the overlap between these two curation areas. For other groups that curate both data types (using PATO or not), how do you handle the overlap? Are IMP GO annotations and Phenotype annotations as redundant as they at first seem? Are there any objective criteria for determining what should be a GO annotation vs. what should be a phenotype annotation vs. what should be both? Has anyone cast any thought on how to reduce redundancy of curation effort in this area? Any discussion welcomed. -Doug Doug Howe ZFIN Scientific Curator From pj37 at cornell.edu Fri Jan 19 08:08:42 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Fri, 19 Jan 2007 11:08:42 -0500 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B004E0.2000900@cs.uoregon.edu> References: <45B004E0.2000900@cs.uoregon.edu> Message-ID: <45B0ED0A.4020002@cornell.edu> Hi Doug, Here is a perspective from Gramene database curators. Take for example two statements, Gene-A #1 -has enzyme activity-X #2 -mutant/variant form (allele/variant) results in dysfunctional liver In both the case we are suggesting an observable feature for a gene. IN #1 is for a functional allele and in #2 is for a variant allele. But both the statements are in general describing the properties of the gene, which we call phenotype. Means an feature/trait observed in a set of growth environments with treatments for a given genotype (stock). In case of #2 it was the phenotyped at gross level (anatomy/morphology) and may include growth and development. In #1 majority won't agree but our experts call it 'molecular phenotype'. Because we are attributing a property to a gene/allele by doing some evaluations/assays. So if I instantiate the two case we will get something like Object-1 | Object-2 | entity | Attribute | value | evidence (citation) | evidence code #1 gene-A/allele-1 | protein-product | enzyme-X | activity | present* | PMID:xxxxx | IDA, IMP gene-A/allele-2 | protein-product | enzyme-X | activity | absent* | PMID:xxxxx | IDA, IMP * Depending on the genotype the values can differ like abnormal/normal/increase/decrease/inhibited etc. Similarly #2 gene-A/allele-1 | anatomical-part | liver | function | normal* | PMID:xxxxx | IDA, IMP gene-A/allele-2 | anatomical-part | liver | function | abnormal* | PMID:xxxxx | IDA, IMP In #2 the entity was anatomy term and in #1 it was a GO function term. So ideally the same database structure can hold the phenotype information independent of the ontology type. Its a different matter how the ontology associations are shared or displayed, but definitely the same annotation/curation which is more thorough based on the phenotype description strategy as described above will work for both the existing GO annotations put into GO-database and the PATO type annotation one wants. In our understanding the detail annotation must be allele/stock/variant centric and not gene centric. I hope this helps. Pankaj Doug howe wrote: > ZFIN curators have the ability to curate GO and phenotype (using PATO) > data from publications, and responsible for extracting all data from > each paper they curate. Not surprisingly, GO annotations derived from > mutant phenotypes are frequently associated with a corresponding > phenotype annotations. I was wondering if anyone had any thoughts about > the overlap between these two curation areas. > > For other groups that curate both data types (using PATO or not), how do > you handle the overlap? > Are IMP GO annotations and Phenotype annotations as redundant as they at > first seem? > Are there any objective criteria for determining what should be a GO > annotation vs. what should be a phenotype annotation vs. what should be > both? > Has anyone cast any thought on how to reduce redundancy of curation > effort in this area? > > Any discussion welcomed. > > -Doug > > Doug Howe > ZFIN Scientific Curator > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Obo-phenotype mailing list > Obo-phenotype at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/obo-phenotype > > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 From ma11 at gen.cam.ac.uk Sat Jan 20 03:58:16 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Sat, 20 Jan 2007 11:58:16 +0000 Subject: [Obo-phenotype] GO vs Phenotype curation Message-ID: Doug I am not sure I see this as a major problem. Let us imagine you have a mutation in gene A which lacks the dorsal fin. You _might_ deduce from this (or an author might) that it is appropriate to annotate the corresponding gene as having the GO process fin morphogenesis with an IMP evidence code. That annotation is to the protein (the gene being a proxy for this). But then the particular _allele_ studied would have a PATO annotation entity: fin; quality: absent [or whatever, I am just giving the general idea] These annotations do not seem to be redundant to me Michael From dhowe at cs.uoregon.edu Mon Jan 22 08:37:07 2007 From: dhowe at cs.uoregon.edu (Doug howe) Date: Mon, 22 Jan 2007 08:37:07 -0800 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: References: Message-ID: <45B4E833.6000007@cs.uoregon.edu> Michael, Consider an alternate example that uses a GO term as the entity in the phenotype annotation: A mutation in gene A results in cyclopia. One could annotate gene A with the GO term 'eye development' by IMP and the allele of gene A with a phenotype annotation like entity: eye development + quality:abnormal No? -Doug Michael Ashburner (Genetics) wrote: > Doug > > I am not sure I see this as a major problem. > > Let us imagine you have a mutation in gene A which lacks the dorsal fin. > You _might_ deduce from this (or an author might) that it is appropriate to > annotate the corresponding gene as having the GO process > > fin morphogenesis > > with an IMP evidence code. > > That annotation is to the protein (the gene being a proxy for this). > > But then the particular _allele_ studied would have a PATO annotation > > entity: fin; quality: absent [or whatever, I am just giving the general idea] > > These annotations do not seem to be redundant to me > > Michael > > From pj37 at cornell.edu Mon Jan 22 08:48:41 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Mon, 22 Jan 2007 11:48:41 -0500 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B0ED0A.4020002@cornell.edu> References: <45B004E0.2000900@cs.uoregon.edu> <45B0ED0A.4020002@cornell.edu> Message-ID: <45B4EAE9.2060106@cornell.edu> Just a thought that I wanted to add to my previous mail. The current GO annotations and the phenotype annotations are redundant. At least to our understanding. Pankaj Pankaj Jaiswal wrote: > Hi Doug, > > Here is a perspective from Gramene database curators. > > Take for example two statements, > Gene-A > #1 -has enzyme activity-X > #2 -mutant/variant form (allele/variant) results in dysfunctional liver > In both the case we are suggesting an observable feature for a gene. IN > #1 is for a functional allele and in #2 is for a variant allele. But > both the statements are in general describing the properties of the > gene, which we call phenotype. Means an feature/trait observed in a set > of growth environments with treatments for a given genotype (stock). > > In case of #2 it was the phenotyped at gross level (anatomy/morphology) > and may include growth and development. > > In #1 majority won't agree but our experts call it 'molecular > phenotype'. Because we are attributing a property to a gene/allele by > doing some evaluations/assays. > So if I instantiate the two case we will get something like > > Object-1 | Object-2 | entity | Attribute | value | evidence (citation) > | evidence code > #1 > gene-A/allele-1 | protein-product | enzyme-X | activity | present* | > PMID:xxxxx | IDA, IMP > gene-A/allele-2 | protein-product | enzyme-X | activity | absent* | > PMID:xxxxx | IDA, IMP > > * Depending on the genotype the values can differ like > abnormal/normal/increase/decrease/inhibited etc. > > Similarly > #2 > gene-A/allele-1 | anatomical-part | liver | function | normal* | > PMID:xxxxx | IDA, IMP > gene-A/allele-2 | anatomical-part | liver | function | abnormal* | > PMID:xxxxx | IDA, IMP > > In #2 the entity was anatomy term and in #1 it was a GO function term. > So ideally the same database structure can hold the phenotype > information independent of the ontology type. Its a different matter how > the ontology associations are shared or displayed, but definitely the > same annotation/curation which is more thorough based on the phenotype > description strategy as described above will work for both the existing > GO annotations put into GO-database and the PATO type annotation one > wants. In our understanding the detail annotation must be > allele/stock/variant centric and not gene centric. > > I hope this helps. > > Pankaj > > > > > Doug howe wrote: > >> ZFIN curators have the ability to curate GO and phenotype (using PATO) >> data from publications, and responsible for extracting all data from >> each paper they curate. Not surprisingly, GO annotations derived from >> mutant phenotypes are frequently associated with a corresponding >> phenotype annotations. I was wondering if anyone had any thoughts about >> the overlap between these two curation areas. >> >> For other groups that curate both data types (using PATO or not), how do >> you handle the overlap? >> Are IMP GO annotations and Phenotype annotations as redundant as they at >> first seem? >> Are there any objective criteria for determining what should be a GO >> annotation vs. what should be a phenotype annotation vs. what should be >> both? >> Has anyone cast any thought on how to reduce redundancy of curation >> effort in this area? >> >> Any discussion welcomed. >> >> -Doug >> >> Doug Howe >> ZFIN Scientific Curator >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share your >> opinions on IT & business topics through brief surveys - and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Obo-phenotype mailing list >> Obo-phenotype at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/obo-phenotype >> >> >> > > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070122/f6c8b6dc/attachment.html From pj37 at cornell.edu Mon Jan 22 09:26:14 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Mon, 22 Jan 2007 12:26:14 -0500 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B4E833.6000007@cs.uoregon.edu> References: <45B4E833.6000007@cs.uoregon.edu> Message-ID: <45B4F3B6.1070808@cornell.edu> Sorry for jumping in. The GO and phenotype annotations are essentially the same thing. Only difference is that the way annotations are associated/assembled in the current GO association tables, except for the quality aspect that is new and comes from PATO. In the following example entity is the 'GO_process: eye development'. e.g. object: allele-A | GO_process: eye development | Quality: abnormal |Code: IMP | Evidence: PMID-xxxxx The same annotation gets percolated to the gene as well because of its lineage. -Pankaj Doug howe wrote: > Michael, > Consider an alternate example that uses a GO term as the entity in > the phenotype annotation: > > A mutation in gene A results in cyclopia. > One could annotate gene A with the GO term 'eye development' by IMP > and the allele of gene A with a phenotype annotation like entity: eye > development + quality:abnormal > > No? > > -Doug > > Michael Ashburner (Genetics) wrote: > >> Doug >> >> I am not sure I see this as a major problem. >> >> Let us imagine you have a mutation in gene A which lacks the dorsal fin. >> You _might_ deduce from this (or an author might) that it is appropriate to >> annotate the corresponding gene as having the GO process >> >> fin morphogenesis >> >> with an IMP evidence code. >> >> That annotation is to the protein (the gene being a proxy for this). >> >> But then the particular _allele_ studied would have a PATO annotation >> >> entity: fin; quality: absent [or whatever, I am just giving the general idea] >> >> These annotations do not seem to be redundant to me >> >> Michael >> >> >> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Obo-phenotype mailing list > Obo-phenotype at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/obo-phenotype > > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070122/670ccea6/attachment.html From dhowe at cs.uoregon.edu Mon Jan 22 11:42:08 2007 From: dhowe at cs.uoregon.edu (Doug howe) Date: Mon, 22 Jan 2007 11:42:08 -0800 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B4F3B6.1070808@cornell.edu> References: <45B4E833.6000007@cs.uoregon.edu> <45B4F3B6.1070808@cornell.edu> Message-ID: <45B51390.2080502@cs.uoregon.edu> To be clear: I view this largely as a redundancy of doing curatorial work. Not a redundancy of biological knowledge space per se. GO annotation applies to the normal functions, processes, components of wild type gene products. Phenotype describes the characteristics of a specific mutant condition. When a mutant characteristic is used to imply a GO annotation by IMP there is some overlap between making the phenotype annotations and making the GO annotations. -Doug Pankaj Jaiswal wrote: > Sorry for jumping in. The GO and phenotype annotations are essentially > the same thing. Only difference is that the way annotations are > associated/assembled in the current GO association tables, except for > the quality aspect that is new and comes from PATO. In the following > example entity is the 'GO_process: eye development'. > > e.g. > object: allele-A | GO_process: eye development | Quality: abnormal > |Code: IMP | Evidence: PMID-xxxxx > The same annotation gets percolated to the gene as well because of its > lineage. > > -Pankaj > > Doug howe wrote: >> Michael, >> Consider an alternate example that uses a GO term as the entity in >> the phenotype annotation: >> >> A mutation in gene A results in cyclopia. >> One could annotate gene A with the GO term 'eye development' by IMP >> and the allele of gene A with a phenotype annotation like entity: eye >> development + quality:abnormal >> >> No? >> >> -Doug >> >> Michael Ashburner (Genetics) wrote: >> >>> Doug >>> >>> I am not sure I see this as a major problem. >>> >>> Let us imagine you have a mutation in gene A which lacks the dorsal fin. >>> You _might_ deduce from this (or an author might) that it is appropriate to >>> annotate the corresponding gene as having the GO process >>> >>> fin morphogenesis >>> >>> with an IMP evidence code. >>> >>> That annotation is to the protein (the gene being a proxy for this). >>> >>> But then the particular _allele_ studied would have a PATO annotation >>> >>> entity: fin; quality: absent [or whatever, I am just giving the general idea] >>> >>> These annotations do not seem to be redundant to me >>> >>> Michael >>> >>> >>> >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share your >> opinions on IT & business topics through brief surveys - and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Obo-phenotype mailing list >> Obo-phenotype at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/obo-phenotype >> >> > > -- > Pankaj Jaiswal > G-15, Bradfield Hall > Dept. of Plant Breeding and Genetics > Cornell University > Ithaca, NY-14853, USA > > Ph. +1-607-255-3103 / 4199 > fax: +1-607-255-6683 From cjm at fruitfly.org Mon Jan 22 12:11:35 2007 From: cjm at fruitfly.org (Chris Mungall) Date: Mon, 22 Jan 2007 12:11:35 -0800 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B4F3B6.1070808@cornell.edu> References: <45B4E833.6000007@cs.uoregon.edu> <45B4F3B6.1070808@cornell.edu> Message-ID: <07CA518A-75F9-40C1-A4B8-CD137A5660C0@fruitfly.org> I think we will have various kinds of overlap and redundancy. I think it will be valuable to capture it all (though tools like phenote will of course be able to help curators and suggest annotations based on rules and statistical associations). This isn't any different from GO (eg transcription => nucleus), although due to the nature of phenotype annotation we will have a wider variety of granularities and perspectives. For any genotype that has an allele for where encodes : IF E=

AND

is_a biological_process AND the phenotype is abnormal THEN the wild type gene product participates_in

(note: this rule may be correct, but it may not be complete; it may be possible to use the Q to make a more specific GO annotation) If E= AND is_a anatomical_entity (including whole cells) AND the phenotype is abnormal THEN then wild type gene product participates in

WHERE

= development_of The cyclops example is interesting. If a cyclops has a single perfectly functioning good eye would we really say: E= GO:vertebrate_eye_development[*] Tag= abnormal Surely the eye development executed as normally as can be expected, it just had the bad luck to be executed as a part_of a larger assemblage of processes such as cranio-facial development? (no suitable term in GO). Perhaps we could also say the overall facial development program execution was abnormal in that it had only one eye development process instance as sub-part, rather than two. I'm guessing curators would find that a strange way to annotate. Of course, the real subpart of the facial development process is likely to be some upstream process involved in laying down some kind of left-right patterning in a precursor (I'm guessing - don't know much about Cylcopeanism). But this inference may not be supported by the data in the paper. In fact there may not be any developmental data at all; in which case, why not annotate: E= FMA:Face Q=PATO:having_a_single_part E2= FMA:Eye (we need to fix this part of PATO -see also http:// www.bioontology.org/wiki/index.php/PATO:Absent for why it's better to state it this way than having E= Eye - this is similar to the spermatocytes devoid of asters example) Here you're stating exactly what you are seeing. Then at a later time someone may do a different experiment with the same gene (or the orthologous gene in another species - or an upstream gene.....) and observe at a more detailed level a disruption in development during an earlier stage of facial development - resulting in both a phenotype annotation and a GO annotation of the wildtype. Which doesn't make the first phenotype annotation redundant - it's all fantastic data for seeding the kinds of analysis that are very run of the mill for GO right now but will be a fair bit more complicated for phenotype annotation. So basically - err on the side of redundancy and when we have a decent amount of data lets try some rule mining to see what the best ways of transferring annotations across are.. Cheers Chris On Jan 22, 2007, at 9:26 AM, Pankaj Jaiswal wrote: > Sorry for jumping in. The GO and phenotype annotations are > essentially the same thing. Only difference is that the way > annotations are associated/assembled in the current GO association > tables, except for the quality aspect that is new and comes from > PATO. In the following example entity is the 'GO_process: eye > development'. > > e.g. > object: allele-A | GO_process: eye development | Quality: abnormal | > Code: IMP | Evidence: PMID-xxxxx > The same annotation gets percolated to the gene as well because of > its lineage. > > -Pankaj > > Doug howe wrote: >> Michael, Consider an alternate example that uses a GO term as the >> entity in the phenotype annotation: A mutation in gene A results >> in cyclopia. One could annotate gene A with the GO term 'eye >> development' by IMP and the allele of gene A with a phenotype >> annotation like entity: eye development + quality:abnormal No? - >> Doug Michael Ashburner (Genetics) wrote: >>> Doug I am not sure I see this as a major problem. Let us imagine >>> you have a mutation in gene A which lacks the dorsal fin. You >>> _might_ deduce from this (or an author might) that it is >>> appropriate to annotate the corresponding gene as having the GO >>> process fin morphogenesis with an IMP evidence code. That >>> annotation is to the protein (the gene being a proxy for this). >>> But then the particular _allele_ studied would have a PATO >>> annotation entity: fin; quality: absent [or whatever, I am just >>> giving the general idea] These annotations do not seem to be >>> redundant to me Michael >> --------------------------------------------------------------------- >> ---- Take Surveys. Earn Cash. Influence the Future of IT Join >> SourceForge.net's Techsay panel and you'll get the chance to share >> your opinions on IT & business topics through brief surveys - and >> earn cash http://www.techsay.com/default.php? >> page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ Obo-phenotype >> mailing list Obo-phenotype at lists.sourceforge.net https:// >> lists.sourceforge.net/lists/listinfo/obo-phenotype > > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and > Genetics Cornell University Ithaca, NY-14853, USA Ph. > +1-607-255-3103 / 4199 fax: +1-607-255-6683 > ---------------------------------------------------------------------- > --- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php? > page=join.php&p=sourceforge&CID=DEVDEV________________________________ > _______________ > Obo-phenotype mailing list > Obo-phenotype at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/obo-phenotype From pj37 at cornell.edu Mon Jan 22 12:43:37 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Mon, 22 Jan 2007 15:43:37 -0500 Subject: [Obo-phenotype] GO vs Phenotype curation In-Reply-To: <45B51390.2080502@cs.uoregon.edu> References: <45B4E833.6000007@cs.uoregon.edu> <45B4F3B6.1070808@cornell.edu> <45B51390.2080502@cs.uoregon.edu> Message-ID: <45B521F9.9060500@cornell.edu> Doug howe wrote: > To be clear: I view this largely as a redundancy of doing curatorial > work. Not a redundancy of biological knowledge space per se. ! > GO > annotation applies to the normal functions, processes, components of > wild type gene products. This is a tricky situation. The moment we say 'wild type normal ' means we are referring to the so called wild type allele or the allele that was sequenced (from sequenced stock / accession) as part of the genome sequencing effort. Often we have found that major phenotype genes/loci are missing from these sequenced genomes. Lets say in case of rice the genome was sequenced from Nipponbare variety, but it lacks the functional genes (loci) to provide many types of disease resistance. So in that case if one has to curate these loci (CDSs) in the genome to 'response to pathogen' a biological process term, we cannot because its not true for the loci from sequenced genome. However, we do want to associate these loci to the disease resistance. that can be done if I have read the functional allele paper that will tell me the real functional loci is the resistant form that comes from a different variety. I may be overstating but, its very superficial to call what is a wild type. It is driven by the experiment. In the example I gave above its not clear that's why a use of a reference set is recommended. Also a gene in true sense is just a place holder for all the phenotypes (at macro / micro levels) that includes GO annotations at a particular locus of a genome. Whether a gene product has a function or not is genotype dependent i.e. allele or the genetic background. > Phenotype describes the characteristics of a > specific mutant condition. When a mutant characteristic is used to > imply a GO annotation by IMP there is some overlap between making the > phenotype annotations and making the GO annotations. > Agreed. in order to associate many GO-Biological_process term entities based on the abnormal eye development example, I am sure one won't be able to ascertain the function of the so called wild type (the reference allele) unless one has looked at the non-functional/variant form/allele. That's why my suggestion (in agreement with you) was, in order to reduce the curational load and avoid redundancy we need a curation strategy that is driven by gene products with reference to their variant/mutant/allele forms and their source stock/germplasm accession and not by their generic counterpart 'gene'. -Pankaj > -Doug > > Pankaj Jaiswal wrote: > >> Sorry for jumping in. The GO and phenotype annotations are essentially >> the same thing. Only difference is that the way annotations are >> associated/assembled in the current GO association tables, except for >> the quality aspect that is new and comes from PATO. In the following >> example entity is the 'GO_process: eye development'. >> >> e.g. >> object: allele-A | GO_process: eye development | Quality: abnormal >> |Code: IMP | Evidence: PMID-xxxxx >> The same annotation gets percolated to the gene as well because of its >> lineage. >> >> -Pankaj >> >> Doug howe wrote: >> >>> Michael, >>> Consider an alternate example that uses a GO term as the entity in >>> the phenotype annotation: >>> >>> A mutation in gene A results in cyclopia. >>> One could annotate gene A with the GO term 'eye development' by IMP >>> and the allele of gene A with a phenotype annotation like entity: eye >>> development + quality:abnormal >>> >>> No? >>> >>> -Doug >>> >>> Michael Ashburner (Genetics) wrote: >>> >>> >>>> Doug >>>> >>>> I am not sure I see this as a major problem. >>>> >>>> Let us imagine you have a mutation in gene A which lacks the dorsal fin. >>>> You _might_ deduce from this (or an author might) that it is appropriate to >>>> annotate the corresponding gene as having the GO process >>>> >>>> fin morphogenesis >>>> >>>> with an IMP evidence code. >>>> >>>> That annotation is to the protein (the gene being a proxy for this). >>>> >>>> But then the particular _allele_ studied would have a PATO annotation >>>> >>>> entity: fin; quality: absent [or whatever, I am just giving the general idea] >>>> >>>> These annotations do not seem to be redundant to me >>>> >>>> Michael >>>> >>>> >>>> >>>> >>> ------------------------------------------------------------------------- >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to share your >>> opinions on IT & business topics through brief surveys - and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> _______________________________________________ >>> Obo-phenotype mailing list >>> Obo-phenotype at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/obo-phenotype >>> >>> >>> >> -- >> Pankaj Jaiswal >> G-15, Bradfield Hall >> Dept. of Plant Breeding and Genetics >> Cornell University >> Ithaca, NY-14853, USA >> >> Ph. +1-607-255-3103 / 4199 >> fax: +1-607-255-6683 >> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Obo-phenotype mailing list > Obo-phenotype at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/obo-phenotype > > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070122/fb180777/attachment.html From midori at ebi.ac.uk Mon Jan 22 16:00:07 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Tue, 23 Jan 2007 00:00:07 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701230000.l0N007t1103530@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070123/75b3481c/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070123/75b3481c/attachment.pl From jane at ebi.ac.uk Thu Jan 25 03:59:29 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 25 Jan 2007 11:59:29 +0000 Subject: GO annotation question for protein complex terms. In-Reply-To: References: Message-ID: <2687E1EA-E81D-4506-B830-9EC745254BBE@ebi.ac.uk> Has this been added to SF? Or has someone implemented? Jane On 16 Jan 2007, at 19:33, Valerie Wood wrote: > That would be better (it would take the onus off the curator to > make the appropriate membrane annotations) > Val > > > Karen Christie wrote: >> I agree that the term 'voltage-gated calcium channel complex' (GO: >> 0005891) >> does need to be moved up to avoid the TPV. >> >> However, I was wondering if it would be worth adding some child >> terms to >> distinguish a core portion versus accessory subunits, so that the >> core >> portion could be given 'integral to membrane' parentage directly. >> >> -Karen >> >> >> >> On Mon, 15 Jan 2007, David Hill wrote: >> >>> I agree too. >>> >>> David >>> >>> Evelyn Camon wrote: >>>> Hi Emily >>>> >>>> I agree with Val...well spotted better to have the complex term >>>> moved. >>>> >>>> Ev >>>> >>>> Valerie Wood wrote: >>>>> >>>>> >>>>> I would have the complex term moved from under integral to >>>>> membrane, and >>>>> re-annotate the individual subunits which are 'membrane >>>>> integral' back to >>>>> this term. >>>>> >>>>> my 2p >>>>> val >>>>> >>>>> Emily Dimmer wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm annotating a beta subunit of calcium channels. This >>>>>> subunit is an >>>>>> 'accessory' subunit which determines the formation, processing, >>>>>> regulation and function of the channel (PMID: 8530407). >>>>>> Unlike the alpha subunits, the betas do not possess membrane- >>>>>> spanning >>>>>> domains, but are localized on the cytoplasmic side of the >>>>>> membrane and >>>>>> are attached to an alpha subunit. >>>>>> >>>>>> I've annotated the beta subunit to 'voltage-gated calcium channel >>>>>> complex' (GO:0005891), however this term is a child of >>>>>> 'integral to >>>>>> membrane'. Should I be concerned that the parentage of channel >>>>>> term is >>>>>> not correct for my subunit, and ask for the calcium channel >>>>>> complex term >>>>>> to have the less granular term 'part of plasma membrane' as a >>>>>> parent >>>>>> instead? (I've also annotated the beta subunit to: 'internal >>>>>> side of >>>>>> plasma membrane' (GO:0009898)). >>>>>> I'm concerned as if someone were to use a GO slim which >>>>>> included the >>>>>> 'integral to membrane' GO term, my protein would get mapped to >>>>>> this >>>>>> location - which would be incorrect. >>>>>> Thanks, >>>>>> Emily >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> > From kchris at genome.Stanford.EDU Thu Jan 25 10:58:12 2007 From: kchris at genome.Stanford.EDU (Karen Christie) Date: Thu, 25 Jan 2007 10:58:12 -0800 (PST) Subject: GO annotation question for protein complex terms. In-Reply-To: <2687E1EA-E81D-4506-B830-9EC745254BBE@ebi.ac.uk> References: <2687E1EA-E81D-4506-B830-9EC745254BBE@ebi.ac.uk> Message-ID: I posted the same comment I sent on email to the SF item, admittedly after it had been closed, but assumed that the person who made the original change would get the email and act on it. Should I have unclosed the SF item, or made a separate item? -Karen On Thu, 25 Jan 2007, Jane Lomax wrote: > Has this been added to SF? Or has someone implemented? > > Jane > > > On 16 Jan 2007, at 19:33, Valerie Wood wrote: > >> That would be better (it would take the onus off the curator to make the >> appropriate membrane annotations) >> Val >> >> >> Karen Christie wrote: >>> I agree that the term 'voltage-gated calcium channel complex' (GO:0005891) >>> does need to be moved up to avoid the TPV. >>> >>> However, I was wondering if it would be worth adding some child terms to >>> distinguish a core portion versus accessory subunits, so that the core >>> portion could be given 'integral to membrane' parentage directly. >>> >>> -Karen >>> >>> >>> >>> On Mon, 15 Jan 2007, David Hill wrote: >>> >>>> I agree too. >>>> >>>> David >>>> >>>> Evelyn Camon wrote: >>>>> Hi Emily >>>>> >>>>> I agree with Val...well spotted better to have the complex term moved. >>>>> >>>>> Ev >>>>> >>>>> Valerie Wood wrote: >>>>>> >>>>>> >>>>>> I would have the complex term moved from under integral to membrane, >>>>>> and >>>>>> re-annotate the individual subunits which are 'membrane integral' back >>>>>> to >>>>>> this term. >>>>>> >>>>>> my 2p >>>>>> val >>>>>> >>>>>> Emily Dimmer wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I'm annotating a beta subunit of calcium channels. This subunit is an >>>>>>> 'accessory' subunit which determines the formation, processing, >>>>>>> regulation and function of the channel (PMID: 8530407). >>>>>>> Unlike the alpha subunits, the betas do not possess membrane-spanning >>>>>>> domains, but are localized on the cytoplasmic side of the membrane and >>>>>>> are attached to an alpha subunit. >>>>>>> >>>>>>> I've annotated the beta subunit to 'voltage-gated calcium channel >>>>>>> complex' (GO:0005891), however this term is a child of 'integral to >>>>>>> membrane'. Should I be concerned that the parentage of channel term is >>>>>>> not correct for my subunit, and ask for the calcium channel complex >>>>>>> term >>>>>>> to have the less granular term 'part of plasma membrane' as a parent >>>>>>> instead? (I've also annotated the beta subunit to: 'internal side of >>>>>>> plasma membrane' (GO:0009898)). >>>>>>> I'm concerned as if someone were to use a GO slim which included the >>>>>>> 'integral to membrane' GO term, my protein would get mapped to this >>>>>>> location - which would be incorrect. >>>>>>> Thanks, >>>>>>> Emily >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> > From midori at ebi.ac.uk Fri Jan 26 02:08:56 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Fri, 26 Jan 2007 10:08:56 +0000 (GMT) Subject: GO annotation question for protein complex terms. In-Reply-To: References: <2687E1EA-E81D-4506-B830-9EC745254BBE@ebi.ac.uk> Message-ID: It's not a big deal, but it will help to reopen the item -- it's all too easy to forget/overlook still-closed items. m On Thu, 25 Jan 2007, Karen Christie wrote: > I posted the same comment I sent on email to the SF item, admittedly after it > had been closed, but assumed that the person who made the original change > would get the email and act on it. Should I have unclosed the SF item, or > made a separate item? > > -Karen > > > On Thu, 25 Jan 2007, Jane Lomax wrote: > >> Has this been added to SF? Or has someone implemented? >> >> Jane >> >> >> On 16 Jan 2007, at 19:33, Valerie Wood wrote: >> >>> That would be better (it would take the onus off the curator to make the >>> appropriate membrane annotations) >>> Val >>> >>> >>> Karen Christie wrote: >>>> I agree that the term 'voltage-gated calcium channel complex' >>>> (GO:0005891) >>>> does need to be moved up to avoid the TPV. >>>> >>>> However, I was wondering if it would be worth adding some child terms to >>>> distinguish a core portion versus accessory subunits, so that the core >>>> portion could be given 'integral to membrane' parentage directly. >>>> >>>> -Karen >>>> >>>> >>>> >>>> On Mon, 15 Jan 2007, David Hill wrote: >>>> >>>>> I agree too. >>>>> >>>>> David >>>>> >>>>> Evelyn Camon wrote: >>>>>> Hi Emily >>>>>> >>>>>> I agree with Val...well spotted better to have the complex term moved. >>>>>> >>>>>> Ev >>>>>> >>>>>> Valerie Wood wrote: >>>>>>> >>>>>>> >>>>>>> I would have the complex term moved from under integral to membrane, >>>>>>> and >>>>>>> re-annotate the individual subunits which are 'membrane integral' back >>>>>>> to >>>>>>> this term. >>>>>>> >>>>>>> my 2p >>>>>>> val >>>>>>> >>>>>>> Emily Dimmer wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm annotating a beta subunit of calcium channels. This subunit is an >>>>>>>> 'accessory' subunit which determines the formation, processing, >>>>>>>> regulation and function of the channel (PMID: 8530407). >>>>>>>> Unlike the alpha subunits, the betas do not possess membrane-spanning >>>>>>>> domains, but are localized on the cytoplasmic side of the membrane >>>>>>>> and >>>>>>>> are attached to an alpha subunit. >>>>>>>> >>>>>>>> I've annotated the beta subunit to 'voltage-gated calcium channel >>>>>>>> complex' (GO:0005891), however this term is a child of 'integral to >>>>>>>> membrane'. Should I be concerned that the parentage of channel term >>>>>>>> is >>>>>>>> not correct for my subunit, and ask for the calcium channel complex >>>>>>>> term >>>>>>>> to have the less granular term 'part of plasma membrane' as a parent >>>>>>>> instead? (I've also annotated the beta subunit to: 'internal side of >>>>>>>> plasma membrane' (GO:0009898)). >>>>>>>> I'm concerned as if someone were to use a GO slim which included the >>>>>>>> 'integral to membrane' GO term, my protein would get mapped to this >>>>>>>> location - which would be incorrect. >>>>>>>> Thanks, >>>>>>>> Emily >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >> > From kchris at genome.Stanford.EDU Fri Jan 26 13:43:40 2007 From: kchris at genome.Stanford.EDU (Karen Christie) Date: Fri, 26 Jan 2007 13:43:40 -0800 (PST) Subject: GO annotation question for protein complex terms. In-Reply-To: References: <2687E1EA-E81D-4506-B830-9EC745254BBE@ebi.ac.uk> Message-ID: OK, item is reopened. -Karen On Fri, 26 Jan 2007, Midori Harris wrote: > It's not a big deal, but it will help to reopen the item -- it's all too easy > to forget/overlook still-closed items. > > m > > On Thu, 25 Jan 2007, Karen Christie wrote: > >> I posted the same comment I sent on email to the SF item, admittedly after >> it had been closed, but assumed that the person who made the original >> change would get the email and act on it. Should I have unclosed the SF >> item, or made a separate item? >> >> -Karen >> >> >> On Thu, 25 Jan 2007, Jane Lomax wrote: >> >>> Has this been added to SF? Or has someone implemented? >>> >>> Jane >>> >>> >>> On 16 Jan 2007, at 19:33, Valerie Wood wrote: >>> >>>> That would be better (it would take the onus off the curator to make the >>>> appropriate membrane annotations) >>>> Val >>>> >>>> >>>> Karen Christie wrote: >>>>> I agree that the term 'voltage-gated calcium channel complex' >>>>> (GO:0005891) >>>>> does need to be moved up to avoid the TPV. >>>>> >>>>> However, I was wondering if it would be worth adding some child terms to >>>>> distinguish a core portion versus accessory subunits, so that the core >>>>> portion could be given 'integral to membrane' parentage directly. >>>>> >>>>> -Karen >>>>> >>>>> >>>>> >>>>> On Mon, 15 Jan 2007, David Hill wrote: >>>>> >>>>>> I agree too. >>>>>> >>>>>> David >>>>>> >>>>>> Evelyn Camon wrote: >>>>>>> Hi Emily >>>>>>> >>>>>>> I agree with Val...well spotted better to have the complex term moved. >>>>>>> >>>>>>> Ev >>>>>>> >>>>>>> Valerie Wood wrote: >>>>>>>> >>>>>>>> >>>>>>>> I would have the complex term moved from under integral to membrane, >>>>>>>> and >>>>>>>> re-annotate the individual subunits which are 'membrane integral' >>>>>>>> back to >>>>>>>> this term. >>>>>>>> >>>>>>>> my 2p >>>>>>>> val >>>>>>>> >>>>>>>> Emily Dimmer wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm annotating a beta subunit of calcium channels. This subunit is >>>>>>>>> an >>>>>>>>> 'accessory' subunit which determines the formation, processing, >>>>>>>>> regulation and function of the channel (PMID: 8530407). >>>>>>>>> Unlike the alpha subunits, the betas do not possess >>>>>>>>> membrane-spanning >>>>>>>>> domains, but are localized on the cytoplasmic side of the membrane >>>>>>>>> and >>>>>>>>> are attached to an alpha subunit. >>>>>>>>> >>>>>>>>> I've annotated the beta subunit to 'voltage-gated calcium channel >>>>>>>>> complex' (GO:0005891), however this term is a child of 'integral to >>>>>>>>> membrane'. Should I be concerned that the parentage of channel term >>>>>>>>> is >>>>>>>>> not correct for my subunit, and ask for the calcium channel complex >>>>>>>>> term >>>>>>>>> to have the less granular term 'part of plasma membrane' as a parent >>>>>>>>> instead? (I've also annotated the beta subunit to: 'internal side of >>>>>>>>> plasma membrane' (GO:0009898)). >>>>>>>>> I'm concerned as if someone were to use a GO slim which included the >>>>>>>>> 'integral to membrane' GO term, my protein would get mapped to this >>>>>>>>> location - which would be incorrect. >>>>>>>>> Thanks, >>>>>>>>> Emily >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>> >> > From midori at ebi.ac.uk Fri Jan 26 16:00:04 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Sat, 27 Jan 2007 00:00:04 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701270000.l0R005v1315061@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070127/655a7226/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070127/655a7226/attachment.pl From val at sanger.ac.uk Mon Jan 29 02:18:56 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Mon, 29 Jan 2007 10:18:56 +0000 Subject: EMP70 family question Message-ID: <45BDCA10.8050205@sanger.ac.uk> I notice that quite a few members of this family are annotated to transporter activity, but I can't find any experimental evidence? ISS S. cerevisiae (EMP70) fly rat RCA mouse in dicty it is annotated as a TM receptor (TAS) I can't track any experimental evidience for the 'transporter activity' which isn't based on speculation becauase of the presence of TM domains. Anybody got any data for this? Thanks Val -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From sart2 at gen.cam.ac.uk Mon Jan 29 03:16:50 2007 From: sart2 at gen.cam.ac.uk (Susan Tweedie) Date: Mon, 29 Jan 2007 11:16:50 +0000 Subject: EMP70 family question In-Reply-To: <45BDCA10.8050205@sanger.ac.uk> References: <45BDCA10.8050205@sanger.ac.uk> Message-ID: <1170069410.4663.16.camel@paul.gen.cam.ac.uk> Haven't looked thoroughly but can see we have old transporter activity ISS annotations for CG10590 and CG7364 to S. cerevisiae (EMP70) which is itself an ISS annotation without a with. I'll take out our annotations unless anyone else finds expt evidence. Can't find any experimental characterisation of the fly proteins. Susan On Mon, 2007-01-29 at 10:18 +0000, Valerie Wood wrote: > > I notice that quite a few members of this family are annotated to > transporter activity, but I can't find any experimental evidence? > > ISS > S. cerevisiae (EMP70) > fly > rat > > RCA > mouse > > in dicty it is annotated as a TM receptor (TAS) > > I can't track any experimental evidience for the 'transporter activity' which isn't based on speculation becauase of the presence of TM domains. > > Anybody got any data for this? > > Thanks > > Val > From rama at genome.Stanford.EDU Mon Jan 29 10:38:28 2007 From: rama at genome.Stanford.EDU (Rama Balakrishnan) Date: Mon, 29 Jan 2007 10:38:28 -0800 Subject: EMP70 family question In-Reply-To: <1170069410.4663.16.camel@paul.gen.cam.ac.uk> References: <45BDCA10.8050205@sanger.ac.uk> <1170069410.4663.16.camel@paul.gen.cam.ac.uk> Message-ID: <14C86AB8-7581-446C-9812-27B062469474@genome.stanford.edu> I agree. EMP70 being a transporter is a speculation in the paper we have used to make the annotation. I will update the yeast annotations. Rama On Jan 29, 2007, at 3:16 AM, Susan Tweedie wrote: > Haven't looked thoroughly but can see we have old transporter activity > ISS annotations for CG10590 and CG7364 to S. cerevisiae (EMP70) > which is > itself an ISS annotation without a with. I'll take out our annotations > unless anyone else finds expt evidence. Can't find any experimental > characterisation of the fly proteins. > > Susan > > On Mon, 2007-01-29 at 10:18 +0000, Valerie Wood wrote: >> >> I notice that quite a few members of this family are annotated to >> transporter activity, but I can't find any experimental evidence? >> >> ISS >> S. cerevisiae (EMP70) >> fly >> rat >> >> RCA >> mouse >> >> in dicty it is annotated as a TM receptor (TAS) >> >> I can't track any experimental evidience for the 'transporter >> activity' which isn't based on speculation becauase of the >> presence of TM domains. >> >> Anybody got any data for this? >> >> Thanks >> >> Val >> From midori at ebi.ac.uk Mon Jan 29 16:00:09 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Tue, 30 Jan 2007 00:00:09 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701300000.l0U00911404560@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070130/53489cd6/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070130/53489cd6/attachment.pl From pj37 at cornell.edu Tue Jan 30 12:40:04 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Tue, 30 Jan 2007 15:40:04 -0500 Subject: use of Uniprot accession Vs GenBank Accession in With column Message-ID: <45BFAD24.9020706@cornell.edu> Hi Everyone, I know it is an accepted SOP to include either the Uniprot accession number or the individual database's own gene/protein ID in the WITH column of the association tables. However while doing it it seems that it is too much of the work to find out what is the Uniprot entry, because often the DDBJ and GenBank do not Xref each other using the Uniprot accession. However the best alternative is to use the GenBank's Accession number. Which I see that almost all the databases including Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. It is also the most suitable ID used to find the particular nucleotide/protein accession that we are looking for using the same query, no matter which db is queried. I hope you would consider my request by adopting the GenBank's accession number, unless there is a better option. Thanks Pankaj -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 From kchris at genome.Stanford.EDU Tue Jan 30 13:42:21 2007 From: kchris at genome.Stanford.EDU (Karen Christie) Date: Tue, 30 Jan 2007 13:42:21 -0800 (PST) Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45BFAD24.9020706@cornell.edu> References: <45BFAD24.9020706@cornell.edu> Message-ID: Hi Pankaj, GenBank IDs are already allowed in the with column. The main requirement is that the abbreviation (or namespace) for the source of the ID be included in the GO.xrf_abbs file. There is already an entry for IDs coming from GenBank/DDBJ/EMBL, so these IDs are already permissable. -Karen abbreviation: EMBL database: International Nucleotide Sequence Database Collaboration, comprising EMBL-EBI International Nucleotide Sequence Data Library (EMBL-Bank), DNA DataBank of Japan (DDBJ), and NCBI GenBank object: Sequence accession number example_id: EMBL:AA816246 example_id: DDBJ:AA816246 example_id: GB:AA816246 synonym: DDBJ synonym: GB synonym: GenBank generic_url: http://www.ebi.ac.uk/embl/ generic_url: http://www.ddbj.nig.ac.jp/ generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: > Hi Everyone, > > I know it is an accepted SOP to include either the Uniprot accession number > or the individual database's own gene/protein ID in the WITH column of the > association tables. > > > However while doing it it seems that it is too much of the work to find out > what is the Uniprot entry, because often the DDBJ and GenBank do not Xref > each other using the Uniprot accession. However the best alternative is to > use the GenBank's Accession number. Which I see that almost all the databases > including Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. It is also the > most suitable ID used to find the particular nucleotide/protein accession > that we are looking for using the same query, no matter which db is queried. > > I hope you would consider my request by adopting the GenBank's accession > number, unless there is a better option. > > Thanks > Pankaj > -- > Pankaj Jaiswal > G-15, Bradfield Hall > Dept. of Plant Breeding and Genetics > Cornell University > Ithaca, NY-14853, USA > > Ph. +1-607-255-3103 / 4199 > fax: +1-607-255-6683 > From rama at genome.Stanford.EDU Tue Jan 30 13:41:27 2007 From: rama at genome.Stanford.EDU (Rama Balakrishnan) Date: Tue, 30 Jan 2007 13:41:27 -0800 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45BFAD24.9020706@cornell.edu> References: <45BFAD24.9020706@cornell.edu> Message-ID: <8CBAB3E4-6B2C-49BF-86E2-5146AEBA9F3C@genome.stanford.edu> Pankaj, We don't explicitly say that we shouldn't use Genbank IDs for the with column in the GO documentation. There might be other issues with using GenBank ids which I am not aware of. Thanks, Rama On Jan 30, 2007, at 12:40 PM, Pankaj Jaiswal wrote: > Hi Everyone, > > I know it is an accepted SOP to include either the Uniprot > accession number or the individual database's own gene/protein ID > in the WITH column of the association tables. > > > However while doing it it seems that it is too much of the work to > find out what is the Uniprot entry, because often the DDBJ and > GenBank do not Xref each other using the Uniprot accession. However > the best alternative is to use the GenBank's Accession number. > Which I see that almost all the databases including Uniprot, DDBJ, > EMBL, PIR etc. use it to cross refer. It is also the most suitable > ID used to find the particular nucleotide/protein accession that we > are looking for using the same query, no matter which db is queried. > > I hope you would consider my request by adopting the GenBank's > accession number, unless there is a better option. > > Thanks > Pankaj > -- > Pankaj Jaiswal > G-15, Bradfield Hall > Dept. of Plant Breeding and Genetics > Cornell University > Ithaca, NY-14853, USA > > Ph. +1-607-255-3103 / 4199 > fax: +1-607-255-6683 From pj37 at cornell.edu Tue Jan 30 13:48:37 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Tue, 30 Jan 2007 16:48:37 -0500 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: <45BFAD24.9020706@cornell.edu> Message-ID: <45BFBD35.4000805@cornell.edu> Got it. We will use the GB one. BTW GenBank ID is different than the GenBank Accession. GenBank ID is the ID exclusive for the GenBank database entry. One GB accession can have mappings to several GenBank IDs. Pankaj Karen Christie wrote: > Hi Pankaj, > > GenBank IDs are already allowed in the with column. The main requirement > is that the abbreviation (or namespace) for the source of the ID be > included in the GO.xrf_abbs file. There is already an entry for IDs > coming from GenBank/DDBJ/EMBL, so these IDs are already permissable. > > -Karen > > > abbreviation: EMBL > database: International Nucleotide Sequence Database Collaboration, > comprising EMBL-EBI International Nucleotide Sequence Data Library > (EMBL-Bank), DNA DataBank of Japan (DDBJ), and NCBI GenBank > object: Sequence accession number > example_id: EMBL:AA816246 > example_id: DDBJ:AA816246 > example_id: GB:AA816246 > synonym: DDBJ > synonym: GB > synonym: GenBank > generic_url: http://www.ebi.ac.uk/embl/ > generic_url: http://www.ddbj.nig.ac.jp/ > generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ > > > On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: > >> Hi Everyone, >> >> I know it is an accepted SOP to include either the Uniprot accession >> number or the individual database's own gene/protein ID in the WITH >> column of the association tables. >> >> >> However while doing it it seems that it is too much of the work to >> find out what is the Uniprot entry, because often the DDBJ and GenBank >> do not Xref each other using the Uniprot accession. However the best >> alternative is to use the GenBank's Accession number. Which I see that >> almost all the databases including Uniprot, DDBJ, EMBL, PIR etc. use >> it to cross refer. It is also the most suitable ID used to find the >> particular nucleotide/protein accession that we are looking for using >> the same query, no matter which db is queried. >> >> I hope you would consider my request by adopting the GenBank's >> accession number, unless there is a better option. >> >> Thanks >> Pankaj >> -- >> Pankaj Jaiswal >> G-15, Bradfield Hall >> Dept. of Plant Breeding and Genetics >> Cornell University >> Ithaca, NY-14853, USA >> >> Ph. +1-607-255-3103 / 4199 >> fax: +1-607-255-6683 >> > -- Pankaj Jaiswal G-15, Bradfield Hall Dept. of Plant Breeding and Genetics Cornell University Ithaca, NY-14853, USA Ph. +1-607-255-3103 / 4199 fax: +1-607-255-6683 From kchris at genome.Stanford.EDU Tue Jan 30 13:55:06 2007 From: kchris at genome.Stanford.EDU (Karen Christie) Date: Tue, 30 Jan 2007 13:55:06 -0800 (PST) Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45BFBD35.4000805@cornell.edu> References: <45BFAD24.9020706@cornell.edu> <45BFBD35.4000805@cornell.edu> Message-ID: Note that the abbreviation selected by GO for the IDs for GenBank, DDBJ, and EMBL is EMBL, so that's the namespace that needs to be used in the gene_association files for GO. -Karen On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: > Got it. We will use the GB one. > > BTW GenBank ID is different than the GenBank Accession. GenBank ID is the ID > exclusive for the GenBank database entry. One GB accession can have mappings > to several GenBank IDs. > > Pankaj > > Karen Christie wrote: >> Hi Pankaj, >> >> GenBank IDs are already allowed in the with column. The main requirement is >> that the abbreviation (or namespace) for the source of the ID be included >> in the GO.xrf_abbs file. There is already an entry for IDs coming from >> GenBank/DDBJ/EMBL, so these IDs are already permissable. >> >> -Karen >> >> >> abbreviation: EMBL >> database: International Nucleotide Sequence Database Collaboration, >> comprising EMBL-EBI International Nucleotide Sequence Data Library >> (EMBL-Bank), DNA DataBank of Japan (DDBJ), and NCBI GenBank >> object: Sequence accession number >> example_id: EMBL:AA816246 >> example_id: DDBJ:AA816246 >> example_id: GB:AA816246 >> synonym: DDBJ >> synonym: GB >> synonym: GenBank >> generic_url: http://www.ebi.ac.uk/embl/ >> generic_url: http://www.ddbj.nig.ac.jp/ >> generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ >> >> >> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >> >>> Hi Everyone, >>> >>> I know it is an accepted SOP to include either the Uniprot accession >>> number or the individual database's own gene/protein ID in the WITH column >>> of the association tables. >>> >>> >>> However while doing it it seems that it is too much of the work to find >>> out what is the Uniprot entry, because often the DDBJ and GenBank do not >>> Xref each other using the Uniprot accession. However the best alternative >>> is to use the GenBank's Accession number. Which I see that almost all the >>> databases including Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. >>> It is also the most suitable ID used to find the particular >>> nucleotide/protein accession that we are looking for using the same query, >>> no matter which db is queried. >>> >>> I hope you would consider my request by adopting the GenBank's accession >>> number, unless there is a better option. >>> >>> Thanks >>> Pankaj >>> -- >>> Pankaj Jaiswal >>> G-15, Bradfield Hall >>> Dept. of Plant Breeding and Genetics >>> Cornell University >>> Ithaca, NY-14853, USA >>> >>> Ph. +1-607-255-3103 / 4199 >>> fax: +1-607-255-6683 >>> >> > > -- > Pankaj Jaiswal > G-15, Bradfield Hall > Dept. of Plant Breeding and Genetics > Cornell University > Ithaca, NY-14853, USA > > Ph. +1-607-255-3103 / 4199 > fax: +1-607-255-6683 > From pj37 at cornell.edu Tue Jan 30 13:56:48 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Tue, 30 Jan 2007 16:56:48 -0500 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: <45BFAD24.9020706@cornell.edu> <45BFBD35.4000805@cornell.edu> Message-ID: <45BFBF20.9070203@cornell.edu> Got it. Thanks Pankaj Karen Christie wrote: > Note that the abbreviation selected by GO for the IDs for GenBank, DDBJ, > and EMBL is EMBL, so that's the namespace that needs to be used in the > gene_association files for GO. > > -Karen > From midori at ebi.ac.uk Tue Jan 30 16:00:07 2007 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Wed, 31 Jan 2007 00:00:07 UT Subject: SourceForge Annotation Tracker Update Message-ID: <200701310000.l0V007k1456370@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20070131/dbb18fef/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20070131/dbb18fef/attachment.pl From midori at ebi.ac.uk Wed Jan 31 01:49:52 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Wed, 31 Jan 2007 09:49:52 +0000 (GMT) Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: <45BFAD24.9020706@cornell.edu> <45BFBD35.4000805@cornell.edu> Message-ID: Actually, GB or GenBank would also be acceptable, because they're listed as synonyms in GO.xrf_abbs (tthe filtering script allows anything in the 'aabbreviation' or 'synonym' fields). m On Tue, 30 Jan 2007, Karen Christie wrote: > Note that the abbreviation selected by GO for the IDs for GenBank, DDBJ, and > EMBL is EMBL, so that's the namespace that needs to be used in the > gene_association files for GO. > > -Karen > > > On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: > >> Got it. We will use the GB one. >> >> BTW GenBank ID is different than the GenBank Accession. GenBank ID is the >> ID exclusive for the GenBank database entry. One GB accession can have >> mappings to several GenBank IDs. >> >> Pankaj >> >> Karen Christie wrote: >>> Hi Pankaj, >>> >>> GenBank IDs are already allowed in the with column. The main requirement >>> is that the abbreviation (or namespace) for the source of the ID be >>> included in the GO.xrf_abbs file. There is already an entry for IDs coming >>> from GenBank/DDBJ/EMBL, so these IDs are already permissable. >>> >>> -Karen >>> >>> >>> abbreviation: EMBL >>> database: International Nucleotide Sequence Database Collaboration, >>> comprising EMBL-EBI International Nucleotide Sequence Data Library >>> (EMBL-Bank), DNA DataBank of Japan (DDBJ), and NCBI GenBank >>> object: Sequence accession number >>> example_id: EMBL:AA816246 >>> example_id: DDBJ:AA816246 >>> example_id: GB:AA816246 >>> synonym: DDBJ >>> synonym: GB >>> synonym: GenBank >>> generic_url: http://www.ebi.ac.uk/embl/ >>> generic_url: http://www.ddbj.nig.ac.jp/ >>> generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ >>> >>> >>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >>> >>>> Hi Everyone, >>>> >>>> I know it is an accepted SOP to include either the Uniprot accession >>>> number or the individual database's own gene/protein ID in the WITH >>>> column of the association tables. >>>> >>>> >>>> However while doing it it seems that it is too much of the work to find >>>> out what is the Uniprot entry, because often the DDBJ and GenBank do not >>>> Xref each other using the Uniprot accession. However the best alternative >>>> is to use the GenBank's Accession number. Which I see that almost all the >>>> databases including Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. >>>> It is also the most suitable ID used to find the particular >>>> nucleotide/protein accession that we are looking for using the same >>>> query, no matter which db is queried. >>>> >>>> I hope you would consider my request by adopting the GenBank's accession >>>> number, unless there is a better option. >>>> >>>> Thanks >>>> Pankaj >>>> -- >>>> Pankaj Jaiswal >>>> G-15, Bradfield Hall >>>> Dept. of Plant Breeding and Genetics >>>> Cornell University >>>> Ithaca, NY-14853, USA >>>> >>>> Ph. +1-607-255-3103 / 4199 >>>> fax: +1-607-255-6683 >>>> >>> >> >> -- >> Pankaj Jaiswal >> G-15, Bradfield Hall >> Dept. of Plant Breeding and Genetics >> Cornell University >> Ithaca, NY-14853, USA >> >> Ph. +1-607-255-3103 / 4199 >> fax: +1-607-255-6683 >> > From edimmer at ebi.ac.uk Wed Jan 31 03:07:02 2007 From: edimmer at ebi.ac.uk (Emily Dimmer) Date: Wed, 31 Jan 2007 11:07:02 +0000 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: <45BFAD24.9020706@cornell.edu> <45BFBD35.4000805@cornell.edu> Message-ID: <45C07856.1040807@ebi.ac.uk> Hi, Just a quick note, GenBank Accessions are exactly the same as EMBL accessions. All EMBL accessions are cross-referenced in UniProt. Therefore if you *did* want to find a UniProtKB accession, you should be able just to enter the GenBank accession into the UniProt website (or search via SRS etc...) and it will bring up the quivalent UniProt entry (I do realize that for some groups there is an issue of a UniProtKB accession not yet existing for an equivalent GenBank accession). Cheers, Emily Midori Harris wrote: > Actually, GB or GenBank would also be acceptable, because they're > listed as synonyms in GO.xrf_abbs (tthe filtering script allows > anything in the 'aabbreviation' or 'synonym' fields). > > m > > On Tue, 30 Jan 2007, Karen Christie wrote: > >> Note that the abbreviation selected by GO for the IDs for GenBank, >> DDBJ, and EMBL is EMBL, so that's the namespace that needs to be used >> in the gene_association files for GO. >> >> -Karen >> >> >> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >> >>> Got it. We will use the GB one. >>> >>> BTW GenBank ID is different than the GenBank Accession. GenBank ID >>> is the ID exclusive for the GenBank database entry. One GB accession >>> can have mappings to several GenBank IDs. >>> >>> Pankaj >>> >>> Karen Christie wrote: >>> >>>> Hi Pankaj, >>>> >>>> GenBank IDs are already allowed in the with column. The main >>>> requirement is that the abbreviation (or namespace) for the source >>>> of the ID be included in the GO.xrf_abbs file. There is already an >>>> entry for IDs coming from GenBank/DDBJ/EMBL, so these IDs are >>>> already permissable. >>>> >>>> -Karen >>>> >>>> >>>> abbreviation: EMBL >>>> database: International Nucleotide Sequence Database Collaboration, >>>> comprising EMBL-EBI International Nucleotide Sequence Data Library >>>> (EMBL-Bank), DNA DataBank of Japan (DDBJ), and NCBI GenBank >>>> object: Sequence accession number >>>> example_id: EMBL:AA816246 >>>> example_id: DDBJ:AA816246 >>>> example_id: GB:AA816246 >>>> synonym: DDBJ >>>> synonym: GB >>>> synonym: GenBank >>>> generic_url: http://www.ebi.ac.uk/embl/ >>>> generic_url: http://www.ddbj.nig.ac.jp/ >>>> generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ >>>> >>>> >>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >>>> >>>>> Hi Everyone, >>>>> >>>>> I know it is an accepted SOP to include either the Uniprot >>>>> accession number or the individual database's own gene/protein ID >>>>> in the WITH column of the association tables. >>>>> >>>>> >>>>> However while doing it it seems that it is too much of the work to >>>>> find out what is the Uniprot entry, because often the DDBJ and >>>>> GenBank do not Xref each other using the Uniprot accession. >>>>> However the best alternative is to use the GenBank's Accession >>>>> number. Which I see that almost all the databases including >>>>> Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. It is also >>>>> the most suitable ID used to find the particular >>>>> nucleotide/protein accession that we are looking for using the >>>>> same query, no matter which db is queried. >>>>> >>>>> I hope you would consider my request by adopting the GenBank's >>>>> accession number, unless there is a better option. >>>>> >>>>> Thanks >>>>> Pankaj >>>>> -- >>>>> Pankaj Jaiswal >>>>> G-15, Bradfield Hall >>>>> Dept. of Plant Breeding and Genetics >>>>> Cornell University >>>>> Ithaca, NY-14853, USA >>>>> >>>>> Ph. +1-607-255-3103 / 4199 >>>>> fax: +1-607-255-6683 >>>>> >>>> >>> >>> -- >>> Pankaj Jaiswal >>> G-15, Bradfield Hall >>> Dept. of Plant Breeding and Genetics >>> Cornell University >>> Ithaca, NY-14853, USA >>> >>> Ph. +1-607-255-3103 / 4199 >>> fax: +1-607-255-6683 >>> >> -- ************************************ Emily Dimmer GOA and IntAct Database Curator EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD, U.K. Tel: +44 1223 494654 Fax: +44 1223 494468 email: edimmer at ebi.ac.uk ************************************ From val at sanger.ac.uk Wed Jan 31 03:17:45 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Wed, 31 Jan 2007 11:17:45 +0000 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45C07856.1040807@ebi.ac.uk> References: <45BFAD24.9020706@cornell.edu> <45BFBD35.4000805@cornell.edu> <45C07856.1040807@ebi.ac.uk> Message-ID: <45C07AD9.2090504@sanger.ac.uk> Emily Dimmer wrote: > Hi, > > Just a quick note, GenBank Accessions are exactly the same as EMBL > accessions. All EMBL accessions are cross-referenced in UniProt. > Therefore if you *did* want to find a UniProtKB accession, you should > be able just to enter the GenBank accession into the UniProt website > (or search via SRS etc...) and it will bring up the quivalent UniProt > entry (I do realize that for some groups there is an issue of a > UniProtKB accession not yet existing for an equivalent GenBank > accession). > In the cases where there is no Uniprot ID, it may be a problem to refer to the Genbank/EMBL accession number as this will often be a cosmid or contig and contain multiple CDS- in these cases you can't refer to the gene/protein uniquely with an EMBL ID. Presumably though, for the cases where there is no Swiss-Prot /Trembl ID then the likelihood that you would be using this as a dbxref in the with column for an ISS is very small (I have never come across one). Can't we all agree to track down the Uniprot ID (which is relatively straightforward), or in cases why there isn't one, contact Uniprot to work out why? Val > Cheers, > Emily > > Midori Harris wrote: > >> Actually, GB or GenBank would also be acceptable, because they're >> listed as synonyms in GO.xrf_abbs (tthe filtering script allows >> anything in the 'aabbreviation' or 'synonym' fields). >> >> m >> >> On Tue, 30 Jan 2007, Karen Christie wrote: >> >>> Note that the abbreviation selected by GO for the IDs for GenBank, >>> DDBJ, and EMBL is EMBL, so that's the namespace that needs to be >>> used in the gene_association files for GO. >>> >>> -Karen >>> >>> >>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >>> >>>> Got it. We will use the GB one. >>>> >>>> BTW GenBank ID is different than the GenBank Accession. GenBank ID >>>> is the ID exclusive for the GenBank database entry. One GB >>>> accession can have mappings to several GenBank IDs. >>>> >>>> Pankaj >>>> >>>> Karen Christie wrote: >>>> >>>>> Hi Pankaj, >>>>> >>>>> GenBank IDs are already allowed in the with column. The main >>>>> requirement is that the abbreviation (or namespace) for the source >>>>> of the ID be included in the GO.xrf_abbs file. There is already an >>>>> entry for IDs coming from GenBank/DDBJ/EMBL, so these IDs are >>>>> already permissable. >>>>> >>>>> -Karen >>>>> >>>>> >>>>> abbreviation: EMBL >>>>> database: International Nucleotide Sequence Database >>>>> Collaboration, comprising EMBL-EBI International Nucleotide >>>>> Sequence Data Library (EMBL-Bank), DNA DataBank of Japan (DDBJ), >>>>> and NCBI GenBank >>>>> object: Sequence accession number >>>>> example_id: EMBL:AA816246 >>>>> example_id: DDBJ:AA816246 >>>>> example_id: GB:AA816246 >>>>> synonym: DDBJ >>>>> synonym: GB >>>>> synonym: GenBank >>>>> generic_url: http://www.ebi.ac.uk/embl/ >>>>> generic_url: http://www.ddbj.nig.ac.jp/ >>>>> generic_url: http://www.ncbi.nlm.nih.gov/Genbank/ >>>>> >>>>> >>>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote: >>>>> >>>>>> Hi Everyone, >>>>>> >>>>>> I know it is an accepted SOP to include either the Uniprot >>>>>> accession number or the individual database's own gene/protein ID >>>>>> in the WITH column of the association tables. >>>>>> >>>>>> >>>>>> However while doing it it seems that it is too much of the work >>>>>> to find out what is the Uniprot entry, because often the DDBJ and >>>>>> GenBank do not Xref each other using the Uniprot accession. >>>>>> However the best alternative is to use the GenBank's Accession >>>>>> number. Which I see that almost all the databases including >>>>>> Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. It is also >>>>>> the most suitable ID used to find the particular >>>>>> nucleotide/protein accession that we are looking for using the >>>>>> same query, no matter which db is queried. >>>>>> >>>>>> I hope you would consider my request by adopting the GenBank's >>>>>> accession number, unless there is a better option. >>>>>> >>>>>> Thanks >>>>>> Pankaj >>>>>> -- >>>>>> Pankaj Jaiswal >>>>>> G-15, Bradfield Hall >>>>>> Dept. of Plant Breeding and Genetics >>>>>> Cornell University >>>>>> Ithaca, NY-14853, USA >>>>>> >>>>>> Ph. +1-607-255-3103 / 4199 >>>>>> fax: +1-607-255-6683 >>>>>> >>>>> >>>> >>>> -- >>>> Pankaj Jaiswal >>>> G-15, Bradfield Hall >>>> Dept. of Plant Breeding and Genetics >>>> Cornell University >>>> Ithaca, NY-14853, USA >>>> >>>> Ph. +1-607-255-3103 / 4199 >>>> fax: +1-607-255-6683 >>>> >>> > > -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From ma11 at gen.cam.ac.uk Wed Jan 31 03:35:29 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Wed, 31 Jan 2007 11:35:29 +0000 (GMT) Subject: use of Uniprot accession Vs GenBank Accession in With column Message-ID: All Am I being thick or not ? It seems as if the obvious object to refer to, if Uniprot ID is not available, is the PID contained within GenBank EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned and gets over the problem that Val points to: 'it may be a problem to refer to the Genbank/EMBL accession number as this will often be a cosmid or contig and contain multiple CDS- in these cases you can't refer to the gene/protein uniquely with an EMBL ID.' Michael From val at sanger.ac.uk Wed Jan 31 04:04:09 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Wed, 31 Jan 2007 12:04:09 +0000 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: Message-ID: <45C085B9.80100@sanger.ac.uk> That might be what you would expect but in my experience these can change (and disappear) over time, although I'm sure all of the old IDs are available in some archive. When TremblNew existed and things moved into Trembl, the old IDs became untraceable (this may have changed now Tremblnew no longer exists)?. I just checked a couple of sequences which were revised at some point and they have IDs with a .1 extension implying that new PIDs were created for these. From a recent conversation I think these may be lost unless the original submitter puts the IDs into their EMBL entry (which I don't, does anybody else?). Also for various other reasons (sequence merges in contig boundaries). Is this right? Anybody at EBI know? Val Michael Ashburner (Genetics) wrote: >All > >Am I being thick or not ? It seems as if the obvious object to refer >to, if Uniprot ID is not available, is the PID contained within GenBank >EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned >and gets over the problem that Val points to: >'it may be a problem to refer >to the Genbank/EMBL accession number as this will often be a cosmid or >contig and contain multiple CDS- in these cases you can't refer to the >gene/protein uniquely with an EMBL ID.' > >Michael > > > > -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From val at sanger.ac.uk Wed Jan 31 04:15:12 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Wed, 31 Jan 2007 12:15:12 +0000 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45C085B9.80100@sanger.ac.uk> References: <45C085B9.80100@sanger.ac.uk> Message-ID: <45C08850.4080202@sanger.ac.uk> This was my reply from Tamara to this exact question a couple of weeks ago which makes me think these IDs are not tracked when new EMBL entries replace old ones (i.e. you submit larger contigs). These are exactly the entries these IDs would be required for. ..... Tamara Kulikova, EMBL database kulikova at ebi.ac.uk > iii) As we do not track the protein IDs can we confirm that these > will automatically be remapped? We will have to do the first loading for you (the protein_id migrate between entries in this case and there is no automatic procedure in place for it) so we will track the protein_ids manually once. After that you will be able to submit the entries and the usual automatic tracking will work as usual Val Valerie Wood wrote: > > That might be what you would expect but in my experience these can > change (and disappear) over time, although I'm sure all of the old IDs > are available in some archive. When TremblNew existed and things moved > into Trembl, the old IDs became untraceable (this may have changed now > Tremblnew no longer exists)?. > > I just checked a couple of sequences which were revised at some point > and they have IDs with a .1 extension implying that new PIDs were > created for these. From a recent conversation I think these may be > lost unless the original submitter puts the IDs into their EMBL entry > (which I don't, does anybody else?). Also for various other reasons > (sequence merges in contig boundaries). Is this right? Anybody at EBI > know? > > Val > > Michael Ashburner (Genetics) wrote: > >> All >> >> Am I being thick or not ? It seems as if the obvious object to refer >> to, if Uniprot ID is not available, is the PID contained within GenBank >> EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned >> and gets over the problem that Val points to: >> 'it may be a problem to refer to the Genbank/EMBL accession number as >> this will often be a cosmid or contig and contain multiple CDS- in >> these cases you can't refer to the gene/protein uniquely with an >> EMBL ID.' >> >> Michael >> >> >> >> > > -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From pj37 at cornell.edu Wed Jan 31 06:20:38 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Wed, 31 Jan 2007 09:20:38 -0500 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: Message-ID: <45C0A5B6.5050905@cornell.edu> Hi, PID is the same as GB/EMBL/DDBJ accession number e.g. /protein_id="AAT37941.1" referred in nucleotide entry http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=47717898 is the same as accession number in http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AAT37941 ACCESSION AAT37941 VERSION AAT37941.1 GI:47717899 The version is fine, that refers to any new updates in the entry and they are all tracked. However, in most cases it is not that significant. The problem I raised is also because it is a rare occurrence a citation refers to Uniprot accessions. Almost always they refer to GB/EMBL/DDBJ accessions. In that case a curator has to go and find out the possible Uniprot accession as Emily has suggested. This I think is extra curational load. There are other problems as well cited in this mail tread as well. So my suggestion is to adopt a universal system to always refer by an EMBL/GB/DDBJ accession number in the association files and some magic script should be able to link back to all the respective dbs and not just one source. On the other hand we should encourage the GB to provide Xrefs to the Uniprot accessions also. I have seen them in unigenes/genes/genomes but not always in protein and nucleotide dbs. -Pankaj Michael Ashburner (Genetics) wrote: > All > > Am I being thick or not ? It seems as if the obvious object to refer > to, if Uniprot ID is not available, is the PID contained within GenBank > EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned > and gets over the problem that Val points to: > 'it may be a problem to refer > to the Genbank/EMBL accession number as this will often be a cosmid or > contig and contain multiple CDS- in these cases you can't refer to the > gene/protein uniquely with an EMBL ID.' > > Michael > > From val at sanger.ac.uk Wed Jan 31 06:26:33 2007 From: val at sanger.ac.uk (Valerie Wood) Date: Wed, 31 Jan 2007 14:26:33 +0000 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: <45C0A5B6.5050905@cornell.edu> References: <45C0A5B6.5050905@cornell.edu> Message-ID: <45C0A719.8090205@sanger.ac.uk> Hi Pankaj, This is not always the case. This cannot happen if your EMBL entry contains multiple proteins. Val Pankaj Jaiswal wrote: > Hi, > > PID is the same as GB/EMBL/DDBJ accession number > e.g. > /protein_id="AAT37941.1" > referred in nucleotide entry > http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=47717898 > is the same as accession number in > http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AAT37941 > ACCESSION AAT37941 > VERSION AAT37941.1 GI:47717899 > > The version is fine, that refers to any new updates in the entry and > they are all tracked. However, in most cases it is not that significant. > > The problem I raised is also because it is a rare occurrence a > citation refers to Uniprot accessions. Almost always they refer to > GB/EMBL/DDBJ accessions. In that case a curator has to go and find out > the possible Uniprot accession as Emily has suggested. This I think is > extra curational load. There are other problems as well cited in this > mail tread as well. So my suggestion is to adopt a universal system to > always refer by an EMBL/GB/DDBJ accession number in the association > files and some magic script should be able to link back to all the > respective dbs and not just one source. > > On the other hand we should encourage the GB to provide Xrefs to the > Uniprot accessions also. I have seen them in unigenes/genes/genomes > but not always in protein and nucleotide dbs. > > -Pankaj > > Michael Ashburner (Genetics) wrote: > >> All >> >> Am I being thick or not ? It seems as if the obvious object to refer >> to, if Uniprot ID is not available, is the PID contained within GenBank >> EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned >> and gets over the problem that Val points to: >> 'it may be a problem to refer to the Genbank/EMBL accession number as >> this will often be a cosmid or contig and contain multiple CDS- in >> these cases you can't refer to the gene/protein uniquely with an EMBL >> ID.' >> >> Michael >> >> > > -- --------------------------------------------------------------------------- Valerie Wood Tel: 01223 496909 S. pombe Genome Project Fax: 01223 494919 Wellcome Trust Sanger Institute email: val at sanger.ac.uk Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/S_pombe From ma11 at gen.cam.ac.uk Wed Jan 31 06:47:57 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Wed, 31 Jan 2007 14:47:57 +0000 (GMT) Subject: use of Uniprot accession Vs GenBank Accession in With column Message-ID: Pankaj PID IS NOT the same as the GB/EMBL/DDBJ accession number. You are correct in that the PID is this: > /protein_id="AAT37941.1" but the corresponding GB/EMBL/DDBJ accession number is, in the case you cite ACCESSION AY607689 Any single ACCESSION can have 0, 1 or >1 PID. The last is especially true for genomic sequences. As has been pointed out to you if you look up an ACCESSION number at the EBI you will see the UniProt xlink. So if you go to www.ebi.ac.uk and paste in AY607689 into the top page query box you will get this record which includes these data: T /codon_start=1 FT /product="low temperature-induced low molecular weight FT integral membrane protein LTI6a" FT /note="OsLti6a" FT /db_xref="GOA:Q8H5T6" FT /db_xref="InterPro:IPR000612" FT /db_xref="UniProtKB/Swiss-Prot:Q8H5T6" FT /protein_id="AAT37941.1" FT /translation="MADSTATCIDIILAIILPPLGVFFKFGCGIEFWICLLLTFFGYLP FT GIIYAVWVITK" Michael > Envelope-to: ma11 at gen.cam.ac.uk > Delivery-date: Wed, 31 Jan 2007 14:21:23 +0000 > X-Cam-SpamDetails: scanned, SpamAssassin-3.1.7 (score=0) > X-Cam-AntiVirus: No virus found > X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ > Date: Wed, 31 Jan 2007 09:20:38 -0500 > From: Pankaj Jaiswal > User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) > X-Accept-Language: en-us, en > MIME-Version: 1.0 > To: "Michael Ashburner (Genetics)" > CC: go at genome.stanford.edu, val at sanger.ac.uk, edimmer at ebi.ac.uk, midori at ebi.ac.uk, kchris at genome.stanford.edu, annotation at genome.stanford.edu > Subject: Re: use of Uniprot accession Vs GenBank Accession in With column > Content-Transfer-Encoding: 7bit > > Hi, > > PID is the same as GB/EMBL/DDBJ accession number > e.g. > /protein_id="AAT37941.1" > referred in nucleotide entry > http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=47717898 > is the same as accession number in > http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AAT37941 > ACCESSION AAT37941 > VERSION AAT37941.1 GI:47717899 > > The version is fine, that refers to any new updates in the entry and > they are all tracked. However, in most cases it is not that significant. > > The problem I raised is also because it is a rare occurrence a citation > refers to Uniprot accessions. Almost always they refer to GB/EMBL/DDBJ > accessions. In that case a curator has to go and find out the possible > Uniprot accession as Emily has suggested. This I think is extra > curational load. There are other problems as well cited in this mail > tread as well. So my suggestion is to adopt a universal system to always > refer by an EMBL/GB/DDBJ accession number in the association files and > some magic script should be able to link back to all the respective dbs > and not just one source. > > On the other hand we should encourage the GB to provide Xrefs to the > Uniprot accessions also. I have seen them in unigenes/genes/genomes but > not always in protein and nucleotide dbs. > > -Pankaj > > Michael Ashburner (Genetics) wrote: > > All > > > > Am I being thick or not ? It seems as if the obvious object to refer > > to, if Uniprot ID is not available, is the PID contained within GenBank > > EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned > > and gets over the problem that Val points to: > > 'it may be a problem to refer > > to the Genbank/EMBL accession number as this will often be a cosmid or > > contig and contain multiple CDS- in these cases you can't refer to the > > gene/protein uniquely with an EMBL ID.' > > > > Michael > > > > > From pj37 at cornell.edu Wed Jan 31 07:06:57 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Wed, 31 Jan 2007 10:06:57 -0500 Subject: use of Uniprot accession Vs GenBank Accession in With column In-Reply-To: References: Message-ID: <45C0B091.3020701@cornell.edu> Michael In my example the PID and the accession of the protein entry in GB is the same. The ACCESSION AY607689 is from the nucleotide db in GB and I agree that a nucleotide entry can have multiple PIDs/protein accessions. -Pankaj Michael Ashburner (Genetics) wrote: > Pankaj > > PID IS NOT the same as the GB/EMBL/DDBJ accession number. > You are correct in that the PID is this: > >>/protein_id="AAT37941.1" > > but the corresponding GB/EMBL/DDBJ accession number is, in the case you cite > ACCESSION AY607689 > > > Any single ACCESSION can have 0, 1 or >1 PID. The last is especially > true for genomic sequences. > > As has been pointed out to you if you look up an ACCESSION number > at the EBI you will see the UniProt xlink. So if you go to > www.ebi.ac.uk and paste in AY607689 into the top page query box > you will get this record which includes these data: > > T /codon_start=1 > FT /product="low temperature-induced low molecular weight > FT integral membrane protein LTI6a" > FT /note="OsLti6a" > FT /db_xref="GOA:Q8H5T6" > FT /db_xref="InterPro:IPR000612" > FT /db_xref="UniProtKB/Swiss-Prot:Q8H5T6" > FT /protein_id="AAT37941.1" > FT /translation="MADSTATCIDIILAIILPPLGVFFKFGCGIEFWICLLLTFFGYLP > FT GIIYAVWVITK" > > > Michael > > >>Envelope-to: ma11 at gen.cam.ac.uk >>Delivery-date: Wed, 31 Jan 2007 14:21:23 +0000 >>X-Cam-SpamDetails: scanned, SpamAssassin-3.1.7 (score=0) >>X-Cam-AntiVirus: No virus found >>X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ >>Date: Wed, 31 Jan 2007 09:20:38 -0500 >>From: Pankaj Jaiswal >>User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) > > Gecko/20030624 Netscape/7.1 (ax) > >>X-Accept-Language: en-us, en >>MIME-Version: 1.0 >>To: "Michael Ashburner (Genetics)" >>CC: go at genome.stanford.edu, val at sanger.ac.uk, edimmer at ebi.ac.uk, > > midori at ebi.ac.uk, kchris at genome.stanford.edu, annotation at genome.stanford.edu > >>Subject: Re: use of Uniprot accession Vs GenBank Accession in With column >>Content-Transfer-Encoding: 7bit >> >>Hi, >> >>PID is the same as GB/EMBL/DDBJ accession number >>e.g. >>/protein_id="AAT37941.1" >>referred in nucleotide entry >>http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=47717898 >>is the same as accession number in >>http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AAT37941 >>ACCESSION AAT37941 >>VERSION AAT37941.1 GI:47717899 >> >>The version is fine, that refers to any new updates in the entry and >>they are all tracked. However, in most cases it is not that significant. >> >>The problem I raised is also because it is a rare occurrence a citation >>refers to Uniprot accessions. Almost always they refer to GB/EMBL/DDBJ >>accessions. In that case a curator has to go and find out the possible >>Uniprot accession as Emily has suggested. This I think is extra >>curational load. There are other problems as well cited in this mail >>tread as well. So my suggestion is to adopt a universal system to always >>refer by an EMBL/GB/DDBJ accession number in the association files and >>some magic script should be able to link back to all the respective dbs >>and not just one source. >> >>On the other hand we should encourage the GB to provide Xrefs to the >>Uniprot accessions also. I have seen them in unigenes/genes/genomes but >>not always in protein and nucleotide dbs. >> >>-Pankaj >> >>Michael Ashburner (Genetics) wrote: >> >>>All >>> >>>Am I being thick or not ? It seems as if the obvious object to refer >>>to, if Uniprot ID is not available, is the PID contained within GenBank >>>EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned >>>and gets over the problem that Val points to: >>>'it may be a problem to refer >>>to the Genbank/EMBL accession number as this will often be a cosmid or >>>contig and contain multiple CDS- in these cases you can't refer to the >>>gene/protein uniquely with an EMBL ID.' >>> >>>Michael >>> >>> >> > > From ma11 at gen.cam.ac.uk Wed Jan 31 07:53:09 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Wed, 31 Jan 2007 15:53:09 +0000 (GMT) Subject: use of Uniprot accession Vs GenBank Accession in With column Message-ID: Pankaj You said: '>>PID is the same as GB/EMBL/DDBJ accession number' THIS IS NOT true, as I explained. The PID is indeed the _protein_ accession number. end of conversation :=) M From dhowe at cs.uoregon.edu Wed Jan 31 08:10:00 2007 From: dh