From suzi at berkeleybop.org Sun Apr 1 11:41:04 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Sun, 1 Apr 2007 11:41:04 -0700 Subject: Fwd: mitotic spindles References: <460BEA48.4070609@acoma.stanford.edu> Message-ID: Jason, Do you happen to know someone you could recommend for editing GO terms for mitotic spindle formation? -S Begin forwarded message: > From: Tanya Berardini > Date: March 29, 2007 9:33:12 AM PDT > To: J Clark > Cc: go list > Subject: Re: mitotic spindles > Reply-To: tberardi at acoma.Stanford.EDU > > An expert on spindle formation in general would be great. Meiotic > and mitotic. > > Thanks, > > Tanya > > > J Clark wrote: >> Hi, >> Does anybody out there know a lot about mitotic spindles? We'd >> like to have a mitotic spindle expert join a conference call >> sometime soon to help with some work on mitotic spindle sensu terms. >> Thanks, >> Jen > > -- > ---------------------------------------------------------------------- > -------------------- > Tanya Berardini, Ph.D. tberardi at acoma.stanford.edu > The Arabidopsis Information Resource FAX: (650) 325-6857 > Carnegie Institution of Washington Tel: (650) 325-1521 ext. 325 > Department of Plant Biology URL: http://arabidopsis.org/ > 260 Panama St. > Stanford, CA 94305 > ---------------------------------------------------------------------- > -------------------- > From midori at ebi.ac.uk Wed Apr 4 05:47:43 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Wed, 4 Apr 2007 13:47:43 +0100 (BST) Subject: What have the ontology developers been up to lately? Message-ID: Dear GOers, We have heard recently from several of you that it's not as easy as it should be to keep abreast of what is going on in the world of ontology development. To address this need, we plan to send an email to this list every month, with some highlights of recent and forthcoming work, and links to additional details. Please let us know how it works, and what else you'd like to see. Most importantly, if you have expertise in any of the areas that we are planning to work on, please let us know. - Is there a mailing list for the entire ontology development group? No, we haven't created a new list for content developers. Our reasons are: 1) The group of people who are interested in large-scale or otherwise important changes would be essentially the same as the existing GO list membership. 2) For smaller changes and nitty-gritty details, there isn't one group of people. Instead, there are several small groups whose members we have called upon because we know they have specific biological expertise. These small groups may only stay together until a particular set of tasks is complete. 3) We all have tasks that go beyond ontology development. Many of us are MOD annotators, and others are involved in other GOC groups. We don't need long e-mail threads discussing issues that we don't know, and may not even care, about. - So how *do* people find out what ontology developers are doing? We make lots of information on past, current, and planned ontology development activity available on the internal wiki; some topics are also represented on the public wiki. This includes: * Monthly reports summarizing ontology development * Detailed notes on large-scale efforts * A rough timeline for future content-related work All this good stuff is available at (or linked to): http://gocwiki.geneontology.org/index.php/Ontology_Development The most recent monthly report, for March 2007, is at: http://gocwiki.geneontology.org/index.php/Mar2007_GO_Ed_report Some highlights from March: * Collaboration with a group at Harvard & MIT doing "GO Engineering" based on information theory (http://gocwiki.geneontology.org/index.php/Collaboration_with_MIT_GO-Engineering) * Progress on renaming "sensu" terms to be more descriptive and less confusing (http://gocwiki.geneontology.org/index.php/Sensu_Main_Page) * Reorganization of transporter terms in the molecular function ontology (http://wiki.geneontology.org/index.php?title=Transporters) In April, we'll continue working on all of the above, and we plan to resume efforts to create links between function and process terms. Finally, remember that all the gory details of small- and medium-scale changes are available in the SourceForge Curator Requests tracker, and don't hesitate to contact us if you want to help out with ontology work in a particular area, or if you have any comments or questions about what's going on. Sincerely, Midori, David, and the GO Editors From ma11 at gen.cam.ac.uk Fri Apr 6 07:32:17 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Fri, 6 Apr 2007 15:32:17 +0100 Subject: What have the ontology developers been up to lately? Message-ID: Thanks guys. M From ma11 at gen.cam.ac.uk Sat Apr 7 03:48:08 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Sat, 7 Apr 2007 11:48:08 +0100 Subject: Of interest Message-ID: Do not ask how found this on Easter Saturday. Yes, I know, I should get a life: http://www.esi-topics.com/fmf/maps/march2007-map.html Michael From rama at genome.Stanford.EDU Sun Apr 8 14:04:39 2007 From: rama at genome.Stanford.EDU (Rama Balakrishnan) Date: Sun, 8 Apr 2007 14:04:39 -0700 Subject: missing GOIDs again Message-ID: The following GOIDs are missing from the obo file. 33100-NuA3 histone acetyltransferase complex 33062-Rhp55-Rhp57 complex 32979-protein insertion into mitochondrial membrane from inner side 32977-membrane insertase activity Thanks, Rama From rama at genome.Stanford.EDU Sun Apr 8 14:42:13 2007 From: rama at genome.Stanford.EDU (Rama Balakrishnan) Date: Sun, 8 Apr 2007 14:42:13 -0700 Subject: SORRY-Re: missing GOIDs again In-Reply-To: References: Message-ID: <05A9DD4F-09B6-4502-B20F-194C049953C9@genome.stanford.edu> I am extremely sorry. I was looking at the wrong version of the obo file. Everything is okay. Sorry again. Rama On Apr 8, 2007, at 2:04 PM, Rama Balakrishnan wrote: > > The following GOIDs are missing from the obo file. > > 33100-NuA3 histone acetyltransferase complex > 33062-Rhp55-Rhp57 complex > 32979-protein insertion into mitochondrial membrane from inner side > 32977-membrane insertase activity > > Thanks, > > Rama From suzi at berkeleybop.org Sun Apr 8 15:20:12 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Sun, 8 Apr 2007 15:20:12 -0700 Subject: What have the ontology developers been up to lately? In-Reply-To: References: Message-ID: <79F6CFB1-417E-4E3A-8BCC-A1E1FC87272E@berkeleybop.org> Hi Midori, Thanks for this. Extremely useful as always. BTW, I noticed that the links to the individual reports from this page: http://gocwiki.geneontology.org/index.php/ GO_Editorial_Office_reports#2006 had not had any reports added since Oct '06. I added the missing links to the individual monthly reports, but thought that you would want to know for this month and beyond. cheers, S On Apr 4, 2007, at 5:47 AM, Midori Harris wrote: > Dear GOers, > > We have heard recently from several of you that it's not as easy as > it should be to keep abreast of what is going on in the world of > ontology development. To address this need, we plan to send an > email to this list every month, with some highlights of recent and > forthcoming work, and links to additional details. Please let us > know how it works, and what else you'd like to see. Most > importantly, if you have expertise in any of the areas that we are > planning to work on, please let us know. > > - Is there a mailing list for the entire ontology development group? > > No, we haven't created a new list for content developers. Our > reasons are: 1) The group of people who are interested in large- > scale or otherwise important changes would be essentially the same > as the existing GO list membership. 2) For smaller changes and > nitty-gritty details, there isn't one group of people. Instead, > there are several small groups whose members we have called upon > because we know they have specific biological expertise. These > small groups may only stay together until a particular set of tasks > is complete. 3) We all have tasks that go beyond ontology > development. Many of us are MOD annotators, and others are involved > in other GOC groups. We don't need long e-mail threads discussing > issues that we don't know, and may not even care, about. > > - So how *do* people find out what ontology developers are doing? > > We make lots of information on past, current, and planned ontology > development activity available on the internal wiki; some topics > are also represented on the public wiki. This includes: > > * Monthly reports summarizing ontology development > * Detailed notes on large-scale efforts > * A rough timeline for future content-related work > > All this good stuff is available at (or linked to): > > http://gocwiki.geneontology.org/index.php/Ontology_Development > > The most recent monthly report, for March 2007, is at: > > http://gocwiki.geneontology.org/index.php/Mar2007_GO_Ed_report > > Some highlights from March: > > * Collaboration with a group at Harvard & MIT doing "GO > Engineering" based on information theory (http:// > gocwiki.geneontology.org/index.php/Collaboration_with_MIT_GO- > Engineering) > > * Progress on renaming "sensu" terms to be more descriptive and > less confusing (http://gocwiki.geneontology.org/index.php/ > Sensu_Main_Page) > > * Reorganization of transporter terms in the molecular function > ontology (http://wiki.geneontology.org/index.php?title=Transporters) > > In April, we'll continue working on all of the above, and we plan > to resume efforts to create links between function and process terms. > > Finally, remember that all the gory details of small- and medium- > scale changes are available in the SourceForge Curator Requests > tracker, and don't hesitate to contact us if you want to help out > with ontology work in a particular area, or if you have any > comments or questions about what's going on. > > Sincerely, > Midori, David, and the GO Editors > > From suzi at berkeleybop.org Sun Apr 8 15:26:26 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Sun, 8 Apr 2007 15:26:26 -0700 Subject: Of interest In-Reply-To: References: Message-ID: nifty. if it is still around, might be something to include in the next newsletter. -S On Apr 7, 2007, at 3:48 AM, Michael Ashburner (Genetics) wrote: > > Do not ask how found this on Easter Saturday. Yes, I know, I should > get a life: > > http://www.esi-topics.com/fmf/maps/march2007-map.html > > Michael > From midori at ebi.ac.uk Mon Apr 9 02:53:21 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Mon, 9 Apr 2007 10:53:21 +0100 (BST) Subject: What have the ontology developers been up to lately? In-Reply-To: <79F6CFB1-417E-4E3A-8BCC-A1E1FC87272E@berkeleybop.org> References: <79F6CFB1-417E-4E3A-8BCC-A1E1FC87272E@berkeleybop.org> Message-ID: Hi, As is explained in the blurb on the main ontology development page, the report format and perspective changed between October and November 2006, with the different versions listed in the two pages linked from the main page. There should not, therefore, be any links to reports after October 2006 on this page: http://gocwiki.geneontology.org/index.php/GO_Editorial_Office_reports because reports from November 2006 onwards are linked to this page: http://gocwiki.geneontology.org/index.php/Combined_Ontology_Development_and_GO_Editorial_Office_reports I have replaced the spurious links on the first page with a single link to the second page. midori On Sun, 8 Apr 2007, Suzanna Lewis wrote: > Hi Midori, > > Thanks for this. Extremely useful as always. > > BTW, I noticed that the links to the individual reports from this page: > > http://gocwiki.geneontology.org/index.php/GO_Editorial_Office_reports#2006 > > had not had any reports added since Oct '06. I added the missing links to the > individual monthly reports, but thought that you would want to know for this > month and beyond. > > cheers, S > From ma11 at gen.cam.ac.uk Mon Apr 9 05:07:10 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Mon, 9 Apr 2007 13:07:10 +0100 Subject: GO on SF Message-ID: I see that GO is the top (of 950) Bioinformatics projects on SF and ranks 124 (out of >180,000) of ALL sf projects. GmOD is rank 330 of all sf projects). Not bad folks ! M From ma11 at gen.cam.ac.uk Mon Apr 9 07:53:17 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner) Date: Mon, 9 Apr 2007 15:53:17 +0100 Subject: Database Registrary In-Reply-To: References: Message-ID: Colleagues Sorry this took so long. I attach a pretty colored spreadsheat of the database abbreviations used by the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). A: name of resource B: object referred to within a resource C: blank D: GO abbreviation E: UniProt abbreviation F: INSD abbreviation G,H blank I: url of resource J: suggestions for edits to db_xref abbreviation. Those is RED are the same in all three sources. Those in blue are used by only two sources and are the same in both Those in yellow are used by only one source Those in green are used by >1 source and differ. Some principles. Where are recourse produces >1 database or has >1 class of object identifier I use an A_B syntax, where A is the resource and B is the class of object within that resource; several of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". In Column J I have made _suggestions_ as to the changes we could make that would bring us all into agreement in those cases where we share a db_xref. There may be errors in column B, which was quite hard to do consistently. I regard this as a rough draft and would like to have feedback on the general idea and on details. I also now have a set of abbreviations from ArrayExpress but have not incorportated them. Also Peter, I never heard from the BioPax guys, can you chase this for me ? Best - Michael -------------- next part -------------- A non-text attachment was scrubbed... Name: db-reg.xls Type: application/octet-stream Size: 64000 bytes Desc: not available Url : http://fafner.stanford.edu/pipermail/go/attachments/20070409/2b7c88f9/attachment.obj -------------- next part -------------- From suzi at berkeleybop.org Mon Apr 9 12:02:37 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Mon, 9 Apr 2007 12:02:37 -0700 Subject: What have the ontology developers been up to lately? In-Reply-To: References: <79F6CFB1-417E-4E3A-8BCC-A1E1FC87272E@berkeleybop.org> Message-ID: <838BDF6A-AE05-42CC-823F-C0DAAB6A9CC2@berkeleybop.org> Now, I'm even more confused. If I 1. start from the main gocwiki main/home page. 2. click on "ontology development" on the left menu 3. click on "Reports" in the content box at the top of that page 4. Ahh... The fact that it is just dates that are different, rather than the type content is very confusing. - On Apr 9, 2007, at 2:53 AM, Midori Harris wrote: > Hi, > > As is explained in the blurb on the main ontology development page, > the report format and perspective changed between October and > November 2006, with the different versions listed in the two pages > linked from the main page. > > There should not, therefore, be any links to reports after October > 2006 on this page: > > http://gocwiki.geneontology.org/index.php/ > GO_Editorial_Office_reports > > because reports from November 2006 onwards are linked to this page: > > http://gocwiki.geneontology.org/index.php/ > Combined_Ontology_Development_and_GO_Editorial_Office_reports > > I have replaced the spurious links on the first page with a single > link to the second page. > > midori > > On Sun, 8 Apr 2007, Suzanna Lewis wrote: > >> Hi Midori, >> >> Thanks for this. Extremely useful as always. >> >> BTW, I noticed that the links to the individual reports from this >> page: >> >> http://gocwiki.geneontology.org/index.php/ >> GO_Editorial_Office_reports#2006 >> >> had not had any reports added since Oct '06. I added the missing >> links to the individual monthly reports, but thought that you >> would want to know for this month and beyond. >> >> cheers, S >> > From suzi at berkeleybop.org Mon Apr 9 12:51:19 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Mon, 9 Apr 2007 12:51:19 -0700 Subject: Position for immune system reactome curator at EBI References: <005901c7760a$838fd320$c34416ac@bernardf8608a5> Message-ID: <58BB21AE-D05F-4E31-8C86-533176E3EA51@berkeleybop.org> Hi all, The Reactome group is searching for a new curator to work on describing pathways involving the immune system. More information can be found here: http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/ 07032.html Apologies, but we do circulate this position to the widest number of people possible who might be interested. Best regards, Suzanna From cjm at fruitfly.org Mon Apr 9 18:27:08 2007 From: cjm at fruitfly.org (Chris Mungall) Date: Mon, 9 Apr 2007 18:27:08 -0700 Subject: Database Registrary In-Reply-To: References: Message-ID: <3F11CC4C-5A9C-4237-90E2-8E7DBAB7B2B1@fruitfly.org> Thanks Michael! I have a few questions Scope: Do we need to define the scope of the shared registry? This may seem an odd question - I think most of us have an implicit understanding of the kinds of databases it's appropriate to include. However, data is becoming ever more interlinked, and some of us are starting to think about databases outside the traditional bio-realm, including sources of clinical and biomedical data, and geographic databases e.g. for recording sources of environmental samples. Abbreviation Scheme: All prefixes indicate the name of the organisation or database that is in charge of the ID-space, with the exception of 'taxon' and 'LocusID'. I think it would be good to encourage a general pattern that encourages prefixes that unambiguously identify the organisation, but I'm happy for there to be exceptions here. It seems the abbrev scheme can combine 3 pieces of information - (i) the organisation or database in charge of the ID-space (or the sub- database within that organisation), (ii) the type of entity represented in that ID-space (e.g. gene vs publication), and (iii) what kind of identifier it is - sometimes entities are unambiguously identified by both a numeric identifier scheme and a more human- friendly symbol. For example, the MOD for tetrahymena has: TGD - ID-space for gene/locus accessions/identifiers TGD_Locus - ID-space for gene/locus symbols TGD_Ref - ID-space for reference/publication identifiers (contrast with FlyBase, which uses a single ID-space) Is it a good thing to encourage this kind of overloading? It is perhaps not such a concern so long as consistent conventions are followed - such as use of the underscore to subdivide an ID-space. Would it be overly prescriptive to formalize these conventions as a set of recommendations for database providers? One reason why this is a concern is that local identifiers often get separated from their context - this generally doesn't cause a problem for humans, but it could do for machines. Technical: We don't explicitly state any syntactic rules for the abbreviations, but there are probably some implicit rules, such as having no whitespace. Would it be a good idea to properly specify this? It will make machine processing and acting on identifiers more robust. There would be some advantages in making the syntax conform to XML namespace tokens: http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Nmtoken The main effect on existing abbreviations would be excluding the use of '/', as used in INSD. Of course, we don't have to go with the nmtoken scheme, but I think it is good to have some kind of convention. Cheers Chris On Apr 9, 2007, at 7:53 AM, Michael Ashburner wrote: > > Colleagues > > Sorry this took so long. I attach a pretty colored spreadsheat of > the database abbreviations used by > the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). > > A: name of resource > B: object referred to within a resource > C: blank > D: GO abbreviation > E: UniProt abbreviation > F: INSD abbreviation > G,H blank > I: url of resource > J: suggestions for edits to db_xref abbreviation. > > Those is RED are the same in all three sources. > Those in blue are used by only two sources and are the same in both > Those in yellow are used by only one source > Those in green are used by >1 source and differ. > > Some principles. > > Where are recourse produces >1 database or has >1 class of object > identifier I > use an A_B syntax, where A is the resource and B is the class of > object within that resource; several > of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". > > In Column J I have made _suggestions_ as to the changes we could > make that would bring us > all into agreement in those cases where we share a db_xref. > > There may be errors in column B, which was quite hard to do > consistently. > > I regard this as a rough draft and would like to have feedback on > the general idea and > on details. I also now have a set of abbreviations from > ArrayExpress but have not incorportated > them. Also Peter, I never heard from the BioPax guys, can you chase > this for me ? > > Best - Michael > > From camon at ebi.ac.uk Tue Apr 10 05:00:19 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Tue, 10 Apr 2007 13:00:19 +0100 Subject: New GOA Baby Boy Message-ID: <461B7C53.3030707@ebi.ac.uk> Hi Everyone, We would just like to publicly congratulate Rachael and husband Darren for the safe arrival of their new baby boy (Sam Gabriel Huntley) on Thursday night. Great news. The GOA Team -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From aji at ebi.ac.uk Tue Apr 10 05:33:49 2007 From: aji at ebi.ac.uk (Amelia Ireland) Date: Tue, 10 Apr 2007 13:33:49 +0100 (BST) Subject: New GOA Baby Boy In-Reply-To: <461B7C53.3030707@ebi.ac.uk> Message-ID: Back in Gotham City, Evelyn Camon wrote: >Hi Everyone, > >We would just like to publicly congratulate Rachael and husband Darren >for the safe arrival of their new baby boy (Sam Gabriel Huntley) on >Thursday night. Is there any indication yet as to which aspects of the GOA project Sam will be working on? ;) Congratulations to Rachael and Darren! -- Amelia Ireland GO Editorial Office, European Bioinformatics Institute, UK. Carbon neutral driving: http://www.targetneutral.com/TONIC/index.jsp From camon at ebi.ac.uk Tue Apr 10 05:41:13 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Tue, 10 Apr 2007 13:41:13 +0100 Subject: New GOA Baby Boy References: Message-ID: <461B85E9.8020902@ebi.ac.uk> He will be pressing buttons like the rest of us..at least thats what my daughter says Evelyn Amelia Ireland wrote: > Back in Gotham City, Evelyn Camon wrote: > > >>Hi Everyone, >> >>We would just like to publicly congratulate Rachael and husband Darren >>for the safe arrival of their new baby boy (Sam Gabriel Huntley) on >>Thursday night. > > > Is there any indication yet as to which aspects of the GOA project Sam > will be working on? ;) > > Congratulations to Rachael and Darren! > -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From erika at cribi.unipd.it Tue Apr 10 05:40:03 2007 From: erika at cribi.unipd.it (Erika Feltrin) Date: Tue, 10 Apr 2007 14:40:03 +0200 Subject: New GOA Baby Boy In-Reply-To: <461B7C53.3030707@ebi.ac.uk> References: <461B7C53.3030707@ebi.ac.uk> Message-ID: <1176208803.10671.70.camel@sophie.cribi.unipd.it> Congratulation to Rachael and Darren! We are all waiting for a photo :-) Erika Il giorno mar, 10/04/2007 alle 13.00 +0100, Evelyn Camon ha scritto: > Hi Everyone, > > We would just like to publicly congratulate Rachael and husband Darren > for the safe arrival of their new baby boy (Sam Gabriel Huntley) on > Thursday night. > > Great news. > > The GOA Team -- Erika Feltrin PhD student University of Padova Tel: +39 049 827 6165 Fax: +39 049 827 6159 La vita ? quello che ti capita mentre stai facendo altri progetti (J. Lennon) From suzi at berkeleybop.org Tue Apr 10 10:23:34 2007 From: suzi at berkeleybop.org (Suzanna Lewis) Date: Tue, 10 Apr 2007 10:23:34 -0700 Subject: Database Registrary In-Reply-To: <461BA78F.5090906@ebi.ac.uk> References: <461B9302.5010007@ebi.ac.uk> <461BA78F.5090906@ebi.ac.uk> Message-ID: <0E11E802-4A62-4EEA-B717-26B23B64C72A@berkeleybop.org> Do we also have (or can we reach some) agreement on the issues that Chris posed regard the prefixes? 1. Overloading of the prefix, yea or nay (see his post for the discussion) For example, the MOD for tetrahymena has: TGD - ID-space for gene/locus accessions/identifiers TGD_Locus - ID-space for gene/locus symbols TGD_Ref - ID-space for reference/publication identifiers 2. No whitespace (that is, directly specify this to make it simpler for computers to use them) 3. Exclude certain characters like '/' to make it easier to use the prefixes inside XML docs -S On Apr 10, 2007, at 8:04 AM, apweiler wrote: > I agree with you completly, Jim. > > Thanks for the clarification. > > Rolf > > > Ostell, Jim (NIH/NLM/NCBI) [E] wrote: > >> Based on your response, Rolf, I am not completely sure you >> understood my >> suggestion. It has two parts: >> >> 1) Usage by the databases: We all agree to use the same case. >> 2) Semantic Meaning of dbxrefs: semantic meaning is case insensitive. >> >> So (1) implies that all databases would produce records with >> "\dbxref=MixedCase". >> >> (2) implies that if someone queried our database with either >> "MIXEDCASE", "mixedcase", or "MixedCase", we would return the single >> record "MixedCase". >> >> (2) also implies that we would NOT use "MixedCase" to mean one thing, >> "MIXEDCASE" to mean something different, and "mixedcase" to mean a >> third >> thing. >> >> So, after my attempt to clarify, do you still object? >> >> Jim >> >> >>> -----Original Message----- >>> From: apweiler [mailto:rolf.apweiler at gmail.com] >>> Sent: Tuesday, April 10, 2007 9:37 AM >>> To: Michael Ashburner >>> Cc: Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; >>> >> parkinson at ebi.ac.uk; Peter >> >>> D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis; Rolf >>> >> Apweiler >> >>> Subject: Re: Database Registrary >>> >>> Hi Michael et al, >>> I attach a Excel spreadsheet with some remarks by me in column M. I >>> >> also >> >>> think that the TIGR abbreviations deserve some more thinking. I >>> am fine >>> with all other suggestions you made. And finally my view on Jim >>> >> Ostell's >> >>> remark about the use of all UPPERCASE or MixedCase: I believe we >>> should >>> all use the same casing. I care about uniprotkb vs UNIPROTKB vs >>> >> UniProtKB. >> >>> Cheers >>> >>> Rolf >>> >>> >>> >>> >>>> Colleagues >>>> >>>> Sorry this took so long. I attach a pretty colored spreadsheat >>>> of the >>>> database abbreviations used by >>>> the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >>>> >>>> A: name of resource >>>> B: object referred to within a resource >>>> C: blank >>>> D: GO abbreviation >>>> E: UniProt abbreviation >>>> F: INSD abbreviation >>>> G,H blank >>>> I: url of resource >>>> J: suggestions for edits to db_xref abbreviation. >>>> >>>> Those is RED are the same in all three sources. >>>> Those in blue are used by only two sources and are the same in both >>>> Those in yellow are used by only one source >>>> Those in green are used by >1 source and differ. >>>> >>>> Some principles. >>>> >>>> Where are recourse produces >1 database or has >1 class of object >>>> identifier I >>>> use an A_B syntax, where A is the resource and B is the class of >>>> object within that resource; several >>>> of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >>>> >>>> In Column J I have made _suggestions_ as to the changes we could >>>> make >>>> that would bring us >>>> all into agreement in those cases where we share a db_xref. >>>> >>>> There may be errors in column B, which was quite hard to do >>>> consistently. >>>> >>>> I regard this as a rough draft and would like to have feedback >>>> on the >>>> general idea and >>>> on details. I also now have a set of abbreviations from >>>> ArrayExpress >>>> but have not incorportated >>>> them. Also Peter, I never heard from the BioPax guys, can you chase >>>> this for me ? >>>> >>>> Best - Michael >>>> >>>> >> >> > > From ma11 at gen.cam.ac.uk Tue Apr 10 13:46:14 2007 From: ma11 at gen.cam.ac.uk (Michael Ashburner (Genetics)) Date: Tue, 10 Apr 2007 21:46:14 +0100 Subject: Database Registrary Message-ID: I am sure they do ! I will write this all up in about 10 days, just off for a week in Cyprus. Thanks for all the input. Michael From midori at ebi.ac.uk Thu Apr 12 09:02:22 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 12 Apr 2007 17:02:22 +0100 (BST) Subject: Alert: proposal to obsolete one cellular component term (no annotations affected) Message-ID: The proposal has been made to obsolete peroxisome targeting signal receptor complex ; GO:0005781 This term is not used in annotations or mappings, so it will be made obsolete in one week (instead of two). The reason for this proposal is that this term represents a single polypeptide rather than a complex; all known peroxisome targeting signal receptors are monomeric. SourceForge link: https://sourceforge.net/tracker/index.php?func=detail&aid=1699380&group_id=36855&atid=440764 From cherry at stanford.edu Sat Apr 14 09:45:32 2007 From: cherry at stanford.edu (Mike Cherry) Date: Sat, 14 Apr 2007 09:45:32 -0700 Subject: Database Registrary In-Reply-To: <0E11E802-4A62-4EEA-B717-26B23B64C72A@berkeleybop.org> References: <461B9302.5010007@ebi.ac.uk> <461BA78F.5090906@ebi.ac.uk> <0E11E802-4A62-4EEA-B717-26B23B64C72A@berkeleybop.org> Message-ID: <01B1A675-CC3F-4A70-8EE8-765C2B09F3D5@stanford.edu> The multiple prefixes were created by the GOC to provide a trivial mapping between IDs and the correct URL for hyperlinks. TGD_LOCUS (or SGD_LOCUS and CGD_LOCUS) are only used in the gene association file (they are not used in any other venue by these projects) simply so AmiGO hyperlinks to the correct web page. We can deal with the hyperlinks for those in AmiGO in other ways, primarily we know the object class because of the field used in the association files. A more difficult task is requiring NCBI, NCBI_GP, NCBI_NM and NCBI_NP to all be NCBI. The accession numbers for the later two are easy as they have a built-in prefix. However, the difference between accession numbers used with NCBI and NCBI_GP are more difficult. For sure no whitespace, just makes all this so much easier for computers and humans checking the data files. We also should think ahead and limit the characters to those that will not cause problems for XML and other formats. -Mike Based on column 1 of the spreadsheet the following resources have multiple rows. The number indicates the number of rows with the came name in column A of the spreadsheet. This includes those with overloaded prefixes and those where unification of the prefix is needed. 5 Gramene 4 NCBI Clusters of Orthologous Groups 3 The University of Minnesota Biocatalysis/Biodegradation Database 3 The EcoGene Database of Escherichia coli Sequence and Function 3 Tetrahymena Genome Database 3 Saccharomyces Genome Database 3 H-invitational database 3 DictyBase 3 Candida Genome Database 3 American Type Culture Collection database 3 AGRICultural OnLine Access 2 WormBase, database of nematode biology 2 The Pharmacogenetics and Pharmacogenomics Knowledge Base 2 The Institute for Genomic Research 2 National Center for Biotechnology Information, Bethesda - RefSeq 2 National Center for Biotechnology Information, Bethesda 2 Mouse Genome Informatics 2 Maize Genome Database 2 MEROPS - the Peptidase Database 2 Kyoto Encyclopedia of Genes and Genomes 2 IMGT 2 Gene Ontology 2 FlyBase 2 Bacillus subtilis database On Apr 10, 2007, at 10:23 AM, Suzanna Lewis wrote: > Do we also have (or can we reach some) agreement on the issues that > Chris posed regard the prefixes? > > 1. Overloading of the prefix, yea or nay (see his post for the > discussion) > For example, the MOD for tetrahymena has: > TGD - ID-space for gene/locus accessions/identifiers > TGD_Locus - ID-space for gene/locus symbols > TGD_Ref - ID-space for reference/publication identifiers > > 2. No whitespace (that is, directly specify this to make it simpler > for computers to use them) > > 3. Exclude certain characters like '/' to make it easier to use the > prefixes inside XML docs > > -S > >>>> >>>>> Colleagues >>>>> >>>>> Sorry this took so long. I attach a pretty colored spreadsheat >>>>> of the >>>>> database abbreviations used by >>>>> the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >>>>> >>>>> A: name of resource >>>>> B: object referred to within a resource >>>>> C: blank >>>>> D: GO abbreviation >>>>> E: UniProt abbreviation >>>>> F: INSD abbreviation >>>>> G,H blank >>>>> I: url of resource >>>>> J: suggestions for edits to db_xref abbreviation. >>>>> >>>>> Those is RED are the same in all three sources. >>>>> Those in blue are used by only two sources and are the same in >>>>> both >>>>> Those in yellow are used by only one source >>>>> Those in green are used by >1 source and differ. >>>>> >>>>> Some principles. >>>>> >>>>> Where are recourse produces >1 database or has >1 class of object >>>>> identifier I >>>>> use an A_B syntax, where A is the resource and B is the class of >>>>> object within that resource; several >>>>> of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >>>>> >>>>> In Column J I have made _suggestions_ as to the changes we >>>>> could make >>>>> that would bring us >>>>> all into agreement in those cases where we share a db_xref. >>>>> >>>>> There may be errors in column B, which was quite hard to do >>>>> consistently. >>>>> >>>>> I regard this as a rough draft and would like to have feedback >>>>> on the >>>>> general idea and >>>>> on details. I also now have a set of abbreviations from >>>>> ArrayExpress >>>>> but have not incorportated >>>>> them. Also Peter, I never heard from the BioPax guys, can you >>>>> chase >>>>> this for me ? >>>>> >>>>> Best - Michael >>>>> From eurie at genome.Stanford.EDU Mon Apr 16 09:56:38 2007 From: eurie at genome.Stanford.EDU (Eurie Hong) Date: Mon, 16 Apr 2007 09:56:38 -0700 Subject: April 11th manager's minutes Message-ID: <5B71EC22-6427-4977-9EFE-65B3ECD96AB7@genome.stanford.edu> The minutes from Wednesday's GO Managers conference call are now on the internal wiki: http://gocwiki.geneontology.org/index.php/Managers_11Apr07 If you would like a particular issue to be discussed at the next managers' call, please contact the relevant manager(s): Reference Genomes: Rex User Advocacy: Jane and Eurie Content Development: Midori and David Annotation Outreach: Jennifer Software: Chris For general management and budget issues, contact the GO PIs . Thanks, eurie From ostell at ncbi.nlm.nih.gov Tue Apr 10 12:41:44 2007 From: ostell at ncbi.nlm.nih.gov (Ostell, Jim (NIH/NLM/NCBI) [E]) Date: Tue, 10 Apr 2007 15:41:44 -0400 Subject: Database Registrary In-Reply-To: <0E11E802-4A62-4EEA-B717-26B23B64C72A@berkeleybop.org> References: <461B9302.5010007@ebi.ac.uk> <461BA78F.5090906@ebi.ac.uk> <0E11E802-4A62-4EEA-B717-26B23B64C72A@berkeleybop.org> Message-ID: 1) Yeah 2) Yeah 3) Yeah (to quote the Beatles, who must also agree with Chris). Jim Ostell >-----Original Message----- >From: Suzanna Lewis [mailto:suzi at berkeleybop.org] >Sent: Tuesday, April 10, 2007 1:24 PM >To: apweiler at ebi.ac.uk >Cc: Ostell, Jim (NIH/NLM/NCBI) [E]; Michael Ashburner; Amos Bairoch; GO LIST; >parkinson at ebi.ac.uk; Peter D'Eustachio; jane at ebi.ac.uk; Midori Harris >Subject: Re: Database Registrary > >Do we also have (or can we reach some) agreement on the issues that >Chris posed regard the prefixes? > >1. Overloading of the prefix, yea or nay (see his post for the >discussion) >For example, the MOD for tetrahymena has: >TGD - ID-space for gene/locus accessions/identifiers >TGD_Locus - ID-space for gene/locus symbols >TGD_Ref - ID-space for reference/publication identifiers > >2. No whitespace (that is, directly specify this to make it simpler >for computers to use them) > >3. Exclude certain characters like '/' to make it easier to use the >prefixes inside XML docs > >-S > >On Apr 10, 2007, at 8:04 AM, apweiler wrote: > >> I agree with you completly, Jim. >> >> Thanks for the clarification. >> >> Rolf >> >> >> Ostell, Jim (NIH/NLM/NCBI) [E] wrote: >> >>> Based on your response, Rolf, I am not completely sure you >>> understood my >>> suggestion. It has two parts: >>> >>> 1) Usage by the databases: We all agree to use the same case. >>> 2) Semantic Meaning of dbxrefs: semantic meaning is case insensitive. >>> >>> So (1) implies that all databases would produce records with >>> "\dbxref=MixedCase". >>> >>> (2) implies that if someone queried our database with either >>> "MIXEDCASE", "mixedcase", or "MixedCase", we would return the single >>> record "MixedCase". >>> >>> (2) also implies that we would NOT use "MixedCase" to mean one thing, >>> "MIXEDCASE" to mean something different, and "mixedcase" to mean a >>> third >>> thing. >>> >>> So, after my attempt to clarify, do you still object? >>> >>> Jim >>> >>> >>>> -----Original Message----- >>>> From: apweiler [mailto:rolf.apweiler at gmail.com] >>>> Sent: Tuesday, April 10, 2007 9:37 AM >>>> To: Michael Ashburner >>>> Cc: Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; >>>> >>> parkinson at ebi.ac.uk; Peter >>> >>>> D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis; Rolf >>>> >>> Apweiler >>> >>>> Subject: Re: Database Registrary >>>> >>>> Hi Michael et al, >>>> I attach a Excel spreadsheet with some remarks by me in column M. I >>>> >>> also >>> >>>> think that the TIGR abbreviations deserve some more thinking. I >>>> am fine >>>> with all other suggestions you made. And finally my view on Jim >>>> >>> Ostell's >>> >>>> remark about the use of all UPPERCASE or MixedCase: I believe we >>>> should >>>> all use the same casing. I care about uniprotkb vs UNIPROTKB vs >>>> >>> UniProtKB. >>> >>>> Cheers >>>> >>>> Rolf >>>> >>>> >>>> >>>> >>>>> Colleagues >>>>> >>>>> Sorry this took so long. I attach a pretty colored spreadsheat >>>>> of the >>>>> database abbreviations used by >>>>> the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >>>>> >>>>> A: name of resource >>>>> B: object referred to within a resource >>>>> C: blank >>>>> D: GO abbreviation >>>>> E: UniProt abbreviation >>>>> F: INSD abbreviation >>>>> G,H blank >>>>> I: url of resource >>>>> J: suggestions for edits to db_xref abbreviation. >>>>> >>>>> Those is RED are the same in all three sources. >>>>> Those in blue are used by only two sources and are the same in both >>>>> Those in yellow are used by only one source >>>>> Those in green are used by >1 source and differ. >>>>> >>>>> Some principles. >>>>> >>>>> Where are recourse produces >1 database or has >1 class of object >>>>> identifier I >>>>> use an A_B syntax, where A is the resource and B is the class of >>>>> object within that resource; several >>>>> of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >>>>> >>>>> In Column J I have made _suggestions_ as to the changes we could >>>>> make >>>>> that would bring us >>>>> all into agreement in those cases where we share a db_xref. >>>>> >>>>> There may be errors in column B, which was quite hard to do >>>>> consistently. >>>>> >>>>> I regard this as a rough draft and would like to have feedback >>>>> on the >>>>> general idea and >>>>> on details. I also now have a set of abbreviations from >>>>> ArrayExpress >>>>> but have not incorportated >>>>> them. Also Peter, I never heard from the BioPax guys, can you chase >>>>> this for me ? >>>>> >>>>> Best - Michael >>>>> >>>>> >>> >>> >> >> From ostell at ncbi.nlm.nih.gov Tue Apr 10 07:57:21 2007 From: ostell at ncbi.nlm.nih.gov (Ostell, Jim (NIH/NLM/NCBI) [E]) Date: Tue, 10 Apr 2007 10:57:21 -0400 Subject: Database Registrary In-Reply-To: <461B9302.5010007@ebi.ac.uk> References: <461B9302.5010007@ebi.ac.uk> Message-ID: Based on your response, Rolf, I am not completely sure you understood my suggestion. It has two parts: 1) Usage by the databases: We all agree to use the same case. 2) Semantic Meaning of dbxrefs: semantic meaning is case insensitive. So (1) implies that all databases would produce records with "\dbxref=MixedCase". (2) implies that if someone queried our database with either "MIXEDCASE", "mixedcase", or "MixedCase", we would return the single record "MixedCase". (2) also implies that we would NOT use "MixedCase" to mean one thing, "MIXEDCASE" to mean something different, and "mixedcase" to mean a third thing. So, after my attempt to clarify, do you still object? Jim >-----Original Message----- >From: apweiler [mailto:rolf.apweiler at gmail.com] >Sent: Tuesday, April 10, 2007 9:37 AM >To: Michael Ashburner >Cc: Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; parkinson at ebi.ac.uk; Peter >D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis; Rolf Apweiler >Subject: Re: Database Registrary > >Hi Michael et al, >I attach a Excel spreadsheet with some remarks by me in column M. I also >think that the TIGR abbreviations deserve some more thinking. I am fine >with all other suggestions you made. And finally my view on Jim Ostell's >remark about the use of all UPPERCASE or MixedCase: I believe we should >all use the same casing. I care about uniprotkb vs UNIPROTKB vs UniProtKB. > >Cheers > >Rolf > > > >> >> Colleagues >> >> Sorry this took so long. I attach a pretty colored spreadsheat of the >> database abbreviations used by >> the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >> >> A: name of resource >> B: object referred to within a resource >> C: blank >> D: GO abbreviation >> E: UniProt abbreviation >> F: INSD abbreviation >> G,H blank >> I: url of resource >> J: suggestions for edits to db_xref abbreviation. >> >> Those is RED are the same in all three sources. >> Those in blue are used by only two sources and are the same in both >> Those in yellow are used by only one source >> Those in green are used by >1 source and differ. >> >> Some principles. >> >> Where are recourse produces >1 database or has >1 class of object >> identifier I >> use an A_B syntax, where A is the resource and B is the class of >> object within that resource; several >> of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >> >> In Column J I have made _suggestions_ as to the changes we could make >> that would bring us >> all into agreement in those cases where we share a db_xref. >> >> There may be errors in column B, which was quite hard to do >> consistently. >> >> I regard this as a rough draft and would like to have feedback on the >> general idea and >> on details. I also now have a set of abbreviations from ArrayExpress >> but have not incorportated >> them. Also Peter, I never heard from the BioPax guys, can you chase >> this for me ? >> >> Best - Michael >> From rolf.apweiler at gmail.com Tue Apr 10 06:37:06 2007 From: rolf.apweiler at gmail.com (apweiler) Date: Tue, 10 Apr 2007 14:37:06 +0100 Subject: Database Registrary In-Reply-To: References: Message-ID: <461B9302.5010007@ebi.ac.uk> Hi Michael et al, I attach a Excel spreadsheet with some remarks by me in column M. I also think that the TIGR abbreviations deserve some more thinking. I am fine with all other suggestions you made. And finally my view on Jim Ostell's remark about the use of all UPPERCASE or MixedCase: I believe we should all use the same casing. I care about uniprotkb vs UNIPROTKB vs UniProtKB. Cheers Rolf > > Colleagues > > Sorry this took so long. I attach a pretty colored spreadsheat of the > database abbreviations used by > the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). > > A: name of resource > B: object referred to within a resource > C: blank > D: GO abbreviation > E: UniProt abbreviation > F: INSD abbreviation > G,H blank > I: url of resource > J: suggestions for edits to db_xref abbreviation. > > Those is RED are the same in all three sources. > Those in blue are used by only two sources and are the same in both > Those in yellow are used by only one source > Those in green are used by >1 source and differ. > > Some principles. > > Where are recourse produces >1 database or has >1 class of object > identifier I > use an A_B syntax, where A is the resource and B is the class of > object within that resource; several > of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". > > In Column J I have made _suggestions_ as to the changes we could make > that would bring us > all into agreement in those cases where we share a db_xref. > > There may be errors in column B, which was quite hard to do > consistently. > > I regard this as a rough draft and would like to have feedback on the > general idea and > on details. I also now have a set of abbreviations from ArrayExpress > but have not incorportated > them. Also Peter, I never heard from the BioPax guys, can you chase > this for me ? > > Best - Michael > -------------- next part -------------- A non-text attachment was scrubbed... Name: db-reg.xls Type: application/vnd.ms-excel Size: 61440 bytes Desc: not available Url : http://fafner.stanford.edu/pipermail/go/attachments/20070410/03585370/attachment.xls From rolf.apweiler at gmail.com Tue Apr 10 08:04:47 2007 From: rolf.apweiler at gmail.com (apweiler) Date: Tue, 10 Apr 2007 16:04:47 +0100 Subject: Database Registrary In-Reply-To: References: <461B9302.5010007@ebi.ac.uk> Message-ID: <461BA78F.5090906@ebi.ac.uk> I agree with you completly, Jim. Thanks for the clarification. Rolf Ostell, Jim (NIH/NLM/NCBI) [E] wrote: >Based on your response, Rolf, I am not completely sure you understood my >suggestion. It has two parts: > >1) Usage by the databases: We all agree to use the same case. >2) Semantic Meaning of dbxrefs: semantic meaning is case insensitive. > >So (1) implies that all databases would produce records with >"\dbxref=MixedCase". > >(2) implies that if someone queried our database with either >"MIXEDCASE", "mixedcase", or "MixedCase", we would return the single >record "MixedCase". > >(2) also implies that we would NOT use "MixedCase" to mean one thing, >"MIXEDCASE" to mean something different, and "mixedcase" to mean a third >thing. > >So, after my attempt to clarify, do you still object? > > Jim > > > >>-----Original Message----- >>From: apweiler [mailto:rolf.apweiler at gmail.com] >>Sent: Tuesday, April 10, 2007 9:37 AM >>To: Michael Ashburner >>Cc: Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; >> >> >parkinson at ebi.ac.uk; Peter > > >>D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis; Rolf >> >> >Apweiler > > >>Subject: Re: Database Registrary >> >>Hi Michael et al, >>I attach a Excel spreadsheet with some remarks by me in column M. I >> >> >also > > >>think that the TIGR abbreviations deserve some more thinking. I am fine >>with all other suggestions you made. And finally my view on Jim >> >> >Ostell's > > >>remark about the use of all UPPERCASE or MixedCase: I believe we should >>all use the same casing. I care about uniprotkb vs UNIPROTKB vs >> >> >UniProtKB. > > >>Cheers >> >>Rolf >> >> >> >> >> >>>Colleagues >>> >>>Sorry this took so long. I attach a pretty colored spreadsheat of the >>>database abbreviations used by >>>the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >>> >>>A: name of resource >>>B: object referred to within a resource >>>C: blank >>>D: GO abbreviation >>>E: UniProt abbreviation >>>F: INSD abbreviation >>>G,H blank >>>I: url of resource >>>J: suggestions for edits to db_xref abbreviation. >>> >>>Those is RED are the same in all three sources. >>>Those in blue are used by only two sources and are the same in both >>>Those in yellow are used by only one source >>>Those in green are used by >1 source and differ. >>> >>>Some principles. >>> >>>Where are recourse produces >1 database or has >1 class of object >>>identifier I >>>use an A_B syntax, where A is the resource and B is the class of >>>object within that resource; several >>>of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >>> >>>In Column J I have made _suggestions_ as to the changes we could make >>>that would bring us >>>all into agreement in those cases where we share a db_xref. >>> >>>There may be errors in column B, which was quite hard to do >>>consistently. >>> >>>I regard this as a rough draft and would like to have feedback on the >>>general idea and >>>on details. I also now have a set of abbreviations from ArrayExpress >>>but have not incorportated >>>them. Also Peter, I never heard from the BioPax guys, can you chase >>>this for me ? >>> >>>Best - Michael >>> >>> >>> > > > From rolf.apweiler at gmail.com Tue Apr 10 08:04:47 2007 From: rolf.apweiler at gmail.com (apweiler) Date: Tue, 10 Apr 2007 16:04:47 +0100 Subject: Database Registrary In-Reply-To: References: <461B9302.5010007@ebi.ac.uk> Message-ID: <461BA78F.5090906@ebi.ac.uk> I agree with you completly, Jim. Thanks for the clarification. Rolf Ostell, Jim (NIH/NLM/NCBI) [E] wrote: >Based on your response, Rolf, I am not completely sure you understood my >suggestion. It has two parts: > >1) Usage by the databases: We all agree to use the same case. >2) Semantic Meaning of dbxrefs: semantic meaning is case insensitive. > >So (1) implies that all databases would produce records with >"\dbxref=MixedCase". > >(2) implies that if someone queried our database with either >"MIXEDCASE", "mixedcase", or "MixedCase", we would return the single >record "MixedCase". > >(2) also implies that we would NOT use "MixedCase" to mean one thing, >"MIXEDCASE" to mean something different, and "mixedcase" to mean a third >thing. > >So, after my attempt to clarify, do you still object? > > Jim > > > >>-----Original Message----- >>From: apweiler [mailto:rolf.apweiler at gmail.com] >>Sent: Tuesday, April 10, 2007 9:37 AM >>To: Michael Ashburner >>Cc: Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; >> >> >parkinson at ebi.ac.uk; Peter > > >>D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis; Rolf >> >> >Apweiler > > >>Subject: Re: Database Registrary >> >>Hi Michael et al, >>I attach a Excel spreadsheet with some remarks by me in column M. I >> >> >also > > >>think that the TIGR abbreviations deserve some more thinking. I am fine >>with all other suggestions you made. And finally my view on Jim >> >> >Ostell's > > >>remark about the use of all UPPERCASE or MixedCase: I believe we should >>all use the same casing. I care about uniprotkb vs UNIPROTKB vs >> >> >UniProtKB. > > >>Cheers >> >>Rolf >> >> >> >> >> >>>Colleagues >>> >>>Sorry this took so long. I attach a pretty colored spreadsheat of the >>>database abbreviations used by >>>the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). >>> >>>A: name of resource >>>B: object referred to within a resource >>>C: blank >>>D: GO abbreviation >>>E: UniProt abbreviation >>>F: INSD abbreviation >>>G,H blank >>>I: url of resource >>>J: suggestions for edits to db_xref abbreviation. >>> >>>Those is RED are the same in all three sources. >>>Those in blue are used by only two sources and are the same in both >>>Those in yellow are used by only one source >>>Those in green are used by >1 source and differ. >>> >>>Some principles. >>> >>>Where are recourse produces >1 database or has >1 class of object >>>identifier I >>>use an A_B syntax, where A is the resource and B is the class of >>>object within that resource; several >>>of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". >>> >>>In Column J I have made _suggestions_ as to the changes we could make >>>that would bring us >>>all into agreement in those cases where we share a db_xref. >>> >>>There may be errors in column B, which was quite hard to do >>>consistently. >>> >>>I regard this as a rough draft and would like to have feedback on the >>>general idea and >>>on details. I also now have a set of abbreviations from ArrayExpress >>>but have not incorportated >>>them. Also Peter, I never heard from the BioPax guys, can you chase >>>this for me ? >>> >>>Best - Michael >>> >>> >>> > > > From ostell at ncbi.nlm.nih.gov Mon Apr 9 08:23:02 2007 From: ostell at ncbi.nlm.nih.gov (Ostell, Jim (NIH/NLM/NCBI) [E]) Date: Mon, 9 Apr 2007 11:23:02 -0400 Subject: Database Registrary In-Reply-To: References: Message-ID: Hi Michael, I have cc'd Ilene Mizrachi above, who is the GenBank Coordinator here. She will be our point person on this. I forwarded her the spreadsheet. There are remarkably few that need to be changed. I am pleasantly surprised by that. Several of them appear to only differ in the use of all UPPERCASE or MixedCase. From the point of view of matching as valid xrefs, I would suggest that case should not matter. Allowing case independent matching means that less people's local datafiles will become invalid through this exercise. It also means that when determining if a NEW dbxref can be allowed, it has to not match any exisiting xref independent of case. This is a good thing. We really don't want to allow "This" and "THIS" to mean two different dbxrefs. Does anyone disagree with that? Assuming we can agree that matching is case independent, nonetheless, recommended usage would be for everyone to use the same case. It just makes things more legible to humans, and there is no reason not to. Jim >-----Original Message----- >From: Michael Ashburner [mailto:ma11 at gen.cam.ac.uk] >Sent: Monday, April 09, 2007 10:53 AM >To: Michael Ashburner (Genetics) >Cc: Rolf Apweiler; Amos Bairoch; Ostell, Jim (NIH/NLM/NCBI) [E]; GO LIST; >parkinson at ebi.ac.uk; Peter D'Eustachio; jane at ebi.ac.uk; Midori Harris; Suzanna Lewis >Subject: Re: Database Registrary > > >Colleagues > >Sorry this took so long. I attach a pretty colored spreadsheat of the >database abbreviations used by >the GO, UniProt and INSD (i.e. Genbank, DDBJ and EMBL). > >A: name of resource >B: object referred to within a resource >C: blank >D: GO abbreviation >E: UniProt abbreviation >F: INSD abbreviation >G,H blank >I: url of resource >J: suggestions for edits to db_xref abbreviation. > >Those is RED are the same in all three sources. >Those in blue are used by only two sources and are the same in both >Those in yellow are used by only one source >Those in green are used by >1 source and differ. > >Some principles. > >Where are recourse produces >1 database or has >1 class of object >identifier I >use an A_B syntax, where A is the resource and B is the class of >object within that resource; several >of us already use that syntax, eg "BDGP_EST" or "CGD_LOCUS". > >In Column J I have made _suggestions_ as to the changes we could make >that would bring us >all into agreement in those cases where we share a db_xref. > >There may be errors in column B, which was quite hard to do >consistently. > >I regard this as a rough draft and would like to have feedback on the >general idea and >on details. I also now have a set of abbreviations from ArrayExpress >but have not incorportated >them. Also Peter, I never heard from the BioPax guys, can you chase >this for me ? > >Best - Michael From ewijaya at i2r.a-star.edu.sg Sun Apr 15 20:56:13 2007 From: ewijaya at i2r.a-star.edu.sg (Edward WIJAYA) Date: Mon, 16 Apr 2007 11:56:13 +0800 Subject: Extracting Gene Names from GO ID Message-ID: <4622F3DD.8060208@i2r.a-star.edu.sg> Hi, Given a particular GO ID e.g. GO:0000016, is there a way to extract all their related "gene's names"? Thanks and hope to hear from you again. -- Edward ------------ Institute For Infocomm Research - Disclaimer ------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its contents to any other person. Thank you. -------------------------------------------------------- From qdong at genome.Stanford.EDU Mon Apr 16 14:22:08 2007 From: qdong at genome.Stanford.EDU (Stan Dong) Date: Mon, 16 Apr 2007 14:22:08 -0700 (PDT) Subject: Extracting Gene Names from GO ID In-Reply-To: <4622F3DD.8060208@i2r.a-star.edu.sg> References: <4622F3DD.8060208@i2r.a-star.edu.sg> Message-ID: You may want to take a look at this page. http://amigo.geneontology.org/dev/sql/doc/example-queries.html Especially 'finding every fly gene product ...' item, not sure if what you want though. -Stan On Mon, 16 Apr 2007, Edward WIJAYA wrote: > > Hi, > > Given a particular GO ID e.g. GO:0000016, > is there a way to extract all their related "gene's names"? > > Thanks and hope to hear from you again. > -- > Edward > > > ------------ Institute For Infocomm Research - Disclaimer ------------- > This email is confidential and may be privileged. If you are not the > intended recipient, please delete it and notify us immediately. Please do not > copy or use it for any purpose, or disclose its contents to any other person. > Thank you. > -------------------------------------------------------- > From dbarrell at ebi.ac.uk Tue Apr 17 07:34:58 2007 From: dbarrell at ebi.ac.uk (Daniel Barrell) Date: Tue, 17 Apr 2007 15:34:58 +0100 Subject: announcement of public MySQL mirror Message-ID: <4624DB12.4070704@ebi.ac.uk> Hi, Before I sent this out I just wanted to check that everyone is ok with contact information etc: ---- We are pleased to announce a public GO MySQL mirror at the EBI. This offers a remote connection to a regularly updated mirror of the GO schema including all IEA data. The database automatically updates itself 2:30am GMT every Sunday which means that it will not be available for a short time; therefore you should have a guaranteed connection Monday to Saturday. Connection details are: user: go_select password: amigo host: mysql.ebi.ac.uk port: 4085 Example connection from command line: $ mysql -hmysql.ebi.ac.uk -ugo_select -pamigo -P4085 For example queries please see the following pages: http://amigo.geneontology.org/dev/sql/doc/example-queries.html http://wiki.geneontology.org/index.php/Example_Queries If you have any problem with the service please contact gohelp at genome.stanford.edu. ---- I'll send it to go_friends and go_database. Cheers Dan -- Daniel Barrell EMBL - The EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD Phone: +44 (0)1223 492551 Email: dbarrell at ebi.ac.uk From jane at ebi.ac.uk Tue Apr 17 07:43:17 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Tue, 17 Apr 2007 15:43:17 +0100 Subject: announcement of public MySQL mirror In-Reply-To: <4624DB12.4070704@ebi.ac.uk> References: <4624DB12.4070704@ebi.ac.uk> Message-ID: <462E7FC0-6B71-49FD-ADED-00A4AE9D4786@ebi.ac.uk> Looks good to me Dan. We should add to the website, and we could also put an announcement in the next newsletter... Jane On 17 Apr 2007, at 15:34, Daniel Barrell wrote: > Hi, > > Before I sent this out I just wanted to check that everyone is ok > with contact information etc: > > ---- > > We are pleased to announce a public GO MySQL mirror at the EBI. > This offers a remote connection to a regularly updated mirror of > the GO schema including all IEA data. The database automatically > updates itself 2:30am GMT every Sunday which means that it will not > be available for a short time; therefore you should have a > guaranteed connection Monday to Saturday. Connection details are: > > user: go_select > password: amigo > host: mysql.ebi.ac.uk > port: 4085 > > Example connection from command line: > > $ mysql -hmysql.ebi.ac.uk -ugo_select -pamigo -P4085 > > For example queries please see the following pages: > > http://amigo.geneontology.org/dev/sql/doc/example-queries.html > http://wiki.geneontology.org/index.php/Example_Queries > > If you have any problem with the service please contact > gohelp at genome.stanford.edu. > > ---- > > I'll send it to go_friends and go_database. > > Cheers > > Dan > > -- > > Daniel Barrell > EMBL - The EBI > Wellcome Trust Genome Campus > Hinxton, Cambridge CB10 1SD > Phone: +44 (0)1223 492551 > Email: dbarrell at ebi.ac.uk From cjm at fruitfly.org Tue Apr 17 09:36:30 2007 From: cjm at fruitfly.org (Chris Mungall) Date: Tue, 17 Apr 2007 09:36:30 -0700 Subject: announcement of public MySQL mirror In-Reply-To: <462E7FC0-6B71-49FD-ADED-00A4AE9D4786@ebi.ac.uk> References: <4624DB12.4070704@ebi.ac.uk> <462E7FC0-6B71-49FD-ADED-00A4AE9D4786@ebi.ac.uk> Message-ID: <319C7BFB-AE5A-4BA9-9971-62B06A7BEB19@fruitfly.org> Sounds good - but perhaps we should wait till we have reorganised the database docs before announcing to the general public (real soon now!) - in the meantime we in the GOC could try hammering it for a while. Dan - is this running on the same server as any other database like ensembl/mart? May be cool to do cross-queries across them... On Apr 17, 2007, at 7:43 AM, Jane Lomax wrote: > Looks good to me Dan. We should add to the website, and we could > also put an announcement in the next newsletter... > > Jane > > On 17 Apr 2007, at 15:34, Daniel Barrell wrote: > >> Hi, >> >> Before I sent this out I just wanted to check that everyone is ok >> with contact information etc: >> >> ---- >> >> We are pleased to announce a public GO MySQL mirror at the EBI. >> This offers a remote connection to a regularly updated mirror of >> the GO schema including all IEA data. The database automatically >> updates itself 2:30am GMT every Sunday which means that it will >> not be available for a short time; therefore you should have a >> guaranteed connection Monday to Saturday. Connection details are: >> >> user: go_select >> password: amigo >> host: mysql.ebi.ac.uk >> port: 4085 >> >> Example connection from command line: >> >> $ mysql -hmysql.ebi.ac.uk -ugo_select -pamigo -P4085 >> >> For example queries please see the following pages: >> >> http://amigo.geneontology.org/dev/sql/doc/example-queries.html >> http://wiki.geneontology.org/index.php/Example_Queries >> >> If you have any problem with the service please contact >> gohelp at genome.stanford.edu. >> >> ---- >> >> I'll send it to go_friends and go_database. >> >> Cheers >> >> Dan >> >> -- >> >> Daniel Barrell >> EMBL - The EBI >> Wellcome Trust Genome Campus >> Hinxton, Cambridge CB10 1SD >> Phone: +44 (0)1223 492551 >> Email: dbarrell at ebi.ac.uk > > From hitz at genome.Stanford.EDU Tue Apr 17 11:22:37 2007 From: hitz at genome.Stanford.EDU (Benjamin Hitz) Date: Tue, 17 Apr 2007 11:22:37 -0700 Subject: Clarification of Redundancy checks in GAFs / godb Message-ID: From the Jan. GOC meeting minutes. 23. Action item for Mike(?): Change documentation to make clear that annotation file column 1 gives the database from which the identifier in column 2 is drawn, NOT (as it currently says) the id of the submitting authority. 24. Action item for Daniel: Form working group to help groups that are having problems resolving redundancy issues. I ran across a case in the wormbase file where air-2 has both a wb identifier and a uniprot (association via intAct) identifier, even though they are clearly the same entity. ... many air-2 assocs from WB removed ... WB WBGene00000099 air-2 GO:0007109 WB:WBPaper00004303|PMID:10983970 IMP P let-603|stu-7|cyk-6|NM_059313|1G515 gene taxon: 6239 20041019 WB WB WBGene00000099 air-2 GO:0004674 WB:WBPaper00004303|PMID:10983970 NAS F let-603|stu-7|cyk-6|NM_059313|1G515 gene taxon: 6239 20041019 WB UniProt O01427 AIR2_CAEEL GO:0005515 PMID: 14704431 IPI UniProt:Q9GSQ0 F air-2, stu-7, B0207.4: Aurora/Ipl1-related protein kinase 2 protein taxon: 6239 20070403 IntAct Actually, one is the "gene" and one is the "protein". If you do a search on air-2 in amigo, you actually find _3_ gene products, because wormbase also has 3 cellular component associations to the protein, AIR-2. http://amigo.geneontology.org/cgi-bin/amigo/go.cgi? action=query&view=query&query=air-2&search_constraint=gp I don't mean to pick on nematodes here, this is just a random example I was having trouble with. I assume there are similar cases (of identical gene/gene-products where 1 is from uniprot and 1 from source mod). I don't know if other GO sources "mix" DB_Object types (protein/gene) for things are are derived from the same gene. I could imagine cases where you want to distinguish the function of a transcript from protein, but I don't think any of the above qualify. There is no way for the database to tell that these are all actually the same thing (air-2, AIR-2, AIR2_CAEEL) - but I think they should be. Ben -- Ben Hitz Senior Scientific Programmer ** Saccharomyces Genome Database ** GO Consortium Stanford University ** hitz at genome.stanford.edu From pj37 at cornell.edu Tue Apr 17 12:30:00 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Tue, 17 Apr 2007 15:30:00 -0400 Subject: Clarification of Redundancy checks in GAFs / godb In-Reply-To: References: Message-ID: <46252038.7030304@cornell.edu> Benjamin Hitz wrote: > > From the Jan. GOC meeting minutes. > > 23. Action item for Mike(?): Change documentation to make clear that > annotation file column 1 gives the database from which the identifier > in column 2 is drawn, NOT (as it currently says) the id of the > submitting authority. > 24. Action item for Daniel: Form working group to help groups that are > having problems resolving redundancy issues. > > I ran across a case in the wormbase file where air-2 has both a wb > identifier and a uniprot (association via intAct) identifier, even > though they are clearly the same entity. > > ... many air-2 assocs from WB removed ... > WB WBGene00000099 air-2 GO:0007109 > WB:WBPaper00004303|PMID:10983970 IMP P > let-603|stu-7|cyk-6|NM_059313|1G515 gene taxon: 6239 > 20041019 WB > WB WBGene00000099 air-2 GO:0004674 > WB:WBPaper00004303|PMID:10983970 NAS F > let-603|stu-7|cyk-6|NM_059313|1G515 gene taxon: 6239 > 20041019 WB > UniProt O01427 AIR2_CAEEL GO:0005515 PMID: 14704431 > IPI UniProt:Q9GSQ0 F air-2, stu-7, B0207.4: > Aurora/Ipl1-related protein kinase 2 protein taxon: > 6239 20070403 IntAct > > Actually, one is the "gene" and one is the "protein". > > If you do a search on air-2 in amigo, you actually find _3_ gene > products, because wormbase also has 3 cellular component associations > to the protein, AIR-2. > > http://amigo.geneontology.org/cgi-bin/amigo/go.cgi? > action=query&view=query&query=air-2&search_constraint=gp > > I don't mean to pick on nematodes here, this is just a random example I > was having trouble with. I assume there are similar cases (of > identical gene/gene-products where 1 is from uniprot and 1 from source > mod). I don't know if other GO sources "mix" DB_Object types > (protein/gene) for things are are derived from the same gene. I could > imagine cases where you want to distinguish the function of a > transcript from protein, but I don't think any of the above qualify. > > There is no way for the database to tell that these are all actually > the same thing (air-2, AIR-2, AIR2_CAEEL) - but I think they should be. > I see your point and would love to see a mechanism where we can say that the different object types are related. Sometime last year I did raise the similar question on how to track the objects, gene, transcript and protein and their relationships, if all of them have been annotated for GO with different object types. At that time my suggestion was to add to the association files the gene_ID, Gene_symbol and Gene_name from which the gene products (the objects) are derived and have the GO annotations. This would probably solve the problem since gene would be the only common link among annotations of the gene products (transcript and protein). However, implementing such a thing may require a lot of work and change of strategy on part of the contributor dbs. Also the gene info in association files can stay optional, as some of the dbs may not have the gene info such as Uniprot. Pankaj From ranjana at caltech.edu Tue Apr 17 12:51:57 2007 From: ranjana at caltech.edu (ranjana) Date: Tue, 17 Apr 2007 12:51:57 -0700 Subject: Clarification of Redundancy checks in GAFs / godb In-Reply-To: References: Message-ID: <4625255D.1090706@caltech.edu> Hi Benjamin, I feel obliged to speak up because you have chosen the WormBase file :-) for this example. I know its hard from a programming perspective to know that all the entities annotated relate to the same gene, but biologically all of the examples you cite are different annotations with different evidence codes, so in that way they are *not* redundant. Also since the GO allows you to annotate to the gene product --transcript or protein, thats what we do here at WormBase (since we have valid identifiers for these in WormBase). Thats why for a genetic experiment where information is derived from the mutant we are annotating the 'gene' and for cellular component where we are looking at where the protein is expressed we annotate to the 'protein'. The *entities* annotated to are different though they are related to the same gene. I think we've had this discussion in various flavors, but as you point out the problem still remains--how can the database tell that they all relate to the same gene. Cheers Ranjana Benjamin Hitz wrote: > > From the Jan. GOC meeting minutes. > > 23. Action item for Mike(?): Change documentation to make clear that > annotation file column 1 gives the database from which the identifier > in column 2 is drawn, NOT (as it currently says) the id of the > submitting authority. > 24. Action item for Daniel: Form working group to help groups that are > having problems resolving redundancy issues. > > I ran across a case in the wormbase file where air-2 has both a wb > identifier and a uniprot (association via intAct) identifier, even > though they are clearly the same entity. > > ... many air-2 assocs from WB removed ... > WB WBGene00000099 air-2 GO:0007109 > WB:WBPaper00004303|PMID:10983970 IMP > P let-603|stu-7|cyk-6|NM_059313|1G515 gene > taxon:6239 20041019 WB > WB WBGene00000099 air-2 GO:0004674 > WB:WBPaper00004303|PMID:10983970 NAS > F let-603|stu-7|cyk-6|NM_059313|1G515 gene > taxon:6239 20041019 WB > UniProt O01427 AIR2_CAEEL GO:0005515 > PMID:14704431 IPI UniProt:Q9GSQ0 F air-2, stu-7, B0207.4: > Aurora/Ipl1-related protein kinase 2 protein > taxon:6239 20070403 IntAct > > Actually, one is the "gene" and one is the "protein". > > If you do a search on air-2 in amigo, you actually find _3_ gene > products, because wormbase also has 3 cellular component associations > to the protein, AIR-2. > > http://amigo.geneontology.org/cgi-bin/amigo/go.cgi?action=query&view=query&query=air-2&search_constraint=gp > > > I don't mean to pick on nematodes here, this is just a random example > I was having trouble with. I assume there are similar cases (of > identical gene/gene-products where 1 is from uniprot and 1 from source > mod). I don't know if other GO sources "mix" DB_Object types > (protein/gene) for things are are derived from the same gene. I could > imagine cases where you want to distinguish the function of a > transcript from protein, but I don't think any of the above qualify. > > There is no way for the database to tell that these are all actually > the same thing (air-2, AIR-2, AIR2_CAEEL) - but I think they should be. > > Ben > -- > Ben Hitz > Senior Scientific Programmer ** Saccharomyces Genome Database ** GO > Consortium > Stanford University ** hitz at genome.stanford.edu > > > From hitz at genome.Stanford.EDU Tue Apr 17 15:03:36 2007 From: hitz at genome.Stanford.EDU (Benjamin Hitz) Date: Tue, 17 Apr 2007 15:03:36 -0700 Subject: Clarification of Redundancy checks in GAFs / godb In-Reply-To: <4625255D.1090706@caltech.edu> References: <4625255D.1090706@caltech.edu> Message-ID: On Apr 17, 2007, at 12:51 PM, ranjana wrote: > Hi Benjamin, > I feel obliged to speak up because you have chosen the WormBase > file :-) for this example. I know its hard from a programming > perspective to know that all the entities annotated relate to the > same gene, but biologically all of the examples you cite are > different annotations with different evidence codes, so in that way > they are *not* redundant. > Also since the GO allows you to annotate to the gene product -- > transcript or protein, thats what we do here at WormBase (since we > have valid identifiers for these in WormBase). Thats why for a > genetic experiment where information is derived from the mutant > we are annotating the 'gene' and for cellular component where we > are looking at where the protein is expressed we annotate to the > 'protein'. The *entities* annotated to are different though they > are related to the same gene. > > I think we've had this discussion in various flavors, but as you > point out the problem still remains--how can the database tell that > they all relate to the same gene. > Ranjana - There are two issues, and I probably shouldn't have mixed them. The, first, it's true that annotating different entities (genes, transcripts, proteins) is intended and allowed. The problem (as you mentioned) is that there is no standard for this or mapping between them. So a user sees 2 different entities when they would expect 1. The other issue is that wormbase (and also RGD) incorporates associations from Uniprot that refer to the EXACT same entity (protein for AIR2). Note that it's not the term-gene_product that is the same, just the gene product. In an ideal world we could magically catch all these "identities" and load a single gene product. We are not there yet, so I think it's on the mods to handle this mapping (since they would know it better than anyone) Ben > > Benjamin Hitz wrote: >> >> From the Jan. GOC meeting minutes. >> >> 23. Action item for Mike(?): Change documentation to make clear >> that annotation file column 1 gives the database from which the >> identifier in column 2 is drawn, NOT (as it currently says) the id >> of the submitting authority. >> 24. Action item for Daniel: Form working group to help groups that >> are having problems resolving redundancy issues. >> >> I ran across a case in the wormbase file where air-2 has both a wb >> identifier and a uniprot (association via intAct) identifier, even >> though they are clearly the same entity. >> >> ... many air-2 assocs from WB removed ... >> WB WBGene00000099 air-2 GO:0007109 >> WB:WBPaper00004303|PMID:10983970 IMP >> P let-603|stu-7|cyk-6|NM_059313|1G515 gene >> taxon:6239 20041019 WB >> WB WBGene00000099 air-2 GO:0004674 >> WB:WBPaper00004303|PMID:10983970 NAS >> F let-603|stu-7|cyk-6|NM_059313|1G515 gene >> taxon:6239 20041019 WB >> UniProt O01427 AIR2_CAEEL GO:0005515 PMID: >> 14704431 IPI UniProt:Q9GSQ0 F air-2, stu-7, B0207.4: >> Aurora/Ipl1-related protein kinase 2 protein >> taxon:6239 20070403 IntAct >> >> Actually, one is the "gene" and one is the "protein". >> >> If you do a search on air-2 in amigo, you actually find _3_ gene >> products, because wormbase also has 3 cellular component >> associations to the protein, AIR-2. >> >> http://amigo.geneontology.org/cgi-bin/amigo/go.cgi? >> action=query&view=query&query=air-2&search_constraint=gp >> >> I don't mean to pick on nematodes here, this is just a random >> example I was having trouble with. I assume there are similar >> cases (of identical gene/gene-products where 1 is from uniprot and >> 1 from source mod). I don't know if other GO sources "mix" >> DB_Object types (protein/gene) for things are are derived from the >> same gene. I could imagine cases where you want to distinguish >> the function of a transcript from protein, but I don't think any >> of the above qualify. >> >> There is no way for the database to tell that these are all >> actually the same thing (air-2, AIR-2, AIR2_CAEEL) - but I think >> they should be. >> >> Ben >> -- >> Ben Hitz >> Senior Scientific Programmer ** Saccharomyces Genome Database ** >> GO Consortium >> Stanford University ** hitz at genome.stanford.edu >> >> >> -- Ben Hitz Senior Scientific Programmer ** Saccharomyces Genome Database ** GO Consortium Stanford University ** hitz at genome.stanford.edu From dbarrell at ebi.ac.uk Wed Apr 18 05:16:51 2007 From: dbarrell at ebi.ac.uk (Daniel Barrell) Date: Wed, 18 Apr 2007 13:16:51 +0100 Subject: announcement of public MySQL mirror In-Reply-To: <319C7BFB-AE5A-4BA9-9971-62B06A7BEB19@fruitfly.org> References: <4624DB12.4070704@ebi.ac.uk> <462E7FC0-6B71-49FD-ADED-00A4AE9D4786@ebi.ac.uk> <319C7BFB-AE5A-4BA9-9971-62B06A7BEB19@fruitfly.org> Message-ID: <46260C33.3040105@ebi.ac.uk> Chris Mungall wrote: > > Sounds good - but perhaps we should wait till we have reorganised the > database docs before announcing to the general public (real soon now!) - > in the meantime we in the GOC could try hammering it for a while. > > Dan - is this running on the same server as any other database like > ensembl/mart? May be cool to do cross-queries across them... That would be a really nice idea but unfortunately the Ensembl DBs are now on two separate machines in the Sanger: ensembldb.ensembl.org martdb.ensembl.org I think arranging this might mean we have to jump through many political hoops but if I ever manage to get an account at Sanger I'd be better armed to investigate this. In the meantime you can always create 2 database connections in your code and let the program do the joining/analysis. Cheers Dan -- Daniel Barrell EMBL - The EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD Phone: +44 (0)1223 492551 Email: dbarrell at ebi.ac.uk From midori at ebi.ac.uk Wed Apr 18 07:10:45 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Wed, 18 Apr 2007 15:10:45 +0100 (BST) Subject: Alert: proposal to obsolete one cellular component term that affects existing annotation Message-ID: The proposal has been made to obsolete complement component C2 complex ; GO:0005603 At present there is a single annotation to this term: UniProt:P06681 CO2_HUMAN PMID:6199794 TAS This term is not used in any external2go mappings, and is not used in any GO slim set maintained within the OBO flat file. The reason for this proposal is that prior to cleavage, complement component C2 is a single polypeptide rather than a complex, and after cleavage the products do not remain physically associated; there is thus no known biological entitiy corresponding to "complement C2 complex". SourceForge link: https://sourceforge.net/tracker/index.php?func=detail&aid=1702912&group_id=36855&atid=440764 The two-week comment period ends on Wednesday, May 2, 2007. From camon at ebi.ac.uk Wed Apr 18 07:39:08 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Wed, 18 Apr 2007 15:39:08 +0100 Subject: Alert: proposal to obsolete one cellular component term that affects existing annotation References: Message-ID: <46262D8C.4040003@ebi.ac.uk> Hi That was an old Proteome Inc. annotation which I have now removed so ok to go ahead with an obsoletion from me.. Evelyn Midori Harris wrote: > The proposal has been made to obsolete > > complement component C2 complex ; GO:0005603 > > At present there is a single annotation to this term: > > UniProt:P06681 CO2_HUMAN PMID:6199794 TAS > > This term is not used in any external2go mappings, and is not used in > any GO slim set maintained within the OBO flat file. > > The reason for this proposal is that prior to cleavage, complement > component C2 is a single polypeptide rather than a complex, and after > cleavage the products do not remain physically associated; there is thus > no known biological entitiy corresponding to "complement C2 complex". > > SourceForge link: > https://sourceforge.net/tracker/index.php?func=detail&aid=1702912&group_id=36855&atid=440764 > > > The two-week comment period ends on Wednesday, May 2, 2007. -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From midori at ebi.ac.uk Wed Apr 18 07:39:38 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Wed, 18 Apr 2007 15:39:38 +0100 (BST) Subject: Alert: proposal to obsolete one cellular component term that affects existing annotation In-Reply-To: <46262D8C.4040003@ebi.ac.uk> References: <46262D8C.4040003@ebi.ac.uk> Message-ID: Thanks -- that was the only annotation, so unless anyone else squawks, out t goes. m On Wed, 18 Apr 2007, Evelyn Camon wrote: > Hi > > That was an old Proteome Inc. annotation which I have now removed so ok to go > ahead with an obsoletion from me.. > > Evelyn > > Midori Harris wrote: >> The proposal has been made to obsolete >> >> complement component C2 complex ; GO:0005603 >> >> At present there is a single annotation to this term: >> >> UniProt:P06681 CO2_HUMAN PMID:6199794 TAS >> >> This term is not used in any external2go mappings, and is not used in any >> GO slim set maintained within the OBO flat file. >> >> The reason for this proposal is that prior to cleavage, complement >> component C2 is a single polypeptide rather than a complex, and after >> cleavage the products do not remain physically associated; there is thus no >> known biological entitiy corresponding to "complement C2 complex". >> >> SourceForge link: >> https://sourceforge.net/tracker/index.php?func=detail&aid=1702912&group_id=36855&atid=440764 >> >> The two-week comment period ends on Wednesday, May 2, 2007. > > > From NLWashington at lbl.gov Tue Apr 3 14:07:34 2007 From: NLWashington at lbl.gov (Nicole Washington) Date: Tue, 3 Apr 2007 14:07:34 -0700 (PDT) Subject: Phenote Users Group - 1st meeting 4/10, 8:30am PDT Message-ID: <50710.131.243.195.76.1175634454.squirrel@webmail.inslomo.net> Hello Phenote User, Our first Phenote User's Meeting will be next Tuesday, 4/10, at 8:30am PDT. For this first meeting, we will conduct it over the phone, have some introductions, and discuss what it is we want to get out of this group. Please join us next Tuesday. Conference line phone numbers: US: 877-208-9784 UK: 001 206 315 8580 Code is 53029# Looking forward to our first event. Nicole Washington and Mark Gibson ************************************************** Nicole Washington, Ph.D. National Center for Biomedical Ontology Lawrence Berkeley National Labs Division of Life Sciences 1 Cyclotron Rd. Mail Stop 64-121 Berkeley, CA 94720 510.486.6836 (office) 510.486.6798 (fax) From cjm at fruitfly.org Wed Apr 18 17:45:12 2007 From: cjm at fruitfly.org (Chris Mungall) Date: Wed, 18 Apr 2007 17:45:12 -0700 Subject: How is the improvement of central nervous system terms in GO? In-Reply-To: References: <1174645686.624.90.camel@sophie.cribi.unipd.it> <4603B1A8.9050509@sanger.ac.uk> Message-ID: <839F8832-3490-4B0A-A366-8773EE472192@fruitfly.org> This is a good solution, but it only answers the "how have annotations improved" part, not "how has the ontology improved" part (at least not directly - we may see more specific annotations due to more specific terms being added) I think these statistics and others like them will be useful when it comes time to produce reports for renewal and so on. There are two broad approaches that are complementary here: 1) create queryable mysql database instances from the archives 2) use file based methods. 1 is a higher initial investment and will take db admin time, but is in general a better approach, especially for comparing all annotations, not just from one GA file. 2 requires no db-admin time. this is ideal for say, using GO::TermFinder (which is a v nice perl library as well as web tool) to compare term-enrichment-p-values for gene sets for particular GA files at arbitrary points in the history of GO. However, it's generally less scalable to the entire annotation set, and harder to do general queries. To figure out if it's worth investing in 1, we have to decide: What time scales are we interested in here? ie - starting from when, and what time intervals - yearly? monthly? Also - more importantly what are the kinds of questions we are trying to answer? A valid answer here is that we don't entirely know yet. - What terms has the highest rate of increase/decrease in annotation in a time interval? [both directly and via transitivity] - does refining a term (adding children) have an effect on annotation? - how does the information content of the nodes in the DAG vary over time? What branches are most/least prone to change? - what happens if we perform the same term enrichment analysis on "classic" gene sets varying both the ontology version and the annotation set version? On Mar 23, 2007, at 7:06 AM, John C. Matese wrote: > > Hi Erica, > > While what Valerie suggests may work for you, GO Term Finder is > primarily designed to look for enrichment of GO annotations within > a list of gene products. For queries like yours, where your are > actually doing the converse (using a list of GO *terms* as the > query), you may want to try the generic GO Term Mapper tool > instead, found at: > > http://go.princeton.edu/cgi-bin/GOTermMapper > > Disclaimer: the downside is that this tool may not tell you what > the annotated genes are, but just reports the numbers (unless > perhaps if you upload a list of all genes, in basic input 1, > below). Perhaps that is all you need, though ("central nervous > system development ( GO:0007417 ) 197 out of 19479 annotated genes"). > > If so, try the following protocol: > Basic Inputs section > 1) leave empty (or upload the entire list of gene-product > identifiers) > 2) Process > 3) choose the appropriate organism/source (don't worry about the > terms, we'll respecify those in the advanced section) > > Advanced Options section > 1) Enter your targeted GOID(s): > for example: > GO:0007417 > GO:0021556 > GO:0021956 > GO:0021626 > etc. > 2) upload your versioned gene association file (2004, 2005, 2006, > etc.) > > > Let us know if that works well enough for you, > > John > > PS: I think AmiGO is a much better/faster tool for these questions, > but I suppose it doesn't answer these temporal/progress queries. > > > On Mar 23, 2007, at 6:53 AM, Valerie Wood wrote: > >> Hi Erica, >> >> I think you can do this using the generic GO Term Finder >> http://go.princeton.edu/cgi-bin/GOTermFinder >> >> There are options to >> "Upload a custom gene association file" >> so you could upload the annotation file for your genes of interest >> for each time point of interest. >> >> Setting p-value cutoff appears to give all associated terms >> (although you would need to check this). >> >> Would that work? >> >> Val >> >> Erika Feltrin wrote: >> >>> Dear all, >>> >>> I am trying to find a solution for a tricky problem, so if >>> anybody can >>> suggest me the best solution I will appreciate your help :-) >>> >>> I need to find the improvement in GO terms and GO annotation about >>> central nervous system biology. I would need also to know how >>> many genes >>> have been annotated to each GO term and all his children. >>> For GO terms, I decided to check the number of terms in the GO >>> file from >>> the 2001 to 2006 year by year and then make a statistics. >>> In 2001 we know that GO was splitted in three files and >>> unfortunately I >>> tried to load them in oboedit choosing GO flat file adapter >>> but it did not work. I have already put a BUG report on SF. >>> As an alternative, I decide to follow an other approach and use >>> the database and go-perl/go-db-perl so you can get the >>> annotation counts. >>> >>> The problem is that in the annotation files genes are annotated >>> only to >>> the last child term. For instance, we know that in the GOA files >>> genes >>> annotated to central nervous system neuron development; GO: >>> 0021954 are >>> not annotated also to the term central nervous system development >>> GO:0007417. >>> >>> So, my question is: do you know if already exist a script or >>> something >>> else that help me, starting from a GO term, to find all GO child >>> terms >>> and all genes annotated to each terms? Does anybody already do >>> something >>> similar? >>> >>> >>> Thanks a lot >>> Erika >>> >> >> >> -- >> --------------------------------------------------------------------- >> ------ >> Valerie Wood Tel: 01223 496909 >> S. pombe Genome Project Fax: 01223 494919 Wellcome >> Trust Sanger Institute email: val at sanger.ac.uk >> Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe >> Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/ >> S_pombe > > From erika at cribi.unipd.it Thu Apr 19 00:38:47 2007 From: erika at cribi.unipd.it (Erika Feltrin) Date: Thu, 19 Apr 2007 09:38:47 +0200 Subject: How is the improvement of central nervous system terms in GO? In-Reply-To: <839F8832-3490-4B0A-A366-8773EE472192@fruitfly.org> References: <1174645686.624.90.camel@sophie.cribi.unipd.it> <4603B1A8.9050509@sanger.ac.uk> <839F8832-3490-4B0A-A366-8773EE472192@fruitfly.org> Message-ID: <1176968329.10671.510.camel@sophie.cribi.unipd.it> IHi Chris, I have solved my problem about "how has the ontology improved". I download the GO files from 2001 to 2007 and made some searches for nervous system specific terms. Then I repeated the same searches for each GO files and compare the result. I can sent you the results if you are interested in. Now I would like to find the genes associated to these terms in the different years. The only thing I can think is to use the GOA annotation release in the archive folder. Then, I tried to do what John Matese suggested me but I had some problems using GO Term Finder. At the step "upload your versioned gene association file" (2004, 2005, 2006, etc.) I have chosen to upload the GOA annotation file but it is bigger than 100Mb and so I can not use it. > Also - more importantly what are the kinds of questions we are trying > to answer? A valid answer here is that we don't entirely know yet. Since my PhD project is funded also by a pharmaceutical that studies neurodegenerative disorders, they are interested in the improvement of gene annotation to these terms. >There are two broad approaches that are complementary here: > > 1) create queryable mysql database instances from the archives > 2) use file based methods. > > 1 is a higher initial investment and will take db admin time, but is > in general a better approach, especially for comparing all > annotations, not just from one GA file. Your suggestion is very very interesting. I understand the problem about db admin time and so it is important to find a solution that allow me to do this task instead of asking your time since I know your are all busy. The solution 2 could be fine but there still is the problem of file size. Cheers Erika 18/04/2007 alle 17.45 -0700, Chris Mungall ha scritto: > This is a good solution, but it only answers the "how have > annotations improved" part, not "how has the ontology improved" part > (at least not directly - we may see more specific annotations due to > more specific terms being added) > > I think these statistics and others like them will be useful when it > comes time to produce reports for renewal and so on. > > There are two broad approaches that are complementary here: > > 1) create queryable mysql database instances from the archives > 2) use file based methods. > > 1 is a higher initial investment and will take db admin time, but is > in general a better approach, especially for comparing all > annotations, not just from one GA file. > > 2 requires no db-admin time. this is ideal for say, using > GO::TermFinder (which is a v nice perl library as well as web tool) > to compare term-enrichment-p-values for gene sets for particular GA > files at arbitrary points in the history of GO. However, it's > generally less scalable to the entire annotation set, and harder to > do general queries. > > To figure out if it's worth investing in 1, we have to decide: > > What time scales are we interested in here? ie - starting from when, > and what time intervals - yearly? monthly? > > Also - more importantly what are the kinds of questions we are trying > to answer? A valid answer here is that we don't entirely know yet. > > - What terms has the highest rate of increase/decrease in annotation > in a time interval? [both directly and via transitivity] > > - does refining a term (adding children) have an effect on annotation? > > - how does the information content of the nodes in the DAG vary over > time? What branches are most/least prone to change? > > - what happens if we perform the same term enrichment analysis on > "classic" gene sets varying both the ontology version and the > annotation set version? > > On Mar 23, 2007, at 7:06 AM, John C. Matese wrote: > > > > > Hi Erica, > > > > While what Valerie suggests may work for you, GO Term Finder is > > primarily designed to look for enrichment of GO annotations within > > a list of gene products. For queries like yours, where your are > > actually doing the converse (using a list of GO *terms* as the > > query), you may want to try the generic GO Term Mapper tool > > instead, found at: > > > > http://go.princeton.edu/cgi-bin/GOTermMapper > > > > Disclaimer: the downside is that this tool may not tell you what > > the annotated genes are, but just reports the numbers (unless > > perhaps if you upload a list of all genes, in basic input 1, > > below). Perhaps that is all you need, though ("central nervous > > system development ( GO:0007417 ) 197 out of 19479 annotated genes"). > > > > If so, try the following protocol: > > Basic Inputs section > > 1) leave empty (or upload the entire list of gene-product > > identifiers) > > 2) Process > > 3) choose the appropriate organism/source (don't worry about the > > terms, we'll respecify those in the advanced section) > > > > Advanced Options section > > 1) Enter your targeted GOID(s): > > for example: > > GO:0007417 > > GO:0021556 > > GO:0021956 > > GO:0021626 > > etc. > > 2) upload your versioned gene association file (2004, 2005, 2006, > > etc.) > > > > > > Let us know if that works well enough for you, > > > > John > > > > PS: I think AmiGO is a much better/faster tool for these questions, > > but I suppose it doesn't answer these temporal/progress queries. > > > > > > On Mar 23, 2007, at 6:53 AM, Valerie Wood wrote: > > > >> Hi Erica, > >> > >> I think you can do this using the generic GO Term Finder > >> http://go.princeton.edu/cgi-bin/GOTermFinder > >> > >> There are options to > >> "Upload a custom gene association file" > >> so you could upload the annotation file for your genes of interest > >> for each time point of interest. > >> > >> Setting p-value cutoff appears to give all associated terms > >> (although you would need to check this). > >> > >> Would that work? > >> > >> Val > >> > >> Erika Feltrin wrote: > >> > >>> Dear all, > >>> > >>> I am trying to find a solution for a tricky problem, so if > >>> anybody can > >>> suggest me the best solution I will appreciate your help :-) > >>> > >>> I need to find the improvement in GO terms and GO annotation about > >>> central nervous system biology. I would need also to know how > >>> many genes > >>> have been annotated to each GO term and all his children. > >>> For GO terms, I decided to check the number of terms in the GO > >>> file from > >>> the 2001 to 2006 year by year and then make a statistics. > >>> In 2001 we know that GO was splitted in three files and > >>> unfortunately I > >>> tried to load them in oboedit choosing GO flat file adapter > >>> but it did not work. I have already put a BUG report on SF. > >>> As an alternative, I decide to follow an other approach and use > >>> the database and go-perl/go-db-perl so you can get the > >>> annotation counts. > >>> > >>> The problem is that in the annotation files genes are annotated > >>> only to > >>> the last child term. For instance, we know that in the GOA files > >>> genes > >>> annotated to central nervous system neuron development; GO: > >>> 0021954 are > >>> not annotated also to the term central nervous system development > >>> GO:0007417. > >>> > >>> So, my question is: do you know if already exist a script or > >>> something > >>> else that help me, starting from a GO term, to find all GO child > >>> terms > >>> and all genes annotated to each terms? Does anybody already do > >>> something > >>> similar? > >>> > >>> > >>> Thanks a lot > >>> Erika > >>> > >> > >> > >> -- > >> --------------------------------------------------------------------- > >> ------ > >> Valerie Wood Tel: 01223 496909 > >> S. pombe Genome Project Fax: 01223 494919 Wellcome > >> Trust Sanger Institute email: val at sanger.ac.uk > >> Wellcome Trust Genome Campus http://www.genedb.org/genedb/pombe > >> Hinxton, Cambridge, CB10 1HH http://www.sanger.ac.uk/Projects/ > >> S_pombe > > > > -- Erika Feltrin PhD student University of Padova Tel: +39 049 827 6165 Fax: +39 049 827 6159 La vita ? quello che ti capita mentre stai facendo altri progetti (J. Lennon) From jane at ebi.ac.uk Thu Apr 19 03:23:53 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 19 Apr 2007 11:23:53 +0100 (BST) Subject: May newsletter Message-ID: Hi all - it's time to start putting together the May newsletter. There are already quite a few items on the wiki, could everyone review these please, to check that they're still relevant: http://gocwiki.geneontology.org/index.php/Future_newsletter_items I've put names next to a few of them for the contact people - just edit these if I've got it wrong. And of course add anything else you would like to see included. thanks, the newsletter WG Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 From rama at genome.Stanford.EDU Thu Apr 19 10:22:41 2007 From: rama at genome.Stanford.EDU (Rama Balakrishnan) Date: Thu, 19 Apr 2007 10:22:41 -0700 Subject: May newsletter In-Reply-To: References: Message-ID: <04B9C3E3-9FC5-406B-B496-CC82AD14D697@genome.stanford.edu> Jane, We did not mention moving AmiGO to the new servers in the last newsletter. I will add that item to the Wiki. * Changes in the with column: use of pipes (|) and commas (,) (evidence code committee) This item has to be discussed again at the next GOC meeting. Shelved for a future future newsletter. Thanks for the list. Rama On Apr 19, 2007, at 3:23 AM, Jane Lomax wrote: > Hi all - it's time to start putting together the May newsletter. > There are already quite a few items on the wiki, could everyone > review these please, to check that they're still relevant: > > http://gocwiki.geneontology.org/index.php/Future_newsletter_items > > I've put names next to a few of them for the contact people - just > edit these if I've got it wrong. And of course add anything else > you would like to see included. > > thanks, > > the newsletter WG > > > > Dr Jane Lomax > GO Editorial Office > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridgeshire, UK > CB10 1SD > > p: +44 1223 492516 > f: +44 1223 494468 From jane at ebi.ac.uk Thu Apr 19 11:20:42 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 19 Apr 2007 19:20:42 +0100 (BST) Subject: May newsletter In-Reply-To: <04B9C3E3-9FC5-406B-B496-CC82AD14D697@genome.stanford.edu> References: <04B9C3E3-9FC5-406B-B496-CC82AD14D697@genome.stanford.edu> Message-ID: Great - thanks Rama. I'll make a new section on the wiki for items postponed for future newsletters, jane On Thu, 19 Apr 2007, Rama Balakrishnan wrote: > Jane, > > We did not mention moving AmiGO to the new servers in the last newsletter. I > will add that item to the Wiki. > > * Changes in the with column: use of pipes (|) and commas (,) (evidence code > committee) > > This item has to be discussed again at the next GOC meeting. Shelved for a > future future newsletter. > > Thanks for the list. > > Rama > > On Apr 19, 2007, at 3:23 AM, Jane Lomax wrote: > >> Hi all - it's time to start putting together the May newsletter. There are >> already quite a few items on the wiki, could everyone review these please, >> to check that they're still relevant: >> >> http://gocwiki.geneontology.org/index.php/Future_newsletter_items >> >> I've put names next to a few of them for the contact people - just edit >> these if I've got it wrong. And of course add anything else you would like >> to see included. >> >> thanks, >> >> the newsletter WG >> >> >> >> Dr Jane Lomax >> GO Editorial Office >> EMBL-EBI >> Wellcome Trust Genome Campus >> Hinxton >> Cambridgeshire, UK >> CB10 1SD >> >> p: +44 1223 492516 >> f: +44 1223 494468 > Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 From NLWashington at lbl.gov Thu Apr 19 16:08:21 2007 From: NLWashington at lbl.gov (Nicole Washington) Date: Thu, 19 Apr 2007 16:08:21 -0700 (PDT) Subject: Phenote Working Group - Wiki & Minutes Message-ID: <54242.131.243.195.247.1177024101.squirrel@webmail.inslomo.net> Hi Phenote Users, Note: this message is being sent to you via a potentially interested mailing list. To avoid bombarding uninterested parties with these emails, this is the last time I will send this information to this list. In the future, I will only email the obo-phenote mailing list. If you are not yet subscribed to obo-phenote and wish to keep up-to-date with all things Phenote, please visit https://lists.sourceforge.net/lists/listinfo/obo-phenote We had our first working group meeting last week with 11 attendees! Please note that, as discussed the the meeting, future Phenote Working Group meetings will be held on the FIRST TUESDAY OF EVERY MONTH at 8AM (Pacific time). Therefore, our next meeting will be on Tuesday, MAY 1, 2007. Please join us! In an effort to keep everyone in the loop, I've set up a public Wiki where I'll be posting all relevant working group information, including agendas and minutes. http://www.bioontology.org/wiki/index.php/Phenote:Main_Page There is also a link to the working group wiki from http://www.phenote.org. Please use this as a forum for your ideas as well. Nicole ************************************************** Nicole Washington, Ph.D. National Center for Biomedical Ontology Lawrence Berkeley National Labs Division of Life Sciences 1 Cyclotron Rd. Mail Stop 64-121 Berkeley, CA 94720 510.486.6836 (office) 510.486.6798 (fax) From midori at ebi.ac.uk Tue Apr 24 02:00:15 2007 From: midori at ebi.ac.uk (Midori Harris) Date: Tue, 24 Apr 2007 10:00:15 +0100 (BST) Subject: new OBO-Edit version in use Message-ID: Dear GOers, This is particularly important for those of you who edit GO and commit revisions of the gene_ontology_edit.obo file: We are now using the recently released OBO-Edit 1.100 for all edits. Please use this version from now on for any changes you make to GO. If you don't already have OBO-Edit 1.100, you can download it from the usual place: https://sourceforge.net/project/showfiles.php?group_id=36855 If you have any questions or problems, write to the OBO-Edit working group mailing list: geneontology-oboedit-working-group at lists.sourceforge.net Thanks, and enjoy the new OBO-Edit! Midori on behalf of the OBO-Edit working group ---------- Forwarded message ---------- Date: Thu, 19 Apr 2007 22:06:49 -0700 From: Mike Cherry OBO-EDIT v1.1 is now in use to create the old flat files, OBO v1.0 and for the GO-DIFF. -Mike From jane at ebi.ac.uk Wed Apr 25 04:42:44 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Wed, 25 Apr 2007 12:42:44 +0100 Subject: enzyme question Message-ID: <072400EF-475C-4231-A2DE-E14DA4CC02E0@ebi.ac.uk> Hi - can anyone with an interest in bacterial enzymes, or just how we represent enzymes in general take a look at SF please? https://sourceforge.net/tracker/index.php? func=detail&aid=1707310&group_id=36855&atid=440764 thanks, Jane From jclark at ebi.ac.uk Wed Apr 25 06:57:13 2007 From: jclark at ebi.ac.uk (J Clark) Date: Wed, 25 Apr 2007 14:57:13 +0100 Subject: Alert: Proposal to obsolete sodium:hydrogen antiporter regulator activity: term that impacts existing annotation Message-ID: <462F5E39.7060209@ebi.ac.uk> The proposal has been made to obsolete GO:0017155: sodium:hydrogen antiporter regulator activity. There exist today annotations to this term as follows (data from AmiGO): * UniProt: 1 object * FB: 1 object The reason for this proposal is that the term represents a process but is in the function ontology. On obsoletion a direct replacement will be made in the process ontology. UNLESS OBJECTIONS ARE RECEIVED BY 9th May WE WILL ASSUME THAT YOU AGREE TO THIS CHANGE. The change will initially be made in the draft transport file, and will be committed to the live GO repository at a later date. Thanks, Jennifer -- Gene Ontology Consortium EMBL-European Bioinformatics Institute From sart2 at gen.cam.ac.uk Wed Apr 25 07:13:46 2007 From: sart2 at gen.cam.ac.uk (Susan Tweedie) Date: Wed, 25 Apr 2007 15:13:46 +0100 Subject: Alert: Proposal to obsolete sodium:hydrogen antiporter regulator activity: term that impacts existing annotation In-Reply-To: <462F5E39.7060209@ebi.ac.uk> References: <462F5E39.7060209@ebi.ac.uk> Message-ID: <1177510426.11195.26.camel@paul.gen.cam.ac.uk> I've no objection. Susan On Wed, 2007-04-25 at 14:57 +0100, J Clark wrote: > The proposal has been made to obsolete GO:0017155: sodium:hydrogen > antiporter regulator activity. > > There exist today annotations to this term as follows (data from AmiGO): > > * UniProt: 1 object > * FB: 1 object > > The reason for this proposal is that the term represents a process but > is in the function ontology. On obsoletion a direct replacement will be > made in the process ontology. > > UNLESS OBJECTIONS ARE RECEIVED BY 9th May WE WILL ASSUME THAT YOU AGREE > TO THIS CHANGE. > The change will initially be made in the draft transport file, and will > be committed to the live GO repository at a later date. > > Thanks, > > Jennifer > -- --------------------------------- Susan Tweedie FlyBase GO curator Department of Genetics University of Cambridge Downing Street Cambridge CB2 3EH UK email: s.tweedie at gen.cam.ac.uk phone: +44 [0]1223 333963 fax:+ 44 [0]1223 333992 From camon at ebi.ac.uk Fri Apr 27 03:40:14 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Fri, 27 Apr 2007 11:40:14 +0100 Subject: GOA - change of coordinator Message-ID: <4631D30E.80506@ebi.ac.uk> Hi, I have reached my 9 years at EBI and so sadly have to leave. I will still be working for GOA from a new location in Northern Ireland for another 6 months and so can still be contacted on my EBI e-mail address. I will forward my new address and phone number to those that are interested at a later date. From Monday Dr. Emily Dimmer will be GOA Coordinator and should be contacted on the main GOA issues. Good luck Emily :-)) Thanks for everything everybody, I've enjoyed every minute. Evelyn -- Evelyn Camon GOA Curator Senior Scientific Curator European Bioinformatics Institute E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From jblake at informatics.jax.org Fri Apr 27 05:29:55 2007 From: jblake at informatics.jax.org (Judith Blake) Date: Fri, 27 Apr 2007 08:29:55 -0400 Subject: GOA - change of coordinator In-Reply-To: <4631D30E.80506@ebi.ac.uk> References: <4631D30E.80506@ebi.ac.uk> Message-ID: <4631ECC3.3090207@informatics.jax.org> Evelyn, You have been a great coordinator and leader for the GOA project and you will be missed. I want to thank you for all the passion and work you put into developing GOA as a comprehensive functional annotation stream for UniProt, and for your collaborations with the mouse and rat curation teams in developing a co-curation effort for experimentally-based GO annotations. I hope we can continue to work with you on these or related efforts in the future. I wish you and your family all the best in your new adventures. Judy Evelyn Camon wrote: > Hi, > > I have reached my 9 years at EBI and so sadly have to leave. > I will still be working for GOA from a new location in Northern > Ireland for another 6 months and so can still be contacted on my EBI > e-mail address. I will forward my new address and phone number to > those that are interested at a later date. From Monday Dr. Emily > Dimmer will be GOA Coordinator and should be contacted on the main GOA > issues. Good luck Emily :-)) > > Thanks for everything everybody, I've enjoyed every minute. > > Evelyn >