From midori at ebi.ac.uk Tue Mar 4 22:00:08 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Wed, 5 Mar 2008 06:00:08 UT Subject: [annotation] SourceForge Annotation Tracker Update Message-ID: <200803050600.m2560861075453@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080305/2d9471fd/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080305/2d9471fd/attachment.pl From cherry at stanford.edu Mon Mar 17 09:57:09 2008 From: cherry at stanford.edu (Mike Cherry) Date: Mon, 17 Mar 2008 09:57:09 -0700 Subject: First Curator Call Message-ID: <9AB0B288-312D-434C-B360-1C9AB83911F8@stanford.edu> In light of spring/easter vacations I'd like to hold our first teleconference this Thursday (March 20) noon-1PM Eastern (9-10A Pacific & 4-5P UK). I know everyone cannot make that time. It seems that Tuesday from noon-1P might be generally a good time for the future. Anyway, this first call will be more organizational in nature than future calls. Please call into the following numbers with the listed access code. US: 1-866-365-4406 access number: 7237541 UK: 08004960580 access number: 7237541 I look forward to our discussions. -Mike From cherry at stanford.edu Fri Mar 14 08:00:04 2008 From: cherry at stanford.edu (Mike Cherry) Date: Fri, 14 Mar 2008 08:00:04 -0700 Subject: Calling all Curators Message-ID: <29DEEF3E-C3B9-4488-A68A-73DAB1287FF8@stanford.edu> This is to all scientific curators at all model organism databases (MOD). I would like to begin a monthly teleconference, which could also include a video conference later this year. There are several forms of interaction between individual curators and between MODs already however a regular frequent forum is needed to continue making progress. In particular, we need to continually work to reduce variability between the annotations created by different MODs, and by different curators. This is NOT for just the GOC and those that normally are involved in GOC conferences. This is for all MOD curators. The MODs are become closer in their procedures as a result of the discussions occurring as part of the GOC. However, there is a lot more than can be done to unify all curators at all the MODs. Thus this call will not be a GOC call on picking the best GO term -- that may happen a little but the intent is to include broader curation activities. While this meeting will be started by the Gene Ontology Consortium I want this to grow to all scientific curators that create manual annotations. This conference call is not about making software that does annotation, but would include discussions of software to assist curators. This is not a conference discussion of software practices, there are enough venues for these discussions. The video part will be a future addition as I feel that these meetings will be more productive if we can see each other, we are discussing the human activity of curation and a discussion works better face to face. I am very interested in having a video component as I believe this will help the meetings provide an even better connection. We will be able to use WebEx to share slides and web screens. This monthly conference will start as a conference call and with time we can work out the technical details for the video and/or Skype. As you'll see below we could have greater than 50 people on these calls. The biggest issue is scheduling. I would like to propose that two regular times are schedule switch monthly between the two times. Its obvious that it is difficult to get California and the UK on a call at the same time during normal business hours. Requiring folks, at least initially, to call in from home is a big negative. Thus I have created a doodle pool. Please ignore the dates, rather use this as just days of the week. All times in the poll should be stated as Eastern time. The result of this pool is to identify two different times that would work for the most people. This assumes that most will only connect once every two months. http://www.doodle.ch/eambfifz3ip5v2sv While a core number of the participants will be from the GOC. I want the group include a diverse number of curators. Other topics like sequence annotation both DNA and protein, phenotype annotation, and of course general topics appropriate for any curation group for example how do you prioritize the literature and how do we work with journals in a more productive manner. The intension is for ALL MOD curators to be involved, not just the one or two people that currently represent a group at the GOC. I feel it will be a failure if this is just for the usual crowd. If this can get a diverse group interacting then perhaps we will need to have additional regular calls. The goal is to foster interactions between the community of curators that exist around the world. GO already has mechanisms for communication that work, but could always be better. There are many others that are not connected either because they don't belong to the GO club (some people feel this way) or they personally don't connect with GO because their project already has others participating in the GOC. We already have a start on a broader group. The Annotation email list has 106 addresses, something about 45 are addresses of people that have attended a GOC meeting. That leaves 61 addresses that are either not connected to GOC or are other curators at GOC projects but have not attended one of the Consortium meetings. After each teleconf there will be a brief summary sent out (not sure which lists to include yet) and that email may get even more people interacting and interested in the next call. Because scientific curators are professional staff members we need to raise the requirement for interactions with the goal of being more efficient, have less variability between the annotations, and to have more understanding of what others do. Interacting with other groups to promote an exchange of information will result in more effective operations and more standardized results. This goes beyond GOC and the MODs as the curation profession needs to work together and decrease the amount of isolation that I believe exists. I certainly do not believe that this is done on purpose but rather the interaction has not been made simpler. For sure there are lots of interactions already, but I believe there could be many more. The GOC is a great example of how things change when people talk to each other. The hope is that curators from projects funded by NIH, NSF, USDA, DOE, BBSRC, EBI, ... and industry will come together at least every two months. Curators already, by definition, have a lot of interaction with the communities they server. We need to add a reward system where they are also judged by the amount of interaction with other curators. Some already do a lot of this, many don't do as much as they should -- we can help this happen. I'm thinking of contacting the PIs of the MODs and strongly encourage their staff to participate. Please fill out the doodle poll (URL above). I propose that the annotation at geneontology.org list can be used for this new interaction. That list already has many curators subscribed plus it is not a very heavily used list at the moment. After a week or two I'll get back to you with the time of our first conference. -Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080314/6a9acde4/attachment.html From cherry at stanford.edu Thu Mar 20 10:57:09 2008 From: cherry at stanford.edu (Mike Cherry) Date: Thu, 20 Mar 2008 10:57:09 -0700 Subject: [Annotation] recording of first Curators call Message-ID: More info of what was discussed with be sent out next week. http://www.geneontology.org/meeting/curators/Curators-20080320.mp3 Cheers, Mike From tberardi at acoma.stanford.edu Thu Mar 20 20:34:16 2008 From: tberardi at acoma.stanford.edu (Tanya Berardini) Date: Thu, 20 Mar 2008 20:34:16 -0700 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> Message-ID: <8e22ab960803202034u3ce76359l724dd312fd697552@mail.gmail.com> For the TAIR/Plant Physiology collaboration, TAIR curators will do the mapping from the terms used by the authors to controlled vocabulary terms. Plant Phys does not intend to publish this data as part of the article in the journal, nor do we at TAIR plan to put this data up without passing it through a curator. I would think that any data for other organisms collected in a similar fashion would pass through a similar curation process by the relevant model organism/biological database. Tanya On 3/20/08, Alan Ruttenberg wrote: > > If there is no control on the vocabulary, is it intended that someone > on the journal staff curate these, or are you suggesting that they be > published, as is, and the picked up by external curators. > -Alan > > > On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: > > > Hi all, > > > > Here's a mock-up of what we are starting to present to some > > journals. I would appreciate any feedback. these are just some > > screen shots with no commentary, but you can see where this is > > going I think. It would be melded into the publication process > > somewhere. One thing to notice is that there is no requirement for > > cho0sing any particular term for the annotation. It justs asks for > > what the author would propose. The idea is that it would be up to > > the curator to use this as appropriate in the curation of the paper > > particularly as a clue to what the author is thinking. > > > > One thing of course is that as 'author submission' forms, there > > would be no pubmed ID yet. Also, we may swap the GO taxon table > > for a MOD taxon table...really meant to reflect the most common > > taxonIDs for the work published in that journal. > > > > notice that if you want to add data for more than one gene, you > > click at the bottom and a new annotation table opens from the same > > form. You can choose for a different taxon here if you want to. > > > > I would welcome any comments. > > > > > Judy___________________________________ > > ____________ > > Biocurator mailing list > > Biocurator at tairgroup.org > > http://mailman.tairgroup.org/mailman/listinfo/biocurator > > _______________________________________________ > Biocurator mailing list > Biocurator at tairgroup.org > http://mailman.tairgroup.org/mailman/listinfo/biocurator > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080320/9f00ec56/attachment-0001.html From jblake at informatics.jax.org Fri Mar 21 04:08:01 2008 From: jblake at informatics.jax.org (Judith Blake) Date: Fri, 21 Mar 2008 07:08:01 -0400 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> Message-ID: <47E39711.8000308@informatics.jax.org> Alan, We do not expect authors to understand or be able to easily pick the right term from an controlled vocabulary such as the GO. Their colloquial terminology will, we think, be pretty easily interpreted by trained curators who are biologists skilled in the use of bio-ontologies. The primary and highly useful aspect of this approach is the indexing of the paper by taxon ID and geneID. This is a royal hassle for curators to have to do and something the authors should have at hand. Judy Alan Ruttenberg wrote: > If there is no control on the vocabulary, is it intended that someone > on the journal staff curate these, or are you suggesting that they be > published, as is, and the picked up by external curators. > -Alan > > On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: > >> Hi all, >> >> Here's a mock-up of what we are starting to present to some >> journals. I would appreciate any feedback. these are just some >> screen shots with no commentary, but you can see where this is going >> I think. It would be melded into the publication process somewhere. >> One thing to notice is that there is no requirement for cho0sing any >> particular term for the annotation. It justs asks for what the >> author would propose. The idea is that it would be up to the curator >> to use this as appropriate in the curation of the paper particularly >> as a clue to what the author is thinking. >> >> One thing of course is that as 'author submission' forms, there would >> be no pubmed ID yet. Also, we may swap the GO taxon table for a MOD >> taxon table...really meant to reflect the most common taxonIDs for >> the work published in that journal. >> >> notice that if you want to add data for more than one gene, you click >> at the bottom and a new annotation table opens from the same form. >> You can choose for a different taxon here if you want to. >> >> I would welcome any comments. >> >> Judy_______________________________________________ >> >> Biocurator mailing list >> Biocurator at tairgroup.org >> http://mailman.tairgroup.org/mailman/listinfo/biocurator > From midori at ebi.ac.uk Fri Mar 21 11:00:08 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Fri, 21 Mar 2008 18:00:08 UT Subject: [Annotation] SourceForge Annotation Tracker Update Message-ID: <200803211800.m2LI08l1287633@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080321/21264e20/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080321/21264e20/attachment.pl From simont at hmgc.mcw.edu Fri Mar 21 11:53:04 2008 From: simont at hmgc.mcw.edu (Twigger, Simon (MCW)) Date: Fri, 21 Mar 2008 13:53:04 -0500 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <47E39711.8000308@informatics.jax.org> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> <47E39711.8000308@informatics.jax.org> Message-ID: Hi Judy, For the list of functions/process, etc. and the method description, do these go in pairs? In other words I might put 'kinase' in box 1 and then 'enzyme assay' in the Method box 1 to indicate that I thought this gene was a kinase because I did an enzyme assay? If this is the case, it might be worth making it more explicit that the two go hand in hand, otherwise people might enter a whole list of things in the function/process boxes and nothing in the methods, or the function keyword might not match up with the right method, etc. Again, assuming the boxes go in pairs, you could rearrange these boxes as sentences to help people enter data and perhaps allow them to specify if they are entering a function/process/location/interacting partner using a drop down menu - this would make subsequent parsing easier and might help the user enter more appropriate information: For example it might look something like this (the drop down menu options are separated by a pipe '|' ) "A [function this gene has | process this gene is involved in | location this gene is found in | protein this gene interacts with] is [insert function/process/partner here] and the method I used in my paper to determine this is [Insert method here]" You could create more rows as desired if people wanted to add more facts. Its a bit more wordy than the two boxes you have now but might help the researcher's know what was expected and enter relevant information. Is location meant to be cellular location or can it be any location (subcellular, organ, tissue, cell type, etc?). I would imagine location means different things to different scientists so we'd need to be more precise as to what was expected or anticipate a wide variety of input. Simon, -- Simon N. Twigger, Ph.D. Assistant Professor, Department of Physiology Medical College of Wisconsin 8701 Watertown Plank Road, Milwaukee, WI, USA tel: 414-456-8802 fax: 414-456-6516 AIM/iChat: simontatmcw On Mar 21, 2008, at 6:08 AM, Judith Blake wrote: > > Alan, > > We do not expect authors to understand or be able to easily pick the > right term from an controlled vocabulary such as the GO. Their > colloquial terminology will, we think, be pretty easily interpreted by > trained curators who are biologists skilled in the use of bio- > ontologies. > > The primary and highly useful aspect of this approach is the > indexing of > the paper by taxon ID and geneID. This is a royal hassle for curators > to have to do and something the authors should have at hand. > > Judy > > > Alan Ruttenberg wrote: >> If there is no control on the vocabulary, is it intended that someone >> on the journal staff curate these, or are you suggesting that they be >> published, as is, and the picked up by external curators. >> -Alan >> >> On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: >> >>> Hi all, >>> >>> Here's a mock-up of what we are starting to present to some >>> journals. I would appreciate any feedback. these are just some >>> screen shots with no commentary, but you can see where this is going >>> I think. It would be melded into the publication process somewhere. >>> One thing to notice is that there is no requirement for cho0sing any >>> particular term for the annotation. It justs asks for what the >>> author would propose. The idea is that it would be up to the >>> curator >>> to use this as appropriate in the curation of the paper particularly >>> as a clue to what the author is thinking. >>> >>> One thing of course is that as 'author submission' forms, there >>> would >>> be no pubmed ID yet. Also, we may swap the GO taxon table for a >>> MOD >>> taxon table...really meant to reflect the most common taxonIDs for >>> the work published in that journal. >>> >>> notice that if you want to add data for more than one gene, you >>> click >>> at the bottom and a new annotation table opens from the same form. >>> You can choose for a different taxon here if you want to. >>> >>> I would welcome any comments. >>> >>> Judy >>> < >>> JournalMetadataSubmission >>> .ppt>_______________________________________________ >>> >>> Biocurator mailing list >>> Biocurator at tairgroup.org >>> http://mailman.tairgroup.org/mailman/listinfo/biocurator >> > > > _______________________________________________ > Biocurator mailing list > Biocurator at tairgroup.org > http://mailman.tairgroup.org/mailman/listinfo/biocurator From jblake at informatics.jax.org Fri Mar 21 17:25:59 2008 From: jblake at informatics.jax.org (Judith Blake) Date: Fri, 21 Mar 2008 20:25:59 -0400 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> <47E39711.8000308@informatics.jax.org> Message-ID: <47E45217.5020507@informatics.jax.org> Simon, This is a good suggestion. I too was wondering how to make it more explicit. I will incorporate somehow. Judy Twigger, Simon (MCW) wrote: > Hi Judy, > > For the list of functions/process, etc. and the method description, do > these go in pairs? In other words I might put 'kinase' in box 1 and > then 'enzyme assay' in the Method box 1 to indicate that I thought > this gene was a kinase because I did an enzyme assay? If this is the > case, it might be worth making it more explicit that the two go hand > in hand, otherwise people might enter a whole list of things in the > function/process boxes and nothing in the methods, or the function > keyword might not match up with the right method, etc. > > Again, assuming the boxes go in pairs, you could rearrange these boxes > as sentences to help people enter data and perhaps allow them to > specify if they are entering a function/process/location/interacting > partner using a drop down menu - this would make subsequent parsing > easier and might help the user enter more appropriate information: > > For example it might look something like this (the drop down menu > options are separated by a pipe '|' ) > > "A [function this gene has | process this gene is involved in | > location this gene is found in | protein this gene interacts with] is > [insert function/process/partner here] and the method I used in my > paper to determine this is [Insert method here]" > > You could create more rows as desired if people wanted to add more > facts. Its a bit more wordy than the two boxes you have now but might > help the researcher's know what was expected and enter relevant > information. > > Is location meant to be cellular location or can it be any location > (subcellular, organ, tissue, cell type, etc?). I would imagine > location means different things to different scientists so we'd need > to be more precise as to what was expected or anticipate a wide > variety of input. > > Simon, > > > -- > > Simon N. Twigger, Ph.D. > Assistant Professor, Department of Physiology > Medical College of Wisconsin > 8701 Watertown Plank Road, > Milwaukee, WI, USA > tel: 414-456-8802 > fax: 414-456-6516 > AIM/iChat: simontatmcw > > > > On Mar 21, 2008, at 6:08 AM, Judith Blake wrote: > >> >> Alan, >> >> We do not expect authors to understand or be able to easily pick the >> right term from an controlled vocabulary such as the GO. Their >> colloquial terminology will, we think, be pretty easily interpreted by >> trained curators who are biologists skilled in the use of >> bio-ontologies. >> >> The primary and highly useful aspect of this approach is the indexing of >> the paper by taxon ID and geneID. This is a royal hassle for curators >> to have to do and something the authors should have at hand. >> >> Judy >> >> >> Alan Ruttenberg wrote: >>> If there is no control on the vocabulary, is it intended that someone >>> on the journal staff curate these, or are you suggesting that they be >>> published, as is, and the picked up by external curators. >>> -Alan >>> >>> On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: >>> >>>> Hi all, >>>> >>>> Here's a mock-up of what we are starting to present to some >>>> journals. I would appreciate any feedback. these are just some >>>> screen shots with no commentary, but you can see where this is going >>>> I think. It would be melded into the publication process somewhere. >>>> One thing to notice is that there is no requirement for cho0sing any >>>> particular term for the annotation. It justs asks for what the >>>> author would propose. The idea is that it would be up to the curator >>>> to use this as appropriate in the curation of the paper particularly >>>> as a clue to what the author is thinking. >>>> >>>> One thing of course is that as 'author submission' forms, there would >>>> be no pubmed ID yet. Also, we may swap the GO taxon table for a MOD >>>> taxon table...really meant to reflect the most common taxonIDs for >>>> the work published in that journal. >>>> >>>> notice that if you want to add data for more than one gene, you click >>>> at the bottom and a new annotation table opens from the same form. >>>> You can choose for a different taxon here if you want to. >>>> >>>> I would welcome any comments. >>>> >>>> Judy_______________________________________________ >>>> >>>> >>>> Biocurator mailing list >>>> Biocurator at tairgroup.org >>>> http://mailman.tairgroup.org/mailman/listinfo/biocurator >>> >> >> >> _______________________________________________ >> Biocurator mailing list >> Biocurator at tairgroup.org >> http://mailman.tairgroup.org/mailman/listinfo/biocurator > From midori at ebi.ac.uk Sat Mar 22 11:00:07 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Sat, 22 Mar 2008 18:00:07 UT Subject: [Annotation] SourceForge Annotation Tracker Update Message-ID: <200803221800.m2MI07A1082673@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080322/aa01c4dd/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080322/aa01c4dd/attachment.pl From jblake at informatics.jax.org Thu Mar 20 13:41:14 2008 From: jblake at informatics.jax.org (Judith Blake) Date: Thu, 20 Mar 2008 16:41:14 -0400 Subject: [Annotation] possible data submission at publication model In-Reply-To: References: Message-ID: <47E2CBEA.408@informatics.jax.org> Hi all, Here's a mock-up of what we are starting to present to some journals. I would appreciate any feedback. these are just some screen shots with no commentary, but you can see where this is going I think. It would be melded into the publication process somewhere. One thing to notice is that there is no requirement for cho0sing any particular term for the annotation. It justs asks for what the author would propose. The idea is that it would be up to the curator to use this as appropriate in the curation of the paper particularly as a clue to what the author is thinking. One thing of course is that as 'author submission' forms, there would be no pubmed ID yet. Also, we may swap the GO taxon table for a MOD taxon table...really meant to reflect the most common taxonIDs for the work published in that journal. notice that if you want to add data for more than one gene, you click at the bottom and a new annotation table opens from the same form. You can choose for a different taxon here if you want to. I would welcome any comments. Judy -------------- next part -------------- A non-text attachment was scrubbed... Name: JournalMetadataSubmission.ppt Type: application/vnd.ms-powerpoint Size: 1432064 bytes Desc: not available Url : http://fafner.stanford.edu/pipermail/annotation/attachments/20080320/d4b93d8e/attachment-0001.ppt From alanruttenberg at gmail.com Thu Mar 20 18:08:01 2008 From: alanruttenberg at gmail.com (Alan Ruttenberg) Date: Thu, 20 Mar 2008 21:08:01 -0400 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <47E2CBEA.408@informatics.jax.org> References: <47E2CBEA.408@informatics.jax.org> Message-ID: <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> If there is no control on the vocabulary, is it intended that someone on the journal staff curate these, or are you suggesting that they be published, as is, and the picked up by external curators. -Alan On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: > Hi all, > > Here's a mock-up of what we are starting to present to some > journals. I would appreciate any feedback. these are just some > screen shots with no commentary, but you can see where this is > going I think. It would be melded into the publication process > somewhere. One thing to notice is that there is no requirement for > cho0sing any particular term for the annotation. It justs asks for > what the author would propose. The idea is that it would be up to > the curator to use this as appropriate in the curation of the paper > particularly as a clue to what the author is thinking. > > One thing of course is that as 'author submission' forms, there > would be no pubmed ID yet. Also, we may swap the GO taxon table > for a MOD taxon table...really meant to reflect the most common > taxonIDs for the work published in that journal. > > notice that if you want to add data for more than one gene, you > click at the bottom and a new annotation table opens from the same > form. You can choose for a different taxon here if you want to. > > I would welcome any comments. > > Judy___________________________________ > ____________ > Biocurator mailing list > Biocurator at tairgroup.org > http://mailman.tairgroup.org/mailman/listinfo/biocurator From alanruttenberg at gmail.com Fri Mar 21 05:03:15 2008 From: alanruttenberg at gmail.com (Alan Ruttenberg) Date: Fri, 21 Mar 2008 08:03:15 -0400 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <47E39711.8000308@informatics.jax.org> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> <47E39711.8000308@informatics.jax.org> Message-ID: Thanks Tanya and Judy for clarifying. I think this approach makes very practical sense - minimal hassle for the author and trained people doing the curation. It would be a great deal for the publishers, I think. Now that I understand the approach, I'll think some more about possibilities. One thing that comes to mind is that there may be different curator groups that could curate different aspects of the paper, working from the same source. For instance, OBI is interested in protocols and reagents, Science Commons is interested in materials that might be made available to other researchers, and there was mention of PDB in an earlier response. So you can imaging multiple curation groups feeding off the same information. BTW, although the Pubmed id may not exist, the DOI may have been allocated already and this would serve as an adequate identifier and can be later mapped to the Pubmed id. It would be interesting to know whether it is available from the publishers at the stage that the form filling would be done. If so, perhaps they could supply the form to authors with that already filled out. Another thought is that you might be able to use Phenote for the data collection as it is quite configurable and will give, for any sections that are reasonable to offer controlled vocabulary, term completion. -Alan On Mar 21, 2008, at 7:08 AM, Judith Blake wrote: > > Alan, > > We do not expect authors to understand or be able to easily pick > the right term from an controlled vocabulary such as the GO. Their > colloquial terminology will, we think, be pretty easily interpreted > by trained curators who are biologists skilled in the use of bio- > ontologies. > > The primary and highly useful aspect of this approach is the > indexing of the paper by taxon ID and geneID. This is a royal > hassle for curators to have to do and something the authors should > have at hand. > Judy > > > Alan Ruttenberg wrote: >> If there is no control on the vocabulary, is it intended that >> someone on the journal staff curate these, or are you suggesting >> that they be published, as is, and the picked up by external >> curators. >> -Alan >> >> On Mar 20, 2008, at 4:41 PM, Judith Blake wrote: >> >>> Hi all, >>> >>> Here's a mock-up of what we are starting to present to some >>> journals. I would appreciate any feedback. these are just some >>> screen shots with no commentary, but you can see where this is >>> going I think. It would be melded into the publication process >>> somewhere. One thing to notice is that there is no requirement >>> for cho0sing any particular term for the annotation. It justs >>> asks for what the author would propose. The idea is that it >>> would be up to the curator to use this as appropriate in the >>> curation of the paper particularly as a clue to what the author >>> is thinking. >>> >>> One thing of course is that as 'author submission' forms, there >>> would be no pubmed ID yet. Also, we may swap the GO taxon table >>> for a MOD taxon table...really meant to reflect the most common >>> taxonIDs for the work published in that journal. >>> >>> notice that if you want to add data for more than one gene, you >>> click at the bottom and a new annotation table opens from the >>> same form. You can choose for a different taxon here if you want >>> to. >>> >>> I would welcome any comments. >>> >>> Judy_________________________________ >>> ______________ >>> Biocurator mailing list >>> Biocurator at tairgroup.org >>> http://mailman.tairgroup.org/mailman/listinfo/biocurator >> > > From lukasz at mbi.ucla.edu Sat Mar 22 09:43:10 2008 From: lukasz at mbi.ucla.edu (Lukasz Salwinski) Date: Sat, 22 Mar 2008 09:43:10 -0700 Subject: [Annotation] [Biocurator] possible data submission at publication model In-Reply-To: <47E39711.8000308@informatics.jax.org> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> <47E39711.8000308@informatics.jax.org> Message-ID: <47E5371E.5020509@mbi.ucla.edu> Judith Blake wrote: > Alan, > > We do not expect authors to understand or be able to easily pick the > right term from an controlled vocabulary such as the GO. Their > colloquial terminology will, we think, be pretty easily interpreted by > trained curators who are biologists skilled in the use of bio-ontologies. > > The primary and highly useful aspect of this approach is the indexing of > the paper by taxon ID and geneID. This is a royal hassle for curators > to have to do and something the authors should have at hand. > > Judy exactly. taxon and gene/protein/et id would be the most useful outcome. I would think about all the annotation more as an incentive for the authors so that they feel like they provide something useful... I wouldn't mind sifting through mostly useless annotation terms as long as I'd have a guarantee each and every paper comes with a complete and true list of what exactly proteins/genes the paper is about - going through third level of 'as described in...' that ends with 'The authors would like to thank ... for providing ... plasmid' is no fun :o[ lukasz -- ------------------------------------------------------------------------- Lukasz Salwinski PHONE: 310-825-1402 UCLA-DOE Institute for Genomics & Proteomics FAX: 310-206-3914 UCLA, Los Angeles EMAIL: lukasz at mbi.ucla.edu ------------------------------------------------------------------------- From midori at ebi.ac.uk Tue Mar 25 11:00:07 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Tue, 25 Mar 2008 18:00:07 UT Subject: [Annotation] SourceForge Annotation Tracker Update Message-ID: <200803251800.m2PI07O1349401@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080325/b57a09a6/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080325/b57a09a6/attachment.pl From rama at genome.stanford.edu Wed Mar 26 10:12:56 2008 From: rama at genome.stanford.edu (Rama Balakrishnan) Date: Wed, 26 Mar 2008 10:12:56 -0700 Subject: [Annotation] posting about TAIRs data submission form In-Reply-To: <47E5371E.5020509@mbi.ucla.edu> References: <47E2CBEA.408@informatics.jax.org> <3C5D2432-36D1-46B4-9451-206B38C4DC6E@gmail.com> <47E39711.8000308@informatics.jax.org> <47E5371E.5020509@mbi.ucla.edu> Message-ID: http://www.openhelix.com/blog/?p=237 rama From rama at genome.stanford.edu Wed Mar 26 13:30:01 2008 From: rama at genome.stanford.edu (Rama Balakrishnan) Date: Wed, 26 Mar 2008 13:30:01 -0700 Subject: [Annotation] evidence code advice Message-ID: <7EA8D90D-C57F-4F76-A060-3D28A470865D@genome.stanford.edu> Hi All, SGD has come across couple of computationally predicted GO annotation data sets for S. cerevisiae that we would like to add to our database. The GO annotations from these data sets are predictions based on multiple high-throughput data sets. RCA evidence code came to our minds but according to the documentation, the annotations all have to be manually reviewed by a curator to use this evidence. There are several 100 annotations of this kind and it is not feasible for us to manually review these annotations. Hence, we thought these annotations can be bulk loaded with IEA evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it was decided that the 'with' column information has to be filled in for all IEAs (else Mike's filtering script strips them out). But these GO annotations being predictions based on multiple high-throughput data sets, don't have any information for the with column. So, we are left with no choice. Which evidence code do people think should be used for these kinds of computational datasets when there is not an obvious "with"? Thanks for your input. Rama +-----o--o --------------------------------------------------------------- o-o Rama Balakrishnan Ph.D O Senior Scientific Curator o-o Saccharomyces Genome Database o---o Stanford University o----o Stanford, CA 94305-5120 O-----O Ph: 650.725.8956 Fax: 650.723.7016 0--o email: rama at genome.stanford.edu O Website: http://www.yeastgenome.org o-o SGD Wiki- http://wiki.yeastgenome.org +- o---o ----------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080326/bddcbef9/attachment-0001.html From kara at genomics.princeton.edu Wed Mar 26 13:59:11 2008 From: kara at genomics.princeton.edu (Kara Dolinski) Date: Wed, 26 Mar 2008 16:59:11 -0400 Subject: [Annotation] evidence code advice In-Reply-To: <7EA8D90D-C57F-4F76-A060-3D28A470865D@genome.stanford.edu> References: <7EA8D90D-C57F-4F76-A060-3D28A470865D@genome.stanford.edu> Message-ID: <26C47A9C-74CD-4033-BE4E-086D6015713D@genomics.princeton.edu> Hi, The root of the problem, as I see it, is that we are mixing apples and oranges with evidence codes. All but one of the evidence codes indicate the type of experimental evidence for a GO annotation, but we have one oddball, IEA, that indicates not what the experiment is, but rather how the annotation was done. We keep running into variations of the same problem: we have some evidence (whether experimental or computational) for a GO annotation, but also want to indicate whether a curator looked at it or not. My proposed (albeit radical) solution: Remove IEA as an evidence code. Create a new property for GO annotations (or add a new type of qualifier) that captures how the annotation was done: manual or automated. Everything that is currently IEA would be given the 'automated' property/qualifier, and then would be given a new evidence code as appropriate (mostly a flavor of ISS I would assume). There can be a rule that all 'automated' annotations that are a flavor of ISS must have a 'with' value. This would allow us to use 'RCA' as appropriate, in some cases they'd be 'manual', in others, they'd be 'automated'. In Rama's case, the annotations would be 'RCA' with an 'automated' qualifier. I realize the issues involved in making such a drastic change, so I understand if we don't go there, but I do think that some approach such as the one above is the best representation of the information that we are trying to capture. Cheers, Kara On Mar 26, 2008, at 4:30 PM, Rama Balakrishnan wrote: > > Hi All, > > SGD has come across couple of computationally predicted GO > annotation data sets for S. cerevisiae that we would like to add to > our database. The GO annotations from these data sets are > predictions based on multiple high-throughput data sets. RCA > evidence code came to our minds but according to the documentation, > the annotations all have to be manually reviewed by a curator to > use this evidence. There are several 100 annotations of this kind > and it is not feasible for us to manually review these annotations. > > Hence, we thought these annotations can be bulk loaded with IEA > evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it > was decided that the 'with' column information has to be filled in > for all IEAs (else Mike's filtering script strips them out). But > these GO annotations being predictions based on multiple high- > throughput data sets, don't have any information for the with > column. So, we are left with no choice. > > Which evidence code do people think should be used for these kinds > of computational datasets when there is not an obvious "with"? > > Thanks for your input. > > > Rama > > > +-----o--o > --------------------------------------------------------------- > o-o Rama Balakrishnan Ph.D > O Senior Scientific Curator > o-o Saccharomyces Genome Database > o---o Stanford University > o----o Stanford, CA 94305-5120 > O-----O Ph: 650.725.8956 Fax: 650.723.7016 > 0--o email: rama at genome.stanford.edu > O Website: http://www.yeastgenome.org > o-o SGD Wiki- http://wiki.yeastgenome.org > +- o---o > ----------------------------------------------------------------- > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080326/5addb7c6/attachment.html From jblake at informatics.jax.org Wed Mar 26 21:51:52 2008 From: jblake at informatics.jax.org (Judith Blake) Date: Thu, 27 Mar 2008 00:51:52 -0400 Subject: [Annotation] evidence code advice In-Reply-To: <26C47A9C-74CD-4033-BE4E-086D6015713D@genomics.princeton.edu> References: <7EA8D90D-C57F-4F76-A060-3D28A470865D@genome.stanford.edu> <26C47A9C-74CD-4033-BE4E-086D6015713D@genomics.princeton.edu> Message-ID: <47EB27E8.4050603@informatics.jax.org> Hi Kara. I like having a new view on this. something to think about judy Kara Dolinski wrote: > Hi, > > The root of the problem, as I see it, is that we are mixing apples and > oranges with evidence codes. All but one of the evidence codes > indicate the type of experimental evidence for a GO annotation, but we > have one oddball, IEA, that indicates not what the experiment is, but > rather how the annotation was done. We keep running into variations > of the same problem: we have some evidence (whether experimental or > computational) for a GO annotation, but also want to indicate whether > a curator looked at it or not. > > My proposed (albeit radical) solution: > > Remove IEA as an evidence code. > > Create a new property for GO annotations (or add a new type of > qualifier) that captures how the annotation was done: manual or > automated. > > Everything that is currently IEA would be given the 'automated' > property/qualifier, and then would be given a new evidence code as > appropriate (mostly a flavor of ISS I would assume). > There can be a rule that all 'automated' annotations that are a flavor > of ISS must have a 'with' value. > > This would allow us to use 'RCA' as appropriate, in some cases they'd > be 'manual', in others, they'd be 'automated'. In Rama's case, the > annotations would be 'RCA' with an 'automated' qualifier. > > I realize the issues involved in making such a drastic change, so I > understand if we don't go there, but I do think that some approach > such as the one above is the best representation of the information > that we are trying to capture. > > Cheers, > Kara > > On Mar 26, 2008, at 4:30 PM, Rama Balakrishnan wrote: > >> >> Hi All, >> >> SGD has come across couple of computationally predicted GO annotation >> data sets for S. cerevisiae that we would like to add to our >> database. The GO annotations from these data sets are predictions >> based on multiple high-throughput data sets. RCA evidence code came >> to our minds but according to the documentation, the annotations all >> have to be manually reviewed by a curator to use this evidence. There >> are several 100 annotations of this kind and it is not feasible for >> us to manually review these annotations. >> >> Hence, we thought these annotations can be bulk loaded with IEA >> evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it >> was decided that the 'with' column information has to be filled in >> for all IEAs (else Mike's filtering script strips them out). But >> these GO annotations being predictions based on multiple >> high-throughput data sets, don't have any information for the with >> column. So, we are left with no choice. >> >> Which evidence code do people think should be used for these kinds of >> computational datasets when there is not an obvious "with"? >> >> Thanks for your input. >> >> >> Rama >> >> >> +-----o--o >> --------------------------------------------------------------- >> o-o Rama Balakrishnan Ph.D >> O Senior Scientific Curator >> o-o Saccharomyces Genome Database >> o---o Stanford University >> o----o Stanford, CA 94305-5120 >> O-----O Ph: 650.725.8956 Fax: 650.723.7016 >> 0--o email: rama at genome.stanford.edu >> >> O Website: http://www.yeastgenome.org >> o-o SGD Wiki- http://wiki.yeastgenome.org >> +- o---o >> ----------------------------------------------------------------- >> >> >> >> >> >> >> > > ------------------------------------------------------------------------ > > _______________________________________________ > Annotation mailing list > Annotation at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/annotation > From midori at ebi.ac.uk Thu Mar 27 11:00:08 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Thu, 27 Mar 2008 18:00:08 UT Subject: [Annotation] SourceForge Annotation Tracker Update Message-ID: <200803271800.m2RI08T1332739@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080327/60c7fbb8/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080327/60c7fbb8/attachment.pl From dhowe at cs.uoregon.edu Fri Mar 28 10:23:15 2008 From: dhowe at cs.uoregon.edu (Doug howe) Date: Fri, 28 Mar 2008 10:23:15 -0700 Subject: [Annotation] obsolete/secondary IDs in translation files Message-ID: <47ED2983.4090309@cs.uoregon.edu> How are obsolete/secondary GO IDs that may creep into the interpro2go, ec2go, and spkw2go translation files located and corrected? -Doug From jdeegan at ebi.ac.uk Fri Mar 28 10:27:16 2008 From: jdeegan at ebi.ac.uk (Jennifer Deegan (nee Clark)) Date: Fri, 28 Mar 2008 17:27:16 +0000 Subject: [Annotation] obsolete/secondary IDs in translation files In-Reply-To: <47ED2983.4090309@cs.uoregon.edu> References: <47ED2983.4090309@cs.uoregon.edu> Message-ID: <47ED2A74.2020905@ebi.ac.uk> Hi Doug, The GOA people just left this minute, but as I understand it these are all done here by GOA or InterPro staff. Jen Doug howe wrote: >How are obsolete/secondary GO IDs that may creep into the interpro2go, >ec2go, and spkw2go translation files located and corrected? >-Doug >_______________________________________________ >Annotation mailing list >Annotation at geneontology.org >http://fafner.stanford.edu/mailman/listinfo/annotation > > From dhowe at cs.uoregon.edu Fri Mar 28 10:31:13 2008 From: dhowe at cs.uoregon.edu (Doug howe) Date: Fri, 28 Mar 2008 10:31:13 -0700 Subject: [Annotation] obsolete/secondary IDs in translation files In-Reply-To: <47ED2A74.2020905@ebi.ac.uk> References: <47ED2983.4090309@cs.uoregon.edu> <47ED2A74.2020905@ebi.ac.uk> Message-ID: <47ED2B61.1070201@cs.uoregon.edu> Is this done on a continual basis, or monthly, or sporadic? Just curious. When we apply the translations here I get reports of these bad translations. I assume they get the same type of thing to facilitate corrections? At the moment the spkw2go and ec2go error report only contains a handfull of bad translations, but the ip2go translation error report appears rather meaty... -Doug Jennifer Deegan (nee Clark) wrote: > Hi Doug, > > The GOA people just left this minute, but as I understand it these are > all done here by GOA or InterPro staff. > > Jen > > Doug howe wrote: > >> How are obsolete/secondary GO IDs that may creep into the >> interpro2go, ec2go, and spkw2go translation files located and corrected? >> -Doug >> _______________________________________________ >> Annotation mailing list >> Annotation at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/annotation >> >> > From midori at ebi.ac.uk Fri Mar 28 11:00:07 2008 From: midori at ebi.ac.uk (midori at ebi.ac.uk) Date: Fri, 28 Mar 2008 18:00:07 UT Subject: [Annotation] SourceForge Annotation Tracker Update Message-ID: <200803281800.m2SI0771065784@mozart.ebi.ac.uk> An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080328/c9d11c2a/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://fafner.stanford.edu/pipermail/annotation/attachments/20080328/c9d11c2a/attachment.pl From jdeegan at ebi.ac.uk Fri Mar 28 12:37:21 2008 From: jdeegan at ebi.ac.uk (Jennifer Deegan (nee Clark)) Date: Fri, 28 Mar 2008 19:37:21 +0000 Subject: [Annotation] obsolete/secondary IDs in translation files In-Reply-To: <47ED2B61.1070201@cs.uoregon.edu> References: <47ED2983.4090309@cs.uoregon.edu> <47ED2A74.2020905@ebi.ac.uk> <47ED2B61.1070201@cs.uoregon.edu> Message-ID: <47ED48F1.4020901@ebi.ac.uk> Hi Doug, I'm pretty certain that they just did a big training session last week to get more people working on interpro2go. Should be right as rain soon. Emily will be able to give you the correct details, but she travelling this weekend so will not see mail until Monday. Jen Doug howe wrote: > Is this done on a continual basis, or monthly, or sporadic? Just > curious. When we apply the translations here I get reports of these > bad translations. I assume they get the same type of thing to > facilitate corrections? At the moment the spkw2go and ec2go error > report only contains a handfull of bad translations, but the ip2go > translation error report appears rather meaty... > -Doug > > Jennifer Deegan (nee Clark) wrote: > >> Hi Doug, >> >> The GOA people just left this minute, but as I understand it these >> are all done here by GOA or InterPro staff. >> >> Jen >> >> Doug howe wrote: >> >>> How are obsolete/secondary GO IDs that may creep into the >>> interpro2go, ec2go, and spkw2go translation files located and >>> corrected? >>> -Doug >>> _______________________________________________ >>> Annotation mailing list >>> Annotation at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/annotation >>> >>> >> From dhowe at cs.uoregon.edu Fri Mar 28 12:54:05 2008 From: dhowe at cs.uoregon.edu (Doug howe) Date: Fri, 28 Mar 2008 12:54:05 -0700 Subject: [Annotation] obsolete/secondary IDs in translation files In-Reply-To: <47ED48F1.4020901@ebi.ac.uk> References: <47ED2983.4090309@cs.uoregon.edu> <47ED2A74.2020905@ebi.ac.uk> <47ED2B61.1070201@cs.uoregon.edu> <47ED48F1.4020901@ebi.ac.uk> Message-ID: <47ED4CDD.80704@cs.uoregon.edu> No worries. Have a great weekend. -Doug Jennifer Deegan (nee Clark) wrote: > Hi Doug, > > I'm pretty certain that they just did a big training session last week > to get more people working on interpro2go. Should be right as rain soon. > Emily will be able to give you the correct details, but she travelling > this weekend so will not see mail until Monday. > > Jen > > > Doug howe wrote: > >> Is this done on a continual basis, or monthly, or sporadic? Just >> curious. When we apply the translations here I get reports of these >> bad translations. I assume they get the same type of thing to >> facilitate corrections? At the moment the spkw2go and ec2go error >> report only contains a handfull of bad translations, but the ip2go >> translation error report appears rather meaty... >> -Doug >> >> Jennifer Deegan (nee Clark) wrote: >> >>> Hi Doug, >>> >>> The GOA people just left this minute, but as I understand it these >>> are all done here by GOA or InterPro staff. >>> >>> Jen >>> >>> Doug howe wrote: >>> >>>> How are obsolete/secondary GO IDs that may creep into the >>>> interpro2go, ec2go, and spkw2go translation files located and >>>> corrected? >>>> -Doug >>>> _______________________________________________ >>>> Annotation mailing list >>>> Annotation at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/annotation >>>> >>>> >>> > From edimmer at ebi.ac.uk Mon Mar 31 01:44:10 2008 From: edimmer at ebi.ac.uk (E Dimmer) Date: Mon, 31 Mar 2008 09:44:10 +0100 Subject: [Annotation] obsolete/secondary IDs in translation files In-Reply-To: <47ED2B61.1070201@cs.uoregon.edu> References: <47ED2983.4090309@cs.uoregon.edu> <47ED2A74.2020905@ebi.ac.uk> <47ED2B61.1070201@cs.uoregon.edu> Message-ID: <47F0A45A.6080803@ebi.ac.uk> Hi Doug, We (GOA) corrects the SPKW2GO mapping on a monthly basis - Dan sends an error report just before our release (I believe that Amelia looks after the EC2GO mapping and so will have a different system). InterPro similarly receives a similar report from Dan for InterPro2GO, however as Jen mentioned, the group is currently reorganising how they will deal with both the sanity checks and sourceforge queries, therefore they may have a bit of a backlog. I'll look into it. Cheers, Emily Doug howe wrote: > Is this done on a continual basis, or monthly, or sporadic? Just > curious. When we apply the translations here I get reports of these bad > translations. I assume they get the same type of thing to facilitate > corrections? At the moment the spkw2go and ec2go error report only > contains a handfull of bad translations, but the ip2go translation error > report appears rather meaty... > -Doug > > Jennifer Deegan (nee Clark) wrote: > >> Hi Doug, >> >> The GOA people just left this minute, but as I understand it these are >> all done here by GOA or InterPro staff. >> >> Jen >> >> Doug howe wrote: >> >> >>> How are obsolete/secondary GO IDs that may creep into the >>> interpro2go, ec2go, and spkw2go translation files located and corrected? >>> -Doug >>> _______________________________________________ >>> Annotation mailing list >>> Annotation at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/annotation >>> >>> >>> > _______________________________________________ > Annotation mailing list > Annotation at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/annotation > -- Do you need any additional GO annotation resources? Which proteins would you like annotated with GO? Let us know in the GOA User Survey, available at: http://www.ebi.ac.uk/GOA/contactus.html ------------------------------------------------------------------ Emily Dimmer Ph.D. GOA Coordinator EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD, U.K. Tel: +44 1223 494654 Fax: +44 1223 494468 email: edimmer at ebi.ac.uk URL: http://www.ebi.ac.uk/goa From pfey at northwestern.edu Mon Mar 31 10:59:55 2008 From: pfey at northwestern.edu (Petra Fey) Date: Mon, 31 Mar 2008 12:59:55 -0500 Subject: [Annotation] biocurator society Message-ID: <4E00C5F6-5C9B-4427-AD07-83C14C7AA879@northwestern.edu> Dear Biocurators, We have recently asked for your input about the formation of a biocurator group in association with an existing organization. Amongst those who answered there was no clear agreement regarding which group we would most benefit from, and most people seemed to prefer an independent society, with the caveat that it's a lot of work and most of us are not familiar with the procedures that would be necessary. We have continued to explore what would be the best solution. Recently, we have met with the SwissProt group in Geneva and they have been highly supportive of the idea of an independent society. They propose to contribute resources and help in establishing the society. With such support, the possibility of establishing a stand-alone society now looks more feasible. We'd like to ask for your input once again. Please answer this very simple survey considering the kind of society you wish to see established. Your immediate participation is very much appreciated. Please pass this message along to colleagues you think would be interested. Thank you! http://www.surveymonkey.com/s.aspx?sm=V3dhsWNvWMo4Zus9FGjSDg_3d_3d Petra on behalf of the planning committee -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/annotation/attachments/20080331/52b0a354/attachment.html From suzi at fruitfly.org Sun Mar 30 15:38:32 2008 From: suzi at fruitfly.org (Suzanna Lewis) Date: Sun, 30 Mar 2008 15:38:32 -0700 Subject: [Annotation] evidence code advice In-Reply-To: <26C47A9C-74CD-4033-BE4E-086D6015713D@genomics.princeton.edu> References: <7EA8D90D-C57F-4F76-A060-3D28A470865D@genome.stanford.edu> <26C47A9C-74CD-4033-BE4E-086D6015713D@genomics.princeton.edu> Message-ID: <3CC10808-17BB-45BF-9963-B8075045E3B8@fruitfly.org> This is very much along the lines that I've been trying to foster (remember the meeting in Cambridge at Jesus College). The bit-code (or bar-code) for evidence codes, with each bit indicating one of these flags for a different piece of information. Not only automated/manual, but also large-scale/small-scale, and other characteristics of the evidence. As Kara (and many others) have said, there is quite a bit of over- loading of multiple pieces of information in the current evidence codes. It would be nice one day to see these distinguished into different constituent bits of information. -S p.s. I thought that IEA did not -require- the with column. p.p.s Was the decision tree a step in this direction? On Mar 26, 2008, at 1:59 PM, Kara Dolinski wrote: > Hi, > > The root of the problem, as I see it, is that we are mixing apples > and oranges with evidence codes. All but one of the evidence codes > indicate the type of experimental evidence for a GO annotation, but > we have one oddball, IEA, that indicates not what the experiment is, > but rather how the annotation was done. We keep running into > variations of the same problem: we have some evidence (whether > experimental or computational) for a GO annotation, but also want to > indicate whether a curator looked at it or not. > > My proposed (albeit radical) solution: > > Remove IEA as an evidence code. > > Create a new property for GO annotations (or add a new type of > qualifier) that captures how the annotation was done: manual or > automated. > > Everything that is currently IEA would be given the 'automated' > property/qualifier, and then would be given a new evidence code as > appropriate (mostly a flavor of ISS I would assume). > There can be a rule that all 'automated' annotations that are a > flavor of ISS must have a 'with' value. > > This would allow us to use 'RCA' as appropriate, in some cases > they'd be 'manual', in others, they'd be 'automated'. In Rama's > case, the annotations would be 'RCA' with an 'automated' qualifier. > > I realize the issues involved in making such a drastic change, so I > understand if we don't go there, but I do think that some approach > such as the one above is the best representation of the information > that we are trying to capture. > > Cheers, > Kara > > On Mar 26, 2008, at 4:30 PM, Rama Balakrishnan wrote: > >> >> Hi All, >> >> SGD has come across couple of computationally predicted GO >> annotation data sets for S. cerevisiae that we would like to add to >> our database. The GO annotations from these data sets are >> predictions based on multiple high-throughput data sets. RCA >> evidence code came to our minds but according to the documentation, >> the annotations all have to be manually reviewed by a curator to >> use this evidence. There are several 100 annotations of this kind >> and it is not feasible for us to manually review these annotations. >> >> Hence, we thought these annotations can be bulk loaded with IEA >> evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it >> was decided that the 'with' column information has to be filled in >> for all IEAs (else Mike's filtering script strips them out). But >> these GO annotations being predictions based on multiple high- >> throughput data sets, don't have any information for the with >> column. So, we are left with no choice. >> >> Which evidence code do people think should be used for these kinds >> of computational datasets when there is not an obvious "with"? >> >> Thanks for your input. >> >> >> Rama >> >> >> +-----o--o >> --------------------------------------------------------------- >> o-o Rama Balakrishnan Ph.D >> O Senior Scientific Curator >> o-o Saccharomyces Genome Database >> o---o Stanford University >> o----o Stanford, CA 94305-5120 >> O-----O Ph: 650.725.8956 Fax: 650.723.7016 >> 0--o email: rama at genome.stanford.edu >> O Website: http://www.yeastgenome.org >> o-o SGD Wiki- http://wiki.yeastgenome.org >> +- o---o >> ----------------------------------------------------------------- >> >> >> >> >> >> >> > > _______________________________________________ > Annotation mailing list > Annotation at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/annotation