From dbarrell at ebi.ac.uk Thu Apr 5 07:58:39 2007 From: dbarrell at ebi.ac.uk (Daniel Barrell) Date: Thu, 05 Apr 2007 15:58:39 +0100 Subject: April 2007 GOA release In-Reply-To: <42690FC8.8010409@ebi.ac.uk> References: <42690FC8.8010409@ebi.ac.uk> Message-ID: <46150E9F.4090704@ebi.ac.uk> GOA releases: April 2007 =========================== GOA (GO Annotation at EBI) is a project run by the European Bioinformatics Institute that aims to provide assignments of gene products to the Gene Ontology (GO) resource. The data can be obtained via: EBI FTP: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ EBI SRS: http://srs.ebi.ac.uk. Search GOA data library The data will be made available at the following sites as soon as a disk issue is fixed, probably next week (another email notice will be sent): GO FTP: ftp://ftp.geneontology.org/pub/go/gene-associations/ GO CVS: http://www.geneontology.org/GO.CVS.help.html For further information read: http://www.ebi.ac.uk/GOA or contact goa at ebi.ac.uk. GOA News Update: ***************** The Ensembl Compara pipeline use manual GO annotations from GOA's January release in combination with ortholog data from the current (43) Ensembl release. Regards The UniProt GOA Team -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From asidhu at biomap.org Sat Apr 7 13:32:41 2007 From: asidhu at biomap.org (Amandeep S. Sidhu) Date: Sun, 08 Apr 2007 06:32:41 +1000 Subject: Call for Chapters: Biomedical Data and Applications by Springer Message-ID: <1175977961.4617ffe99472f@mail.opentransfer.com> *********************************************************** Call for Book Chapters Biomedical Data and Applications http://biomed.biomap.org/ *********************************************************** Editors: Amandeep S. Sidhu, Tharam S. Dillon, Elizabeth Chang Digital Ecosystems and Business Intelligence, Curtin University of Technology, Perth, Australia Volume: Studies in Computational Intelligence Series Publisher: Springer http://www.springer.com/series/7092/ ************************************************************ Deadline: 15 June 2007 ************************************************************ Introduction: The goal of this book is to cover biomedical data and applications identifying new issues and directions for future research in biomedical domain. The book will become a useful guide for researchers, practitioners, and graduate-level students interested in learning state-of-the-art development in biomedical data management, data-intensive bioinformatics systems, and other miscellaneous biological database applications. The content of this book is at an introductory and medium technical level. ************************************************************* Topics: Chapters in the book will broadly address following areas: * Conceptual Models for Biological and Medical Data * Data Representation and Visualization * Biomedical Databases and Data Integration * Biomedical Data Analysis and Interoperability * Biological Query Processing, Query Optimization, and Information Retrieval * Biomedical Ontologies and taxonomies * Ontology-driven Biomedical Systems * Computational Methods for Microarray analysis, Protein and RNA structure prediction * Feature selection and pattern discovery in biological data * Modelling Biological Pathways * Health Care Information Systems * Electronic Health Records * Biomedical Data Privacy and Security * Clinical Assessment and Patient Diagnosis * Disease Control and Prevention * Data mining applications in bioinformatics, biomedicine, health care and other biomedical domain areas **************************************************************** Publication: The Book will be published in Studies in Computational Intelligence of Springer in Late 2007. Studies in Computational Intelligence Series publishes new developments and advances in the various areas of computational intelligence. **************************************************************** Submission Instructions: Papers should be original and should not be submitted for publication or published elsewhere. All papers are to be submitted electronically (PDF or MS word format preferred) by email to: Amandeep.Sidhu at cbs.curtin.edu.ai For details on how to prepare the manuscripts and style files refer to: http://www.springer.com/west/home/new+&+forthcoming+titles+(default)?SGWID=4-40356-69-173623546-0&detailsPage=contentItemPage&contentItemId=149813&CIPageCounter=CI_FOR_AUTHORS_AND_EDITORS_PAGE1 **************************************************************** Important Dates: June 15, 2007 Last day to email a letter of intent (including a tentative title, short summary of proposed topic, and a list of authors) July 31, 2007 Deadline to submit a full book chapter paper August 30, 2007 Notification of acceptance with referee comments and revision comments September 30, 2007 Deadline to submit final version of the accepted chapters in camera-ready Springer format **************************************************************** Editors: Amandeep S. Sidhu, Prof. Tharam S. Dillon, Prof. Elizabeth Chang Digital Ecosystems & Business Intelligence Institute, Curtin University of Technology, Australia http://debii.curtin.edu.au/ **************************************************************** -- -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From asidhu at biomap.org Sun Apr 8 02:48:10 2007 From: asidhu at biomap.org (Amandeep S. Sidhu) Date: Sun, 08 Apr 2007 19:48:10 +1000 Subject: CFP: 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007) Message-ID: <1176025690.4618ba5a4611b@mail.opentransfer.com> 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007) http://www.cis.drexel.edu/faculty/thu/bibm/index.php.htm San Jose, CA, USA Nov 2-4, 2007 IEEE BIBM 2007 will provide a general forum for disseminating the latest research in bioinformatics and biomedicine. It is a multidisciplinary conference that brings together academic and industrial scientists from computer science, biology, chemistry, medicine, mathematics and statistics. It will exchange research results and address open issues in all aspects of bioinformatics and biomedicine and provide a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, simulation, ontology and other computational methods, as applied to life science problems, with emphasis on applications in high throughput data-rich areas in biology, biomedical engineering. IEEE BIBM 2007 intends to attract a balanced combination of computer scientists, biologists, biomedical engineers, chemist, data analyzer, and statistician. Topics of Interest: We are soliciting high quality original research papers (including significant works-in-progress) in any aspect of bioinformatics and biomedicine. New computational techniques and methods in machine learning; data mining; text analysis; pattern recognition; knowledge representation; databases; data modeling; combinatorics; stochastic modeling; string and graph algorithms; linguistic methods; robotics; constraint satisfaction; data visualization; parallel computation; data integration; modeling and simulation and heir application in life science domain are especially encouraged. Relevant topics included, but are not limited to: ? Biological Data Mining ? High Performance Bio-computing ? Genomics, Comparative Genomics ? Biological Databases & Data Integration ? Biological Data Visualization ? Evolution and Phylogeny ? Molecular sequence analysis ? Healthcare Informatics ? Recognition of genes and regulatory elements ? Molecular evolution ? Proteomics, Protein structure, Computational proteomics ? Gene networks, Computational genetics ? Combinatorial libraries and drug design ? Computational systems biology ? Microarray design and data analysis ? Biomedical Ontologies ? Sequence Analysis ? Structural and functional genomics ? Synthetic Biological Systems ? Pathways, Networks, Systems Biology ? Text Mining & Information Extraction ? Transcriptomics ? BioMedical Devices with Embedded Computers ? Signal and Image Processing in BioMedicine ? BioMedical Image Segmentation and Compression ? BioMedical Intelligence and Data Warehousing ? BioMedical Databases & Information Systems , BioMedical Information ? Multimedia Biomedical Databases ? Intelligent Biomedical Knowledge Discovery ? Modeling signalling network ? Temporal and spatial mining of clinic data including medical images SUBMISSION REQUIREMENTS: Please submit a full length paper (not exceeding 5000 words) thorough the online submission system (we will use the same cyber chair system as the IEEE AI/IAT 2007 and IEEE GrC 2007 conference) Electronic submissions (in PDF or Postscript format) are required Selected participants will be asked to submit their revised papers in a format to be specified at the time of acceptance. The Best papers selected from the IEEE BIBM 07 will be published as a special issue in related IEEE Transactions IMPORTANT DATES: Paper submission: June 10, 2007 Notification: Aug 10, 2007 Camera-ready due: Aug 30, 2007 MEETING ORGANIZATION: GENERAL CO-CHAIRS: Prof. Yi Pan (Department of Computer Science, Georgia State University, USA) Prof. Carmelina Ruggiero (Editor-in-chief of IEEE Trans. on NanoBioscience, University of Genoa, Italy) Prof. Aydin Tozeren (Director, Center for Integrated Bioinformatics, Drexel University, USA) PROGRAM CO-CHAIRS: Prof. Xiaohua Hu (College of Information Science & Technology, Drexel University, USA) Prof. Zoran Obradovic (Center for Information Science and Technology, Temple University, USA) Prof. Ion Mandoiu (Computer Science & Engineering Department, University of Connecticut, USA) WORKSHOP CO-CHAIRS: Jinyan Li, Institute for Infocomm Research, Singapore Xue-Wen Chen Univ. of Kansas, USA Tharam S. Dillon Curtin University of Technology, Australia TUTORIAL CHAIR: Sun Kim, Indiana University, USA LOCAL ACCOMODATION CO-CHAIRS: David Scot Taylor San Jos? State University, USA Tom Qi Zhang Google, USA PUBLICITY CO-CHAIRS: Amandeep S. Sidhu Curtin University of Technology, Australia Shuigeng Zhou Fudan University, China -- -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From camon at ebi.ac.uk Wed Apr 11 05:03:03 2007 From: camon at ebi.ac.uk (Evelyn Camon) Date: Wed, 11 Apr 2007 13:03:03 +0100 Subject: Position for immune system Reactome Curator at EBI Message-ID: <461CCE77.4050004@ebi.ac.uk> Job description: Reactome (http://www.reactome.org) is a pathway knowledgebase that focuses on human molecular biology. It integrates experimentally determined information from literature sources to produce molecular networks suitable for computational analysis. As part of our effort to study pathways of direct clinical significance, a wide-ranging literature review process is currently underway, geared towards the network analysis of human immune-related pathways. This post is an opportunity to build and study an important pathway resource while working with leading experts in signaling biology, and an experienced in-house team of Biologists and Bioinformaticians. Qualifications and experience: The ideal candidate should hold a Masters/PhD in Biology and have experience in Signaling and Molecular Biology or Immunology. Experience in any of the following areas will be considered as an advantage: bioinformatics, computer programming, and database/spreadsheet management. The Reactome Curator should be able to demonstrate good interpersonal and communication skills as well as attention to detail. An excellent command of written and spoken English is a requirement. Contract: A contract of 3 years will be offered to the successful candidate. This can be renewed, depending on circumstances at the time of the review. Closing date: 29 April 2007 (EMBL advert: http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) -- Evelyn Camon GOA Coordinator Senior Scientific Curator European Bioinformatics Institute Tel:01223-494465 Fax:01223-494468 E-mail: camon at ebi.ac.uk URL: http://www.ebi.ac.uk/goa -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From lange at ipk-gatersleben.de Wed Apr 11 08:10:04 2007 From: lange at ipk-gatersleben.de (M. Lange) Date: Wed, 11 Apr 2007 17:10:04 +0200 Subject: Postdoc position for WebService development for semantic data and tool integration Message-ID: <461CFA4C.4050508@ipk-gatersleben.de> Job description: The Leibniz Institute of Plant Genetics and Crop Plant Research (http://www.ipk-gatersleben.de) is a leading institute in crop plant research. This results in several international published databases and developed software tools. This data and software build an important bioinformatics resource for the scientific community. The positive impact for the IPK bioinformatics research can increase by supporting WebService repositories like BioMoby or BioCase. The final goal is embedding the institute into international bioinformatics resource networks. Examples for such networks are the "Virtual Plant Information Network" (VPIN), the PlaNet-Network or the international initiative "Global Biodiversity Information Facility" (GBIF). Thus, this scholarship focuses on the concrete implementation of such a service and should support the service work of the IPK bioinformatics groups. This service must be implemented using standardized computer science and bioinformatics methods, like XML, Web Services, ontologies, semantic Web and database integration. Specific Requirements of the candidate: The project's goals require a Bioinformatics Postdoc with special experience in - database concepts and technologies, - SQL - Web-service techniques - biological databases, - XML processing, - JAVA and Perl programming languages - UNIX operating systems Contract: A contract of 1 year will be offered to the successful candidate. Matthias Lange Group Leader IT Leibniz Institute of Plant Genetics and Crop Plant Research Tel: +49 39482 5693 Fax: +49 39482 5139 E-mail: lange at ipk-gatersleben.de URL: http://www.ipk-gatersleben.de -------------- next part -------------- A non-text attachment was scrubbed... Name: lange.vcf Type: text/x-vcard Size: 407 bytes Desc: not available Url : http://fafner.stanford.edu/pipermail/gofriends/attachments/20070411/1f3da2d6/attachment.vcf From asidhu at biomap.org Wed Apr 11 12:14:08 2007 From: asidhu at biomap.org (Amandeep S. Sidhu) Date: Thu, 12 Apr 2007 05:14:08 +1000 Subject: CFP: 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007) Message-ID: <1176318848.461d3380f168c@mail.opentransfer.com> 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007) (co-located with IEEE/WCI/ACM WI-IAT 07 and IEEE GrC 07) http://www.cis.drexel.edu/faculty/thu/bibm/index.php.htm San Jose, CA, USA Nov 2-4, 2007 IEEE BIBM 2007 will provide a general forum for disseminating the latest research in bioinformatics and biomedicine. It is a multidisciplinary conference that brings together academic and industrial scientists from computer science, biology, chemistry, medicine, mathematics and statistics. It will exchange research results and address open issues in all aspects of bioinformatics and biomedicine and provide a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, simulation, ontology and other computational methods, as applied to life science problems, with emphasis on applications in high throughput data-rich areas in biology, biomedical engineering. IEEE BIBM 2007 intends to attract a balanced combination of computer scientists, biologists, biomedical engineers, chemist, data analyzer, statistician. Topics of interests We are soliciting high quality original research papers (including significant works-in-progress) in any aspect of bioinformatics and biomedicine. New computational techniques and methods in machine learning; data mining; text analysis; pattern recognition; knowledge representation; databases; data modeling; combinatorics; stochastic modeling; string and graph algorithms; linguistic methods; robotics; constraint satisfaction; data visualization; parallel computation; data integration; modeling and simulation and heir application in life science domain are especially encouraged. Relevant topics included, but are not limited to: ? Biological Data Mining ? High Performance Bio-computing ? Genomics, Comparative Genomics ? Biological Databases & Data Integration ? Biological Data Visualization ? Evolution and Phylogeny ? Molecular sequence analysis ? Healthcare Informatics ? Recognition of genes and regulatory elements ? Molecular evolution ? Proteomics, Protein structure, Computational proteomics ? Gene networks, Computational genetics ? Combinatorial libraries and drug design ? Computational systems biology ? Microarray design and data analysis ? BioOntologies ? Sequence Analysis ? Structural and functional genomics ? Synthetic Biological Systems ? Pathways, Networks , Systems Biology ? Text Mining & Information Extraction ? Transcriptomics ? BioMedical Devices with Embedded Computers ? Signal and Image Processing in BioMedicine ? BioMedical Image Segmentation and Compression ? BioMedical Intelligence and Data Warehousing ? BioMedical Databases & Information Systems , BioMedical Information ? Multimedia Biomedical Databases ? Intelligent Biomedical Knowledge Discovery ? Modeling signalling network ? Temporal and spatial mining of clinic data including medical images MEETING ORGANIZATION: GENERAL CO-CHAIRS: Prof. Yi Pan Department of Computer Science Georgia State University 34 Peachtree Street, Suite 1450 Atlanta, GA 30302-4110, USA Prof. Carmelina Ruggiero Editor-in-chief of IEEE Trans. on NanoBioscience Professor of Bioengineering DIST University of Genoa Via Opera Pia 13 Genoa, 16145, Italy Prof. Aydin Tozeren Distinguished Professor and Director, Center for Integrated Bioinformatics, School of Biomedical Engineering, Science & Health Systems Drexel University, Philadelphia, PA 19104, USA PROGRAM CO-CHAIRS: Prof. Xiaohua Hu College of Information Science & Technology Drexel University Philadelphia, PA 19104 USA Prof. Zoran Obradovic Center for Information Science and Technology, Temple University, 303 Wachman Hall (038-24), 1805 N. Broad St., Philadelphia, PA 19122, USA. Prof. Ion Mandoiu Computer Science & Engineering Department University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-2155 Office: ITEB 261 WORKSHOP CHAIRS: Jinyan Li Institute for Infocomm Research, Singapore Xue-Wen Chen Univ. of Kansas, USA Tharam S. Dillon Curtin University of Technology, Australia TUTORIAL CHAIR: Sun Kim Indiana University, USA LOCAL ACCOMMODATION CO-CHAIRS: David Scot Taylor San Jos? State University, USA Tom Qi Zhang Google, USA PUBLICITY CO-CHAIRS: Amandeep S. Sidhu Curtin University of Technology, Australia Shuigeng Zhou Fudan University, China IEEE BIBM CO-CHAIRS/DIRECTORS: Xiaohua Hu Drexel University, USA Yi Pan Georgia State University, USA IEEE BIBM STEERING COMMITTEE: Xiaohua Hu Drexel University, USA Vipin Kumar Univ. of Minnesota, USA Michael Ng HongKong Baptist University, China Zoran Obradovic Temple University, USA Yi Pan Georgia State University, USA Jose Principe Univ. of Florida, USA Carmelina Ruggiero DIST University of Genoa, Italy Niilo Saranummi VTT Information Technology, Finland Shusaku Tsumoto Shimane University, Japan SUBMISSION REQUIREMENTS: Please submit a full length paper (not exceeding 5000 words) thorough the online submission system (we will use the same cyber chair system as the IEEE/WCI-ACM WI-IAT 2007 and IEEE GrC 2007 conference) Electronic submissions (in PDF or Postscript format) are required. Selected participants will be asked to submit their revised papers in a format to be specified at the time of acceptance. The Best papers selected from the IEEE BIBM 07 will be published as a special issue in related IEEE Transactions and other journals such as IJDMB, IJBRA etc. IMPORTANT DATES: Paper submission: June 10, 2007 Notification: August 1, 2007 Camera-ready due: August 20, 2007 -- -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From taagi2 at hotmail.com Fri Apr 13 10:47:25 2007 From: taagi2 at hotmail.com (Geeta Joshi-Tope) Date: Fri, 13 Apr 2007 13:47:25 -0400 Subject: Position for immune system Reactome Curator at EBI In-Reply-To: <461CCE77.4050004@ebi.ac.uk> Message-ID: Hi Evelyn, I am curious whether GO or yourslef are now in charge of Reactome at EBI, or Reactome in general. I don't remember if we met when I was the Editor-in-Chief of Reactome (back until May 25, 2005). cheers Geeta *************** Geeta Joshi-Tope >From: Evelyn Camon >To: gofriends at genome.stanford.edu, angenmap , > "camon >> Evelyn Camon" >Subject: Position for immune system Reactome Curator at EBI >Date: Wed, 11 Apr 2007 13:03:03 +0100 > >Job description: Reactome (http://www.reactome.org) is a pathway >knowledgebase that focuses on human molecular biology. It integrates >experimentally determined information from literature sources to produce >molecular networks suitable for computational analysis. As part of our >effort to study pathways of direct clinical significance, a wide-ranging >literature review process is currently underway, geared towards the network >analysis of human immune-related pathways. This post is an opportunity to >build and study an important pathway resource while working with leading >experts in signaling biology, and an experienced in-house team of >Biologists and Bioinformaticians. > >Qualifications and experience: The ideal candidate should hold a >Masters/PhD in Biology and have experience in Signaling and Molecular >Biology or Immunology. Experience in any of the following areas will be >considered as an advantage: bioinformatics, computer programming, and >database/spreadsheet management. The Reactome Curator should be able to >demonstrate good interpersonal and communication skills as well as >attention to detail. An excellent command of written and spoken English is >a requirement. > >Contract: A contract of 3 years will be offered to the successful >candidate. This can be renewed, depending on circumstances at the time of >the review. > >Closing date: 29 April 2007 (EMBL advert: >http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) >-- >Evelyn Camon >GOA Coordinator >Senior Scientific Curator >European Bioinformatics Institute >Tel:01223-494465 >Fax:01223-494468 >E-mail: camon at ebi.ac.uk >URL: http://www.ebi.ac.uk/goa > > >-- >This message is from the GOFriends moderated mailing list. A list of >public >announcements and discussion of the Gene Ontology (GO) project. >Problems with the list? E-mail: owner-gofriends at geneontology.org >Subscribing send "subscribe" to gofriends-request at geneontology.org >Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org >Web: http://www.geneontology.org/ _________________________________________________________________ Exercise your brain! Try Flexicon. http://games.msn.com/en/flexicon/default.htm?icid=flexicon_hmemailtaglineapril07 -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From taagi2 at hotmail.com Fri Apr 13 10:51:35 2007 From: taagi2 at hotmail.com (Geeta Joshi-Tope) Date: Fri, 13 Apr 2007 13:51:35 -0400 Subject: Position for immune system Reactome Curator at EBI In-Reply-To: <461CCE77.4050004@ebi.ac.uk> Message-ID: Hi Evelyn, I am curious whether GO or yourslef are now in charge of Reactome at EBI, or Reactome in general. I don't remember if we met when I was the Editor-in-Chief of Reactome (back until May 25, 2005). cheers Geeta *************** Geeta Joshi-Tope >From: Evelyn Camon >To: gofriends at genome.stanford.edu, angenmap , > "camon >> Evelyn Camon" >Subject: Position for immune system Reactome Curator at EBI >Date: Wed, 11 Apr 2007 13:03:03 +0100 > >Job description: Reactome (http://www.reactome.org) is a pathway >knowledgebase that focuses on human molecular biology. It integrates >experimentally determined information from literature sources to produce >molecular networks suitable for computational analysis. As part of our >effort to study pathways of direct clinical significance, a wide-ranging >literature review process is currently underway, geared towards the network >analysis of human immune-related pathways. This post is an opportunity to >build and study an important pathway resource while working with leading >experts in signaling biology, and an experienced in-house team of >Biologists and Bioinformaticians. > >Qualifications and experience: The ideal candidate should hold a >Masters/PhD in Biology and have experience in Signaling and Molecular >Biology or Immunology. Experience in any of the following areas will be >considered as an advantage: bioinformatics, computer programming, and >database/spreadsheet management. The Reactome Curator should be able to >demonstrate good interpersonal and communication skills as well as >attention to detail. An excellent command of written and spoken English is >a requirement. > >Contract: A contract of 3 years will be offered to the successful >candidate. This can be renewed, depending on circumstances at the time of >the review. > >Closing date: 29 April 2007 (EMBL advert: >http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) >-- >Evelyn Camon >GOA Coordinator >Senior Scientific Curator >European Bioinformatics Institute >Tel:01223-494465 >Fax:01223-494468 >E-mail: camon at ebi.ac.uk >URL: http://www.ebi.ac.uk/goa > > >-- >This message is from the GOFriends moderated mailing list. A list of >public >announcements and discussion of the Gene Ontology (GO) project. >Problems with the list? E-mail: owner-gofriends at geneontology.org >Subscribing send "subscribe" to gofriends-request at geneontology.org >Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org >Web: http://www.geneontology.org/ _________________________________________________________________ Mortgage refinance is Hot. *Terms. Get a 5.375%* fix rate. Check savings https://www2.nextag.com/goto.jsp?product=100000035&url=%2fst.jsp&tm=y&search=mortgage_text_links_88_h2bbb&disc=y&vers=925&s=4056&p=5117 -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From camon at ebi.ac.uk Fri Apr 13 12:38:57 2007 From: camon at ebi.ac.uk (camon at ebi.ac.uk) Date: Fri, 13 Apr 2007 20:38:57 +0100 (BST) Subject: Position for immune system Reactome Curator at EBI In-Reply-To: References: <461CCE77.4050004@ebi.ac.uk> Message-ID: <40642.80.47.134.3.1176493137.squirrel@webmail.ebi.ac.uk> Dear Geeta Just to clarify. I simply passed on this position as a favour to Bernard de Bono who works for Reactome. I coordinate the GOA project at the EBI. Reactome and GO and GOA are separate databases. Ewan Birney heads Reactome(at EBI), GO is headed by Michael Ashburner, Judith Blake, Suzi Lewis and Mike Cherry, GOA is headed by Rolf Apweiler(at EBI). If you have any queries regarding this position they should be addressed not to me but to the EMBL personnel website. http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html E-mail: application at embl.de I don't think we met, maybe next time. Kind regards Evelyn Camon > > I am curious whether GO or yourslef are now in charge of Reactome at EBI, > or > Reactome in general. I don't remember if we met when I was the > Editor-in-Chief of Reactome (back until May 25, 2005). > > cheers > Geeta > *************** > Geeta Joshi-Tope > > >>From: Evelyn Camon >>To: gofriends at genome.stanford.edu, angenmap , >> "camon >> Evelyn Camon" >>Subject: Position for immune system Reactome Curator at EBI >>Date: Wed, 11 Apr 2007 13:03:03 +0100 >> >>Job description: Reactome (http://www.reactome.org) is a pathway >>knowledgebase that focuses on human molecular biology. It integrates >>experimentally determined information from literature sources to produce >>molecular networks suitable for computational analysis. As part of our >>effort to study pathways of direct clinical significance, a wide-ranging >>literature review process is currently underway, geared towards the >> network >>analysis of human immune-related pathways. This post is an opportunity to >>build and study an important pathway resource while working with leading >>experts in signaling biology, and an experienced in-house team of >>Biologists and Bioinformaticians. >> >>Qualifications and experience: The ideal candidate should hold a >>Masters/PhD in Biology and have experience in Signaling and Molecular >>Biology or Immunology. Experience in any of the following areas will be >>considered as an advantage: bioinformatics, computer programming, and >>database/spreadsheet management. The Reactome Curator should be able to >>demonstrate good interpersonal and communication skills as well as >>attention to detail. An excellent command of written and spoken English >> is >>a requirement. >> >>Contract: A contract of 3 years will be offered to the successful >>candidate. This can be renewed, depending on circumstances at the time of >>the review. >> >>Closing date: 29 April 2007 (EMBL advert: >>http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) >>-- >>Evelyn Camon >>GOA Coordinator >>Senior Scientific Curator >>European Bioinformatics Institute >>Tel:01223-494465 >>Fax:01223-494468 >>E-mail: camon at ebi.ac.uk >>URL: http://www.ebi.ac.uk/goa >> >> >>-- >>This message is from the GOFriends moderated mailing list. A list of >>public >>announcements and discussion of the Gene Ontology (GO) project. >>Problems with the list? E-mail: >> owner-gofriends at geneontology.org >>Subscribing send "subscribe" to >> gofriends-request at geneontology.org >>Unsubscribing send "unsubscribe" to >> gofriends-request at geneontology.org >>Web: http://www.geneontology.org/ > > _________________________________________________________________ > Mortgage refinance is Hot. *Terms. Get a 5.375%* fix rate. Check savings > https://www2.nextag.com/goto.jsp?product=100000035&url=%2fst.jsp&tm=y&search=mortgage_text_links_88_h2bbb&disc=y&vers=925&s=4056&p=5117 > > > -- > This message is from the GOFriends moderated mailing list. A list of > public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner-gofriends at geneontology.org > Subscribing send "subscribe" to gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org > Web: http://www.geneontology.org/ > -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From taagi2 at hotmail.com Fri Apr 13 10:47:32 2007 From: taagi2 at hotmail.com (Geeta Joshi-Tope) Date: 13 Apr 2007 17:47:32 -0000 Subject: Position for immune system Reactome Curator at EBI Message-ID: <20070414035207.2585.qmail@nagrp.ansci.iastate.edu> Hi Evelyn, I am curious whether GO or yourslef are now in charge of Reactome at EBI, or Reactome in general. I don't remember if we met when I was the Editor-in-Chief of Reactome (back until May 25, 2005). cheers Geeta *************** Geeta Joshi-Tope >From: Evelyn Camon >To: gofriends at genome.stanford.edu, angenmap , > "camon >> Evelyn Camon" >Subject: Position for immune system Reactome Curator at EBI >Date: Wed, 11 Apr 2007 13:03:03 +0100 > >Job description: Reactome (http://www.reactome.org) is a pathway >knowledgebase that focuses on human molecular biology. It integrates >experimentally determined information from literature sources to produce >molecular networks suitable for computational analysis. As part of our >effort to study pathways of direct clinical significance, a wide-ranging >literature review process is currently underway, geared towards the network >analysis of human immune-related pathways. This post is an opportunity to >build and study an important pathway resource while working with leading >experts in signaling biology, and an experienced in-house team of >Biologists and Bioinformaticians. > >Qualifications and experience: The ideal candidate should hold a >Masters/PhD in Biology and have experience in Signaling and Molecular >Biology or Immunology. Experience in any of the following areas will be >considered as an advantage: bioinformatics, computer programming, and >database/spreadsheet management. The Reactome Curator should be able to >demonstrate good interpersonal and communication skills as well as >attention to detail. An excellent command of written and spoken English is >a requirement. > >Contract: A contract of 3 years will be offered to the successful >candidate. This can be renewed, depending on circumstances at the time of >the review. > >Closing date: 29 April 2007 (EMBL advert: >http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) >-- >Evelyn Camon >GOA Coordinator >Senior Scientific Curator >European Bioinformatics Institute >Tel:01223-494465 >Fax:01223-494468 >E-mail: camon at ebi.ac.uk >URL: http://www.ebi.ac.uk/goa -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From taagi2 at hotmail.com Tue Apr 17 08:09:31 2007 From: taagi2 at hotmail.com (Geeta Joshi-Tope) Date: Tue, 17 Apr 2007 11:09:31 -0400 Subject: Position for immune system Reactome Curator at EBI In-Reply-To: <40642.80.47.134.3.1176493137.squirrel@webmail.ebi.ac.uk> Message-ID: Dear Evelyn, Yes, of course I knew Bernard. As I said I oversaw and managed the curation and curators, repectively, until ~ 4 PM, May 25, 2005. I was just surprised to see a post from GO rather than from project personnel. Thanks for the clarification. regards Geeta >From: camon at ebi.ac.uk >To: "Geeta Joshi-Tope" >CC: camon at ebi.ac.uk, gofriends at genome.stanford.edu, >angenmap at animalgenome.org, geetatoep at gmail.com >Subject: RE: Position for immune system Reactome Curator at EBI >Date: Fri, 13 Apr 2007 20:38:57 +0100 (BST) > >Dear Geeta > >Just to clarify. I simply passed on this position as a favour to Bernard >de Bono who works for Reactome. I coordinate the GOA project at the EBI. >Reactome and GO and GOA are separate databases. Ewan Birney heads >Reactome(at EBI), GO is headed by Michael Ashburner, Judith Blake, Suzi >Lewis and Mike Cherry, GOA is headed by Rolf Apweiler(at EBI). >If you have any queries regarding this position they should be addressed >not to me but to the EMBL personnel website. >http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html >E-mail: application at embl.de > >I don't think we met, maybe next time. > >Kind regards > >Evelyn Camon > > > > > I am curious whether GO or yourslef are now in charge of Reactome at >EBI, > > or > > Reactome in general. I don't remember if we met when I was the > > Editor-in-Chief of Reactome (back until May 25, 2005). > > > > cheers > > Geeta > > *************** > > Geeta Joshi-Tope > > > > > >>From: Evelyn Camon > >>To: gofriends at genome.stanford.edu, angenmap , > >> "camon >> Evelyn Camon" > >>Subject: Position for immune system Reactome Curator at EBI > >>Date: Wed, 11 Apr 2007 13:03:03 +0100 > >> > >>Job description: Reactome (http://www.reactome.org) is a pathway > >>knowledgebase that focuses on human molecular biology. It integrates > >>experimentally determined information from literature sources to produce > >>molecular networks suitable for computational analysis. As part of our > >>effort to study pathways of direct clinical significance, a wide-ranging > >>literature review process is currently underway, geared towards the > >> network > >>analysis of human immune-related pathways. This post is an opportunity >to > >>build and study an important pathway resource while working with leading > >>experts in signaling biology, and an experienced in-house team of > >>Biologists and Bioinformaticians. > >> > >>Qualifications and experience: The ideal candidate should hold a > >>Masters/PhD in Biology and have experience in Signaling and Molecular > >>Biology or Immunology. Experience in any of the following areas will be > >>considered as an advantage: bioinformatics, computer programming, and > >>database/spreadsheet management. The Reactome Curator should be able to > >>demonstrate good interpersonal and communication skills as well as > >>attention to detail. An excellent command of written and spoken English > >> is > >>a requirement. > >> > >>Contract: A contract of 3 years will be offered to the successful > >>candidate. This can be renewed, depending on circumstances at the time >of > >>the review. > >> > >>Closing date: 29 April 2007 (EMBL advert: > >>http://www-db.embl.de/jss/servlet/de.embl.bk.emblGroups.JobsPage/07032.html) > >>-- > >>Evelyn Camon > >>GOA Coordinator > >>Senior Scientific Curator > >>European Bioinformatics Institute > >>Tel:01223-494465 > >>Fax:01223-494468 > >>E-mail: camon at ebi.ac.uk > >>URL: http://www.ebi.ac.uk/goa > >> > >> > >>-- > >>This message is from the GOFriends moderated mailing list. A list of > >>public > >>announcements and discussion of the Gene Ontology (GO) project. > >>Problems with the list? E-mail: > >> owner-gofriends at geneontology.org > >>Subscribing send "subscribe" to > >> gofriends-request at geneontology.org > >>Unsubscribing send "unsubscribe" to > >> gofriends-request at geneontology.org > >>Web: http://www.geneontology.org/ > > > > _________________________________________________________________ > > Mortgage refinance is Hot. *Terms. Get a 5.375%* fix rate. Check savings > > >https://www2.nextag.com/goto.jsp?product=100000035&url=%2fst.jsp&tm=y&search=mortgage_text_links_88_h2bbb&disc=y&vers=925&s=4056&p=5117 > > > > > > -- > > This message is from the GOFriends moderated mailing list. A list of > > public > > announcements and discussion of the Gene Ontology (GO) project. > > Problems with the list? E-mail: >owner-gofriends at geneontology.org > > Subscribing send "subscribe" to >gofriends-request at geneontology.org > > Unsubscribing send "unsubscribe" to >gofriends-request at geneontology.org > > Web: http://www.geneontology.org/ > > > > _________________________________________________________________ Download Messenger. Join the i?m Initiative. Help make a difference today. http://im.live.com/messenger/im/home/?source=TAGHM_APR07 -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From tobias at sfsu.edu Tue Apr 17 12:35:08 2007 From: tobias at sfsu.edu (Tobias Sayre) Date: Tue, 17 Apr 2007 12:35:08 -0700 Subject: Quantifying Specificity of GO Terms Message-ID: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Dear GO Friends, I am working on a project that involves curation of protein data that includes GO terms, and it would be very helpful if I had some numerical quantification of the specificity of each term. It is possible to manually examine each term to determine this specificity, but because there is a large amount of data, I would like to automate the process. I understand that there is no reliable way to do this simply using the level in the DAG hierarchy, but I am wondering if any of you might have a work-around. Thanks, Tobias Sayre -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/gofriends/attachments/20070417/af36407b/attachment.html From alanruttenberg at gmail.com Tue Apr 17 17:17:01 2007 From: alanruttenberg at gmail.com (Alan Ruttenberg) Date: Tue, 17 Apr 2007 20:17:01 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: <3993900A-08AE-47BF-8D83-76B622E9D597@gmail.com> Try Google: gene ontology information content Best, Alan On Apr 17, 2007, at 3:35 PM, Tobias Sayre wrote: > Dear GO Friends, > > I am working on a project that involves curation of protein data > that includes GO terms, and it would be very helpful if I had some > numerical quantification of the specificity of each term. It is > possible to manually examine each term to determine this > specificity, but because there is a large amount of data, I would > like to automate the process. I understand that there is no > reliable way to do this simply using the level in the DAG > hierarchy, but I am wondering if any of you might have a work-around. > > Thanks, > > Tobias Sayre -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From froth at hms.harvard.edu Tue Apr 17 18:16:23 2007 From: froth at hms.harvard.edu (Fritz Roth) Date: Tue, 17 Apr 2007 21:16:23 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.co m> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: <6.2.3.4.2.20070417211059.04730ba0@email.med.harvard.edu> My favorite measure of GO term specificity (or 'vagueness' if you prefer) is the fraction of genes currently annotated with that term. Fritz Roth At 03:35 PM 4/17/2007, Tobias Sayre wrote: >Dear GO Friends, > >I am working on a project that involves curation of protein data >that includes GO terms, and it would be very helpful if I had some >numerical quantification of the specificity of each term. It is >possible to manually examine each term to determine this >specificity, but because there is a large amount of data, I would >like to automate the process. I understand that there is no >reliable way to do this simply using the level in the DAG hierarchy, >but I am wondering if any of you might have a work-around. > >Thanks, > >Tobias Sayre ____________________________________________________ Frederick P. 'Fritz' Roth, Asst. Professor, Harvard Medical School / BCMP Dept. 250 Longwood Avenue, SGMB-322, Boston, MA 02115 phone:(617) 432-3551 mailto:fritz_roth at hms.harvard.edu fax: (617) 432-3557 http://llama.med.harvard.edu ____________________________________________________ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From sorin at wayne.edu Tue Apr 17 19:43:33 2007 From: sorin at wayne.edu (Sorin Draghici) Date: Tue, 17 Apr 2007 22:43:33 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: <462585D5.3070607@wayne.edu> Tobias, How exactly would you define the specificity of a GO term? If we had an exact definition, we could possibly write a piece of software that could do it. Sorin Tobias Sayre wrote: > Dear GO Friends, > > I am working on a project that involves curation of protein data that > includes GO terms, and it would be very helpful if I had some > numerical quantification of the specificity of each term. It is > possible to manually examine each term to determine this specificity, > but because there is a large amount of data, I would like to automate > the process. I understand that there is no reliable way to do this > simply using the level in the DAG hierarchy, but I am wondering if any > of you might have a work-around. > > Thanks, > > Tobias Sayre -- Sorin Draghici, Ph.D. Director of the Bioinformatics Core, Karmanos Cancer Institute Associate Professor Tel: (313) 577-5484 Dept. of Computer Science Fax: (313) 577-6868 Wayne State University 5143 Cass Ave, Room 431 State Hall, Detroit, MI, 48202 WWW: http://vortex.cs.wayne.edu/Sorin/ (personal) WWW: http://vortex.cs.wayne.edu/Projects.html (lab) Check out my recent book: Data Analysis Tools for Microarrays: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C3154&parent_id=&pc= -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From pshannon at systemsbiology.org Tue Apr 17 21:36:23 2007 From: pshannon at systemsbiology.org (Paul Shannon) Date: Tue, 17 Apr 2007 21:36:23 -0700 Subject: Quantifying Specificity of GO Terms In-Reply-To: <462585D5.3070607@wayne.edu> (message from Sorin Draghici on Tue, 17 Apr 2007 22:43:33 -0400) References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> Message-ID: <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> The Bioconductor project has, I believe, a fine solution to this problem -- though please forgive me if I have misconstrued things. The relevant packages (see below) use the Hypergeometric distribution to calculate a p-value for the enrichment of any GO node for the genes in question. I typically map proteins to GeneID's as the first step in my analysis. If this sounds like it addresses your problem, you may wish to take a look at http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and http://bioconductor.org/packages/1.9/bioc/html/Category.html Each of these web pages contains a 'vignette' in a pdf file which makes for a good introduction to the methods. Though orginally conceived in the context of microarrays, I use these packages quite fruitfully with proteomics data. - Paul > > I am working on a project that involves curation of protein data that > > includes GO terms, and it would be very helpful if I had some > > numerical quantification of the specificity of each term. It is > > possible to manually examine each term to determine this specificity, > > but because there is a large amount of data, I would like to automate > > the process. I understand that there is no reliable way to do this > > simply using the level in the DAG hierarchy, but I am wondering if any > > of you might have a work-around. -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From Waclaw.Marcin.Kusnierczyk at idi.ntnu.no Wed Apr 18 00:59:21 2007 From: Waclaw.Marcin.Kusnierczyk at idi.ntnu.no (Waclaw Kusnierczyk) Date: Wed, 18 Apr 2007 09:59:21 +0200 Subject: Quantifying Specificity of GO Terms In-Reply-To: <6.2.3.4.2.20070417211059.04730ba0@email.med.harvard.edu> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <6.2.3.4.2.20070417211059.04730ba0@email.med.harvard.edu> Message-ID: <4625CFD9.4070707@idi.ntnu.no> See [1] for an information-theoretic approach. [1] Alterovitz, G., Xiang, M., Mohan, M., and Ramoni, M.F. (2007) .Gopad:the Gene Ontology Partition Database. Nucleic Acids Res, 35 (Database issue), D322?D327. vQ Fritz Roth wrote: > My favorite measure of GO term specificity (or 'vagueness' if you > prefer) is the fraction of genes currently annotated with that term. > > Fritz Roth > > At 03:35 PM 4/17/2007, Tobias Sayre wrote: >> Dear GO Friends, >> >> I am working on a project that involves curation of protein data that >> includes GO terms, and it would be very helpful if I had some >> numerical quantification of the specificity of each term. It is >> possible to manually examine each term to determine this specificity, >> but because there is a large amount of data, I would like to automate >> the process. I understand that there is no reliable way to do this >> simply using the level in the DAG hierarchy, but I am wondering if any >> of you might have a work-around. >> >> Thanks, >> >> Tobias Sayre > > ____________________________________________________ > Frederick P. 'Fritz' Roth, Asst. Professor, > Harvard Medical School / BCMP Dept. > 250 Longwood Avenue, SGMB-322, Boston, MA 02115 > phone:(617) 432-3551 mailto:fritz_roth at hms.harvard.edu > fax: (617) 432-3557 http://llama.med.harvard.edu > ____________________________________________________ > > -- > This message is from the GOFriends moderated mailing list. A list of > public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner-gofriends at geneontology.org > Subscribing send "subscribe" to gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org > Web: http://www.geneontology.org/ -- Wacek Kusnierczyk ------------------------------------------------------ Department of Information and Computer Science (IDI) Norwegian University of Science and Technology (NTNU) Sem Saelandsv. 7-9 7027 Trondheim Norway tel. 0047 73591875 fax 0047 73594466 ------------------------------------------------------ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From Waclaw.Marcin.Kusnierczyk at idi.ntnu.no Wed Apr 18 01:12:34 2007 From: Waclaw.Marcin.Kusnierczyk at idi.ntnu.no (Waclaw Kusnierczyk) Date: Wed, 18 Apr 2007 10:12:34 +0200 Subject: Quantifying Specificity of GO Terms In-Reply-To: <462585D5.3070607@wayne.edu> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> Message-ID: <4625D2F2.1020300@idi.ntnu.no> This is exactly the problem. Coding is the least painful part. Both the length of the path from a node to the root and the count of annotations say more about how well explored and interesting the particular part of the GO is than about any sort of specificity of the GO term. One possibility that I've been advocating for a while is to measure specificity in terms of a GO term's correspondence with the taxon/taxa in the taxonomy of species, for which organisms the term may be sensibly used to talk about. This would not, of course, solve those of your problems that are not related to the classification of organisms, but might help solve others. vQ Sorin Draghici wrote: > Tobias, > > How exactly would you define the specificity of a GO term? If we had an > exact definition, we could possibly write a piece of software that could > do it. > > Sorin > > Tobias Sayre wrote: >> Dear GO Friends, >> >> I am working on a project that involves curation of protein data that >> includes GO terms, and it would be very helpful if I had some >> numerical quantification of the specificity of each term. It is >> possible to manually examine each term to determine this specificity, >> but because there is a large amount of data, I would like to automate >> the process. I understand that there is no reliable way to do this >> simply using the level in the DAG hierarchy, but I am wondering if any >> of you might have a work-around. >> >> Thanks, >> >> Tobias Sayre > -- Wacek Kusnierczyk ------------------------------------------------------ Department of Information and Computer Science (IDI) Norwegian University of Science and Technology (NTNU) Sem Saelandsv. 7-9 7027 Trondheim Norway tel. 0047 73591875 fax 0047 73594466 ------------------------------------------------------ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From jrichter at berkeleybop.org Tue Apr 17 22:44:01 2007 From: jrichter at berkeleybop.org (John Day-Richter) Date: Tue, 17 Apr 2007 23:44:01 -0600 Subject: OBO-Edit 1.100 Official Release Message-ID: <4625B021.9070505@berkeleybop.org> The official release of OBO-Edit 1.100 is now available at http://sourceforge.net/projects/geneontology See the README file for a list of the many bug fixes and new features in this revision. Many thanks to the OBO-Edit Working Group for the hundreds of hours they spent testing the OBO-Edit 1.1 beta releases! Thanks to them, we now have an official, stable release! -John -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From cowel001 at mc.duke.edu Wed Apr 4 10:11:52 2007 From: cowel001 at mc.duke.edu (Lindsay G Cowell) Date: Wed, 4 Apr 2007 13:11:52 -0400 Subject: postdoc/curator position available Message-ID: The Duke University Center of Computational Immunology ( http://www.dulci.org) seeks applicants to work on two biocuration projects, one focused on immunology and one focused on infectious diseases. The approach taken under both projects is the curation of knowledge from the primary literature and public databases followed by the formal representation of that knowledge for automated reasoning. In its early stages, the infectious disease project will focus on host-pathogen interaction in the context of Staphylococcus aureus infection. The next stage will be coverage of host-pathogen interaction in the context of Mycobacterium tuberculosis infection. Successful candidates will have a recent PhD in Immunology, Microbiology, or Biology and experience with a variety of molecular and cell biology experimental techniques. Experience with biocuration, ontology development, text mining and information retrieval, knowledge representation, or automated reasoning is an added strength, but we are prepared to provide training in those areas. Please submit an application package consisting of a letter of interest containing contact information and citizenship status, a curriculum vitae and a one-page statement of research interests. Please arrange to have three letters of reference sent. The application package and reference letters should be submitted electronically, as PDF files, to Lindsay Cowell lgcowell at duke.edu. The positions will remain open until filled ------------------------------------- Lindsay G. Cowell Assistant Professor Department of Biostatistics and Bioinformatics Duke University Medical Center (919) 681-6226 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/gofriends/attachments/20070404/59e7b7be/attachment.html From qdong at genome.Stanford.EDU Wed Apr 18 10:01:25 2007 From: qdong at genome.Stanford.EDU (Stan Dong) Date: Wed, 18 Apr 2007 10:01:25 -0700 Subject: Quantifying Specificity of GO Terms In-Reply-To: <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> Message-ID: Another tool is the GO-TermFinder by Gavin Sherlock. I believe there is interest for Amigo to incorporate this tool. http://search.cpan.org/dist/GO-TermFinder/ SGD has been using it with great satisfaction from our users. You may check the SGD page to get some sense of a use case. http://db.yeastgenome.org/cgi-bin/GO/goTermFinder -Stan On Apr 17, 2007, at 9:36 PM, Paul Shannon wrote: > The Bioconductor project has, I believe, a fine solution to this > problem -- though > please forgive me if I have misconstrued things. The relevant > packages (see > below) use the Hypergeometric distribution to calculate a p-value > for the > enrichment of any GO node for the genes in question. I typically > map proteins > to GeneID's as the first step in my analysis. > > If this sounds like it addresses your problem, you may wish to take > a look at > > http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and > http://bioconductor.org/packages/1.9/bioc/html/Category.html > > Each of these web pages contains a 'vignette' in a pdf file which > makes for > a good introduction to the methods. > > Though orginally conceived in the context of microarrays, I use > these packages > quite fruitfully with proteomics data. > > - Paul > > >>> I am working on a project that involves curation of protein data >>> that >>> includes GO terms, and it would be very helpful if I had some >>> numerical quantification of the specificity of each term. It is >>> possible to manually examine each term to determine this >>> specificity, >>> but because there is a large amount of data, I would like to >>> automate >>> the process. I understand that there is no reliable way to do this >>> simply using the level in the DAG hierarchy, but I am wondering >>> if any >>> of you might have a work-around. > > -- > This message is from the GOFriends moderated mailing list. A list > of public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner- > gofriends at geneontology.org > Subscribing send "subscribe" to gofriends- > request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends- > request at geneontology.org > Web: http://www.geneontology.org/ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From sorin at wayne.edu Wed Apr 18 11:40:55 2007 From: sorin at wayne.edu (Sorin Draghici) Date: Wed, 18 Apr 2007 14:40:55 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> Message-ID: <46266637.5040904@wayne.edu> Hi, There seems to be a fair amount of confusion here. There are about 20 tools that are able to calculate a statistical significance value for a GO term giving a set of differentially expressed genes. This is a very well known problem that was defined about 4-5 years ago, see for instance: http://vortex.cs.wayne.edu/papers/genomics.pdf and http://vortex.cs.wayne.edu/papers/Onto-Express_V2_proof.pdf GO-TermFinder, GOStats and the others mentioned in the recent emails are all tools from the same category, tools that address the problem defined above. If anybody is interested in this problem, 17 of these tools have been recently reviewed in: http://vortex.cs.wayne.edu/papers/Ontological_analysis.pdf. The GO tools page includes pointers to many if not all such tools. The question at hand here is how to quantify the specificity of a given term. This is independent of any experiment and any set of differentially regulated genes and has to do with the structure of the GO and the position of the given term in the DAG. For instance, "regulation of apoptosis through extracellular signals" is more specific than "regulation of apoptosis" or "apoptosis". The problem is how to numerically quantify this specificity. To my knowledge, there is no tools of any kind that would even remotely provide any quantitative assessment of this specificity. Any answers or thoughts on this issue would be very valuable. Regards, Sorin Stan Dong wrote: > Another tool is the GO-TermFinder by Gavin Sherlock. I believe there > is interest for Amigo to incorporate this tool. > > http://search.cpan.org/dist/GO-TermFinder/ > > SGD has been using it with great satisfaction from our users. You may > check the SGD page to get some sense of a use case. > > http://db.yeastgenome.org/cgi-bin/GO/goTermFinder > > -Stan > > On Apr 17, 2007, at 9:36 PM, Paul Shannon wrote: > >> The Bioconductor project has, I believe, a fine solution to this >> problem -- though >> please forgive me if I have misconstrued things. The relevant >> packages (see >> below) use the Hypergeometric distribution to calculate a p-value for >> the >> enrichment of any GO node for the genes in question. I typically map >> proteins >> to GeneID's as the first step in my analysis. >> >> If this sounds like it addresses your problem, you may wish to take a >> look at >> >> http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and >> http://bioconductor.org/packages/1.9/bioc/html/Category.html >> >> Each of these web pages contains a 'vignette' in a pdf file which >> makes for >> a good introduction to the methods. >> >> Though orginally conceived in the context of microarrays, I use these >> packages >> quite fruitfully with proteomics data. >> >> - Paul >> >> >>>> I am working on a project that involves curation of protein data that >>>> includes GO terms, and it would be very helpful if I had some >>>> numerical quantification of the specificity of each term. It is >>>> possible to manually examine each term to determine this specificity, >>>> but because there is a large amount of data, I would like to automate >>>> the process. I understand that there is no reliable way to do this >>>> simply using the level in the DAG hierarchy, but I am wondering if any >>>> of you might have a work-around. >> >> -- >> This message is from the GOFriends moderated mailing list. A list of >> public >> announcements and discussion of the Gene Ontology (GO) project. >> Problems with the list? E-mail: >> owner-gofriends at geneontology.org >> Subscribing send "subscribe" to >> gofriends-request at geneontology.org >> Unsubscribing send "unsubscribe" to >> gofriends-request at geneontology.org >> Web: http://www.geneontology.org/ > > > -- > This message is from the GOFriends moderated mailing list. A list of > public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: > owner-gofriends at geneontology.org > Subscribing send "subscribe" to > gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to > gofriends-request at geneontology.org > Web: http://www.geneontology.org/ > -- Sorin Draghici, Ph.D. Director of the Bioinformatics Core, Karmanos Cancer Institute Associate Professor Tel: (313) 577-5484 Dept. of Computer Science Fax: (313) 577-6868 Wayne State University 5143 Cass Ave, Room 431 State Hall, Detroit, MI, 48202 WWW: http://vortex.cs.wayne.edu/Sorin/ (personal) WWW: http://vortex.cs.wayne.edu/Projects.html (lab) Check out my recent book: Data Analysis Tools for Microarrays: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C3154&parent_id=&pc= -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From andreas.schlicker at mpi-sb.mpg.de Thu Apr 19 01:17:11 2007 From: andreas.schlicker at mpi-sb.mpg.de (Andreas Schlicker) Date: Thu, 19 Apr 2007 10:17:11 +0200 Subject: Quantifying Specificity of GO Terms In-Reply-To: <46266637.5040904@wayne.edu> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> <46266637.5040904@wayne.edu> Message-ID: <46272587.8030601@mpi-sb.mpg.de> Hi, Sorin Draghici schrieb: > > ... > The question at hand here is how to quantify the specificity of a given > term. This is independent of any experiment and any set of > differentially regulated genes and has to do with the structure of the > GO and the position of the given term in the DAG. For instance, > "regulation of apoptosis through extracellular signals" is more specific > than "regulation of apoptosis" or "apoptosis". The problem is how to > numerically quantify this specificity. To my knowledge, there is no > tools of any kind that would even remotely provide any quantitative > assessment of this specificity. Any answers or thoughts on this issue > would be very valuable. We have developed a measure of similarity between GO terms that is based on the information content of a GO term [1]. The information content is based on the frequency of a term in UniProt, and can be used to quantify the specificity of a GO term. A high value corresponds to a high specificity, terms less frequently annotated and usually deeper in the graph. We have such a table for the August 2006 release of GO in our database. [1] Schlicker A, Domingues FS, Rahnenfuehrer J, Lengauer T. A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics 2006, 7:302 (http://www.biomedcentral.com/1471-2105/7/302) Kind regards, Andreas > Stan Dong wrote: >> Another tool is the GO-TermFinder by Gavin Sherlock. I believe there >> is interest for Amigo to incorporate this tool. >> >> http://search.cpan.org/dist/GO-TermFinder/ >> >> SGD has been using it with great satisfaction from our users. You may >> check the SGD page to get some sense of a use case. >> >> http://db.yeastgenome.org/cgi-bin/GO/goTermFinder >> >> -Stan >> >> On Apr 17, 2007, at 9:36 PM, Paul Shannon wrote: >> >>> The Bioconductor project has, I believe, a fine solution to this >>> problem -- though >>> please forgive me if I have misconstrued things. The relevant >>> packages (see >>> below) use the Hypergeometric distribution to calculate a p-value for >>> the >>> enrichment of any GO node for the genes in question. I typically map >>> proteins >>> to GeneID's as the first step in my analysis. >>> >>> If this sounds like it addresses your problem, you may wish to take a >>> look at >>> >>> http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and >>> http://bioconductor.org/packages/1.9/bioc/html/Category.html >>> >>> Each of these web pages contains a 'vignette' in a pdf file which >>> makes for >>> a good introduction to the methods. >>> >>> Though orginally conceived in the context of microarrays, I use these >>> packages >>> quite fruitfully with proteomics data. >>> >>> - Paul >>> >>> >>>>> I am working on a project that involves curation of protein data that >>>>> includes GO terms, and it would be very helpful if I had some >>>>> numerical quantification of the specificity of each term. It is >>>>> possible to manually examine each term to determine this specificity, >>>>> but because there is a large amount of data, I would like to automate >>>>> the process. I understand that there is no reliable way to do this >>>>> simply using the level in the DAG hierarchy, but I am wondering if any >>>>> of you might have a work-around. >>> >>> -- >>> This message is from the GOFriends moderated mailing list. A list of >>> public >>> announcements and discussion of the Gene Ontology (GO) project. >>> Problems with the list? E-mail: >>> owner-gofriends at geneontology.org >>> Subscribing send "subscribe" to >>> gofriends-request at geneontology.org >>> Unsubscribing send "unsubscribe" to >>> gofriends-request at geneontology.org >>> Web: http://www.geneontology.org/ >> >> >> -- >> This message is from the GOFriends moderated mailing list. A list of >> public >> announcements and discussion of the Gene Ontology (GO) project. >> Problems with the list? E-mail: >> owner-gofriends at geneontology.org >> Subscribing send "subscribe" to >> gofriends-request at geneontology.org >> Unsubscribing send "unsubscribe" to >> gofriends-request at geneontology.org >> Web: http://www.geneontology.org/ >> > -- Andreas Schlicker, M.Sc. Max-Planck-Institute for Informatics Department 3: Computational Biology and Applied Algorithmics Stuhlsatzenhausweg 85 66123 Saarbruecken Germany Phone: +49 681 9325 321 Fax: +49 681 9325 399 Homepage: http://www.mpi-inf.mpg.de/~schlandi -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From jane at ebi.ac.uk Thu Apr 19 02:14:23 2007 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 19 Apr 2007 10:14:23 +0100 (BST) Subject: Quantifying Specificity of GO Terms In-Reply-To: <46272587.8030601@mpi-sb.mpg.de> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <200704180436.l3I4aNVh013718@atlas.systemsbiology.net> <46266637.5040904@wayne.edu> <46272587.8030601@mpi-sb.mpg.de> Message-ID: Just fyi - this is the reply I originally sent to Tobias: Hi Tobias - I'm afraid there isn't currently any reliable way to quantify the specificity of a GO term. Distance from the root is not proportional to the specificity of the term, as different parts of the graph have different depths (this is due to differences in the levels of research for different biological fields, and how much different parts of the graph itself have been worked on). Distance from the nearest leaf node might be more meaningful than distance from root, but again it's not precise because some areas of the graph have many very specific terms. Sorry I couldn't be more help - it might be worth you sending a mail to gofriends at geneontology.org to see how others have handled this problem - there are lots of tool developers subscribed to that list. We are currently undertaking a project to try and standardize the depth v/s specificity a bit more in collaboration with some researchers at MIT, but it's a long-term project. many thanks, Jane Lomax On Thu, 19 Apr 2007, Andreas Schlicker wrote: > Hi, > > Sorin Draghici schrieb: >> >> ... >> The question at hand here is how to quantify the specificity of a given >> term. This is independent of any experiment and any set of >> differentially regulated genes and has to do with the structure of the >> GO and the position of the given term in the DAG. For instance, >> "regulation of apoptosis through extracellular signals" is more specific >> than "regulation of apoptosis" or "apoptosis". The problem is how to >> numerically quantify this specificity. To my knowledge, there is no >> tools of any kind that would even remotely provide any quantitative >> assessment of this specificity. Any answers or thoughts on this issue >> would be very valuable. > > We have developed a measure of similarity between GO terms that is based on > the > information content of a GO term [1]. The information content is based on the > frequency of a term in UniProt, and can be used to quantify the specificity > of a > GO term. A high value corresponds to a high specificity, terms less > frequently > annotated and usually deeper in the graph. We have such a table for the > August > 2006 release of GO in our database. > > [1] Schlicker A, Domingues FS, Rahnenfuehrer J, Lengauer T. A new measure for > functional similarity of gene products based on Gene Ontology. BMC > Bioinformatics 2006, 7:302 (http://www.biomedcentral.com/1471-2105/7/302) > > Kind regards, > Andreas > >> Stan Dong wrote: >>> Another tool is the GO-TermFinder by Gavin Sherlock. I believe there is >>> interest for Amigo to incorporate this tool. >>> >>> http://search.cpan.org/dist/GO-TermFinder/ >>> >>> SGD has been using it with great satisfaction from our users. You may >>> check the SGD page to get some sense of a use case. >>> >>> http://db.yeastgenome.org/cgi-bin/GO/goTermFinder >>> >>> -Stan >>> >>> On Apr 17, 2007, at 9:36 PM, Paul Shannon wrote: >>> >>>> The Bioconductor project has, I believe, a fine solution to this problem >>>> -- though >>>> please forgive me if I have misconstrued things. The relevant packages >>>> (see >>>> below) use the Hypergeometric distribution to calculate a p-value for the >>>> enrichment of any GO node for the genes in question. I typically map >>>> proteins >>>> to GeneID's as the first step in my analysis. >>>> >>>> If this sounds like it addresses your problem, you may wish to take a >>>> look at >>>> >>>> http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and >>>> http://bioconductor.org/packages/1.9/bioc/html/Category.html >>>> >>>> Each of these web pages contains a 'vignette' in a pdf file which makes >>>> for >>>> a good introduction to the methods. >>>> >>>> Though orginally conceived in the context of microarrays, I use these >>>> packages >>>> quite fruitfully with proteomics data. >>>> >>>> - Paul >>>> >>>> >>>>>> I am working on a project that involves curation of protein data that >>>>>> includes GO terms, and it would be very helpful if I had some >>>>>> numerical quantification of the specificity of each term. It is >>>>>> possible to manually examine each term to determine this specificity, >>>>>> but because there is a large amount of data, I would like to automate >>>>>> the process. I understand that there is no reliable way to do this >>>>>> simply using the level in the DAG hierarchy, but I am wondering if any >>>>>> of you might have a work-around. >>>> >>>> -- >>>> This message is from the GOFriends moderated mailing list. A list of >>>> public >>>> announcements and discussion of the Gene Ontology (GO) project. >>>> Problems with the list? E-mail: >>>> owner-gofriends at geneontology.org >>>> Subscribing send "subscribe" to >>>> gofriends-request at geneontology.org >>>> Unsubscribing send "unsubscribe" to >>>> gofriends-request at geneontology.org >>>> Web: http://www.geneontology.org/ >>> >>> >>> -- >>> This message is from the GOFriends moderated mailing list. A list of >>> public >>> announcements and discussion of the Gene Ontology (GO) project. >>> Problems with the list? E-mail: owner-gofriends at geneontology.org >>> Subscribing send "subscribe" to gofriends-request at geneontology.org >>> Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org >>> Web: http://www.geneontology.org/ >>> >> > > > > -- > Andreas Schlicker, M.Sc. > Max-Planck-Institute for Informatics > Department 3: Computational Biology and Applied Algorithmics > Stuhlsatzenhausweg 85 > 66123 Saarbruecken > Germany > > Phone: +49 681 9325 321 > Fax: +49 681 9325 399 > Homepage: http://www.mpi-inf.mpg.de/~schlandi > > -- > This message is from the GOFriends moderated mailing list. A list of public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner-gofriends at geneontology.org > Subscribing send "subscribe" to gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org > Web: http://www.geneontology.org/ > Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From sherlock at genome.Stanford.EDU Thu Apr 19 07:59:09 2007 From: sherlock at genome.Stanford.EDU (Gavin Sherlock) Date: Thu, 19 Apr 2007 07:59:09 -0700 Subject: Quantifying Specificity of GO Terms In-Reply-To: <4625D2F2.1020300@idi.ntnu.no> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <4625D2F2.1020300@idi.ntnu.no> Message-ID: One possibility that occurred to me this morning, based on a user request wanting to eliminate certain GO terms from being tested in GO::TermFinder, is that you could calculate the best possible p-value that could be generated for a GO node, based on the hypergeometric distribution. E.g. if a node has 20 annotations, then the best possible p-value you could generate would be based on observing all 20 of those annotations in your list of interesting genes. Likewise for something that has 5,000 annotations. I suspect, though haven't tested, that for the non-specific GO terms, the best possible p-value that could be generated would be non-significant - i.e. no matter how many observations of it you make in your list of interesting genes, that node might never achieve significance. Such results could of course be easily ranked. Cheers, Gavin On Apr 18, 2007, at 1:12 AM, Waclaw Kusnierczyk wrote: > This is exactly the problem. Coding is the least painful part. > Both the length of the path from a node to the root and the count > of annotations say more about how well explored and interesting the > particular part of the GO is than about any sort of specificity of > the GO term. > > One possibility that I've been advocating for a while is to measure > specificity in terms of a GO term's correspondence with the taxon/ > taxa in the taxonomy of species, for which organisms the term may > be sensibly used to talk about. This would not, of course, solve > those of your problems that are not related to the classification > of organisms, but might help solve others. > > vQ > > > Sorin Draghici wrote: >> Tobias, >> How exactly would you define the specificity of a GO term? If we >> had an exact definition, we could possibly write a piece of >> software that could do it. >> Sorin >> Tobias Sayre wrote: >>> Dear GO Friends, >>> >>> I am working on a project that involves curation of protein data >>> that includes GO terms, and it would be very helpful if I had >>> some numerical quantification of the specificity of each term. >>> It is possible to manually examine each term to determine this >>> specificity, but because there is a large amount of data, I would >>> like to automate the process. I understand that there is no >>> reliable way to do this simply using the level in the DAG >>> hierarchy, but I am wondering if any of you might have a work- >>> around. >>> >>> Thanks, >>> >>> Tobias Sayre > > -- > Wacek Kusnierczyk > > ------------------------------------------------------ > Department of Information and Computer Science (IDI) > Norwegian University of Science and Technology (NTNU) > Sem Saelandsv. 7-9 > 7027 Trondheim > Norway > > tel. 0047 73591875 > fax 0047 73594466 > ------------------------------------------------------ > > -- > This message is from the GOFriends moderated mailing list. A list > of public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner- > gofriends at geneontology.org > Subscribing send "subscribe" to gofriends- > request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends- > request at geneontology.org > Web: http://www.geneontology.org/ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From phillip.lord at newcastle.ac.uk Thu Apr 19 04:08:36 2007 From: phillip.lord at newcastle.ac.uk (Phillip Lord) Date: Thu, 19 Apr 2007 12:08:36 +0100 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> (Tobias Sayre's message of "Tue\, 17 Apr 2007 12\:35\:08 -0700") References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: >>>>> "TS" == Tobias Sayre writes: TS> Dear GO Friends, TS> I am working on a project that involves curation of protein data TS> that includes GO terms, and it would be very helpful if I had TS> some numerical quantification of the specificity of each term. TS> It is possible to manually examine each term to determine this TS> specificity, but because there is a large amount of data, I TS> would like to automate the process. I understand that there is TS> no reliable way to do this simply using the level in the DAG TS> hierarchy, but I am wondering if any of you might have a TS> work-around. There are basically two ways. Information content or GO structure. Information content works fine, but depends on having a corpus. This exists for GO, of course, but it's hard to determine what corpus you should use. So, if you are comparing GO terms for proteins between human and yeast, should you use SGD? Or Swissprot? Structure based techniques are myriad and probably more common. They tend to be less computationally intensive, because they only need a structure, while information content needs a structure and a corpus. Some are based on "level", but this is not great. GO is a DAG and not a tree, and so doesn't really have levels. To my mind level based approaches are treating the DAG nature of GO as an embarrasement rather that a feature. Not all structure based techniques are level based though. If I may be so bold, and express my untried, unproven and generally dubious opinion here, my own feeling is that, in practise, it doesn't actually matter that much. Most measures of specificity give a result which looks sort of correct. The parts of GO with highest information content *tend* to be the "deepest" in terms of maximum level and vice versa. In all the papers I have read on specificity (or, more generally, similarity of which there have been more papers, but which is highly related), authors have tested against some gold standard, or applied to a specific application. In my papers, I used sequence similarity, for instance. And as far as I can tell, there is no real clear winner; different authors showed that different measures were better for different things. Wow, talking about sitting on the fence! Phil -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From sherlock at genome.Stanford.EDU Thu Apr 19 08:30:18 2007 From: sherlock at genome.Stanford.EDU (Gavin Sherlock) Date: Thu, 19 Apr 2007 08:30:18 -0700 Subject: Quantifying Specificity of GO Terms In-Reply-To: <462788B7.6010900@wayne.edu> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> <462585D5.3070607@wayne.edu> <4625D2F2.1020300@idi.ntnu.no> <462788B7.6010900@wayne.edu> Message-ID: <0ECBF720-0E05-4195-B5FE-B0145FC9A138@genome.stanford.edu> Hi Sorin, Indeed, it would be dependent on the annotations, which are of course a reflection of both our knowledge and ignorance. In terms of removing nodes from consideration when finding enriched GO nodes, it should work fine, because that itself is dependent on annotations. In terms of describing a node as specific or not, it will only do that in terms of annotations - it is thus not perfect, but I'm not sure whether specificity is an intrinsic property of a node itself, or a property only in the context of the annotations to it and all other nodes. I think I favor the latter - it simply means that our notion of specificity would evolve over time as we chip away at our ignorance. Cheers, Gavin On Apr 19, 2007, at 8:20 AM, Sorin Draghici wrote: > Gavin, > > This is a very interesting idea but it seems to me that the results > would still be dependent on the current state of annotations, not > on the intrinsic concepts captured by the terms. I would like > something that would tell me that "boiling water in a brown kettle" > is more specific than "boiling water in a kettle" which in turn is > more specific than "boiling water", even if there are no genes > known to boil water at this time. What do you think? > > Sorin > > Gavin Sherlock wrote: >> One possibility that occurred to me this morning, based on a user >> request wanting to eliminate certain GO terms from being tested in >> GO::TermFinder, is that you could calculate the best possible p- >> value that could be generated for a GO node, based on the >> hypergeometric distribution. E.g. if a node has 20 annotations, >> then the best possible p-value you could generate would be based >> on observing all 20 of those annotations in your list of >> interesting genes. Likewise for something that has 5,000 >> annotations. I suspect, though haven't tested, that for the non- >> specific GO terms, the best possible p-value that could be >> generated would be non-significant - i.e. no matter how many >> observations of it you make in your list of interesting genes, >> that node might never achieve significance. Such results could of >> course be easily ranked. >> >> Cheers, >> Gavin >> >> On Apr 18, 2007, at 1:12 AM, Waclaw Kusnierczyk wrote: >> >>> This is exactly the problem. Coding is the least painful part. >>> Both the length of the path from a node to the root and the count >>> of annotations say more about how well explored and interesting >>> the particular part of the GO is than about any sort of >>> specificity of the GO term. >>> >>> One possibility that I've been advocating for a while is to >>> measure specificity in terms of a GO term's correspondence with >>> the taxon/taxa in the taxonomy of species, for which organisms >>> the term may be sensibly used to talk about. This would not, of >>> course, solve those of your problems that are not related to the >>> classification of organisms, but might help solve others. >>> >>> vQ >>> >>> >>> Sorin Draghici wrote: >>>> Tobias, >>>> How exactly would you define the specificity of a GO term? If we >>>> had an exact definition, we could possibly write a piece of >>>> software that could do it. >>>> Sorin >>>> Tobias Sayre wrote: >>>>> Dear GO Friends, >>>>> >>>>> I am working on a project that involves curation of protein >>>>> data that includes GO terms, and it would be very helpful if I >>>>> had some numerical quantification of the specificity of each >>>>> term. It is possible to manually examine each term to >>>>> determine this specificity, but because there is a large amount >>>>> of data, I would like to automate the process. I understand >>>>> that there is no reliable way to do this simply using the level >>>>> in the DAG hierarchy, but I am wondering if any of you might >>>>> have a work-around. >>>>> >>>>> Thanks, >>>>> >>>>> Tobias Sayre >>> >>> --Wacek Kusnierczyk >>> >>> ------------------------------------------------------ >>> Department of Information and Computer Science (IDI) >>> Norwegian University of Science and Technology (NTNU) >>> Sem Saelandsv. 7-9 >>> 7027 Trondheim >>> Norway >>> >>> tel. 0047 73591875 >>> fax 0047 73594466 >>> ------------------------------------------------------ >>> >>> -- >>> This message is from the GOFriends moderated mailing list. A >>> list of public >>> announcements and discussion of the Gene Ontology (GO) project. >>> Problems with the list? E-mail: owner- >>> gofriends at geneontology.org >>> Subscribing send "subscribe" to gofriends- >>> request at geneontology.org >>> Unsubscribing send "unsubscribe" to gofriends- >>> request at geneontology.org >>> Web: http://www.geneontology.org/ >> >> >> -- >> This message is from the GOFriends moderated mailing list. A list >> of public >> announcements and discussion of the Gene Ontology (GO) project. >> Problems with the list? E-mail: owner- >> gofriends at geneontology.org >> Subscribing send "subscribe" to gofriends- >> request at geneontology.org >> Unsubscribing send "unsubscribe" to gofriends- >> request at geneontology.org >> Web: http://www.geneontology.org/ >> > > -- > Sorin Draghici, Ph.D. > > Director of the Bioinformatics Core, Karmanos Cancer Institute > > Associate Professor Tel: (313) 577-5484 > Dept. of Computer Science Fax: (313) 577-6868 > Wayne State University > 5143 Cass Ave, Room 431 State Hall, Detroit, MI, 48202 > WWW: http://vortex.cs.wayne.edu/Sorin/ (personal) > WWW: http://vortex.cs.wayne.edu/Projects.html (lab) > > > Check out my recent book: Data Analysis Tools for Microarrays: > http://www.crcpress.com/shopping_cart/products/product_detail.asp? > sku=C3154&parent_id=&pc= > -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From arubio at ceit.es Fri Apr 20 00:34:21 2007 From: arubio at ceit.es (Rubio, Angel) Date: Fri, 20 Apr 2007 09:34:21 +0200 Subject: Quantifying Specificity of GO Terms In-Reply-To: Message-ID: <7A107805A3E03C4383EE831B1C11244D17975D421F@MAILCLUSTER.tecnun.es> Some years ago, my group compared the correlation between gene expression and different versions of semantic similarity. We found it that the Resnik similarity measure (already used by Dr. Lord comparing sequence and functions) outperformed other measures also based on a corpus for the three categories (BP, MF and CC). Indeed, in our case these other measures (Lin and Jiang) did not perform well at all. Resnik similarity measure is easy to evaluate: Resnik(GeneProduct1, GeneProduct2) = -log(ni/nt) Where ni: number of gene products in the corpus annotated for the common ancestor of the annotations of a pair of gene products (it seems a sort of tongue twister!). nt: total number of gene products. I expect that it helps. -----Original Message----- From: owner-gofriends at genome.stanford.edu [mailto:owner-gofriends at genome.stanford.edu] On Behalf Of Phillip Lord Sent: Thursday, April 19, 2007 1:09 PM To: tobias at sfsu.edu Cc: gofriends at genome.stanford.edu Subject: Re: Quantifying Specificity of GO Terms >>>>> "TS" == Tobias Sayre writes: TS> Dear GO Friends, TS> I am working on a project that involves curation of protein data TS> that includes GO terms, and it would be very helpful if I had TS> some numerical quantification of the specificity of each term. TS> It is possible to manually examine each term to determine this TS> specificity, but because there is a large amount of data, I TS> would like to automate the process. I understand that there is TS> no reliable way to do this simply using the level in the DAG TS> hierarchy, but I am wondering if any of you might have a TS> work-around. There are basically two ways. Information content or GO structure. Information content works fine, but depends on having a corpus. This exists for GO, of course, but it's hard to determine what corpus you should use. So, if you are comparing GO terms for proteins between human and yeast, should you use SGD? Or Swissprot? Structure based techniques are myriad and probably more common. They tend to be less computationally intensive, because they only need a structure, while information content needs a structure and a corpus. Some are based on "level", but this is not great. GO is a DAG and not a tree, and so doesn't really have levels. To my mind level based approaches are treating the DAG nature of GO as an embarrasement rather that a feature. Not all structure based techniques are level based though. If I may be so bold, and express my untried, unproven and generally dubious opinion here, my own feeling is that, in practise, it doesn't actually matter that much. Most measures of specificity give a result which looks sort of correct. The parts of GO with highest information content *tend* to be the "deepest" in terms of maximum level and vice versa. In all the papers I have read on specificity (or, more generally, similarity of which there have been more papers, but which is highly related), authors have tested against some gold standard, or applied to a specific application. In my papers, I used sequence similarity, for instance. And as far as I can tell, there is no real clear winner; different authors showed that different measures were better for different things. Wow, talking about sitting on the fence! Phil -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From phillip.lord at newcastle.ac.uk Fri Apr 20 06:31:59 2007 From: phillip.lord at newcastle.ac.uk (Phillip Lord) Date: Fri, 20 Apr 2007 14:31:59 +0100 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7A107805A3E03C4383EE831B1C11244D17975D421F@MAILCLUSTER.tecnun.es> (Angel Rubio's message of "Fri\, 20 Apr 2007 09\:34\:21 +0200") References: <7A107805A3E03C4383EE831B1C11244D17975D421F@MAILCLUSTER.tecnun.es> Message-ID: Probably because Lin and Jiang are both normalised measures, while Resnik is not. My over-riding suspicion has been that this distinction is more important than anything else. >>>>> "RA" == Rubio, Angel writes: RA> Some years ago, my group compared the correlation between gene RA> expression and different versions of semantic similarity. We RA> found it that the Resnik similarity measure (already used by RA> Dr. Lord comparing sequence and functions) outperformed other RA> measures also based on a corpus for the three categories (BP, MF RA> and CC). Indeed, in our case these other measures (Lin and RA> Jiang) did not perform well at all. Resnik similarity measure RA> is easy to evaluate: RA> Resnik(GeneProduct1, GeneProduct2) = -log(ni/nt) RA> Where ni: number of gene products in the corpus annotated for RA> the common ancestor of the annotations of a pair of gene RA> products (it seems a sort of tongue twister!). nt: total number RA> of gene products. RA> I expect that it helps. -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From tobias at sfsu.edu Thu Apr 19 19:35:14 2007 From: tobias at sfsu.edu (Tobias Sayre) Date: Thu, 19 Apr 2007 19:35:14 -0700 Subject: Quantifying Specificity of GO Terms In-Reply-To: References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: <7e833d7d0704191935lb95daa1o881b52f4139db1e@mail.gmail.com> GO Friends, Thank you so much for all of your answers to my question. I will probably use some technique that measures the representation of the use of the GO term among all GO terms. Thanks again, Tobias On 4/19/07, Phillip Lord wrote: > > >>>>> "TS" == Tobias Sayre writes: > > TS> Dear GO Friends, > > TS> I am working on a project that involves curation of protein data > TS> that includes GO terms, and it would be very helpful if I had > TS> some numerical quantification of the specificity of each term. > TS> It is possible to manually examine each term to determine this > TS> specificity, but because there is a large amount of data, I > TS> would like to automate the process. I understand that there is > TS> no reliable way to do this simply using the level in the DAG > TS> hierarchy, but I am wondering if any of you might have a > TS> work-around. > > > There are basically two ways. Information content or GO structure. > > Information content works fine, but depends on having a corpus. This > exists for GO, of course, but it's hard to determine what corpus you > should use. So, if you are comparing GO terms for proteins between > human and yeast, should you use SGD? Or Swissprot? > > Structure based techniques are myriad and probably more common. They > tend to be less computationally intensive, because they only need a > structure, while information content needs a structure and a > corpus. Some are based on "level", but this is not great. GO is a DAG > and not a tree, and so doesn't really have levels. To my mind level > based approaches are treating the DAG nature of GO as an embarrasement > rather that a feature. Not all structure based techniques are level > based though. > > > If I may be so bold, and express my untried, unproven and generally > dubious opinion here, my own feeling is that, in practise, it doesn't > actually matter that much. Most measures of specificity give a result > which looks sort of correct. The parts of GO with highest information > content *tend* to be the "deepest" in terms of maximum level and vice > versa. > > In all the papers I have read on specificity (or, more generally, > similarity of which there have been more papers, but which is highly > related), authors have tested against some gold standard, or applied > to a specific application. In my papers, I used sequence similarity, > for instance. And as far as I can tell, there is no real clear > winner; different authors showed that different measures were better > for different things. > > Wow, talking about sitting on the fence! > > Phil > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://fafner.stanford.edu/pipermail/gofriends/attachments/20070419/726941d2/attachment.html From sletovsky at aol.com Fri Apr 20 10:04:45 2007 From: sletovsky at aol.com (Stan Letovsky) Date: Fri, 20 Apr 2007 13:04:45 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: <46266637.5040904@wayne.edu> Message-ID: This is a rather trivial answer, and may be irrelevant, depending on what was meant by "specificity" in the original request. But when I have needed to distinguish between very general and very specific terms in the past, I have simply relied on the term frequency (or "annotation probability") of each term in the genome of interest. Granted this is annotation/knowledge-dependent, and species-dependent, but it was a pretty useful method of distinguishing broad terms like "Metabolism" from specific ones like "s-methyltransferase activity". You need to first complete the transitive closure of the ISA hierarchy; then term frequencies decrease monotonically down any chain from the root. It's a more objective metric than depth. Many people doing function prediction with GO terms will report precision/recall etc. without specifying whether they are predicting general terms like "metabolism" or more specific ones. The former predictions are not very useful. The rarer the term, the more interesting the prediction. Cheers, -Stan -----Original Message----- From: owner-gofriends at genome.stanford.edu [mailto:owner-gofriends at genome.stanford.edu] On Behalf Of Sorin Draghici Sent: Wednesday, April 18, 2007 2:41 PM To: Stan Dong Cc: Paul Shannon; tobias at sfsu.edu; gofriends at genome.stanford.edu Subject: Re: Quantifying Specificity of GO Terms Hi, There seems to be a fair amount of confusion here. There are about 20 tools that are able to calculate a statistical significance value for a GO term giving a set of differentially expressed genes. This is a very well known problem that was defined about 4-5 years ago, see for instance: http://vortex.cs.wayne.edu/papers/genomics.pdf and http://vortex.cs.wayne.edu/papers/Onto-Express_V2_proof.pdf GO-TermFinder, GOStats and the others mentioned in the recent emails are all tools from the same category, tools that address the problem defined above. If anybody is interested in this problem, 17 of these tools have been recently reviewed in: http://vortex.cs.wayne.edu/papers/Ontological_analysis.pdf. The GO tools page includes pointers to many if not all such tools. The question at hand here is how to quantify the specificity of a given term. This is independent of any experiment and any set of differentially regulated genes and has to do with the structure of the GO and the position of the given term in the DAG. For instance, "regulation of apoptosis through extracellular signals" is more specific than "regulation of apoptosis" or "apoptosis". The problem is how to numerically quantify this specificity. To my knowledge, there is no tools of any kind that would even remotely provide any quantitative assessment of this specificity. Any answers or thoughts on this issue would be very valuable. Regards, Sorin Stan Dong wrote: > Another tool is the GO-TermFinder by Gavin Sherlock. I believe there > is interest for Amigo to incorporate this tool. > > http://search.cpan.org/dist/GO-TermFinder/ > > SGD has been using it with great satisfaction from our users. You may > check the SGD page to get some sense of a use case. > > http://db.yeastgenome.org/cgi-bin/GO/goTermFinder > > -Stan > > On Apr 17, 2007, at 9:36 PM, Paul Shannon wrote: > >> The Bioconductor project has, I believe, a fine solution to this >> problem -- though >> please forgive me if I have misconstrued things. The relevant >> packages (see >> below) use the Hypergeometric distribution to calculate a p-value for >> the >> enrichment of any GO node for the genes in question. I typically map >> proteins >> to GeneID's as the first step in my analysis. >> >> If this sounds like it addresses your problem, you may wish to take a >> look at >> >> http://bioconductor.org/packages/1.9/bioc/html/GOstats.html and >> http://bioconductor.org/packages/1.9/bioc/html/Category.html >> >> Each of these web pages contains a 'vignette' in a pdf file which >> makes for >> a good introduction to the methods. >> >> Though orginally conceived in the context of microarrays, I use these >> packages >> quite fruitfully with proteomics data. >> >> - Paul >> >> >>>> I am working on a project that involves curation of protein data that >>>> includes GO terms, and it would be very helpful if I had some >>>> numerical quantification of the specificity of each term. It is >>>> possible to manually examine each term to determine this specificity, >>>> but because there is a large amount of data, I would like to automate >>>> the process. I understand that there is no reliable way to do this >>>> simply using the level in the DAG hierarchy, but I am wondering if any >>>> of you might have a work-around. >> >> -- >> This message is from the GOFriends moderated mailing list. A list of >> public >> announcements and discussion of the Gene Ontology (GO) project. >> Problems with the list? E-mail: >> owner-gofriends at geneontology.org >> Subscribing send "subscribe" to >> gofriends-request at geneontology.org >> Unsubscribing send "unsubscribe" to >> gofriends-request at geneontology.org >> Web: http://www.geneontology.org/ > > > -- > This message is from the GOFriends moderated mailing list. A list of > public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: > owner-gofriends at geneontology.org > Subscribing send "subscribe" to > gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to > gofriends-request at geneontology.org > Web: http://www.geneontology.org/ > -- Sorin Draghici, Ph.D. Director of the Bioinformatics Core, Karmanos Cancer Institute Associate Professor Tel: (313) 577-5484 Dept. of Computer Science Fax: (313) 577-6868 Wayne State University 5143 Cass Ave, Room 431 State Hall, Detroit, MI, 48202 WWW: http://vortex.cs.wayne.edu/Sorin/ (personal) WWW: http://vortex.cs.wayne.edu/Projects.html (lab) Check out my recent book: Data Analysis Tools for Microarrays: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C3154& parent_id=&pc= -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From n.mitsakakis at utoronto.ca Fri Apr 20 11:06:06 2007 From: n.mitsakakis at utoronto.ca (Nicholas Mitsakakis) Date: Fri, 20 Apr 2007 14:06:06 -0400 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7A107805A3E03C4383EE831B1C11244D17975D421F@MAILCLUSTER.tecnun.es> References: <7A107805A3E03C4383EE831B1C11244D17975D421F@MAILCLUSTER.tecnun.es> Message-ID: <4629010E.2080402@utoronto.ca> Angel, I think I came across your work on comparing correlation values from gene expression data and different semantic similarities. I wonder if you and your group tried using partial correlations instead. Let me know if you know anything about this. Thanks, Nicholas Rubio, Angel wrote: >Some years ago, my group compared the correlation between gene expression and different versions of semantic similarity. We found it that the Resnik similarity measure (already used by Dr. Lord comparing sequence and functions) outperformed other measures also based on a corpus for the three categories (BP, MF and CC). >Indeed, in our case these other measures (Lin and Jiang) did not perform well at all. >Resnik similarity measure is easy to evaluate: > >Resnik(GeneProduct1, GeneProduct2) = -log(ni/nt) > >Where >ni: number of gene products in the corpus annotated for the common ancestor of the annotations of a pair of gene products (it seems a sort of tongue twister!). >nt: total number of gene products. > >I expect that it helps. > >-----Original Message----- >From: owner-gofriends at genome.stanford.edu [mailto:owner-gofriends at genome.stanford.edu] On Behalf Of Phillip Lord >Sent: Thursday, April 19, 2007 1:09 PM >To: tobias at sfsu.edu >Cc: gofriends at genome.stanford.edu >Subject: Re: Quantifying Specificity of GO Terms > > > >>>>>>"TS" == Tobias Sayre writes: >>>>>> >>>>>> > > TS> Dear GO Friends, > > TS> I am working on a project that involves curation of protein data > TS> that includes GO terms, and it would be very helpful if I had > TS> some numerical quantification of the specificity of each term. > TS> It is possible to manually examine each term to determine this > TS> specificity, but because there is a large amount of data, I > TS> would like to automate the process. I understand that there is > TS> no reliable way to do this simply using the level in the DAG > TS> hierarchy, but I am wondering if any of you might have a > TS> work-around. > > >There are basically two ways. Information content or GO structure. > >Information content works fine, but depends on having a corpus. This >exists for GO, of course, but it's hard to determine what corpus you >should use. So, if you are comparing GO terms for proteins between >human and yeast, should you use SGD? Or Swissprot? > >Structure based techniques are myriad and probably more common. They >tend to be less computationally intensive, because they only need a >structure, while information content needs a structure and a >corpus. Some are based on "level", but this is not great. GO is a DAG >and not a tree, and so doesn't really have levels. To my mind level >based approaches are treating the DAG nature of GO as an embarrasement >rather that a feature. Not all structure based techniques are level >based though. > > >If I may be so bold, and express my untried, unproven and generally >dubious opinion here, my own feeling is that, in practise, it doesn't >actually matter that much. Most measures of specificity give a result >which looks sort of correct. The parts of GO with highest information >content *tend* to be the "deepest" in terms of maximum level and vice >versa. > >In all the papers I have read on specificity (or, more generally, >similarity of which there have been more papers, but which is highly >related), authors have tested against some gold standard, or applied >to a specific application. In my papers, I used sequence similarity, >for instance. And as far as I can tell, there is no real clear >winner; different authors showed that different measures were better >for different things. > >Wow, talking about sitting on the fence! > >Phil > > >-- >This message is from the GOFriends moderated mailing list. A list of public >announcements and discussion of the Gene Ontology (GO) project. >Problems with the list? E-mail: owner-gofriends at geneontology.org >Subscribing send "subscribe" to gofriends-request at geneontology.org >Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org >Web: http://www.geneontology.org/ > > >-- >This message is from the GOFriends moderated mailing list. A list of public >announcements and discussion of the Gene Ontology (GO) project. >Problems with the list? E-mail: owner-gofriends at geneontology.org >Subscribing send "subscribe" to gofriends-request at geneontology.org >Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org >Web: http://www.geneontology.org/ > > -- Nicholas Mitsakakis PhD Candidate - Biostatistics Department of Public Health Sciences University of Toronto -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From paul at bioinformatics.ubc.ca Fri Apr 20 10:52:50 2007 From: paul at bioinformatics.ubc.ca (Paul Pavlidis) Date: Fri, 20 Apr 2007 10:52:50 -0700 Subject: Quantifying Specificity of GO Terms Message-ID: <4628FDF2.1060609@bioinformatics.ubc.ca> Just to throw our hat into the ring, we experimented with both the Resnik measure and a simple 'GO term overlap' measure and found that they are highly correlated. Measuring term overlap (the number of GO terms shared by two genes, including all parent terms) does not depend on knowing term-use frequencies and very fast to compute. The term overlap measure is used and briefly described here: http://www.genome.org/cgi/content/full/14/6/1085 As for whether it is "more accurate", I don't know (I'm interested in hearing any opinions), but as I believe was pointed out, the depth of terms in the hierarchy is actually not a bad indication of their specificity. Paul Paul Pavlidis, PhD Assistant Professor of Psychiatry UBC Bioinformatics Centre (UBiC) 177 Michael Smith Laboratories 2185 East Mall University of British Columbia Vancouver BC V6T1Z4 voice: 604 827 4157 fax: 604 608 2964 paul at bioinformatics.ubc.ca http://bioinformatics.ubc.ca/pavlidis/ Phillip Lord wrote: > > Probably because Lin and Jiang are both normalised measures, while > Resnik is not. My over-riding suspicion has been that this distinction > is more important than anything else. > >>>>>> "RA" == Rubio, Angel writes: > > RA> Some years ago, my group compared the correlation between gene > RA> expression and different versions of semantic similarity. We > RA> found it that the Resnik similarity measure (already used by > RA> Dr. Lord comparing sequence and functions) outperformed other > RA> measures also based on a corpus for the three categories (BP, MF > RA> and CC). Indeed, in our case these other measures (Lin and > RA> Jiang) did not perform well at all. Resnik similarity measure > RA> is easy to evaluate: > > RA> Resnik(GeneProduct1, GeneProduct2) = -log(ni/nt) > > RA> Where ni: number of gene products in the corpus annotated for > RA> the common ancestor of the annotations of a pair of gene > RA> products (it seems a sort of tongue twister!). nt: total number > RA> of gene products. > > RA> I expect that it helps. > > -- > This message is from the GOFriends moderated mailing list. A list of public > announcements and discussion of the Gene Ontology (GO) project. > Problems with the list? E-mail: owner-gofriends at geneontology.org > Subscribing send "subscribe" to gofriends-request at geneontology.org > Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org > Web: http://www.geneontology.org/ -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From fcouto at di.fc.ul.pt Tue Apr 24 07:44:02 2007 From: fcouto at di.fc.ul.pt (Francisco Couto) Date: Tue, 24 Apr 2007 15:44:02 +0100 Subject: Quantifying Specificity of GO Terms In-Reply-To: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> References: <7e833d7d0704171235l798c81ddg213bfbad4bf666d5@mail.gmail.com> Message-ID: <462E17B2.8080804@di.fc.ul.pt> Hi Tobias, you can try our web-tools: http://xldb.fc.ul.pt/rebil/tools/ssm/ http://xldb.fc.ul.pt/biotools/proteinon/ that give you the information content of each GO term. Cheers, Francisco Tobias Sayre wrote: > Dear GO Friends, > > I am working on a project that involves curation of protein data that > includes GO terms, and it would be very helpful if I had some > numerical quantification of the specificity of each term. It is > possible to manually examine each term to determine this specificity, > but because there is a large amount of data, I would like to automate > the process. I understand that there is no reliable way to do this > simply using the level in the DAG hierarchy, but I am wondering if any > of you might have a work-around. > > Thanks, > > Tobias Sayre -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From Julien.Roux at unil.ch Tue Apr 24 09:33:54 2007 From: Julien.Roux at unil.ch (Julien Roux) Date: Tue, 24 Apr 2007 18:33:54 +0200 Subject: GO development Message-ID: <462E3172.3090102@unil.ch> Hello everybody I was looking at the expression in zebrafish of developmental genes in GO, annotated as (or under) multicellular organismal development, GO:0007275, biological process. Surprisingly I find that compared to the rest of zebrafish genes, they are less expressed at the beginning of development, expressed in similar levels from segmentation stages to larval stages and then less expressed in juvenile and adult. I would expect these genes to be overexpressed during the development process... What do you think of that? What is classified as "development" in GO? Thanks for your help and ideas Julien -- Julien Roux, PhD student http://www.unil.ch/dee/page22707.html Department of Ecology and Evolution Biophore, University of Lausanne, 1015 Lausanne, Switzerland tel: +41 21 692 4221 fax: +41 21 692 4165 -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From dph at informatics.jax.org Tue Apr 24 10:43:20 2007 From: dph at informatics.jax.org (David Hill) Date: Tue, 24 Apr 2007 13:43:20 -0400 Subject: GO development In-Reply-To: <462E3172.3090102@unil.ch> References: <462E3172.3090102@unil.ch> Message-ID: <462E41B8.2000000@informatics.jax.org> Julien, This is a good question. In the past couple of weeks I was making some gene sets for someone and have noticed something interesting with mouse genes as well. If I search for mouse genes that are associated with GO:0032502 'developmental process' there are 2846 genes. If I search for genes associated with GO:0007275 'multicellular organism development' there are 1963 genes. Clearly all mouse genes that function in a developmental process also function in multicellular organism development. When I examine this in have noticed that the discrepancy comes from two sources: 1) Annotations that are made to generic terms like 'cell death' and 'cell differentiation'. In this case, we cannot link this term to the multicellular term because cell death or cell differentiation do not always happen in a multicellular organism. For the cases of cell differentiation, it is now our policy to use the cell type ontology to create GO terms that describe the differentiation of a cell type. Cell death is a bit more tricky because it is often tested in 'generic' cells in culture. We could create terms to describe cell death in every type of cell but I don't think that would be a practical approach. 2) An incompleteness of part_of relationships in the graph. In particular the differentiation of some cell types (fat cell differentiation ; GO:45444) that are only found in multicellular organisms are not part_of multicellular organism development. One of my plans is to clean these up, but when I do, I'd like to work in conjunction with the anatomy and cell type ontology groups since the precise placement in the graph would require a knowledge of which cell types are always found in which anatomical structures. I have cleaned some of these up when I am focusing on a certain area of the graph, but it is not as easy to place them in exactly the right place as it first seems. I am planning to create a term called 'cell differentiation involved in multicellular organism development', make this term a part_of child of 'multicellular organism development' and an is_a "cell differentiation" then place all the correct types of cell differentiation as is_a children of this term. It would solve part of the problem of not having gene products grouped correctly under 'multicellular organism development' but the part_of problem will still exist. I also think that if a gene product from a multicellular organism is annotated to one of the more generic development terms, it should be co-annotated to 'multicellular organism development'. We would need to discuss what evidence code this would get. I suspect it would be IC. For now, I would suggest that if you want all annotations for genes involved in the development of zebrafish, your should search for all genes that are annotated to 'developmental process' (GO:0032502) or its children. David Julien Roux wrote: > Hello everybody > > I was looking at the expression in zebrafish of developmental genes in > GO, annotated as (or under) multicellular organismal development, > GO:0007275, > biological process. > Surprisingly I find that compared to the rest of zebrafish genes, they > are less expressed at the beginning of development, expressed in > similar levels from segmentation stages to larval stages and then less > expressed in juvenile and adult. > I would expect these genes to be overexpressed during the development > process... > What do you think of that? > What is classified as "development" in GO? > > Thanks for your help and ideas > Julien > -- This message is from the GOFriends moderated mailing list. A list of public announcements and discussion of the Gene Ontology (GO) project. Problems with the list? E-mail: owner-gofriends at geneontology.org Subscribing send "subscribe" to gofriends-request at geneontology.org Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org Web: http://www.geneontology.org/ From pj37 at cornell.edu Tue Apr 24 10:59:35 2007 From: pj37 at cornell.edu (Pankaj Jaiswal) Date: Tue, 24 Apr 2007 13:59:35 -0400 Subject: GO development In-Reply-To: <462E41B8.2000000@informatics.jax.org> References: <462E3172.3090102@unil.ch> <462E41B8.2000000@informatics.jax.org> Message-ID: <462E4587.2000609@cornell.edu> David Hill wrote: > > I am planning to create a term called 'cell differentia