From cjm at berkeleybop.org Wed Aug 5 11:35:48 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Wed, 5 Aug 2009 11:35:48 -0700 Subject: [Ontology-editors] cellular component vs cellular location References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> Message-ID: <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> I'd like to briefly revisit this issue We have a request for a term "Fully spanning the plasma membrane" https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 This a location rather than a cellular component. The determining factor would be whether instances of the type would have mass. A nucleus instance has mass, but it's not clear what a "Fully spanning the plasma membrane" instance is. This is also the case for the intrinsic/extrinsic terms. See the email from Alan below which I guess never made it onto the annotation list (I guess we should probably set up a public GO ontology discussion list distinct from GO friends to stop people spamming the wrong lists?) Of course these terms are useful and I'm not suggesting getting rid of current annotations. But it is important to be clear what the terms in CC are. This is especially important as other groups start using GO terms in cross-products. The options as I see it are: [1] Interpret all GO CC terms as locations. Thus GO:0005634 ! nucleus would be interpreted as "located in the nucleus". Note that this is not the current interpretation as far as I see it; an instance of GO:0005634 is an instance of a nucleus. When we have an association between a gene product and a nucleus then we interpret this as the gene product being localized to the nucleus. [2] Introduce a new high level term "cell location". The is_a hierarchy would be something (roughly) like cell location membrane location extrinsic spanning intrinsic fully spanning full spanning plasma membrane (it would be more complex with dual parentage as we have the cross product between membrane type and spatial qualifier) There would be additional contained_in/part_of links following the current structure, so query results, enrichment etc would remain roughly the same [3] Use spatial qualifiers in annotation Here we would actually obsolete the locational terms, and replace them with annotation qualifiers * extrinsic, intrinsic: membranes only * overlaps * fully contained by * fully spanning [4] Keep things as they actually are and not worry about giving a coherent explanation as to what a cell component is. I am against [1] for reasons I can expand on. I am also against [4] but partially resigned to it. I prefer [3] to [2] This is also related to the host terms as well, but I think this is best dealt with separately Begin forwarded message: > From: Barry Smith > Date: August 1, 2009 6:58:05 AM PDT > To: Alan Ruttenberg , Suzanna Lewis >, bfo-discuss at googlegroups.com > Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk > Subject: [bfo-discuss] Re: Example of one of the problems in > cellular component > Reply-To: bfo-discuss at googlegroups.com > > > >>> >>> >>> ---------- Forwarded message ---------- >>> From: Rachael Huntley >>> Date: Fri, Jul 31, 2009 at 5:50 AM >>> Subject: [Annotation] GPI-anchored proteins >>> To: annotation at genome.stanford.edu >>> >>> >>> Hi all, >>> >>> I'm after some advice. I'm a little confused about these two terms, >>> with respect to GPI-anchored proteins; >>> >>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>> some covalently attached portion of the gene product, for example >>> part >>> of a peptide sequence or some other covalently attached moiety >>> such as >>> a GPI anchor, spans or is embedded in one or both leaflets of the >>> membrane. Note that proteins intrinsic to membranes cannot be >>> removed >>> without disrupting the membrane, e.g. by detergent. >>> >>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>> membrane, but not integrated into the hydrophobic region. Note that >>> proteins extrinsic to membranes can be removed by treatments that do >>> not disrupt the membrane, such as salt solutions. >>> This term can be used instead of these obsolete terms: GO:0015025 >>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>> >>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>> definition and the second as a suggestion to use extrinsic to >>> membrane >>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>> receptor >>> >>> I don't know much about GPI-anchored proteins, but from what I can >>> gather they can be extracted by detergent-solubilizing a membrane >>> (PMID:19374451) which would suggest use of the term GO:0031224 >>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>> phospholipase C, thus releasing the associated protein, which would >>> suggest use of the term GO:0019898 extrinsic to membrane. >>> >>> Additionally, GO:0031224 intrinsic to membrane has the child >>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>> covalently attached anchor, such as a lipid moiety, that is embedded >>> in the membrane. When used to describe a protein, indicates that >>> none >>> of the peptide sequence is embedded in the membrane.) which would >>> be a >>> term I would use for GPI-anchored proteins. >>> >>> Can anyone suggest whether GPI-anchored proteins should be annotated >>> to extrinsic or intrinsic to membrane. Either way, it looks as >>> though >>> the ontology could be refined in this area. >>> >>> Thanks for your help. >>> >>> Rachael. >>> >>> -- >>> GOA and IntAct Curator >>> European Bioinformatics Institute >>> Welcome Trust Genome Campus >>> Hinxton >>> Cambridge, CB10 1SD >>> UK >>> >>> Tel: 01223 492515 >>> Fax: 01223 494468 > >> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >> Nice discussion. I'll note, however, that an adjective can't be a >> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >> location) versus "floating above the Taj Mahal" (a relation to a >> location). >> >> Cellular components are either places that things can be located in, >> or substances that are part_of cells. >> And GO:0031225: anchored to membrane >> >> Is neither of these - it is a state of affairs or disposition or >> something. >> >> -Alan > > I agree with Alan that there are cellular component terms which need > cleaning up. However, I note that, according to BFO, objects can have > both other objects and also holes (cavities) as parts. Thus for > instance your gut and your nostrils are parts of you. This is one of > the reasons why it is wrong to see organisms as sums of molecules, > for example. > BS > BS > > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google > Groups "BFO Discuss" group. > To post to this group, send email to bfo-discuss at googlegroups.com > To unsubscribe from this group, send email to bfo-discuss+unsubscribe at googlegroups.com > For more options, visit this group at http://groups.google.com/group/bfo-discuss?hl=en > -~----------~----~----~----~------~----~------~--~--- > > From adiehl at informatics.jax.org Wed Aug 5 11:42:18 2009 From: adiehl at informatics.jax.org (Alexander Diehl) Date: Wed, 05 Aug 2009 14:42:18 -0400 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> Message-ID: <4A79D28A.1060606@informatics.jax.org> For what it's worth, I have long described CC as "localization relative to a cell" to account for extracellular space, lumenal spaces, etc. I never really thought about the need for mass. -- Alex Chris Mungall wrote: > > I'd like to briefly revisit this issue > > We have a request for a term "Fully spanning the plasma membrane" > https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 > > > This a location rather than a cellular component. The determining > factor would be whether instances of the type would have mass. A > nucleus instance has mass, but it's not clear what a "Fully spanning > the plasma membrane" instance is. > > This is also the case for the intrinsic/extrinsic terms. See the email > from Alan below which I guess never made it onto the annotation list > (I guess we should probably set up a public GO ontology discussion > list distinct from GO friends to stop people spamming the wrong lists?) > > Of course these terms are useful and I'm not suggesting getting rid of > current annotations. But it is important to be clear what the terms in > CC are. This is especially important as other groups start using GO > terms in cross-products. > > The options as I see it are: > > [1] Interpret all GO CC terms as locations. > > Thus GO:0005634 ! nucleus would be interpreted as "located in the > nucleus". Note that this is not the current interpretation as far as I > see it; an instance of GO:0005634 is an instance of a nucleus. When we > have an association between a gene product and a nucleus then we > interpret this as the gene product being localized to the nucleus. > > [2] Introduce a new high level term "cell location". > > The is_a hierarchy would be something (roughly) like > > cell location > membrane location > extrinsic > spanning > intrinsic > fully spanning > full spanning plasma membrane > > (it would be more complex with dual parentage as we have the cross > product between membrane type and spatial qualifier) > > There would be additional contained_in/part_of links following the > current structure, so query results, enrichment etc would remain > roughly the same > > [3] Use spatial qualifiers in annotation > > Here we would actually obsolete the locational terms, and replace them > with annotation qualifiers > > * extrinsic, intrinsic: membranes only > * overlaps > * fully contained by > * fully spanning > > [4] Keep things as they actually are and not worry about giving a > coherent explanation as to what a cell component is. > > I am against [1] for reasons I can expand on. I am also against [4] > but partially resigned to it. I prefer [3] to [2] > > This is also related to the host terms as well, but I think this > is best dealt with separately > > Begin forwarded message: > >> From: Barry Smith >> Date: August 1, 2009 6:58:05 AM PDT >> To: Alan Ruttenberg , Suzanna Lewis >> , bfo-discuss at googlegroups.com >> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >> component >> Reply-To: bfo-discuss at googlegroups.com >> >> >> >>>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Rachael Huntley >>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>> Subject: [Annotation] GPI-anchored proteins >>>> To: annotation at genome.stanford.edu >>>> >>>> >>>> Hi all, >>>> >>>> I'm after some advice. I'm a little confused about these two terms, >>>> with respect to GPI-anchored proteins; >>>> >>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>> some covalently attached portion of the gene product, for example part >>>> of a peptide sequence or some other covalently attached moiety such as >>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>> membrane. Note that proteins intrinsic to membranes cannot be removed >>>> without disrupting the membrane, e.g. by detergent. >>>> >>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>> membrane, but not integrated into the hydrophobic region. Note that >>>> proteins extrinsic to membranes can be removed by treatments that do >>>> not disrupt the membrane, such as salt solutions. >>>> This term can be used instead of these obsolete terms: GO:0015025 >>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>> >>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>> definition and the second as a suggestion to use extrinsic to membrane >>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>> receptor >>>> >>>> I don't know much about GPI-anchored proteins, but from what I can >>>> gather they can be extracted by detergent-solubilizing a membrane >>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>> phospholipase C, thus releasing the associated protein, which would >>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>> >>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>> in the membrane. When used to describe a protein, indicates that none >>>> of the peptide sequence is embedded in the membrane.) which would be a >>>> term I would use for GPI-anchored proteins. >>>> >>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>> to extrinsic or intrinsic to membrane. Either way, it looks as though >>>> the ontology could be refined in this area. >>>> >>>> Thanks for your help. >>>> >>>> Rachael. >>>> >>>> -- >>>> GOA and IntAct Curator >>>> European Bioinformatics Institute >>>> Welcome Trust Genome Campus >>>> Hinxton >>>> Cambridge, CB10 1SD >>>> UK >>>> >>>> Tel: 01223 492515 >>>> Fax: 01223 494468 >> >>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>> Nice discussion. I'll note, however, that an adjective can't be a >>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>> location) versus "floating above the Taj Mahal" (a relation to a >>> location). >>> >>> Cellular components are either places that things can be located in, >>> or substances that are part_of cells. >>> And GO:0031225: anchored to membrane >>> >>> Is neither of these - it is a state of affairs or disposition or >>> something. >>> >>> -Alan >> >> I agree with Alan that there are cellular component terms which need >> cleaning up. However, I note that, according to BFO, objects can have >> both other objects and also holes (cavities) as parts. Thus for >> instance your gut and your nostrils are parts of you. This is one of >> the reasons why it is wrong to see organisms as sums of molecules, >> for example. >> BS >> BS >> >> >> --~--~---------~--~----~------------~-------~--~----~ >> You received this message because you are subscribed to the Google >> Groups "BFO Discuss" group. >> To post to this group, send email to bfo-discuss at googlegroups.com >> To unsubscribe from this group, send email to >> bfo-discuss+unsubscribe at googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/bfo-discuss?hl=en >> -~----------~----~----~----~------~----~------~--~--- >> >> > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors -- Alexander D. Diehl, Ph.D. Senior Scientific Curator Mouse Genome Informatics The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609 email: adiehl at informatics.jax.org work: +1 (207) 288-6427 fax: +1 (207) 288-6131 From hjd at informatics.jax.org Wed Aug 5 11:49:59 2009 From: hjd at informatics.jax.org (Harold Drabkin) Date: Wed, 05 Aug 2009 14:49:59 -0400 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> Message-ID: <4A79D457.3010104@informatics.jax.org> This is a good point. So when we annotate, we are saying something is in or at the component. Example: a protein is in the plasma membrane. But if we use a term integral_to_plasma _membrane as an annotation, we are now adding a "how" the protein is "in/at" the plasma membrane. It's still in the membrane. It might need a new relationship to link integral_to_plasma_membrane with "plasma membrane" hd Chris Mungall wrote: > > I'd like to briefly revisit this issue > > We have a request for a term "Fully spanning the plasma membrane" > https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 > > > This a location rather than a cellular component. The determining > factor would be whether instances of the type would have mass. A > nucleus instance has mass, but it's not clear what a "Fully spanning > the plasma membrane" instance is. > > This is also the case for the intrinsic/extrinsic terms. See the email > from Alan below which I guess never made it onto the annotation list > (I guess we should probably set up a public GO ontology discussion > list distinct from GO friends to stop people spamming the wrong lists?) > > Of course these terms are useful and I'm not suggesting getting rid of > current annotations. But it is important to be clear what the terms in > CC are. This is especially important as other groups start using GO > terms in cross-products. > > The options as I see it are: > > [1] Interpret all GO CC terms as locations. > > Thus GO:0005634 ! nucleus would be interpreted as "located in the > nucleus". Note that this is not the current interpretation as far as I > see it; an instance of GO:0005634 is an instance of a nucleus. When we > have an association between a gene product and a nucleus then we > interpret this as the gene product being localized to the nucleus. > > [2] Introduce a new high level term "cell location". > > The is_a hierarchy would be something (roughly) like > > cell location > membrane location > extrinsic > spanning > intrinsic > fully spanning > full spanning plasma membrane > > (it would be more complex with dual parentage as we have the cross > product between membrane type and spatial qualifier) > > There would be additional contained_in/part_of links following the > current structure, so query results, enrichment etc would remain > roughly the same > > [3] Use spatial qualifiers in annotation > > Here we would actually obsolete the locational terms, and replace them > with annotation qualifiers > > * extrinsic, intrinsic: membranes only > * overlaps > * fully contained by > * fully spanning > > [4] Keep things as they actually are and not worry about giving a > coherent explanation as to what a cell component is. > > I am against [1] for reasons I can expand on. I am also against [4] > but partially resigned to it. I prefer [3] to [2] > > This is also related to the host terms as well, but I think this > is best dealt with separately > > Begin forwarded message: > >> From: Barry Smith >> Date: August 1, 2009 6:58:05 AM PDT >> To: Alan Ruttenberg , Suzanna Lewis >> , bfo-discuss at googlegroups.com >> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >> component >> Reply-To: bfo-discuss at googlegroups.com >> >> >> >>>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Rachael Huntley >>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>> Subject: [Annotation] GPI-anchored proteins >>>> To: annotation at genome.stanford.edu >>>> >>>> >>>> Hi all, >>>> >>>> I'm after some advice. I'm a little confused about these two terms, >>>> with respect to GPI-anchored proteins; >>>> >>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>> some covalently attached portion of the gene product, for example part >>>> of a peptide sequence or some other covalently attached moiety such as >>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>> membrane. Note that proteins intrinsic to membranes cannot be removed >>>> without disrupting the membrane, e.g. by detergent. >>>> >>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>> membrane, but not integrated into the hydrophobic region. Note that >>>> proteins extrinsic to membranes can be removed by treatments that do >>>> not disrupt the membrane, such as salt solutions. >>>> This term can be used instead of these obsolete terms: GO:0015025 >>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>> >>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>> definition and the second as a suggestion to use extrinsic to membrane >>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>> receptor >>>> >>>> I don't know much about GPI-anchored proteins, but from what I can >>>> gather they can be extracted by detergent-solubilizing a membrane >>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>> phospholipase C, thus releasing the associated protein, which would >>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>> >>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>> in the membrane. When used to describe a protein, indicates that none >>>> of the peptide sequence is embedded in the membrane.) which would be a >>>> term I would use for GPI-anchored proteins. >>>> >>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>> to extrinsic or intrinsic to membrane. Either way, it looks as though >>>> the ontology could be refined in this area. >>>> >>>> Thanks for your help. >>>> >>>> Rachael. >>>> >>>> -- >>>> GOA and IntAct Curator >>>> European Bioinformatics Institute >>>> Welcome Trust Genome Campus >>>> Hinxton >>>> Cambridge, CB10 1SD >>>> UK >>>> >>>> Tel: 01223 492515 >>>> Fax: 01223 494468 >> >>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>> Nice discussion. I'll note, however, that an adjective can't be a >>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>> location) versus "floating above the Taj Mahal" (a relation to a >>> location). >>> >>> Cellular components are either places that things can be located in, >>> or substances that are part_of cells. >>> And GO:0031225: anchored to membrane >>> >>> Is neither of these - it is a state of affairs or disposition or >>> something. >>> >>> -Alan >> >> I agree with Alan that there are cellular component terms which need >> cleaning up. However, I note that, according to BFO, objects can have >> both other objects and also holes (cavities) as parts. Thus for >> instance your gut and your nostrils are parts of you. This is one of >> the reasons why it is wrong to see organisms as sums of molecules, >> for example. >> BS >> BS >> >> >> --~--~---------~--~----~------------~-------~--~----~ >> You received this message because you are subscribed to the Google >> Groups "BFO Discuss" group. >> To post to this group, send email to bfo-discuss at googlegroups.com >> To unsubscribe from this group, send email to >> bfo-discuss+unsubscribe at googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/bfo-discuss?hl=en >> -~----------~----~----~----~------~----~------~--~--- >> >> > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors From midori at ebi.ac.uk Thu Aug 6 02:37:44 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 6 Aug 2009 10:37:44 +0100 (BST) Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <4A79D457.3010104@informatics.jax.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> Message-ID: For the integral/intrinsic/extrinsic to membrane terms, it doesn't really matter whether we interpret CC as locations or "stuff" that has mass. Even if the CC terms were "located in X" rather than the thing "X", terms like "intrinsic to membrane" describe spatial relations -- it's "how" they're located there, exactly as Harold says. I don't want to do [1], not least because it wouldn't solve the intrinsic-to-membrane problem anyway. I also agree that an annotation makes a "located-in" (or on, or at) statement, so the ontology doesn't have to. I don't like [2] at all; it sounds like a lot of work for a rather small gain. My vote is to work towards [3], and put up with [4] until we have [3] deployed. m On Wed, 5 Aug 2009, Harold Drabkin wrote: > This is a good point. So when we annotate, we are saying something is in or > at the component. Example: a protein is in the plasma membrane. But if we > use a term integral_to_plasma _membrane as an annotation, we are now adding a > "how" the protein is "in/at" the plasma membrane. It's still in the membrane. > It might need a new relationship to link integral_to_plasma_membrane with > "plasma membrane" > > hd > > Chris Mungall wrote: >> >> I'd like to briefly revisit this issue >> >> We have a request for a term "Fully spanning the plasma membrane" >> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >> >> This a location rather than a cellular component. The determining factor >> would be whether instances of the type would have mass. A nucleus instance >> has mass, but it's not clear what a "Fully spanning the plasma membrane" >> instance is. >> >> This is also the case for the intrinsic/extrinsic terms. See the email from >> Alan below which I guess never made it onto the annotation list (I guess we >> should probably set up a public GO ontology discussion list distinct from >> GO friends to stop people spamming the wrong lists?) >> >> Of course these terms are useful and I'm not suggesting getting rid of >> current annotations. But it is important to be clear what the terms in CC >> are. This is especially important as other groups start using GO terms in >> cross-products. >> >> The options as I see it are: >> >> [1] Interpret all GO CC terms as locations. >> >> Thus GO:0005634 ! nucleus would be interpreted as "located in the nucleus". >> Note that this is not the current interpretation as far as I see it; an >> instance of GO:0005634 is an instance of a nucleus. When we have an >> association between a gene product and a nucleus then we interpret this as >> the gene product being localized to the nucleus. >> >> [2] Introduce a new high level term "cell location". >> >> The is_a hierarchy would be something (roughly) like >> >> cell location >> membrane location >> extrinsic >> spanning >> intrinsic >> fully spanning >> full spanning plasma membrane >> >> (it would be more complex with dual parentage as we have the cross product >> between membrane type and spatial qualifier) >> >> There would be additional contained_in/part_of links following the current >> structure, so query results, enrichment etc would remain roughly the same >> >> [3] Use spatial qualifiers in annotation >> >> Here we would actually obsolete the locational terms, and replace them with >> annotation qualifiers >> >> * extrinsic, intrinsic: membranes only >> * overlaps >> * fully contained by >> * fully spanning >> >> [4] Keep things as they actually are and not worry about giving a coherent >> explanation as to what a cell component is. >> >> I am against [1] for reasons I can expand on. I am also against [4] but >> partially resigned to it. I prefer [3] to [2] >> >> This is also related to the host terms as well, but I think this is >> best dealt with separately >> >> Begin forwarded message: >> >>> From: Barry Smith >>> Date: August 1, 2009 6:58:05 AM PDT >>> To: Alan Ruttenberg , Suzanna Lewis >>> , bfo-discuss at googlegroups.com >>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >>> component >>> Reply-To: bfo-discuss at googlegroups.com >>> >>> >>> >>>>> >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: Rachael Huntley >>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>> Subject: [Annotation] GPI-anchored proteins >>>>> To: annotation at genome.stanford.edu >>>>> >>>>> >>>>> Hi all, >>>>> >>>>> I'm after some advice. I'm a little confused about these two terms, >>>>> with respect to GPI-anchored proteins; >>>>> >>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>> some covalently attached portion of the gene product, for example part >>>>> of a peptide sequence or some other covalently attached moiety such as >>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>> membrane. Note that proteins intrinsic to membranes cannot be removed >>>>> without disrupting the membrane, e.g. by detergent. >>>>> >>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>>> membrane, but not integrated into the hydrophobic region. Note that >>>>> proteins extrinsic to membranes can be removed by treatments that do >>>>> not disrupt the membrane, such as salt solutions. >>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>> >>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>> definition and the second as a suggestion to use extrinsic to membrane >>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>> receptor >>>>> >>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>> phospholipase C, thus releasing the associated protein, which would >>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>> >>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>>> in the membrane. When used to describe a protein, indicates that none >>>>> of the peptide sequence is embedded in the membrane.) which would be a >>>>> term I would use for GPI-anchored proteins. >>>>> >>>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>>> to extrinsic or intrinsic to membrane. Either way, it looks as though >>>>> the ontology could be refined in this area. >>>>> >>>>> Thanks for your help. >>>>> >>>>> Rachael. >>>>> >>>>> -- >>>>> GOA and IntAct Curator >>>>> European Bioinformatics Institute >>>>> Welcome Trust Genome Campus >>>>> Hinxton >>>>> Cambridge, CB10 1SD >>>>> UK >>>>> >>>>> Tel: 01223 492515 >>>>> Fax: 01223 494468 >>> >>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>> Nice discussion. I'll note, however, that an adjective can't be a >>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>> location) versus "floating above the Taj Mahal" (a relation to a >>>> location). >>>> >>>> Cellular components are either places that things can be located in, >>>> or substances that are part_of cells. >>>> And GO:0031225: anchored to membrane >>>> >>>> Is neither of these - it is a state of affairs or disposition or >>>> something. >>>> >>>> -Alan >>> >>> I agree with Alan that there are cellular component terms which need >>> cleaning up. However, I note that, according to BFO, objects can have >>> both other objects and also holes (cavities) as parts. Thus for >>> instance your gut and your nostrils are parts of you. This is one of >>> the reasons why it is wrong to see organisms as sums of molecules, for >>> example. >>> BS >>> BS >>> >>> >>> --~--~---------~--~----~------------~-------~--~----~ >>> You received this message because you are subscribed to the Google Groups >>> "BFO Discuss" group. >>> To post to this group, send email to bfo-discuss at googlegroups.com >>> To unsubscribe from this group, send email to >>> bfo-discuss+unsubscribe at googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/bfo-discuss?hl=en >>> -~----------~----~----~----~------~----~------~--~--- >>> >>> >> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors > From jane at ebi.ac.uk Thu Aug 6 03:43:29 2009 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 06 Aug 2009 11:43:29 +0100 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> Message-ID: <4A7AB3D1.3080107@ebi.ac.uk> I would also favour 3, not least because it also has the potential to solve the host [component] issue. I imagine it would be fairly straightforward to automatically transfer existing annotations from e.g. extrinsic to membrane to be gp -> qual [extrinsic] -> GO:membrane. An obvious downside to this is that it would become more complicated to retrieve e.g. all extrinsic membrane proteins as the query would now involve info in the annotation as well as the GO term. Tools would need to be more sophisticated to handle this. That's not necessarily a deal-breaker though. We'd also need to make sure the qualifiers had accessible definitions somewhere because the GO term definition would not be sufficient to make the annotation. Shall we put this on the agenda for the GOC meeting? Jane Midori Harris wrote: > For the integral/intrinsic/extrinsic to membrane terms, it doesn't > really matter whether we interpret CC as locations or "stuff" that has > mass. Even if the CC terms were "located in X" rather than the thing > "X", terms like "intrinsic to membrane" describe spatial relations -- > it's "how" they're located there, exactly as Harold says. > > I don't want to do [1], not least because it wouldn't solve the > intrinsic-to-membrane problem anyway. I also agree that an annotation > makes a "located-in" (or on, or at) statement, so the ontology doesn't > have to. I don't like [2] at all; it sounds like a lot of work for a > rather small gain. My vote is to work towards [3], and put up with [4] > until we have [3] deployed. > > m > > On Wed, 5 Aug 2009, Harold Drabkin wrote: > >> This is a good point. So when we annotate, we are saying something is >> in or at the component. Example: a protein is in the plasma >> membrane. But if we use a term integral_to_plasma _membrane as an >> annotation, we are now adding a "how" the protein is "in/at" the >> plasma membrane. It's still in the membrane. It might need a new >> relationship to link integral_to_plasma_membrane with "plasma membrane" >> >> hd >> >> Chris Mungall wrote: >>> >>> I'd like to briefly revisit this issue >>> >>> We have a request for a term "Fully spanning the plasma membrane" >>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>> >>> This a location rather than a cellular component. The determining >>> factor would be whether instances of the type would have mass. A >>> nucleus instance has mass, but it's not clear what a "Fully spanning >>> the plasma membrane" instance is. >>> >>> This is also the case for the intrinsic/extrinsic terms. See the >>> email from Alan below which I guess never made it onto the >>> annotation list (I guess we should probably set up a public GO >>> ontology discussion list distinct from GO friends to stop people >>> spamming the wrong lists?) >>> >>> Of course these terms are useful and I'm not suggesting getting rid >>> of current annotations. But it is important to be clear what the >>> terms in CC are. This is especially important as other groups start >>> using GO terms in cross-products. >>> >>> The options as I see it are: >>> >>> [1] Interpret all GO CC terms as locations. >>> >>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>> nucleus". Note that this is not the current interpretation as far as >>> I see it; an instance of GO:0005634 is an instance of a nucleus. >>> When we have an association between a gene product and a nucleus >>> then we interpret this as the gene product being localized to the >>> nucleus. >>> >>> [2] Introduce a new high level term "cell location". >>> >>> The is_a hierarchy would be something (roughly) like >>> >>> cell location >>> membrane location >>> extrinsic >>> spanning >>> intrinsic >>> fully spanning >>> full spanning plasma membrane >>> >>> (it would be more complex with dual parentage as we have the cross >>> product between membrane type and spatial qualifier) >>> >>> There would be additional contained_in/part_of links following the >>> current structure, so query results, enrichment etc would remain >>> roughly the same >>> >>> [3] Use spatial qualifiers in annotation >>> >>> Here we would actually obsolete the locational terms, and replace >>> them with annotation qualifiers >>> >>> * extrinsic, intrinsic: membranes only >>> * overlaps >>> * fully contained by >>> * fully spanning >>> >>> [4] Keep things as they actually are and not worry about giving a >>> coherent explanation as to what a cell component is. >>> >>> I am against [1] for reasons I can expand on. I am also against [4] >>> but partially resigned to it. I prefer [3] to [2] >>> >>> This is also related to the host terms as well, but I think >>> this is best dealt with separately >>> >>> Begin forwarded message: >>> >>>> From: Barry Smith >>>> Date: August 1, 2009 6:58:05 AM PDT >>>> To: Alan Ruttenberg , Suzanna Lewis >>>> , bfo-discuss at googlegroups.com >>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>> cellular component >>>> Reply-To: bfo-discuss at googlegroups.com >>>> >>>> >>>> >>>>>> >>>>>> >>>>>> ---------- Forwarded message ---------- >>>>>> From: Rachael Huntley >>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>> To: annotation at genome.stanford.edu >>>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I'm after some advice. I'm a little confused about these two terms, >>>>>> with respect to GPI-anchored proteins; >>>>>> >>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>>> some covalently attached portion of the gene product, for example >>>>>> part >>>>>> of a peptide sequence or some other covalently attached moiety >>>>>> such as >>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>> membrane. Note that proteins intrinsic to membranes cannot be >>>>>> removed >>>>>> without disrupting the membrane, e.g. by detergent. >>>>>> >>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>>>> membrane, but not integrated into the hydrophobic region. Note that >>>>>> proteins extrinsic to membranes can be removed by treatments that do >>>>>> not disrupt the membrane, such as salt solutions. >>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>> >>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>> definition and the second as a suggestion to use extrinsic to >>>>>> membrane >>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>> receptor >>>>>> >>>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>>> phospholipase C, thus releasing the associated protein, which would >>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>> >>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>>>> in the membrane. When used to describe a protein, indicates that >>>>>> none >>>>>> of the peptide sequence is embedded in the membrane.) which would >>>>>> be a >>>>>> term I would use for GPI-anchored proteins. >>>>>> >>>>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as >>>>>> though >>>>>> the ontology could be refined in this area. >>>>>> >>>>>> Thanks for your help. >>>>>> >>>>>> Rachael. >>>>>> >>>>>> -- >>>>>> GOA and IntAct Curator >>>>>> European Bioinformatics Institute >>>>>> Welcome Trust Genome Campus >>>>>> Hinxton >>>>>> Cambridge, CB10 1SD >>>>>> UK >>>>>> >>>>>> Tel: 01223 492515 >>>>>> Fax: 01223 494468 >>>> >>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>> location). >>>>> >>>>> Cellular components are either places that things can be located in, >>>>> or substances that are part_of cells. >>>>> And GO:0031225: anchored to membrane >>>>> >>>>> Is neither of these - it is a state of affairs or disposition or >>>>> something. >>>>> >>>>> -Alan >>>> >>>> I agree with Alan that there are cellular component terms which need >>>> cleaning up. However, I note that, according to BFO, objects can have >>>> both other objects and also holes (cavities) as parts. Thus for >>>> instance your gut and your nostrils are parts of you. This is one of >>>> the reasons why it is wrong to see organisms as sums of molecules, >>>> for example. >>>> BS >>>> BS >>>> >>>> >>>> --~--~---------~--~----~------------~-------~--~----~ >>>> You received this message because you are subscribed to the Google >>>> Groups "BFO Discuss" group. >>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>> To unsubscribe from this group, send email to >>>> bfo-discuss+unsubscribe at googlegroups.com >>>> For more options, visit this group at >>>> http://groups.google.com/group/bfo-discuss?hl=en >>>> -~----------~----~----~----~------~----~------~--~--- >>>> >>>> >>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors -- Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 From midori at ebi.ac.uk Thu Aug 6 03:46:15 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 6 Aug 2009 11:46:15 +0100 (BST) Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <4A7AB3D1.3080107@ebi.ac.uk> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: Yes, please do put it on the agenda. m On Thu, 6 Aug 2009, Jane Lomax wrote: > I would also favour 3, not least because it also has the potential to solve > the host [component] issue. > > I imagine it would be fairly straightforward to automatically transfer > existing annotations from e.g. extrinsic to membrane to be gp -> qual > [extrinsic] -> GO:membrane. > > An obvious downside to this is that it would become more complicated to > retrieve e.g. all extrinsic membrane proteins as the query would now involve > info in the annotation as well as the GO term. Tools would need to be more > sophisticated to handle this. That's not necessarily a deal-breaker though. > We'd also need to make sure the qualifiers had accessible definitions > somewhere because the GO term definition would not be sufficient to make the > annotation. > > Shall we put this on the agenda for the GOC meeting? > > Jane > > Midori Harris wrote: >> For the integral/intrinsic/extrinsic to membrane terms, it doesn't really >> matter whether we interpret CC as locations or "stuff" that has mass. Even >> if the CC terms were "located in X" rather than the thing "X", terms like >> "intrinsic to membrane" describe spatial relations -- it's "how" they're >> located there, exactly as Harold says. >> >> I don't want to do [1], not least because it wouldn't solve the >> intrinsic-to-membrane problem anyway. I also agree that an annotation makes >> a "located-in" (or on, or at) statement, so the ontology doesn't have to. I >> don't like [2] at all; it sounds like a lot of work for a rather small >> gain. My vote is to work towards [3], and put up with [4] until we have [3] >> deployed. >> >> m >> >> On Wed, 5 Aug 2009, Harold Drabkin wrote: >> >>> This is a good point. So when we annotate, we are saying something is in >>> or at the component. Example: a protein is in the plasma membrane. But if >>> we use a term integral_to_plasma _membrane as an annotation, we are now >>> adding a "how" the protein is "in/at" the plasma membrane. It's still in >>> the membrane. It might need a new relationship to link >>> integral_to_plasma_membrane with "plasma membrane" >>> >>> hd >>> >>> Chris Mungall wrote: >>>> >>>> I'd like to briefly revisit this issue >>>> >>>> We have a request for a term "Fully spanning the plasma membrane" >>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>> This a location rather than a cellular component. The determining factor >>>> would be whether instances of the type would have mass. A nucleus >>>> instance has mass, but it's not clear what a "Fully spanning the plasma >>>> membrane" instance is. >>>> >>>> This is also the case for the intrinsic/extrinsic terms. See the email >>>> from Alan below which I guess never made it onto the annotation list (I >>>> guess we should probably set up a public GO ontology discussion list >>>> distinct from GO friends to stop people spamming the wrong lists?) >>>> >>>> Of course these terms are useful and I'm not suggesting getting rid of >>>> current annotations. But it is important to be clear what the terms in CC >>>> are. This is especially important as other groups start using GO terms in >>>> cross-products. >>>> >>>> The options as I see it are: >>>> >>>> [1] Interpret all GO CC terms as locations. >>>> >>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>> nucleus". Note that this is not the current interpretation as far as I >>>> see it; an instance of GO:0005634 is an instance of a nucleus. When we >>>> have an association between a gene product and a nucleus then we >>>> interpret this as the gene product being localized to the nucleus. >>>> >>>> [2] Introduce a new high level term "cell location". >>>> >>>> The is_a hierarchy would be something (roughly) like >>>> >>>> cell location >>>> membrane location >>>> extrinsic >>>> spanning >>>> intrinsic >>>> fully spanning >>>> full spanning plasma membrane >>>> >>>> (it would be more complex with dual parentage as we have the cross >>>> product between membrane type and spatial qualifier) >>>> >>>> There would be additional contained_in/part_of links following the >>>> current structure, so query results, enrichment etc would remain roughly >>>> the same >>>> >>>> [3] Use spatial qualifiers in annotation >>>> >>>> Here we would actually obsolete the locational terms, and replace them >>>> with annotation qualifiers >>>> >>>> * extrinsic, intrinsic: membranes only >>>> * overlaps >>>> * fully contained by >>>> * fully spanning >>>> >>>> [4] Keep things as they actually are and not worry about giving a >>>> coherent explanation as to what a cell component is. >>>> >>>> I am against [1] for reasons I can expand on. I am also against [4] but >>>> partially resigned to it. I prefer [3] to [2] >>>> >>>> This is also related to the host terms as well, but I think this is >>>> best dealt with separately >>>> >>>> Begin forwarded message: >>>> >>>>> From: Barry Smith >>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>> , bfo-discuss at googlegroups.com >>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >>>>> component >>>>> Reply-To: bfo-discuss at googlegroups.com >>>>> >>>>> >>>>> >>>>>>> >>>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Rachael Huntley >>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>> To: annotation at genome.stanford.edu >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I'm after some advice. I'm a little confused about these two terms, >>>>>>> with respect to GPI-anchored proteins; >>>>>>> >>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>>>> some covalently attached portion of the gene product, for example part >>>>>>> of a peptide sequence or some other covalently attached moiety such as >>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>>> membrane. Note that proteins intrinsic to membranes cannot be removed >>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>> >>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>>>>> membrane, but not integrated into the hydrophobic region. Note that >>>>>>> proteins extrinsic to membranes can be removed by treatments that do >>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>> >>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>>> definition and the second as a suggestion to use extrinsic to membrane >>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>> receptor >>>>>>> >>>>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>>>> phospholipase C, thus releasing the associated protein, which would >>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>> >>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>>>>> in the membrane. When used to describe a protein, indicates that none >>>>>>> of the peptide sequence is embedded in the membrane.) which would be a >>>>>>> term I would use for GPI-anchored proteins. >>>>>>> >>>>>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as though >>>>>>> the ontology could be refined in this area. >>>>>>> >>>>>>> Thanks for your help. >>>>>>> >>>>>>> Rachael. >>>>>>> >>>>>>> -- >>>>>>> GOA and IntAct Curator >>>>>>> European Bioinformatics Institute >>>>>>> Welcome Trust Genome Campus >>>>>>> Hinxton >>>>>>> Cambridge, CB10 1SD >>>>>>> UK >>>>>>> >>>>>>> Tel: 01223 492515 >>>>>>> Fax: 01223 494468 >>>>> >>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>> location). >>>>>> >>>>>> Cellular components are either places that things can be located in, >>>>>> or substances that are part_of cells. >>>>>> And GO:0031225: anchored to membrane >>>>>> >>>>>> Is neither of these - it is a state of affairs or disposition or >>>>>> something. >>>>>> >>>>>> -Alan >>>>> >>>>> I agree with Alan that there are cellular component terms which need >>>>> cleaning up. However, I note that, according to BFO, objects can have >>>>> both other objects and also holes (cavities) as parts. Thus for >>>>> instance your gut and your nostrils are parts of you. This is one of >>>>> the reasons why it is wrong to see organisms as sums of molecules, for >>>>> example. >>>>> BS >>>>> BS >>>>> >>>>> >>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>> You received this message because you are subscribed to the Google >>>>> Groups "BFO Discuss" group. >>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>> To unsubscribe from this group, send email to >>>>> bfo-discuss+unsubscribe at googlegroups.com >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/bfo-discuss?hl=en >>>>> -~----------~----~----~----~------~----~------~--~--- >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > From midori at ebi.ac.uk Thu Aug 6 03:50:34 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 6 Aug 2009 11:50:34 +0100 (BST) Subject: [Ontology-editors] lists (Re: cellular component vs cellular location) In-Reply-To: <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> Message-ID: > See the email from > Alan below which I guess never made it onto the annotation list (I guess we > should probably set up a public GO ontology discussion list distinct from GO > friends to stop people spamming the wrong lists?) Is there any reason why Alan and similarly inclined others shouldn't send their emails to the ontology-editors list? They'll be moderated because the sender isn't a list member, but they'll get through with only a short delay (and we get to minimize list proliferation). (I got the impression that Alan sent his email to Barry, and then Barry cc'd bfo-discuss and GO annotation in his reply; Barry's email reached annotation after moderation.) m From jane at ebi.ac.uk Thu Aug 6 05:12:19 2009 From: jane at ebi.ac.uk (Jane Lomax) Date: Thu, 06 Aug 2009 13:12:19 +0100 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: <4A7AC8A3.9040907@ebi.ac.uk> Okay - done. Chris, I put it down as yours... Jane Midori Harris wrote: > Yes, please do put it on the agenda. > > m > > On Thu, 6 Aug 2009, Jane Lomax wrote: > >> I would also favour 3, not least because it also has the potential to >> solve the host [component] issue. >> >> I imagine it would be fairly straightforward to automatically >> transfer existing annotations from e.g. extrinsic to membrane to be >> gp -> qual [extrinsic] -> GO:membrane. >> >> An obvious downside to this is that it would become more complicated >> to retrieve e.g. all extrinsic membrane proteins as the query would >> now involve info in the annotation as well as the GO term. Tools >> would need to be more sophisticated to handle this. That's not >> necessarily a deal-breaker though. We'd also need to make sure the >> qualifiers had accessible definitions somewhere because the GO term >> definition would not be sufficient to make the annotation. >> >> Shall we put this on the agenda for the GOC meeting? >> >> Jane >> >> Midori Harris wrote: >>> For the integral/intrinsic/extrinsic to membrane terms, it doesn't >>> really matter whether we interpret CC as locations or "stuff" that >>> has mass. Even if the CC terms were "located in X" rather than the >>> thing "X", terms like "intrinsic to membrane" describe spatial >>> relations -- it's "how" they're located there, exactly as Harold says. >>> >>> I don't want to do [1], not least because it wouldn't solve the >>> intrinsic-to-membrane problem anyway. I also agree that an >>> annotation makes a "located-in" (or on, or at) statement, so the >>> ontology doesn't have to. I don't like [2] at all; it sounds like a >>> lot of work for a rather small gain. My vote is to work towards [3], >>> and put up with [4] until we have [3] deployed. >>> >>> m >>> >>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>> >>>> This is a good point. So when we annotate, we are saying something >>>> is in or at the component. Example: a protein is in the plasma >>>> membrane. But if we use a term integral_to_plasma _membrane as an >>>> annotation, we are now adding a "how" the protein is "in/at" the >>>> plasma membrane. It's still in the membrane. It might need a new >>>> relationship to link integral_to_plasma_membrane with "plasma >>>> membrane" >>>> >>>> hd >>>> >>>> Chris Mungall wrote: >>>>> >>>>> I'd like to briefly revisit this issue >>>>> >>>>> We have a request for a term "Fully spanning the plasma membrane" >>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>> This a location rather than a cellular component. The determining >>>>> factor would be whether instances of the type would have mass. A >>>>> nucleus instance has mass, but it's not clear what a "Fully >>>>> spanning the plasma membrane" instance is. >>>>> >>>>> This is also the case for the intrinsic/extrinsic terms. See the >>>>> email from Alan below which I guess never made it onto the >>>>> annotation list (I guess we should probably set up a public GO >>>>> ontology discussion list distinct from GO friends to stop people >>>>> spamming the wrong lists?) >>>>> >>>>> Of course these terms are useful and I'm not suggesting getting >>>>> rid of current annotations. But it is important to be clear what >>>>> the terms in CC are. This is especially important as other groups >>>>> start using GO terms in cross-products. >>>>> >>>>> The options as I see it are: >>>>> >>>>> [1] Interpret all GO CC terms as locations. >>>>> >>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>>> nucleus". Note that this is not the current interpretation as far >>>>> as I see it; an instance of GO:0005634 is an instance of a >>>>> nucleus. When we have an association between a gene product and a >>>>> nucleus then we interpret this as the gene product being localized >>>>> to the nucleus. >>>>> >>>>> [2] Introduce a new high level term "cell location". >>>>> >>>>> The is_a hierarchy would be something (roughly) like >>>>> >>>>> cell location >>>>> membrane location >>>>> extrinsic >>>>> spanning >>>>> intrinsic >>>>> fully spanning >>>>> full spanning plasma membrane >>>>> >>>>> (it would be more complex with dual parentage as we have the cross >>>>> product between membrane type and spatial qualifier) >>>>> >>>>> There would be additional contained_in/part_of links following the >>>>> current structure, so query results, enrichment etc would remain >>>>> roughly the same >>>>> >>>>> [3] Use spatial qualifiers in annotation >>>>> >>>>> Here we would actually obsolete the locational terms, and replace >>>>> them with annotation qualifiers >>>>> >>>>> * extrinsic, intrinsic: membranes only >>>>> * overlaps >>>>> * fully contained by >>>>> * fully spanning >>>>> >>>>> [4] Keep things as they actually are and not worry about giving a >>>>> coherent explanation as to what a cell component is. >>>>> >>>>> I am against [1] for reasons I can expand on. I am also against >>>>> [4] but partially resigned to it. I prefer [3] to [2] >>>>> >>>>> This is also related to the host terms as well, but I think >>>>> this is best dealt with separately >>>>> >>>>> Begin forwarded message: >>>>> >>>>>> From: Barry Smith >>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>>> , bfo-discuss at googlegroups.com >>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>>>> cellular component >>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>> >>>>>> >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ---------- Forwarded message ---------- >>>>>>>> From: Rachael Huntley >>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>> To: annotation at genome.stanford.edu >>>>>>>> >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I'm after some advice. I'm a little confused about these two >>>>>>>> terms, >>>>>>>> with respect to GPI-anchored proteins; >>>>>>>> >>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>>>>> some covalently attached portion of the gene product, for >>>>>>>> example part >>>>>>>> of a peptide sequence or some other covalently attached moiety >>>>>>>> such as >>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>>>> membrane. Note that proteins intrinsic to membranes cannot be >>>>>>>> removed >>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>> >>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface >>>>>>>> of a >>>>>>>> membrane, but not integrated into the hydrophobic region. Note >>>>>>>> that >>>>>>>> proteins extrinsic to membranes can be removed by treatments >>>>>>>> that do >>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>> >>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>>>> definition and the second as a suggestion to use extrinsic to >>>>>>>> membrane >>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>>> receptor >>>>>>>> >>>>>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>>>>> phospholipase C, thus releasing the associated protein, which >>>>>>>> would >>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>> >>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>>>> covalently attached anchor, such as a lipid moiety, that is >>>>>>>> embedded >>>>>>>> in the membrane. When used to describe a protein, indicates >>>>>>>> that none >>>>>>>> of the peptide sequence is embedded in the membrane.) which >>>>>>>> would be a >>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>> >>>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>>> annotated >>>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as >>>>>>>> though >>>>>>>> the ontology could be refined in this area. >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> Rachael. >>>>>>>> >>>>>>>> -- >>>>>>>> GOA and IntAct Curator >>>>>>>> European Bioinformatics Institute >>>>>>>> Welcome Trust Genome Campus >>>>>>>> Hinxton >>>>>>>> Cambridge, CB10 1SD >>>>>>>> UK >>>>>>>> >>>>>>>> Tel: 01223 492515 >>>>>>>> Fax: 01223 494468 >>>>>> >>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>>> location). >>>>>>> >>>>>>> Cellular components are either places that things can be located >>>>>>> in, >>>>>>> or substances that are part_of cells. >>>>>>> And GO:0031225: anchored to membrane >>>>>>> >>>>>>> Is neither of these - it is a state of affairs or disposition or >>>>>>> something. >>>>>>> >>>>>>> -Alan >>>>>> >>>>>> I agree with Alan that there are cellular component terms which need >>>>>> cleaning up. However, I note that, according to BFO, objects can >>>>>> have >>>>>> both other objects and also holes (cavities) as parts. Thus for >>>>>> instance your gut and your nostrils are parts of you. This is one of >>>>>> the reasons why it is wrong to see organisms as sums of >>>>>> molecules, for example. >>>>>> BS >>>>>> BS >>>>>> >>>>>> >>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>> You received this message because you are subscribed to the >>>>>> Google Groups "BFO Discuss" group. >>>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>>> To unsubscribe from this group, send email to >>>>>> bfo-discuss+unsubscribe at googlegroups.com >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/bfo-discuss?hl=en >>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Ontology-editors mailing list >>>>> Ontology-editors at geneontology.org >>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> >> >> -- Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 From cjm at berkeleybop.org Thu Aug 6 08:57:21 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Thu, 6 Aug 2009 08:57:21 -0700 Subject: [Ontology-editors] lists (Re: cellular component vs cellular location) In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> Message-ID: On Aug 6, 2009, at 3:50 AM, Midori Harris wrote: >> See the email from Alan below which I guess never made it onto the >> annotation list (I guess we should probably set up a public GO >> ontology discussion list distinct from GO friends to stop people >> spamming the wrong lists?) > > Is there any reason why Alan and similarly inclined others shouldn't > send their emails to the ontology-editors list? They'll be moderated > because the sender isn't a list member, but they'll get through with > only a short delay (and we get to minimize list proliferation). I think it was just because the original thread on GPI anchors was on the annotation list > (I got the impression that Alan sent his email to Barry, and then > Barry cc'd bfo-discuss and GO annotation in his reply; Barry's email > reached annotation after moderation.) > > m > > From cjm at berkeleybop.org Thu Aug 6 09:10:02 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Thu, 6 Aug 2009 09:10:02 -0700 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <4A7AB3D1.3080107@ebi.ac.uk> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: > I would also favour 3, not least because it also has the potential > to solve the host [component] issue. > > I imagine it would be fairly straightforward to automatically > transfer existing annotations from e.g. extrinsic to membrane to be > gp -> qual [extrinsic] -> GO:membrane. Yep, that's the easy part.. > > An obvious downside to this is that it would become more complicated > to retrieve e.g. all extrinsic membrane proteins as the query would > now involve info in the annotation as well as the GO term. Tools > would need to be more sophisticated to handle this. ..that's the harder part. The extra level of sophistication is not that high, but getting 50 developers to reengineer their enrichment tools... > That's not necessarily a deal-breaker though. We'd also need to make > sure the qualifiers had accessible definitions somewhere because the > GO term definition would not be sufficient to make the annotation. These could live as relations declared in the main obo file. > > Shall we put this on the agenda for the GOC meeting? yep I wonder if it's worth soliciting people for other similar qualifiers that may be useful? That may be too big a can of worms > > Jane > > Midori Harris wrote: >> For the integral/intrinsic/extrinsic to membrane terms, it doesn't >> really matter whether we interpret CC as locations or "stuff" that >> has mass. Even if the CC terms were "located in X" rather than the >> thing "X", terms like "intrinsic to membrane" describe spatial >> relations -- it's "how" they're located there, exactly as Harold >> says. >> >> I don't want to do [1], not least because it wouldn't solve the >> intrinsic-to-membrane problem anyway. I also agree that an >> annotation makes a "located-in" (or on, or at) statement, so the >> ontology doesn't have to. I don't like [2] at all; it sounds like a >> lot of work for a rather small gain. My vote is to work towards >> [3], and put up with [4] until we have [3] deployed. >> >> m >> >> On Wed, 5 Aug 2009, Harold Drabkin wrote: >> >>> This is a good point. So when we annotate, we are saying something >>> is in or at the component. Example: a protein is in the plasma >>> membrane. But if we use a term integral_to_plasma _membrane as an >>> annotation, we are now adding a "how" the protein is "in/at" the >>> plasma membrane. It's still in the membrane. It might need a new >>> relationship to link integral_to_plasma_membrane with "plasma >>> membrane" >>> >>> hd >>> >>> Chris Mungall wrote: >>>> >>>> I'd like to briefly revisit this issue >>>> >>>> We have a request for a term "Fully spanning the plasma membrane" >>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>> This a location rather than a cellular component. The determining >>>> factor would be whether instances of the type would have mass. A >>>> nucleus instance has mass, but it's not clear what a "Fully >>>> spanning the plasma membrane" instance is. >>>> >>>> This is also the case for the intrinsic/extrinsic terms. See the >>>> email from Alan below which I guess never made it onto the >>>> annotation list (I guess we should probably set up a public GO >>>> ontology discussion list distinct from GO friends to stop people >>>> spamming the wrong lists?) >>>> >>>> Of course these terms are useful and I'm not suggesting getting >>>> rid of current annotations. But it is important to be clear what >>>> the terms in CC are. This is especially important as other groups >>>> start using GO terms in cross-products. >>>> >>>> The options as I see it are: >>>> >>>> [1] Interpret all GO CC terms as locations. >>>> >>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>> nucleus". Note that this is not the current interpretation as far >>>> as I see it; an instance of GO:0005634 is an instance of a >>>> nucleus. When we have an association between a gene product and a >>>> nucleus then we interpret this as the gene product being >>>> localized to the nucleus. >>>> >>>> [2] Introduce a new high level term "cell location". >>>> >>>> The is_a hierarchy would be something (roughly) like >>>> >>>> cell location >>>> membrane location >>>> extrinsic >>>> spanning >>>> intrinsic >>>> fully spanning >>>> full spanning plasma membrane >>>> >>>> (it would be more complex with dual parentage as we have the >>>> cross product between membrane type and spatial qualifier) >>>> >>>> There would be additional contained_in/part_of links following >>>> the current structure, so query results, enrichment etc would >>>> remain roughly the same >>>> >>>> [3] Use spatial qualifiers in annotation >>>> >>>> Here we would actually obsolete the locational terms, and replace >>>> them with annotation qualifiers >>>> >>>> * extrinsic, intrinsic: membranes only >>>> * overlaps >>>> * fully contained by >>>> * fully spanning >>>> >>>> [4] Keep things as they actually are and not worry about giving a >>>> coherent explanation as to what a cell component is. >>>> >>>> I am against [1] for reasons I can expand on. I am also against >>>> [4] but partially resigned to it. I prefer [3] to [2] >>>> >>>> This is also related to the host terms as well, but I think >>>> this is best dealt with separately >>>> >>>> Begin forwarded message: >>>> >>>>> From: Barry Smith >>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>> To: Alan Ruttenberg , Suzanna Lewis >>>> >, bfo-discuss at googlegroups.com >>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>>> cellular component >>>>> Reply-To: bfo-discuss at googlegroups.com >>>>> >>>>> >>>>> >>>>>>> >>>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Rachael Huntley >>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>> To: annotation at genome.stanford.edu >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I'm after some advice. I'm a little confused about these two >>>>>>> terms, >>>>>>> with respect to GPI-anchored proteins; >>>>>>> >>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such >>>>>>> that >>>>>>> some covalently attached portion of the gene product, for >>>>>>> example part >>>>>>> of a peptide sequence or some other covalently attached moiety >>>>>>> such as >>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of >>>>>>> the >>>>>>> membrane. Note that proteins intrinsic to membranes cannot be >>>>>>> removed >>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>> >>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one >>>>>>> surface of a >>>>>>> membrane, but not integrated into the hydrophobic region. Note >>>>>>> that >>>>>>> proteins extrinsic to membranes can be removed by treatments >>>>>>> that do >>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>> This term can be used instead of these obsolete terms: GO: >>>>>>> 0015025 >>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>> >>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in >>>>>>> the >>>>>>> definition and the second as a suggestion to use extrinsic to >>>>>>> membrane >>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>> receptor >>>>>>> >>>>>>> I don't know much about GPI-anchored proteins, but from what I >>>>>>> can >>>>>>> gather they can be extracted by detergent-solubilizing a >>>>>>> membrane >>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>> intrinsic to membrane. However, the GPI-anchor can be >>>>>>> disrupted by >>>>>>> phospholipase C, thus releasing the associated protein, which >>>>>>> would >>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>> >>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane >>>>>>> by a >>>>>>> covalently attached anchor, such as a lipid moiety, that is >>>>>>> embedded >>>>>>> in the membrane. When used to describe a protein, indicates >>>>>>> that none >>>>>>> of the peptide sequence is embedded in the membrane.) which >>>>>>> would be a >>>>>>> term I would use for GPI-anchored proteins. >>>>>>> >>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>> annotated >>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as >>>>>>> though >>>>>>> the ontology could be refined in this area. >>>>>>> >>>>>>> Thanks for your help. >>>>>>> >>>>>>> Rachael. >>>>>>> >>>>>>> -- >>>>>>> GOA and IntAct Curator >>>>>>> European Bioinformatics Institute >>>>>>> Welcome Trust Genome Campus >>>>>>> Hinxton >>>>>>> Cambridge, CB10 1SD >>>>>>> UK >>>>>>> >>>>>>> Tel: 01223 492515 >>>>>>> Fax: 01223 494468 >>>>> >>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj >>>>>> Mahal" (a >>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>> location). >>>>>> >>>>>> Cellular components are either places that things can be >>>>>> located in, >>>>>> or substances that are part_of cells. >>>>>> And GO:0031225: anchored to membrane >>>>>> >>>>>> Is neither of these - it is a state of affairs or disposition >>>>>> or something. >>>>>> >>>>>> -Alan >>>>> >>>>> I agree with Alan that there are cellular component terms which >>>>> need >>>>> cleaning up. However, I note that, according to BFO, objects can >>>>> have >>>>> both other objects and also holes (cavities) as parts. Thus for >>>>> instance your gut and your nostrils are parts of you. This is >>>>> one of >>>>> the reasons why it is wrong to see organisms as sums of >>>>> molecules, for example. >>>>> BS >>>>> BS >>>>> >>>>> >>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>> You received this message because you are subscribed to the >>>>> Google Groups "BFO Discuss" group. >>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>> To unsubscribe from this group, send email to bfo-discuss+unsubscribe at googlegroups.com >>>>> For more options, visit this group at http://groups.google.com/group/bfo-discuss?hl=en >>>>> -~----------~----~----~----~------~----~------~--~--- >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > -- > Dr Jane Lomax > GO Editorial Office > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridgeshire, UK > CB10 1SD > > p: +44 1223 492516 > f: +44 1223 494468 > > From kchris at genome.stanford.edu Thu Aug 6 09:45:24 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Thu, 6 Aug 2009 09:45:24 -0700 (PDT) Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: Personally, I'm not fond of idea #3 at all. We already find the qualifiers confusing and adding more will just increase the potential problems with accurate and consistent use of qualifiers in annotation. In light of the multiple other places where we are embracing the explosion of precomposed terms, I don't see the rationale to post-compose here by combining a term from the mini ontology of qualifiers with a cellular component term. -Karen On Thu, 6 Aug 2009, Chris Mungall wrote: > > On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: > >> I would also favour 3, not least because it also has the potential to solve >> the host [component] issue. >> >> I imagine it would be fairly straightforward to automatically transfer >> existing annotations from e.g. extrinsic to membrane to be gp -> qual >> [extrinsic] -> GO:membrane. > > Yep, that's the easy part.. > >> >> An obvious downside to this is that it would become more complicated to >> retrieve e.g. all extrinsic membrane proteins as the query would now >> involve info in the annotation as well as the GO term. Tools would need to >> be more sophisticated to handle this. > > ..that's the harder part. The extra level of sophistication is not that high, > but getting 50 developers to reengineer their enrichment tools... > >> That's not necessarily a deal-breaker though. We'd also need to make sure >> the qualifiers had accessible definitions somewhere because the GO term >> definition would not be sufficient to make the annotation. > > These could live as relations declared in the main obo file. > >> >> Shall we put this on the agenda for the GOC meeting? > > yep > > I wonder if it's worth soliciting people for other similar qualifiers that > may be useful? That may be too big a can of worms > >> >> Jane >> >> Midori Harris wrote: >>> For the integral/intrinsic/extrinsic to membrane terms, it doesn't really >>> matter whether we interpret CC as locations or "stuff" that has mass. Even >>> if the CC terms were "located in X" rather than the thing "X", terms like >>> "intrinsic to membrane" describe spatial relations -- it's "how" they're >>> located there, exactly as Harold says. >>> >>> I don't want to do [1], not least because it wouldn't solve the >>> intrinsic-to-membrane problem anyway. I also agree that an annotation >>> makes a "located-in" (or on, or at) statement, so the ontology doesn't >>> have to. I don't like [2] at all; it sounds like a lot of work for a >>> rather small gain. My vote is to work towards [3], and put up with [4] >>> until we have [3] deployed. >>> >>> m >>> >>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>> >>>> This is a good point. So when we annotate, we are saying something is in >>>> or at the component. Example: a protein is in the plasma membrane. But >>>> if we use a term integral_to_plasma _membrane as an annotation, we are >>>> now adding a "how" the protein is "in/at" the plasma membrane. It's still >>>> in the membrane. It might need a new relationship to link >>>> integral_to_plasma_membrane with "plasma membrane" >>>> >>>> hd >>>> >>>> Chris Mungall wrote: >>>>> >>>>> I'd like to briefly revisit this issue >>>>> >>>>> We have a request for a term "Fully spanning the plasma membrane" >>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>> This a location rather than a cellular component. The determining factor >>>>> would be whether instances of the type would have mass. A nucleus >>>>> instance has mass, but it's not clear what a "Fully spanning the plasma >>>>> membrane" instance is. >>>>> >>>>> This is also the case for the intrinsic/extrinsic terms. See the email >>>>> from Alan below which I guess never made it onto the annotation list (I >>>>> guess we should probably set up a public GO ontology discussion list >>>>> distinct from GO friends to stop people spamming the wrong lists?) >>>>> >>>>> Of course these terms are useful and I'm not suggesting getting rid of >>>>> current annotations. But it is important to be clear what the terms in >>>>> CC are. This is especially important as other groups start using GO >>>>> terms in cross-products. >>>>> >>>>> The options as I see it are: >>>>> >>>>> [1] Interpret all GO CC terms as locations. >>>>> >>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>>> nucleus". Note that this is not the current interpretation as far as I >>>>> see it; an instance of GO:0005634 is an instance of a nucleus. When we >>>>> have an association between a gene product and a nucleus then we >>>>> interpret this as the gene product being localized to the nucleus. >>>>> >>>>> [2] Introduce a new high level term "cell location". >>>>> >>>>> The is_a hierarchy would be something (roughly) like >>>>> >>>>> cell location >>>>> membrane location >>>>> extrinsic >>>>> spanning >>>>> intrinsic >>>>> fully spanning >>>>> full spanning plasma membrane >>>>> >>>>> (it would be more complex with dual parentage as we have the cross >>>>> product between membrane type and spatial qualifier) >>>>> >>>>> There would be additional contained_in/part_of links following the >>>>> current structure, so query results, enrichment etc would remain roughly >>>>> the same >>>>> >>>>> [3] Use spatial qualifiers in annotation >>>>> >>>>> Here we would actually obsolete the locational terms, and replace them >>>>> with annotation qualifiers >>>>> >>>>> * extrinsic, intrinsic: membranes only >>>>> * overlaps >>>>> * fully contained by >>>>> * fully spanning >>>>> >>>>> [4] Keep things as they actually are and not worry about giving a >>>>> coherent explanation as to what a cell component is. >>>>> >>>>> I am against [1] for reasons I can expand on. I am also against [4] but >>>>> partially resigned to it. I prefer [3] to [2] >>>>> >>>>> This is also related to the host terms as well, but I think this is >>>>> best dealt with separately >>>>> >>>>> Begin forwarded message: >>>>> >>>>>> From: Barry Smith >>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>>> , bfo-discuss at googlegroups.com >>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >>>>>> component >>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>> >>>>>> >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ---------- Forwarded message ---------- >>>>>>>> From: Rachael Huntley >>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>> To: annotation at genome.stanford.edu >>>>>>>> >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I'm after some advice. I'm a little confused about these two terms, >>>>>>>> with respect to GPI-anchored proteins; >>>>>>>> >>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>>>>> some covalently attached portion of the gene product, for example >>>>>>>> part >>>>>>>> of a peptide sequence or some other covalently attached moiety such >>>>>>>> as >>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>>>> membrane. Note that proteins intrinsic to membranes cannot be removed >>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>> >>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of a >>>>>>>> membrane, but not integrated into the hydrophobic region. Note that >>>>>>>> proteins extrinsic to membranes can be removed by treatments that do >>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>> >>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>>>> definition and the second as a suggestion to use extrinsic to >>>>>>>> membrane >>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>>> receptor >>>>>>>> >>>>>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>>>>> phospholipase C, thus releasing the associated protein, which would >>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>> >>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>>>> covalently attached anchor, such as a lipid moiety, that is embedded >>>>>>>> in the membrane. When used to describe a protein, indicates that none >>>>>>>> of the peptide sequence is embedded in the membrane.) which would be >>>>>>>> a >>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>> >>>>>>>> Can anyone suggest whether GPI-anchored proteins should be annotated >>>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as though >>>>>>>> the ontology could be refined in this area. >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> Rachael. >>>>>>>> >>>>>>>> -- >>>>>>>> GOA and IntAct Curator >>>>>>>> European Bioinformatics Institute >>>>>>>> Welcome Trust Genome Campus >>>>>>>> Hinxton >>>>>>>> Cambridge, CB10 1SD >>>>>>>> UK >>>>>>>> >>>>>>>> Tel: 01223 492515 >>>>>>>> Fax: 01223 494468 >>>>>> >>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>>> location). >>>>>>> >>>>>>> Cellular components are either places that things can be located in, >>>>>>> or substances that are part_of cells. >>>>>>> And GO:0031225: anchored to membrane >>>>>>> >>>>>>> Is neither of these - it is a state of affairs or disposition or >>>>>>> something. >>>>>>> >>>>>>> -Alan >>>>>> >>>>>> I agree with Alan that there are cellular component terms which need >>>>>> cleaning up. However, I note that, according to BFO, objects can have >>>>>> both other objects and also holes (cavities) as parts. Thus for >>>>>> instance your gut and your nostrils are parts of you. This is one of >>>>>> the reasons why it is wrong to see organisms as sums of molecules, for >>>>>> example. >>>>>> BS >>>>>> BS >>>>>> >>>>>> >>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "BFO Discuss" group. >>>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>>> To unsubscribe from this group, send email to >>>>>> bfo-discuss+unsubscribe at googlegroups.com >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/bfo-discuss?hl=en >>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Ontology-editors mailing list >>>>> Ontology-editors at geneontology.org >>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> >> >> -- >> Dr Jane Lomax >> GO Editorial Office >> EMBL-EBI >> Wellcome Trust Genome Campus >> Hinxton >> Cambridgeshire, UK >> CB10 1SD >> >> p: +44 1223 492516 >> f: +44 1223 494468 >> >> > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors From cjm at berkeleybop.org Thu Aug 6 11:15:47 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Thu, 6 Aug 2009 11:15:47 -0700 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> I'm not at all worried by the additional terms, there aren't that many. The problem is the mixing of locations with cell parts in an ad- hoc way I agree that many existing qualifiers are confusing and are used inconsistently. But it doesn't follow that any future qualifiers will be confusing or lead to inconsistency. If these qualifiers are well- defined with a clearly specified semantics it may have the opposite effect. On Aug 6, 2009, at 9:45 AM, Karen Christie wrote: > Personally, I'm not fond of idea #3 at all. We already find the > qualifiers confusing and adding more will just increase the > potential problems with accurate and consistent use of qualifiers in > annotation. > > In light of the multiple other places where we are embracing the > explosion of precomposed terms, I don't see the rationale to post- > compose here by combining a term from the mini ontology of > qualifiers with a cellular component term. > > -Karen > > > On Thu, 6 Aug 2009, Chris Mungall wrote: > >> >> On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: >> >>> I would also favour 3, not least because it also has the potential >>> to solve the host [component] issue. >>> I imagine it would be fairly straightforward to automatically >>> transfer existing annotations from e.g. extrinsic to membrane to >>> be gp -> qual [extrinsic] -> GO:membrane. >> >> Yep, that's the easy part.. >> >>> An obvious downside to this is that it would become more >>> complicated to retrieve e.g. all extrinsic membrane proteins as >>> the query would now involve info in the annotation as well as the >>> GO term. Tools would need to be more sophisticated to handle this. >> >> ..that's the harder part. The extra level of sophistication is not >> that high, but getting 50 developers to reengineer their enrichment >> tools... >> >>> That's not necessarily a deal-breaker though. We'd also need to >>> make sure the qualifiers had accessible definitions somewhere >>> because the GO term definition would not be sufficient to make the >>> annotation. >> >> These could live as relations declared in the main obo file. >> >>> Shall we put this on the agenda for the GOC meeting? >> >> yep >> >> I wonder if it's worth soliciting people for other similar >> qualifiers that may be useful? That may be too big a can of worms >> >>> Jane >>> Midori Harris wrote: >>>> For the integral/intrinsic/extrinsic to membrane terms, it >>>> doesn't really matter whether we interpret CC as locations or >>>> "stuff" that has mass. Even if the CC terms were "located in X" >>>> rather than the thing "X", terms like "intrinsic to membrane" >>>> describe spatial relations -- it's "how" they're located there, >>>> exactly as Harold says. >>>> I don't want to do [1], not least because it wouldn't solve the >>>> intrinsic-to-membrane problem anyway. I also agree that an >>>> annotation makes a "located-in" (or on, or at) statement, so the >>>> ontology doesn't have to. I don't like [2] at all; it sounds like >>>> a lot of work for a rather small gain. My vote is to work towards >>>> [3], and put up with [4] until we have [3] deployed. >>>> m >>>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>>>> This is a good point. So when we annotate, we are saying >>>>> something is in or at the component. Example: a protein is in >>>>> the plasma membrane. But if we use a term integral_to_plasma >>>>> _membrane as an annotation, we are now adding a "how" the >>>>> protein is "in/at" the plasma membrane. It's still in the >>>>> membrane. It might need a new relationship to link >>>>> integral_to_plasma_membrane with "plasma membrane" >>>>> hd >>>>> Chris Mungall wrote: >>>>>> I'd like to briefly revisit this issue >>>>>> We have a request for a term "Fully spanning the plasma membrane" >>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>>> This a location rather than a cellular component. The >>>>>> determining factor would be whether instances of the type would >>>>>> have mass. A nucleus instance has mass, but it's not clear what >>>>>> a "Fully spanning the plasma membrane" instance is. >>>>>> This is also the case for the intrinsic/extrinsic terms. See >>>>>> the email from Alan below which I guess never made it onto the >>>>>> annotation list (I guess we should probably set up a public GO >>>>>> ontology discussion list distinct from GO friends to stop >>>>>> people spamming the wrong lists?) >>>>>> Of course these terms are useful and I'm not suggesting getting >>>>>> rid of current annotations. But it is important to be clear >>>>>> what the terms in CC are. This is especially important as other >>>>>> groups start using GO terms in cross-products. >>>>>> The options as I see it are: >>>>>> [1] Interpret all GO CC terms as locations. >>>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in >>>>>> the nucleus". Note that this is not the current interpretation >>>>>> as far as I see it; an instance of GO:0005634 is an instance of >>>>>> a nucleus. When we have an association between a gene product >>>>>> and a nucleus then we interpret this as the gene product being >>>>>> localized to the nucleus. >>>>>> [2] Introduce a new high level term "cell location". >>>>>> The is_a hierarchy would be something (roughly) like >>>>>> >>>>>> cell location >>>>>> membrane location >>>>>> extrinsic >>>>>> spanning >>>>>> intrinsic >>>>>> fully spanning >>>>>> full spanning plasma membrane >>>>>> (it would be more complex with dual parentage as we have the >>>>>> cross product between membrane type and spatial qualifier) >>>>>> There would be additional contained_in/part_of links following >>>>>> the current structure, so query results, enrichment etc would >>>>>> remain roughly the same >>>>>> [3] Use spatial qualifiers in annotation >>>>>> Here we would actually obsolete the locational terms, and >>>>>> replace them with annotation qualifiers >>>>>> >>>>>> * extrinsic, intrinsic: membranes only >>>>>> * overlaps >>>>>> * fully contained by >>>>>> * fully spanning >>>>>> [4] Keep things as they actually are and not worry about giving >>>>>> a coherent explanation as to what a cell component is. >>>>>> I am against [1] for reasons I can expand on. I am also against >>>>>> [4] but partially resigned to it. I prefer [3] to [2] >>>>>> This is also related to the host terms as well, but I >>>>>> think this is best dealt with separately >>>>>> Begin forwarded message: >>>>>>> From: Barry Smith >>>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>>> >, bfo-discuss at googlegroups.com >>>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>>>>> cellular component >>>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>> From: Rachael Huntley >>>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>>> To: annotation at genome.stanford.edu >>>>>>>>> Hi all, >>>>>>>>> I'm after some advice. I'm a little confused about these two >>>>>>>>> terms, >>>>>>>>> with respect to GPI-anchored proteins; >>>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane >>>>>>>>> such that >>>>>>>>> some covalently attached portion of the gene product, for >>>>>>>>> example part >>>>>>>>> of a peptide sequence or some other covalently attached >>>>>>>>> moiety such as >>>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets >>>>>>>>> of the >>>>>>>>> membrane. Note that proteins intrinsic to membranes cannot >>>>>>>>> be removed >>>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one >>>>>>>>> surface of a >>>>>>>>> membrane, but not integrated into the hydrophobic region. >>>>>>>>> Note that >>>>>>>>> proteins extrinsic to membranes can be removed by treatments >>>>>>>>> that do >>>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>>> This term can be used instead of these obsolete terms: GO: >>>>>>>>> 0015025 >>>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) >>>>>>>>> in the >>>>>>>>> definition and the second as a suggestion to use extrinsic >>>>>>>>> to membrane >>>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>>>> receptor >>>>>>>>> I don't know much about GPI-anchored proteins, but from what >>>>>>>>> I can >>>>>>>>> gather they can be extracted by detergent-solubilizing a >>>>>>>>> membrane >>>>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>>>> intrinsic to membrane. However, the GPI-anchor can be >>>>>>>>> disrupted by >>>>>>>>> phospholipase C, thus releasing the associated protein, >>>>>>>>> which would >>>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane >>>>>>>>> by a >>>>>>>>> covalently attached anchor, such as a lipid moiety, that is >>>>>>>>> embedded >>>>>>>>> in the membrane. When used to describe a protein, indicates >>>>>>>>> that none >>>>>>>>> of the peptide sequence is embedded in the membrane.) which >>>>>>>>> would be a >>>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>>>> annotated >>>>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks >>>>>>>>> as though >>>>>>>>> the ontology could be refined in this area. >>>>>>>>> Thanks for your help. >>>>>>>>> Rachael. >>>>>>>>> -- >>>>>>>>> GOA and IntAct Curator >>>>>>>>> European Bioinformatics Institute >>>>>>>>> Welcome Trust Genome Campus >>>>>>>>> Hinxton >>>>>>>>> Cambridge, CB10 1SD >>>>>>>>> UK >>>>>>>>> Tel: 01223 492515 >>>>>>>>> Fax: 01223 494468 >>>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>>> Nice discussion. I'll note, however, that an adjective can't >>>>>>>> be a >>>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj >>>>>>>> Mahal" (a >>>>>>>> location) versus "floating above the Taj Mahal" (a relation >>>>>>>> to a >>>>>>>> location). >>>>>>>> Cellular components are either places that things can be >>>>>>>> located in, >>>>>>>> or substances that are part_of cells. >>>>>>>> And GO:0031225: anchored to membrane >>>>>>>> Is neither of these - it is a state of affairs or disposition >>>>>>>> or something. >>>>>>>> -Alan >>>>>>> I agree with Alan that there are cellular component terms >>>>>>> which need >>>>>>> cleaning up. However, I note that, according to BFO, objects >>>>>>> can have >>>>>>> both other objects and also holes (cavities) as parts. Thus for >>>>>>> instance your gut and your nostrils are parts of you. This is >>>>>>> one of >>>>>>> the reasons why it is wrong to see organisms as sums of >>>>>>> molecules, for example. >>>>>>> BS >>>>>>> BS >>>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>>> You received this message because you are subscribed to the >>>>>>> Google Groups "BFO Discuss" group. >>>>>>> To post to this group, send email to bfo- >>>>>>> discuss at googlegroups.com >>>>>>> To unsubscribe from this group, send email to bfo-discuss+unsubscribe at googlegroups.com >>>>>>> For more options, visit this group at http://groups.google.com/group/bfo-discuss?hl=en >>>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>> _______________________________________________ >>>>>> Ontology-editors mailing list >>>>>> Ontology-editors at geneontology.org >>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>> _______________________________________________ >>>>> Ontology-editors mailing list >>>>> Ontology-editors at geneontology.org >>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> -- >>> Dr Jane Lomax >>> GO Editorial Office >>> EMBL-EBI >>> Wellcome Trust Genome Campus >>> Hinxton >>> Cambridgeshire, UK >>> CB10 1SD >>> p: +44 1223 492516 >>> f: +44 1223 494468 >> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > From adiehl at informatics.jax.org Thu Aug 6 11:22:50 2009 From: adiehl at informatics.jax.org (Alexander Diehl) Date: Thu, 06 Aug 2009 14:22:50 -0400 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> Message-ID: <4A7B1F7A.2000703@informatics.jax.org> I second Karen on this. Introducing new qualifiers necessitates new software development for the MODs and tool developers, which many don't have time for, and will slow down and complicate the annotation process for little benefit to the end users over the qualifiers implicit in the term names already. With infinite time and resources, this would be great, but it's not a realistic approach given the pressures and requirements on MODs and annotators already. -- Alex Karen Christie wrote: > Personally, I'm not fond of idea #3 at all. We already find the > qualifiers confusing and adding more will just increase the potential > problems with accurate and consistent use of qualifiers in annotation. > > In light of the multiple other places where we are embracing the > explosion of precomposed terms, I don't see the rationale to > post-compose here by combining a term from the mini ontology of > qualifiers with a cellular component term. > > -Karen > > > On Thu, 6 Aug 2009, Chris Mungall wrote: > >> >> On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: >> >>> I would also favour 3, not least because it also has the potential >>> to solve the host [component] issue. >>> >>> I imagine it would be fairly straightforward to automatically >>> transfer existing annotations from e.g. extrinsic to membrane to be >>> gp -> qual [extrinsic] -> GO:membrane. >> >> Yep, that's the easy part.. >> >>> >>> An obvious downside to this is that it would become more complicated >>> to retrieve e.g. all extrinsic membrane proteins as the query would >>> now involve info in the annotation as well as the GO term. Tools >>> would need to be more sophisticated to handle this. >> >> ..that's the harder part. The extra level of sophistication is not >> that high, but getting 50 developers to reengineer their enrichment >> tools... >> >>> That's not necessarily a deal-breaker though. We'd also need to make >>> sure the qualifiers had accessible definitions somewhere because the >>> GO term definition would not be sufficient to make the annotation. >> >> These could live as relations declared in the main obo file. >> >>> >>> Shall we put this on the agenda for the GOC meeting? >> >> yep >> >> I wonder if it's worth soliciting people for other similar qualifiers >> that may be useful? That may be too big a can of worms >> >>> >>> Jane >>> >>> Midori Harris wrote: >>>> For the integral/intrinsic/extrinsic to membrane terms, it doesn't >>>> really matter whether we interpret CC as locations or "stuff" that >>>> has mass. Even if the CC terms were "located in X" rather than the >>>> thing "X", terms like "intrinsic to membrane" describe spatial >>>> relations -- it's "how" they're located there, exactly as Harold says. >>>> >>>> I don't want to do [1], not least because it wouldn't solve the >>>> intrinsic-to-membrane problem anyway. I also agree that an >>>> annotation makes a "located-in" (or on, or at) statement, so the >>>> ontology doesn't have to. I don't like [2] at all; it sounds like a >>>> lot of work for a rather small gain. My vote is to work towards >>>> [3], and put up with [4] until we have [3] deployed. >>>> >>>> m >>>> >>>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>>> >>>>> This is a good point. So when we annotate, we are saying something >>>>> is in or at the component. Example: a protein is in the plasma >>>>> membrane. But if we use a term integral_to_plasma _membrane as an >>>>> annotation, we are now adding a "how" the protein is "in/at" the >>>>> plasma membrane. It's still in the membrane. It might need a new >>>>> relationship to link integral_to_plasma_membrane with "plasma >>>>> membrane" >>>>> >>>>> hd >>>>> >>>>> Chris Mungall wrote: >>>>>> >>>>>> I'd like to briefly revisit this issue >>>>>> >>>>>> We have a request for a term "Fully spanning the plasma membrane" >>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>>> >>>>>> This a location rather than a cellular component. The determining >>>>>> factor would be whether instances of the type would have mass. A >>>>>> nucleus instance has mass, but it's not clear what a "Fully >>>>>> spanning the plasma membrane" instance is. >>>>>> >>>>>> This is also the case for the intrinsic/extrinsic terms. See the >>>>>> email from Alan below which I guess never made it onto the >>>>>> annotation list (I guess we should probably set up a public GO >>>>>> ontology discussion list distinct from GO friends to stop people >>>>>> spamming the wrong lists?) >>>>>> >>>>>> Of course these terms are useful and I'm not suggesting getting >>>>>> rid of current annotations. But it is important to be clear what >>>>>> the terms in CC are. This is especially important as other groups >>>>>> start using GO terms in cross-products. >>>>>> >>>>>> The options as I see it are: >>>>>> >>>>>> [1] Interpret all GO CC terms as locations. >>>>>> >>>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>>>> nucleus". Note that this is not the current interpretation as far >>>>>> as I see it; an instance of GO:0005634 is an instance of a >>>>>> nucleus. When we have an association between a gene product and a >>>>>> nucleus then we interpret this as the gene product being >>>>>> localized to the nucleus. >>>>>> >>>>>> [2] Introduce a new high level term "cell location". >>>>>> >>>>>> The is_a hierarchy would be something (roughly) like >>>>>> >>>>>> cell location >>>>>> membrane location >>>>>> extrinsic >>>>>> spanning >>>>>> intrinsic >>>>>> fully spanning >>>>>> full spanning plasma membrane >>>>>> >>>>>> (it would be more complex with dual parentage as we have the >>>>>> cross product between membrane type and spatial qualifier) >>>>>> >>>>>> There would be additional contained_in/part_of links following >>>>>> the current structure, so query results, enrichment etc would >>>>>> remain roughly the same >>>>>> >>>>>> [3] Use spatial qualifiers in annotation >>>>>> >>>>>> Here we would actually obsolete the locational terms, and replace >>>>>> them with annotation qualifiers >>>>>> >>>>>> * extrinsic, intrinsic: membranes only >>>>>> * overlaps >>>>>> * fully contained by >>>>>> * fully spanning >>>>>> >>>>>> [4] Keep things as they actually are and not worry about giving a >>>>>> coherent explanation as to what a cell component is. >>>>>> >>>>>> I am against [1] for reasons I can expand on. I am also against >>>>>> [4] but partially resigned to it. I prefer [3] to [2] >>>>>> >>>>>> This is also related to the host terms as well, but I think >>>>>> this is best dealt with separately >>>>>> >>>>>> Begin forwarded message: >>>>>> >>>>>>> From: Barry Smith >>>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>>>> , bfo-discuss at googlegroups.com >>>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>>>>> cellular component >>>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>>> >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>> From: Rachael Huntley >>>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>>> To: annotation at genome.stanford.edu >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I'm after some advice. I'm a little confused about these two >>>>>>>>> terms, >>>>>>>>> with respect to GPI-anchored proteins; >>>>>>>>> >>>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such >>>>>>>>> that >>>>>>>>> some covalently attached portion of the gene product, for >>>>>>>>> example part >>>>>>>>> of a peptide sequence or some other covalently attached moiety >>>>>>>>> such as >>>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>>>>> membrane. Note that proteins intrinsic to membranes cannot be >>>>>>>>> removed >>>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>>> >>>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one >>>>>>>>> surface of a >>>>>>>>> membrane, but not integrated into the hydrophobic region. Note >>>>>>>>> that >>>>>>>>> proteins extrinsic to membranes can be removed by treatments >>>>>>>>> that do >>>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>>> >>>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>>>>> definition and the second as a suggestion to use extrinsic to >>>>>>>>> membrane >>>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>>>> receptor >>>>>>>>> >>>>>>>>> I don't know much about GPI-anchored proteins, but from what I >>>>>>>>> can >>>>>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>>>> intrinsic to membrane. However, the GPI-anchor can be >>>>>>>>> disrupted by >>>>>>>>> phospholipase C, thus releasing the associated protein, which >>>>>>>>> would >>>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>>> >>>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>>>>> covalently attached anchor, such as a lipid moiety, that is >>>>>>>>> embedded >>>>>>>>> in the membrane. When used to describe a protein, indicates >>>>>>>>> that none >>>>>>>>> of the peptide sequence is embedded in the membrane.) which >>>>>>>>> would be a >>>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>>> >>>>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>>>> annotated >>>>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as >>>>>>>>> though >>>>>>>>> the ontology could be refined in this area. >>>>>>>>> >>>>>>>>> Thanks for your help. >>>>>>>>> >>>>>>>>> Rachael. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> GOA and IntAct Curator >>>>>>>>> European Bioinformatics Institute >>>>>>>>> Welcome Trust Genome Campus >>>>>>>>> Hinxton >>>>>>>>> Cambridge, CB10 1SD >>>>>>>>> UK >>>>>>>>> >>>>>>>>> Tel: 01223 492515 >>>>>>>>> Fax: 01223 494468 >>>>>>> >>>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>>>> location). >>>>>>>> >>>>>>>> Cellular components are either places that things can be >>>>>>>> located in, >>>>>>>> or substances that are part_of cells. >>>>>>>> And GO:0031225: anchored to membrane >>>>>>>> >>>>>>>> Is neither of these - it is a state of affairs or disposition >>>>>>>> or something. >>>>>>>> >>>>>>>> -Alan >>>>>>> >>>>>>> I agree with Alan that there are cellular component terms which >>>>>>> need >>>>>>> cleaning up. However, I note that, according to BFO, objects can >>>>>>> have >>>>>>> both other objects and also holes (cavities) as parts. Thus for >>>>>>> instance your gut and your nostrils are parts of you. This is >>>>>>> one of >>>>>>> the reasons why it is wrong to see organisms as sums of >>>>>>> molecules, for example. >>>>>>> BS >>>>>>> BS >>>>>>> >>>>>>> >>>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>>> You received this message because you are subscribed to the >>>>>>> Google Groups "BFO Discuss" group. >>>>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>>>> To unsubscribe from this group, send email to >>>>>>> bfo-discuss+unsubscribe at googlegroups.com >>>>>>> For more options, visit this group at >>>>>>> http://groups.google.com/group/bfo-discuss?hl=en >>>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Ontology-editors mailing list >>>>>> Ontology-editors at geneontology.org >>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>> >>>>> _______________________________________________ >>>>> Ontology-editors mailing list >>>>> Ontology-editors at geneontology.org >>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>> >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>> >>> >>> -- >>> Dr Jane Lomax >>> GO Editorial Office >>> EMBL-EBI >>> Wellcome Trust Genome Campus >>> Hinxton >>> Cambridgeshire, UK >>> CB10 1SD >>> >>> p: +44 1223 492516 >>> f: +44 1223 494468 >>> >>> >> >> _______________________________________________ >> Ontology-editors mailing list >> Ontology-editors at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors -- Alexander D. Diehl, Ph.D. Senior Scientific Curator Mouse Genome Informatics The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609 email: adiehl at informatics.jax.org work: +1 (207) 288-6427 fax: +1 (207) 288-6131 From kchris at genome.stanford.edu Fri Aug 7 16:19:18 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Fri, 7 Aug 2009 16:19:18 -0700 (PDT) Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> Message-ID: In some future GO version 2, I think I would prefer a more modular system where you can do annotations by combining terms from multiple ontologies rather than having to instantiate huge numbers of terms in order to distinguish things like cytoplasmic translation vs mitochondrial translation, or the terms for function process links, or 3'-end processing of snoRNA, snRNA, tRNA, etc. However, I think the proposal to create a new qualifier system just for the purpose of a subset of the CC terms would be a really fundamental change in the way annotations are made. Right now, the existing qualifiers basically modify the annotation, whether a gene is really part of a complex or just associated with it. But with your proposal #3, it seems like you would have to use a qualifier in order to generate a "term" equivalent to what exists now. In terms of the effects on annotation process, need for curator training, as well as the things Alex mentioned about programming time to modify software, I think this falls into the lots of work for very little gain category. While I can see your point that CC is a hodgepodge, I don't think this is the only issue in CC and I don't think this is a good way to fix it. At the moment, I vote for option 4, leave CC as is. -Karen On Thu, 6 Aug 2009, Chris Mungall wrote: > > I'm not at all worried by the additional terms, there aren't that many. The > problem is the mixing of locations with cell parts in an ad-hoc way > > I agree that many existing qualifiers are confusing and are used > inconsistently. But it doesn't follow that any future qualifiers will be > confusing or lead to inconsistency. If these qualifiers are well-defined with > a clearly specified semantics it may have the opposite effect. > > On Aug 6, 2009, at 9:45 AM, Karen Christie wrote: > >> Personally, I'm not fond of idea #3 at all. We already find the qualifiers >> confusing and adding more will just increase the potential problems with >> accurate and consistent use of qualifiers in annotation. >> >> In light of the multiple other places where we are embracing the explosion >> of precomposed terms, I don't see the rationale to post-compose here by >> combining a term from the mini ontology of qualifiers with a cellular >> component term. >> >> -Karen >> >> >> On Thu, 6 Aug 2009, Chris Mungall wrote: >> >>> >>> On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: >>> >>>> I would also favour 3, not least because it also has the potential to >>>> solve the host [component] issue. >>>> I imagine it would be fairly straightforward to automatically transfer >>>> existing annotations from e.g. extrinsic to membrane to be gp -> qual >>>> [extrinsic] -> GO:membrane. >>> >>> Yep, that's the easy part.. >>> >>>> An obvious downside to this is that it would become more complicated to >>>> retrieve e.g. all extrinsic membrane proteins as the query would now >>>> involve info in the annotation as well as the GO term. Tools would need >>>> to be more sophisticated to handle this. >>> >>> ..that's the harder part. The extra level of sophistication is not that >>> high, but getting 50 developers to reengineer their enrichment tools... >>> >>>> That's not necessarily a deal-breaker though. We'd also need to make sure >>>> the qualifiers had accessible definitions somewhere because the GO term >>>> definition would not be sufficient to make the annotation. >>> >>> These could live as relations declared in the main obo file. >>> >>>> Shall we put this on the agenda for the GOC meeting? >>> >>> yep >>> >>> I wonder if it's worth soliciting people for other similar qualifiers that >>> may be useful? That may be too big a can of worms >>> >>>> Jane >>>> Midori Harris wrote: >>>>> For the integral/intrinsic/extrinsic to membrane terms, it doesn't >>>>> really matter whether we interpret CC as locations or "stuff" that has >>>>> mass. Even if the CC terms were "located in X" rather than the thing >>>>> "X", terms like "intrinsic to membrane" describe spatial relations -- >>>>> it's "how" they're located there, exactly as Harold says. >>>>> I don't want to do [1], not least because it wouldn't solve the >>>>> intrinsic-to-membrane problem anyway. I also agree that an annotation >>>>> makes a "located-in" (or on, or at) statement, so the ontology doesn't >>>>> have to. I don't like [2] at all; it sounds like a lot of work for a >>>>> rather small gain. My vote is to work towards [3], and put up with [4] >>>>> until we have [3] deployed. >>>>> m >>>>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>>>>> This is a good point. So when we annotate, we are saying something is >>>>>> in or at the component. Example: a protein is in the plasma membrane. >>>>>> But if we use a term integral_to_plasma _membrane as an annotation, we >>>>>> are now adding a "how" the protein is "in/at" the plasma membrane. It's >>>>>> still in the membrane. It might need a new relationship to link >>>>>> integral_to_plasma_membrane with "plasma membrane" >>>>>> hd >>>>>> Chris Mungall wrote: >>>>>>> I'd like to briefly revisit this issue >>>>>>> We have a request for a term "Fully spanning the plasma membrane" >>>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>>>> This a location rather than a cellular component. The determining >>>>>>> factor would be whether instances of the type would have mass. A >>>>>>> nucleus instance has mass, but it's not clear what a "Fully spanning >>>>>>> the plasma membrane" instance is. >>>>>>> This is also the case for the intrinsic/extrinsic terms. See the email >>>>>>> from Alan below which I guess never made it onto the annotation list >>>>>>> (I guess we should probably set up a public GO ontology discussion >>>>>>> list distinct from GO friends to stop people spamming the wrong >>>>>>> lists?) >>>>>>> Of course these terms are useful and I'm not suggesting getting rid of >>>>>>> current annotations. But it is important to be clear what the terms in >>>>>>> CC are. This is especially important as other groups start using GO >>>>>>> terms in cross-products. >>>>>>> The options as I see it are: >>>>>>> [1] Interpret all GO CC terms as locations. >>>>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in the >>>>>>> nucleus". Note that this is not the current interpretation as far as I >>>>>>> see it; an instance of GO:0005634 is an instance of a nucleus. When we >>>>>>> have an association between a gene product and a nucleus then we >>>>>>> interpret this as the gene product being localized to the nucleus. >>>>>>> [2] Introduce a new high level term "cell location". >>>>>>> The is_a hierarchy would be something (roughly) like >>>>>>> >>>>>>> cell location >>>>>>> membrane location >>>>>>> extrinsic >>>>>>> spanning >>>>>>> intrinsic >>>>>>> fully spanning >>>>>>> full spanning plasma membrane >>>>>>> (it would be more complex with dual parentage as we have the cross >>>>>>> product between membrane type and spatial qualifier) >>>>>>> There would be additional contained_in/part_of links following the >>>>>>> current structure, so query results, enrichment etc would remain >>>>>>> roughly the same >>>>>>> [3] Use spatial qualifiers in annotation >>>>>>> Here we would actually obsolete the locational terms, and replace them >>>>>>> with annotation qualifiers >>>>>>> >>>>>>> * extrinsic, intrinsic: membranes only >>>>>>> * overlaps >>>>>>> * fully contained by >>>>>>> * fully spanning >>>>>>> [4] Keep things as they actually are and not worry about giving a >>>>>>> coherent explanation as to what a cell component is. >>>>>>> I am against [1] for reasons I can expand on. I am also against [4] >>>>>>> but partially resigned to it. I prefer [3] to [2] >>>>>>> This is also related to the host terms as well, but I think this >>>>>>> is best dealt with separately >>>>>>> Begin forwarded message: >>>>>>>> From: Barry Smith >>>>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>>>> To: Alan Ruttenberg , Suzanna Lewis >>>>>>>> , bfo-discuss at googlegroups.com >>>>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in cellular >>>>>>>> component >>>>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>>> From: Rachael Huntley >>>>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>>>> To: annotation at genome.stanford.edu >>>>>>>>>> Hi all, >>>>>>>>>> I'm after some advice. I'm a little confused about these two terms, >>>>>>>>>> with respect to GPI-anchored proteins; >>>>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane such that >>>>>>>>>> some covalently attached portion of the gene product, for example >>>>>>>>>> part >>>>>>>>>> of a peptide sequence or some other covalently attached moiety such >>>>>>>>>> as >>>>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets of the >>>>>>>>>> membrane. Note that proteins intrinsic to membranes cannot be >>>>>>>>>> removed >>>>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one surface of >>>>>>>>>> a >>>>>>>>>> membrane, but not integrated into the hydrophobic region. Note that >>>>>>>>>> proteins extrinsic to membranes can be removed by treatments that >>>>>>>>>> do >>>>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>>>> This term can be used instead of these obsolete terms: GO:0015025 >>>>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) in the >>>>>>>>>> definition and the second as a suggestion to use extrinsic to >>>>>>>>>> membrane >>>>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane-bound >>>>>>>>>> receptor >>>>>>>>>> I don't know much about GPI-anchored proteins, but from what I can >>>>>>>>>> gather they can be extracted by detergent-solubilizing a membrane >>>>>>>>>> (PMID:19374451) which would suggest use of the term GO:0031224 >>>>>>>>>> intrinsic to membrane. However, the GPI-anchor can be disrupted by >>>>>>>>>> phospholipase C, thus releasing the associated protein, which would >>>>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a membrane by a >>>>>>>>>> covalently attached anchor, such as a lipid moiety, that is >>>>>>>>>> embedded >>>>>>>>>> in the membrane. When used to describe a protein, indicates that >>>>>>>>>> none >>>>>>>>>> of the peptide sequence is embedded in the membrane.) which would >>>>>>>>>> be a >>>>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>>>>> annotated >>>>>>>>>> to extrinsic or intrinsic to membrane. Either way, it looks as >>>>>>>>>> though >>>>>>>>>> the ontology could be refined in this area. >>>>>>>>>> Thanks for your help. >>>>>>>>>> Rachael. >>>>>>>>>> -- >>>>>>>>>> GOA and IntAct Curator >>>>>>>>>> European Bioinformatics Institute >>>>>>>>>> Welcome Trust Genome Campus >>>>>>>>>> Hinxton >>>>>>>>>> Cambridge, CB10 1SD >>>>>>>>>> UK >>>>>>>>>> Tel: 01223 492515 >>>>>>>>>> Fax: 01223 494468 >>>>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>>>> Nice discussion. I'll note, however, that an adjective can't be a >>>>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj Mahal" (a >>>>>>>>> location) versus "floating above the Taj Mahal" (a relation to a >>>>>>>>> location). >>>>>>>>> Cellular components are either places that things can be located in, >>>>>>>>> or substances that are part_of cells. >>>>>>>>> And GO:0031225: anchored to membrane >>>>>>>>> Is neither of these - it is a state of affairs or disposition or >>>>>>>>> something. >>>>>>>>> -Alan >>>>>>>> I agree with Alan that there are cellular component terms which need >>>>>>>> cleaning up. However, I note that, according to BFO, objects can have >>>>>>>> both other objects and also holes (cavities) as parts. Thus for >>>>>>>> instance your gut and your nostrils are parts of you. This is one of >>>>>>>> the reasons why it is wrong to see organisms as sums of molecules, >>>>>>>> for example. >>>>>>>> BS >>>>>>>> BS >>>>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "BFO Discuss" group. >>>>>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>>>>> To unsubscribe from this group, send email to >>>>>>>> bfo-discuss+unsubscribe at googlegroups.com >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/bfo-discuss?hl=en >>>>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>>> _______________________________________________ >>>>>>> Ontology-editors mailing list >>>>>>> Ontology-editors at geneontology.org >>>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>>> _______________________________________________ >>>>>> Ontology-editors mailing list >>>>>> Ontology-editors at geneontology.org >>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>> _______________________________________________ >>>>> Ontology-editors mailing list >>>>> Ontology-editors at geneontology.org >>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>> -- >>>> Dr Jane Lomax >>>> GO Editorial Office >>>> EMBL-EBI >>>> Wellcome Trust Genome Campus >>>> Hinxton >>>> Cambridgeshire, UK >>>> CB10 1SD >>>> p: +44 1223 492516 >>>> f: +44 1223 494468 >>> >>> _______________________________________________ >>> Ontology-editors mailing list >>> Ontology-editors at geneontology.org >>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors > From cjm at berkeleybop.org Fri Aug 7 16:54:38 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Fri, 7 Aug 2009 16:54:38 -0700 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> Message-ID: <535CD753-ACE9-43B5-8FC3-3A524EEB2322@berkeleybop.org> On Aug 7, 2009, at 4:19 PM, Karen Christie wrote: > In some future GO version 2, I think I would prefer a more modular > system where you can do annotations by combining terms from multiple > ontologies rather than having to instantiate huge numbers of terms > in order to distinguish things like cytoplasmic translation vs > mitochondrial translation, or the terms for function process links, > or 3'-end processing of snoRNA, snRNA, tRNA, etc. Everyone has had the expressivity to do this for some time: http://wiki.geneontology.org/index.php/Annotation_Cross_Products But I would argue GO should always retain 'mitochondrial translation' as a pre-composed term, because the mt aspect is non-accidental. > However, I think the proposal to create a new qualifier system just > for the purpose of a subset of the CC terms would be a really > fundamental change in the way annotations are made. Right now, the > existing qualifiers basically modify the annotation, whether a gene > is really part of a complex or just associated with it. But with > your proposal #3, it seems like you would have to use a qualifier in > order to generate a "term" equivalent to what exists now. I would argue that there are 3 distinct things that should not be confused: [1] the gene product [2] the function/process/component [3] the relationship between [1] and [2]. For historic reasons, [3] has never been explicitly stated but is left implicit, except in non-default scenarios which is where the existing qualifiers come in. I believe there are strong benefits to clearly separating [1], [2] and [3], and for being explicit about [3] and having formal semantics for [3]. These benefits include clearer annotations, more expressive power, cleaner easier to maintain ontologies, and better integration between GO and other resources. At the moment terms such as "fully spanning the plasma membrane", "integral to membrane", "host cell" confuse [2] and [3]. > In terms of the effects on annotation process, need for curator > training, as well as the things Alex mentioned about programming > time to modify software, I think this falls into the lots of work > for very little gain category. > > While I can see your point that CC is a hodgepodge, I don't think > this is the only issue in CC and I don't think this is a good way to > fix it. At the moment, I vote for option 4, leave CC as is. I think we need to plan for the future. I think that given the current resistance we are unlikely to move beyond [4] for a number of years. But that does not mean we can't have a discussion about improvements on current hodgepodges and how to get there. > > -Karen > > > On Thu, 6 Aug 2009, Chris Mungall wrote: > >> >> I'm not at all worried by the additional terms, there aren't that >> many. The problem is the mixing of locations with cell parts in an >> ad-hoc way >> >> I agree that many existing qualifiers are confusing and are used >> inconsistently. But it doesn't follow that any future qualifiers >> will be confusing or lead to inconsistency. If these qualifiers are >> well-defined with a clearly specified semantics it may have the >> opposite effect. >> >> On Aug 6, 2009, at 9:45 AM, Karen Christie wrote: >> >>> Personally, I'm not fond of idea #3 at all. We already find the >>> qualifiers confusing and adding more will just increase the >>> potential problems with accurate and consistent use of qualifiers >>> in annotation. >>> In light of the multiple other places where we are embracing the >>> explosion of precomposed terms, I don't see the rationale to post- >>> compose here by combining a term from the mini ontology of >>> qualifiers with a cellular component term. >>> -Karen >>> On Thu, 6 Aug 2009, Chris Mungall wrote: >>>> On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: >>>>> I would also favour 3, not least because it also has the >>>>> potential to solve the host [component] issue. >>>>> I imagine it would be fairly straightforward to automatically >>>>> transfer existing annotations from e.g. extrinsic to membrane to >>>>> be gp -> qual [extrinsic] -> GO:membrane. >>>> Yep, that's the easy part.. >>>>> An obvious downside to this is that it would become more >>>>> complicated to retrieve e.g. all extrinsic membrane proteins as >>>>> the query would now involve info in the annotation as well as >>>>> the GO term. Tools would need to be more sophisticated to handle >>>>> this. >>>> ..that's the harder part. The extra level of sophistication is >>>> not that high, but getting 50 developers to reengineer their >>>> enrichment tools... >>>>> That's not necessarily a deal-breaker though. We'd also need to >>>>> make sure the qualifiers had accessible definitions somewhere >>>>> because the GO term definition would not be sufficient to make >>>>> the annotation. >>>> These could live as relations declared in the main obo file. >>>>> Shall we put this on the agenda for the GOC meeting? >>>> yep >>>> I wonder if it's worth soliciting people for other similar >>>> qualifiers that may be useful? That may be too big a can of worms >>>>> Jane >>>>> Midori Harris wrote: >>>>>> For the integral/intrinsic/extrinsic to membrane terms, it >>>>>> doesn't really matter whether we interpret CC as locations or >>>>>> "stuff" that has mass. Even if the CC terms were "located in X" >>>>>> rather than the thing "X", terms like "intrinsic to membrane" >>>>>> describe spatial relations -- it's "how" they're located there, >>>>>> exactly as Harold says. >>>>>> I don't want to do [1], not least because it wouldn't solve the >>>>>> intrinsic-to-membrane problem anyway. I also agree that an >>>>>> annotation makes a "located-in" (or on, or at) statement, so >>>>>> the ontology doesn't have to. I don't like [2] at all; it >>>>>> sounds like a lot of work for a rather small gain. My vote is >>>>>> to work towards [3], and put up with [4] until we have [3] >>>>>> deployed. >>>>>> m >>>>>> On Wed, 5 Aug 2009, Harold Drabkin wrote: >>>>>>> This is a good point. So when we annotate, we are saying >>>>>>> something is in or at the component. Example: a protein is in >>>>>>> the plasma membrane. But if we use a term integral_to_plasma >>>>>>> _membrane as an annotation, we are now adding a "how" the >>>>>>> protein is "in/at" the plasma membrane. It's still in the >>>>>>> membrane. It might need a new relationship to link >>>>>>> integral_to_plasma_membrane with "plasma membrane" >>>>>>> hd >>>>>>> Chris Mungall wrote: >>>>>>>> I'd like to briefly revisit this issue >>>>>>>> We have a request for a term "Fully spanning the plasma >>>>>>>> membrane" >>>>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2831884&group_id=36855&atid=440764 >>>>>>>> This a location rather than a cellular component. The >>>>>>>> determining factor would be whether instances of the type >>>>>>>> would have mass. A nucleus instance has mass, but it's not >>>>>>>> clear what a "Fully spanning the plasma membrane" instance is. >>>>>>>> This is also the case for the intrinsic/extrinsic terms. See >>>>>>>> the email from Alan below which I guess never made it onto >>>>>>>> the annotation list (I guess we should probably set up a >>>>>>>> public GO ontology discussion list distinct from GO friends >>>>>>>> to stop people spamming the wrong lists?) >>>>>>>> Of course these terms are useful and I'm not suggesting >>>>>>>> getting rid of current annotations. But it is important to be >>>>>>>> clear what the terms in CC are. This is especially important >>>>>>>> as other groups start using GO terms in cross-products. >>>>>>>> The options as I see it are: >>>>>>>> [1] Interpret all GO CC terms as locations. >>>>>>>> Thus GO:0005634 ! nucleus would be interpreted as "located in >>>>>>>> the nucleus". Note that this is not the current >>>>>>>> interpretation as far as I see it; an instance of GO:0005634 >>>>>>>> is an instance of a nucleus. When we have an association >>>>>>>> between a gene product and a nucleus then we interpret this >>>>>>>> as the gene product being localized to the nucleus. >>>>>>>> [2] Introduce a new high level term "cell location". >>>>>>>> The is_a hierarchy would be something (roughly) like >>>>>>>> cell location >>>>>>>> membrane location >>>>>>>> extrinsic >>>>>>>> spanning >>>>>>>> intrinsic >>>>>>>> fully spanning >>>>>>>> full spanning plasma membrane >>>>>>>> (it would be more complex with dual parentage as we have the >>>>>>>> cross product between membrane type and spatial qualifier) >>>>>>>> There would be additional contained_in/part_of links >>>>>>>> following the current structure, so query results, enrichment >>>>>>>> etc would remain roughly the same >>>>>>>> [3] Use spatial qualifiers in annotation >>>>>>>> Here we would actually obsolete the locational terms, and >>>>>>>> replace them with annotation qualifiers >>>>>>>> * extrinsic, intrinsic: membranes only >>>>>>>> * overlaps >>>>>>>> * fully contained by >>>>>>>> * fully spanning >>>>>>>> [4] Keep things as they actually are and not worry about >>>>>>>> giving a coherent explanation as to what a cell component is. >>>>>>>> I am against [1] for reasons I can expand on. I am also >>>>>>>> against [4] but partially resigned to it. I prefer [3] to [2] >>>>>>>> This is also related to the host terms as well, but I >>>>>>>> think this is best dealt with separately >>>>>>>> Begin forwarded message: >>>>>>>>> From: Barry Smith >>>>>>>>> Date: August 1, 2009 6:58:05 AM PDT >>>>>>>>> To: Alan Ruttenberg , Suzanna >>>>>>>>> Lewis , bfo-discuss at googlegroups.com >>>>>>>>> Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >>>>>>>>> Subject: [bfo-discuss] Re: Example of one of the problems in >>>>>>>>> cellular component >>>>>>>>> Reply-To: bfo-discuss at googlegroups.com >>>>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>>>> From: Rachael Huntley >>>>>>>>>>> Date: Fri, Jul 31, 2009 at 5:50 AM >>>>>>>>>>> Subject: [Annotation] GPI-anchored proteins >>>>>>>>>>> To: annotation at genome.stanford.edu >>>>>>>>>>> Hi all, >>>>>>>>>>> I'm after some advice. I'm a little confused about these >>>>>>>>>>> two terms, >>>>>>>>>>> with respect to GPI-anchored proteins; >>>>>>>>>>> GO:0031224 intrinsic to membrane - Located in a membrane >>>>>>>>>>> such that >>>>>>>>>>> some covalently attached portion of the gene product, for >>>>>>>>>>> example part >>>>>>>>>>> of a peptide sequence or some other covalently attached >>>>>>>>>>> moiety such as >>>>>>>>>>> a GPI anchor, spans or is embedded in one or both leaflets >>>>>>>>>>> of the >>>>>>>>>>> membrane. Note that proteins intrinsic to membranes cannot >>>>>>>>>>> be removed >>>>>>>>>>> without disrupting the membrane, e.g. by detergent. >>>>>>>>>>> GO:0019898 extrinsic to membrane - Loosely bound to one >>>>>>>>>>> surface of a >>>>>>>>>>> membrane, but not integrated into the hydrophobic region. >>>>>>>>>>> Note that >>>>>>>>>>> proteins extrinsic to membranes can be removed by >>>>>>>>>>> treatments that do >>>>>>>>>>> not disrupt the membrane, such as salt solutions. >>>>>>>>>>> This term can be used instead of these obsolete terms: GO: >>>>>>>>>>> 0015025 >>>>>>>>>>> GPI-anchored membrane-bound receptor (consider GO:0019898) >>>>>>>>>>> Both mention GPI anchor, the first (intrinsic to membrane) >>>>>>>>>>> in the >>>>>>>>>>> definition and the second as a suggestion to use extrinsic >>>>>>>>>>> to membrane >>>>>>>>>>> instead of the obsolete GO:0015025 GPI-anchored membrane- >>>>>>>>>>> bound >>>>>>>>>>> receptor >>>>>>>>>>> I don't know much about GPI-anchored proteins, but from >>>>>>>>>>> what I can >>>>>>>>>>> gather they can be extracted by detergent-solubilizing a >>>>>>>>>>> membrane >>>>>>>>>>> (PMID:19374451) which would suggest use of the term GO: >>>>>>>>>>> 0031224 >>>>>>>>>>> intrinsic to membrane. However, the GPI-anchor can be >>>>>>>>>>> disrupted by >>>>>>>>>>> phospholipase C, thus releasing the associated protein, >>>>>>>>>>> which would >>>>>>>>>>> suggest use of the term GO:0019898 extrinsic to membrane. >>>>>>>>>>> Additionally, GO:0031224 intrinsic to membrane has the child >>>>>>>>>>> GO:0031225 anchored to membrane (Def: Tethered to a >>>>>>>>>>> membrane by a >>>>>>>>>>> covalently attached anchor, such as a lipid moiety, that >>>>>>>>>>> is embedded >>>>>>>>>>> in the membrane. When used to describe a protein, >>>>>>>>>>> indicates that none >>>>>>>>>>> of the peptide sequence is embedded in the membrane.) >>>>>>>>>>> which would be a >>>>>>>>>>> term I would use for GPI-anchored proteins. >>>>>>>>>>> Can anyone suggest whether GPI-anchored proteins should be >>>>>>>>>>> annotated >>>>>>>>>>> to extrinsic or intrinsic to membrane. Either way, it >>>>>>>>>>> looks as though >>>>>>>>>>> the ontology could be refined in this area. >>>>>>>>>>> Thanks for your help. >>>>>>>>>>> Rachael. >>>>>>>>>>> -- >>>>>>>>>>> GOA and IntAct Curator >>>>>>>>>>> European Bioinformatics Institute >>>>>>>>>>> Welcome Trust Genome Campus >>>>>>>>>>> Hinxton >>>>>>>>>>> Cambridge, CB10 1SD >>>>>>>>>>> UK >>>>>>>>>>> Tel: 01223 492515 >>>>>>>>>>> Fax: 01223 494468 >>>>>>>>>> At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >>>>>>>>>> Nice discussion. I'll note, however, that an adjective >>>>>>>>>> can't be a >>>>>>>>>> place. Taj Mahal, versus Tah Mahalish. Or "above the Taj >>>>>>>>>> Mahal" (a >>>>>>>>>> location) versus "floating above the Taj Mahal" (a relation >>>>>>>>>> to a >>>>>>>>>> location). >>>>>>>>>> Cellular components are either places that things can be >>>>>>>>>> located in, >>>>>>>>>> or substances that are part_of cells. >>>>>>>>>> And GO:0031225: anchored to membrane >>>>>>>>>> Is neither of these - it is a state of affairs or >>>>>>>>>> disposition or something. >>>>>>>>>> -Alan >>>>>>>>> I agree with Alan that there are cellular component terms >>>>>>>>> which need >>>>>>>>> cleaning up. However, I note that, according to BFO, objects >>>>>>>>> can have >>>>>>>>> both other objects and also holes (cavities) as parts. Thus >>>>>>>>> for >>>>>>>>> instance your gut and your nostrils are parts of you. This >>>>>>>>> is one of >>>>>>>>> the reasons why it is wrong to see organisms as sums of >>>>>>>>> molecules, for example. >>>>>>>>> BS >>>>>>>>> BS >>>>>>>>> --~--~---------~--~----~------------~-------~--~----~ >>>>>>>>> You received this message because you are subscribed to the >>>>>>>>> Google Groups "BFO Discuss" group. >>>>>>>>> To post to this group, send email to bfo-discuss at googlegroups.com >>>>>>>>> To unsubscribe from this group, send email to bfo-discuss+unsubscribe at googlegroups.com >>>>>>>>> For more options, visit this group at http://groups.google.com/group/bfo-discuss?hl=en >>>>>>>>> -~----------~----~----~----~------~----~------~--~--- >>>>>>>> _______________________________________________ >>>>>>>> Ontology-editors mailing list >>>>>>>> Ontology-editors at geneontology.org >>>>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>>>> _______________________________________________ >>>>>>> Ontology-editors mailing list >>>>>>> Ontology-editors at geneontology.org >>>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>>> _______________________________________________ >>>>>> Ontology-editors mailing list >>>>>> Ontology-editors at geneontology.org >>>>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >>>>> -- >>>>> Dr Jane Lomax >>>>> GO Editorial Office >>>>> EMBL-EBI >>>>> Wellcome Trust Genome Campus >>>>> Hinxton >>>>> Cambridgeshire, UK >>>>> CB10 1SD >>>>> p: +44 1223 492516 >>>>> f: +44 1223 494468 >>>> _______________________________________________ >>>> Ontology-editors mailing list >>>> Ontology-editors at geneontology.org >>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > From cjm at berkeleybop.org Wed Aug 12 11:52:18 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Wed, 12 Aug 2009 11:52:18 -0700 Subject: [Ontology-editors] disjointness check added to check-obo-for-standard-release.pl Message-ID: This means that any disjoint_from violation will halt the publishing pipeline and not make it any further than the editors file From kchris at genome.stanford.edu Wed Aug 12 14:36:27 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Wed, 12 Aug 2009 14:36:27 -0700 (PDT) Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: <535CD753-ACE9-43B5-8FC3-3A524EEB2322@berkeleybop.org> References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> <535CD753-ACE9-43B5-8FC3-3A524EEB2322@berkeleybop.org> Message-ID: On Fri, 7 Aug 2009, Chris Mungall wrote: comments inserted inline > > On Aug 7, 2009, at 4:19 PM, Karen Christie wrote: > > > In some future GO version 2, I think I would prefer a more modular system > > where you can do annotations by combining terms from multiple ontologies > > rather than having to instantiate huge numbers of terms in order to > > distinguish things like cytoplasmic translation vs mitochondrial > > translation, or the terms for function process links, or 3'-end processing > > of snoRNA, snRNA, tRNA, etc. > > Everyone has had the expressivity to do this for some time: > http://wiki.geneontology.org/index.php/Annotation_Cross_Products Just because it is allowed by the gene-association file format does NOT mean that "everyone" can do this. Many groups curate into databases, not into flat files, so my actual ability to utilize this is determined by when our group has time to redesign the database to store it and write new code for both curator and user interfaces. Considering that have little or no need to make annotations this way versus just asking for a new term, spending person-hours on the database and software changes required to implement this is not a priority at this time. However, I was NOT talking about slapping an optional modular system on top of our existing system which functions primarily by instantiating new terms, even when they represent crossproducts. SGD's phenotype annotations are done by combining terms from multiple smaller ontologies. Thus, sometimes I wonder if the functional annotations made with GO might be better accomplished by combining terms from multiple ontologies to make an annotation and NOT instantiating huge numbers of cross-product terms. In such a system, you might only need to represent translation in an ontology of processes and to annotate you'd combine a process term with a location term at the time of annotation, instead of duplicating part or all of the translation process terms multiple times, for cytoplasmic, mitochondrial, chloroplast, and whatever else comes up. Before I move on to other parts of this discussion, let me make it clear that I am not advocating we go in that direction now. While there are clearly some issues with the existing system in terms of managing the size of GO, there would be other issues in other systems. I just sometimes wonder if there might be a better way than instantiating 7 terms to represent the various different RNA substrates of "polyadenylation-dependent RNA catabolism", where basically there is one process and it acts on all types of RNA in the nucleus, but the only way to be able to see that that process acts on the various types of RNA a biologist might be interested in is to instantiate a bunch of separate terms. > But I would argue GO should always retain 'mitochondrial translation' as a > pre-composed term, because the mt aspect is non-accidental. In our current system, yes, we would probably always pre-compose this term. However, if we moved to another type of system, more like SGD's phenotype system, perhaps we would not choose to precompose this term. "Accidental" is a rather poor choice of words to convey your meaning as I understood it at Oregon. The things which you don't feel need to be represented with pre-composed terms are probably not accidental either. > > However, I think the proposal to create a new qualifier system just for > > the purpose of a subset of the CC terms would be a really fundamental > > change in the way annotations are made. Right now, the existing qualifiers > > basically modify the annotation, whether a gene is really part of a > > complex or just associated with it. But with your proposal #3, it seems > > like you would have to use a qualifier in order to generate a "term" > > equivalent to what exists now. > > I would argue that there are 3 distinct things that should not be confused: > > [1] the gene product > [2] the function/process/component > [3] the relationship between [1] and [2]. > > For historic reasons, [3] has never been explicitly stated but is left > implicit, except in non-default scenarios which is where the existing > qualifiers come in. > > I believe there are strong benefits to clearly separating [1], [2] and [3], > and for being explicit about [3] and having formal semantics for [3]. These > benefits include clearer annotations, more expressive power, cleaner easier > to maintain ontologies, and better integration between GO and other > resources. > > At the moment terms such as "fully spanning the plasma membrane", "integral > to membrane", "host cell" confuse [2] and [3]. > > > In terms of the effects on annotation process, need for curator training, > > as well as the things Alex mentioned about programming time to modify > > software, I think this falls into the lots of work for very little gain > > category. > > > > While I can see your point that CC is a hodgepodge, I don't think this is > > the only issue in CC and I don't think this is a good way to fix it. At > > the moment, I vote for option 4, leave CC as is. > > I think we need to plan for the future. I think that given the current > resistance we are unlikely to move beyond [4] for a number of years. But > that does not mean we can't have a discussion about improvements on current > hodgepodges and how to get there. I agree that we need to plan for the future. However, I also feel that such planning needs to consider the needs and practical concerns of annotators, and not just ontological purity. After all, GO was started by annotators because of practical needs to annotate genes. -Karen > > -Karen > > > > > > On Thu, 6 Aug 2009, Chris Mungall wrote: > > > > > > > > I'm not at all worried by the additional terms, there aren't that many. > > > The problem is the mixing of locations with cell parts in an ad-hoc way > > > > > > I agree that many existing qualifiers are confusing and are used > > > inconsistently. But it doesn't follow that any future qualifiers will be > > > confusing or lead to inconsistency. If these qualifiers are well-defined > > > with a clearly specified semantics it may have the opposite effect. > > > > > > On Aug 6, 2009, at 9:45 AM, Karen Christie wrote: > > > > > > > Personally, I'm not fond of idea #3 at all. We already find the > > > > qualifiers confusing and adding more will just increase the potential > > > > problems with accurate and consistent use of qualifiers in annotation. > > > > In light of the multiple other places where we are embracing the > > > > explosion of precomposed terms, I don't see the rationale to post- > > > > compose here by combining a term from the mini ontology of qualifiers > > > > with a cellular component term. > > > > -Karen > > > > On Thu, 6 Aug 2009, Chris Mungall wrote: > > > > > On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: > > > > > > I would also favour 3, not least because it also has the potential > > > > > > to solve the host [component] issue. > > > > > > I imagine it would be fairly straightforward to automatically > > > > > > transfer existing annotations from e.g. extrinsic to membrane to > > > > > > be gp -> qual [extrinsic] -> GO:membrane. > > > > > Yep, that's the easy part.. > > > > > > An obvious downside to this is that it would become more > > > > > > complicated to retrieve e.g. all extrinsic membrane proteins as > > > > > > the query would now involve info in the annotation as well as the > > > > > > GO term. Tools would need to be more sophisticated to handle this. > > > > > ..that's the harder part. The extra level of sophistication is not > > > > > that high, but getting 50 developers to reengineer their enrichment > > > > > tools... > > > > > > That's not necessarily a deal-breaker though. We'd also need to > > > > > > make sure the qualifiers had accessible definitions somewhere > > > > > > because the GO term definition would not be sufficient to make the > > > > > > annotation. > > > > > These could live as relations declared in the main obo file. > > > > > > Shall we put this on the agenda for the GOC meeting? > > > > > yep > > > > > I wonder if it's worth soliciting people for other similar > > > > > qualifiers that may be useful? That may be too big a can of worms > > > > > > Jane > > > > > > Midori Harris wrote: > > > > > > > For the integral/intrinsic/extrinsic to membrane terms, it > > > > > > > doesn't really matter whether we interpret CC as locations or > > > > > > > "stuff" that has mass. Even if the CC terms were "located in X" > > > > > > > rather than the thing "X", terms like "intrinsic to membrane" > > > > > > > describe spatial relations -- it's "how" they're located there, > > > > > > > exactly as Harold says. > > > > > > > I don't want to do [1], not least because it wouldn't solve the > > > > > > > intrinsic-to-membrane problem anyway. I also agree that an > > > > > > > annotation makes a "located-in" (or on, or at) statement, so the > > > > > > > ontology doesn't have to. I don't like [2] at all; it sounds > > > > > > > like a lot of work for a rather small gain. My vote is to work > > > > > > > towards [3], and put up with [4] until we have [3] deployed. > > > > > > > m > > > > > > > On Wed, 5 Aug 2009, Harold Drabkin wrote: > > > > > > > > This is a good point. So when we annotate, we are saying > > > > > > > > something is in or at the component. Example: a protein is in > > > > > > > > the plasma membrane. But if we use a term integral_to_plasma > > > > > > > > _membrane as an annotation, we are now adding a "how" the > > > > > > > > protein is "in/at" the plasma membrane. It's still in the > > > > > > > > membrane. It might need a new relationship to link > > > > > > > > integral_to_plasma_membrane with "plasma membrane" > > > > > > > > hd > > > > > > > > Chris Mungall wrote: > > > > > > > > > I'd like to briefly revisit this issue > > > > > > > > > We have a request for a term "Fully spanning the plasma > > > > > > > > > membrane" > > > > > > > > > https://sourceforge.net/tracker/index.php?func=detail&aid=28 > > > > > > > > > 31884&group_id=36855&atid=440764 > > > > > > > > > This a location rather than a cellular component. The > > > > > > > > > determining factor would be whether instances of the type > > > > > > > > > would have mass. A nucleus instance has mass, but it's not > > > > > > > > > clear what a "Fully spanning the plasma membrane" instance > > > > > > > > > is. > > > > > > > > > This is also the case for the intrinsic/extrinsic terms. See > > > > > > > > > the email from Alan below which I guess never made it onto > > > > > > > > > the annotation list (I guess we should probably set up a > > > > > > > > > public GO ontology discussion list distinct from GO friends > > > > > > > > > to stop people spamming the wrong lists?) > > > > > > > > > Of course these terms are useful and I'm not suggesting > > > > > > > > > getting rid of current annotations. But it is important to > > > > > > > > > be clear what the terms in CC are. This is especially > > > > > > > > > important as other groups start using GO terms in > > > > > > > > > cross-products. > > > > > > > > > The options as I see it are: > > > > > > > > > [1] Interpret all GO CC terms as locations. > > > > > > > > > Thus GO:0005634 ! nucleus would be interpreted as "located > > > > > > > > > in the nucleus". Note that this is not the current > > > > > > > > > interpretation as far as I see it; an instance of GO:0005634 > > > > > > > > > is an instance of a nucleus. When we have an association > > > > > > > > > between a gene product and a nucleus then we interpret this > > > > > > > > > as the gene product being localized to the nucleus. > > > > > > > > > [2] Introduce a new high level term "cell location". > > > > > > > > > The is_a hierarchy would be something (roughly) like > > > > > > > > > cell location > > > > > > > > > membrane location > > > > > > > > > extrinsic > > > > > > > > > spanning > > > > > > > > > intrinsic > > > > > > > > > fully spanning > > > > > > > > > full spanning plasma membrane > > > > > > > > > (it would be more complex with dual parentage as we have the > > > > > > > > > cross product between membrane type and spatial qualifier) > > > > > > > > > There would be additional contained_in/part_of links > > > > > > > > > following the current structure, so query results, > > > > > > > > > enrichment etc would remain roughly the same > > > > > > > > > [3] Use spatial qualifiers in annotation > > > > > > > > > Here we would actually obsolete the locational terms, and > > > > > > > > > replace them with annotation qualifiers > > > > > > > > > * extrinsic, intrinsic: membranes only > > > > > > > > > * overlaps > > > > > > > > > * fully contained by > > > > > > > > > * fully spanning > > > > > > > > > [4] Keep things as they actually are and not worry about > > > > > > > > > giving a coherent explanation as to what a cell component > > > > > > > > > is. > > > > > > > > > I am against [1] for reasons I can expand on. I am also > > > > > > > > > against [4] but partially resigned to it. I prefer [3] to > > > > > > > > > [2] > > > > > > > > > This is also related to the host terms as well, but I > > > > > > > > > think this is best dealt with separately > > > > > > > > > Begin forwarded message: > > > > > > > > > > From: Barry Smith > > > > > > > > > > Date: August 1, 2009 6:58:05 AM PDT > > > > > > > > > > To: Alan Ruttenberg , Suzanna > > > > > > > > > > Lewis , bfo-discuss at googlegroups.com > > > > > > > > > > Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk > > > > > > > > > > Subject: [bfo-discuss] Re: Example of one of the problems > > > > > > > > > > in cellular component > > > > > > > > > > Reply-To: bfo-discuss at googlegroups.com > > > > > > > > > > > > ---------- Forwarded message ---------- > > > > > > > > > > > > From: Rachael Huntley > > > > > > > > > > > > Date: Fri, Jul 31, 2009 at 5:50 AM > > > > > > > > > > > > Subject: [Annotation] GPI-anchored proteins > > > > > > > > > > > > To: annotation at genome.stanford.edu > > > > > > > > > > > > Hi all, > > > > > > > > > > > > I'm after some advice. I'm a little confused about > > > > > > > > > > > > these two terms, > > > > > > > > > > > > with respect to GPI-anchored proteins; > > > > > > > > > > > > GO:0031224 intrinsic to membrane - Located in a > > > > > > > > > > > > membrane such that > > > > > > > > > > > > some covalently attached portion of the gene product, > > > > > > > > > > > > for example part > > > > > > > > > > > > of a peptide sequence or some other covalently > > > > > > > > > > > > attached moiety such as > > > > > > > > > > > > a GPI anchor, spans or is embedded in one or both > > > > > > > > > > > > leaflets of the > > > > > > > > > > > > membrane. Note that proteins intrinsic to membranes > > > > > > > > > > > > cannot be removed > > > > > > > > > > > > without disrupting the membrane, e.g. by detergent. > > > > > > > > > > > > GO:0019898 extrinsic to membrane - Loosely bound to > > > > > > > > > > > > one surface of a > > > > > > > > > > > > membrane, but not integrated into the hydrophobic > > > > > > > > > > > > region. Note that > > > > > > > > > > > > proteins extrinsic to membranes can be removed by > > > > > > > > > > > > treatments that do > > > > > > > > > > > > not disrupt the membrane, such as salt solutions. > > > > > > > > > > > > This term can be used instead of these obsolete terms: > > > > > > > > > > > > GO:0015025 > > > > > > > > > > > > GPI-anchored membrane-bound receptor (consider > > > > > > > > > > > > GO:0019898) > > > > > > > > > > > > Both mention GPI anchor, the first (intrinsic to > > > > > > > > > > > > membrane) in the > > > > > > > > > > > > definition and the second as a suggestion to use > > > > > > > > > > > > extrinsic to membrane > > > > > > > > > > > > instead of the obsolete GO:0015025 GPI-anchored > > > > > > > > > > > > membrane-bound > > > > > > > > > > > > receptor > > > > > > > > > > > > I don't know much about GPI-anchored proteins, but > > > > > > > > > > > > from what I can > > > > > > > > > > > > gather they can be extracted by detergent-solubilizing > > > > > > > > > > > > a membrane > > > > > > > > > > > > (PMID:19374451) which would suggest use of the term > > > > > > > > > > > > GO:0031224 > > > > > > > > > > > > intrinsic to membrane. However, the GPI-anchor can be > > > > > > > > > > > > disrupted by > > > > > > > > > > > > phospholipase C, thus releasing the associated > > > > > > > > > > > > protein, which would > > > > > > > > > > > > suggest use of the term GO:0019898 extrinsic to > > > > > > > > > > > > membrane. > > > > > > > > > > > > Additionally, GO:0031224 intrinsic to membrane has the > > > > > > > > > > > > child > > > > > > > > > > > > GO:0031225 anchored to membrane (Def: Tethered to a > > > > > > > > > > > > membrane by a > > > > > > > > > > > > covalently attached anchor, such as a lipid moiety, > > > > > > > > > > > > that is embedded > > > > > > > > > > > > in the membrane. When used to describe a protein, > > > > > > > > > > > > indicates that none > > > > > > > > > > > > of the peptide sequence is embedded in the membrane.) > > > > > > > > > > > > which would be a > > > > > > > > > > > > term I would use for GPI-anchored proteins. > > > > > > > > > > > > Can anyone suggest whether GPI-anchored proteins > > > > > > > > > > > > should be annotated > > > > > > > > > > > > to extrinsic or intrinsic to membrane. Either way, it > > > > > > > > > > > > looks as though > > > > > > > > > > > > the ontology could be refined in this area. > > > > > > > > > > > > Thanks for your help. > > > > > > > > > > > > Rachael. > > > > > > > > > > > > -- > > > > > > > > > > > > GOA and IntAct Curator > > > > > > > > > > > > European Bioinformatics Institute > > > > > > > > > > > > Welcome Trust Genome Campus > > > > > > > > > > > > Hinxton > > > > > > > > > > > > Cambridge, CB10 1SD > > > > > > > > > > > > UK > > > > > > > > > > > > Tel: 01223 492515 > > > > > > > > > > > > Fax: 01223 494468 > > > > > > > > > > > At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: > > > > > > > > > > > Nice discussion. I'll note, however, that an adjective > > > > > > > > > > > can't be a > > > > > > > > > > > place. Taj Mahal, versus Tah Mahalish. Or "above the Taj > > > > > > > > > > > Mahal" (a > > > > > > > > > > > location) versus "floating above the Taj Mahal" (a > > > > > > > > > > > relation to a > > > > > > > > > > > location). > > > > > > > > > > > Cellular components are either places that things can be > > > > > > > > > > > located in, > > > > > > > > > > > or substances that are part_of cells. > > > > > > > > > > > And GO:0031225: anchored to membrane > > > > > > > > > > > Is neither of these - it is a state of affairs or > > > > > > > > > > > disposition or something. > > > > > > > > > > > -Alan > > > > > > > > > > I agree with Alan that there are cellular component terms > > > > > > > > > > which need > > > > > > > > > > cleaning up. However, I note that, according to BFO, > > > > > > > > > > objects can have > > > > > > > > > > both other objects and also holes (cavities) as parts. > > > > > > > > > > Thus for > > > > > > > > > > instance your gut and your nostrils are parts of you. This > > > > > > > > > > is one of > > > > > > > > > > the reasons why it is wrong to see organisms as sums of > > > > > > > > > > molecules, for example. > > > > > > > > > > BS > > > > > > > > > > BS > > > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ > > > > > > > > > > You received this message because you are subscribed to > > > > > > > > > > the Google Groups "BFO Discuss" group. > > > > > > > > > > To post to this group, send email to > > > > > > > > > > bfo-discuss at googlegroups.com > > > > > > > > > > To unsubscribe from this group, send email to > > > > > > > > > > bfo-discuss+unsubscribe at googlegroups.com > > > > > > > > > > For more options, visit this group at > > > > > > > > > > http://groups.google.com/group/bfo-discuss?hl=en > > > > > > > > > > -~----------~----~----~----~------~----~------~--~--- > > > > > > > > > _______________________________________________ > > > > > > > > > Ontology-editors mailing list > > > > > > > > > Ontology-editors at geneontology.org > > > > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > > > > > > _______________________________________________ > > > > > > > > Ontology-editors mailing list > > > > > > > > Ontology-editors at geneontology.org > > > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > > > > > _______________________________________________ > > > > > > > Ontology-editors mailing list > > > > > > > Ontology-editors at geneontology.org > > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > > > > -- > > > > > > Dr Jane Lomax > > > > > > GO Editorial Office > > > > > > EMBL-EBI > > > > > > Wellcome Trust Genome Campus > > > > > > Hinxton > > > > > > Cambridgeshire, UK > > > > > > CB10 1SD > > > > > > p: +44 1223 492516 > > > > > > f: +44 1223 494468 > > > > > _______________________________________________ > > > > > Ontology-editors mailing list > > > > > Ontology-editors at geneontology.org > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors > > > > > From cjm at berkeleybop.org Wed Aug 12 16:31:40 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Wed, 12 Aug 2009 16:31:40 -0700 Subject: [Ontology-editors] cellular component vs cellular location In-Reply-To: References: <20090801141012.401235B003A@mweb1.acsu.buffalo.edu> <1A9EA1CF-3985-4877-9664-6B12D0C4BFA9@berkeleybop.org> <4A79D457.3010104@informatics.jax.org> <4A7AB3D1.3080107@ebi.ac.uk> <7F947877-D3E8-4C53-BDFA-5AE9748D573C@berkeleybop.org> <535CD753-ACE9-43B5-8FC3-3A524EEB2322@berkeleybop.org> Message-ID: On Aug 12, 2009, at 2:36 PM, Karen Christie wrote: > > On Fri, 7 Aug 2009, Chris Mungall wrote: > > comments inserted inline > >> On Aug 7, 2009, at 4:19 PM, Karen Christie wrote: >> > In some future GO version 2, I think I would prefer a more >> modular system >> > where you can do annotations by combining terms from multiple >> ontologies >> > rather than having to instantiate huge numbers of terms in order to >> > distinguish things like cytoplasmic translation vs mitochondrial >> > translation, or the terms for function process links, or 3'-end >> processing >> > of snoRNA, snRNA, tRNA, etc. >> Everyone has had the expressivity to do this for some time: >> http://wiki.geneontology.org/index.php/Annotation_Cross_Products > > Just because it is allowed by the gene-association file format does > NOT mean that "everyone" can do this. Many groups curate into > databases, not into flat files, so my actual ability to utilize this > is determined by when our group has time to redesign the database to > store it and write new code for both curator and user interfaces. > Considering that have little or no need to make annotations this way > versus just asking for a new term, spending person-hours on the > database and software changes required to implement this is not a > priority at this time. This is unfortunate but probably true. Of course it might help if MODs actually coordinated to develop a single annotation tool. Some groups are already moving towards using Phenote for curation, which allows these kinds of annotations. But anyway that's another discussion, I just wanted to point out the expressivity is there. Also, I disagree with the pejorative characterization that this is 'slapping on' an optional modular system. It is a logically coherent extension that maximizes the benefits of and minimizes the deficits of both pre- and post- composition > However, I was NOT talking about slapping an optional modular system > on top of our existing system which functions primarily by > instantiating new terms, even when they represent crossproducts. > SGD's phenotype annotations are done by combining terms from > multiple smaller ontologies. Thus, sometimes I wonder if the > functional annotations made with GO might be better accomplished by > combining terms from multiple ontologies to make an annotation and > NOT instantiating huge numbers of cross-product terms. As I understand it, SGD uses the APO, which is an ontology of pre- coordinated phenotypes such as "colony shape" or "bud neck morphology" in combination with an ontology of qualifiers (PATO?) to post- coordinate expressions such as "bud neck morphology, abnormal". Is this correct? For example: http://www.yeastgenome.org/cgi-bin/phenotype/phenotype.pl?observable=bud%20neck%20morphology "wide bud neck" appears to be a free text description. The actual list of qualifiers used appears to be fairly small: abnormal, normal, decreased, increased. Some only make sense in certain contexts - e.g. "bud neck morphology increased" would make no sense. So in many cases you are modifying a pre-coordinated phenotype term with "abnormal" or "normal". Given that you are in favor of the post-composition approach I wonder why SGD didn't take this to the next logical step and compose phenotype descriptions from the GO and PATO? For example, your current system allows you to compose descriptions such as "mitochondrion morphology, abnormal" from a pre-composed APO term "mitochondrion morphology". But what about properties of the mitochondrial membrane? There is no APO term for this, you would have to create one, eventually recapitulating the full subset of GO-CC that is instantiated in fungi. Perhaps this is how the system evolved and SGD would design it differently now. However, consider the consequences of this ideal scenario in which everyone post-composes descriptions from a smaller set of primitives. GO no longer pre-composes "mitochondrial membrane" - if you want to talk about this you must compose an expression from the intersection of: (a) membrane and (b) . If you want to describe the conductivity or morphology of the mitochondrial membrane you need a nested class expression. This is completely within the expressivity of any reasonable logic or set- based ontology formalism. (Indeed within the GO logical definitions we have classes defined in terms of other classes that are themselves defined in another xp set -- e.g. regulation of X biosynthesis). However, nested class expressions are hard to get right and require additional training and a good grasp of logic. Even then you are opening the door to all kinds of errors and inconsistencies. Better to centralize these decisions in the ontology. I have a lot of experience of different phenotype ontology methodologies, from MODs to human/clinical ontologies. There is a lot to be said on the merits of pre- and post- composition, but if I had to boil it down it would come to this: * post-composition works well where the combinations are trivial, e.g, size x anatomical_entity. * for anything else, post-composition can lead to annotator inconsistency, logical errors and confusion. I can give lots of real examples. * for pre-coordinated ontologies it is *vital* to make explicit how the classes are composed from simpler classes, so that you can use a reasoner to maintain the ontology The problem with pre-coordination in GO is not the pre-coordination itself, it is that we are only just catching up with the methods for building these pre-coordinated classes rationally (methods that were known and applied even before GO came into existence) > In such a system, you might only need to represent translation in an > ontology of processes and to annotate you'd combine a process term > with a location term at the time of annotation, instead of > duplicating part or all of the translation process terms multiple > times, for cytoplasmic, mitochondrial, chloroplast, and whatever > else comes up. you are considering this a weakness and I consider it a strength. Because a class "mitochondrial translation" exists in GO I know that it is possible for translation to take place in the mitochondrion, even in the absence of annotations. In some cases I may even get a pubmed ID. Furthermore, if an annotator needs to say 'positive regulation of mitochondrial translation in response to stress' they don't need to have extensive training in description logics in order to compose the logically correct expression using the correct relations and boolean connectives. There are limitations in the current implementation because of how GO evolved, but we are near to overcoming them. For example, currently the logical definition lives in external xp files, but these will soon move into the editors version of GO. This will mean we can use the reasoner to automate the placement of the term within the DAG, removing most of the objections about ontology bloat. Furthermore, as an additional optional bonus, any annotator using a modern ontology- aware tool such as phenote can choose to compose classes compositionally, and have the compositional expression automatically substituted for the pre-composed term using the logical definition. For example, first by choosing "mitochondrion" as the main term and then refining this with the relationship . By inspecting the logical definition in GO, the tool can swap this for the equivalent "mitochondrial translation" term. > Before I move on to other parts of this discussion, let me make it > clear that I am not advocating we go in that direction now. Good, I think it is important to distinguish between discussion of the optimal system from the individual practical steps to get there > While there are clearly some issues with the existing system in > terms of managing the size of GO, there would be other issues in > other systems. I just sometimes wonder if there might be a better > way than instantiating 7 terms to represent the various different > RNA substrates of "polyadenylation-dependent RNA catabolism", where > basically there is one process and it acts on all types of RNA in > the nucleus, but the only way to be able to see that that process > acts on the various types of RNA a biologist might be interested in > is to instantiate a bunch of separate terms. I would say that if the processes were identical in all respects other than having a different input then we should not declare subtypes for every input. However, if the processes differ in any respect then it is a good candidate for declaring a term >> But I would argue GO should always retain 'mitochondrial >> translation' as a >> pre-composed term, because the mt aspect is non-accidental. > > In our current system, yes, we would probably always pre-compose this > term. However, if we moved to another type of system, more like SGD's > phenotype system, perhaps we would not choose to precompose this term. Given the opportunity to develop a ontology like GO from scratch with modern tools I would still choose to go the pre-composed route for non- trivial combinatorial terms. The difference is that I would insist that classes were explicitly composed from simpler classes at the time of creation, thus prospectively maintaining the cross-product definitions, rather than doing it retrospectively as we are currently doing. For combinatorial classes such as {abnormal,normal,incr,decr} x phenotype, where all combinations are valid biological classes I think post-composition makes sense. > > "Accidental" is a rather poor choice of words to convey your meaning > as I understood it at Oregon. The things which you don't feel need > to be represented with pre-composed terms are probably not > accidental either. I agree this is a poor choice of term > >> > However, I think the proposal to create a new qualifier system >> just for >> > the purpose of a subset of the CC terms would be a really >> fundamental >> > change in the way annotations are made. Right now, the existing >> qualifiers >> > basically modify the annotation, whether a gene is really part of a >> > complex or just associated with it. But with your proposal #3, it >> seems >> > like you would have to use a qualifier in order to generate a >> "term" >> > equivalent to what exists now. >> I would argue that there are 3 distinct things that should not be >> confused: >> [1] the gene product >> [2] the function/process/component >> [3] the relationship between [1] and [2]. >> For historic reasons, [3] has never been explicitly stated but is >> left >> implicit, except in non-default scenarios which is where the existing >> qualifiers come in. >> I believe there are strong benefits to clearly separating [1], [2] >> and [3], >> and for being explicit about [3] and having formal semantics for >> [3]. These >> benefits include clearer annotations, more expressive power, >> cleaner easier >> to maintain ontologies, and better integration between GO and other >> resources. >> At the moment terms such as "fully spanning the plasma membrane", >> "integral >> to membrane", "host cell" confuse [2] and [3]. >> >> > In terms of the effects on annotation process, need for curator >> training, >> > as well as the things Alex mentioned about programming time to >> modify >> > software, I think this falls into the lots of work for very >> little gain >> > category. >> > > While I can see your point that CC is a hodgepodge, I don't >> think this is >> > the only issue in CC and I don't think this is a good way to fix >> it. At >> > the moment, I vote for option 4, leave CC as is. >> I think we need to plan for the future. I think that given the >> current >> resistance we are unlikely to move beyond [4] for a number of >> years. But >> that does not mean we can't have a discussion about improvements on >> current >> hodgepodges and how to get there. > > I agree that we need to plan for the future. However, I also feel that > such planning needs to consider the needs and practical concerns of > annotators, and not just ontological purity. After all, GO was started > by annotators because of practical needs to annotate genes. I'm only interested in ontological purity as a means to an end. One of those ends is having consistent annotations that accurately represent the biology in as precise a way as possible, and that still make sense in ten years or more. I'm not sure that having an ontology that we admit is a hodgepodge is a good way of achieving those ends. We end up with bizarre constructs such as "colocalized_with extrinsic to plasma membrane" http://amigo.geneontology.org/cgi-bin/amigo/gp-assoc.cgi?gp=dictyBase:DDB_G0285161 > > -Karen > > > > > > > >> > -Karen >> > > > On Thu, 6 Aug 2009, Chris Mungall wrote: >> > > > > > I'm not at all worried by the additional terms, there >> aren't that many. >> > > The problem is the mixing of locations with cell parts in an ad- >> hoc way >> > > > > I agree that many existing qualifiers are confusing and are >> used >> > > inconsistently. But it doesn't follow that any future >> qualifiers will be >> > > confusing or lead to inconsistency. If these qualifiers are >> well-defined >> > > with a clearly specified semantics it may have the opposite >> effect. >> > > > > On Aug 6, 2009, at 9:45 AM, Karen Christie wrote: >> > > > > > Personally, I'm not fond of idea #3 at all. We already >> find the >> > > > qualifiers confusing and adding more will just increase the >> potential >> > > > problems with accurate and consistent use of qualifiers in >> annotation. >> > > > In light of the multiple other places where we are embracing >> the >> > > > explosion of precomposed terms, I don't see the rationale to >> post- >> > > > compose here by combining a term from the mini ontology of >> qualifiers >> > > > with a cellular component term. >> > > > -Karen >> > > > On Thu, 6 Aug 2009, Chris Mungall wrote: >> > > > > On Aug 6, 2009, at 3:43 AM, Jane Lomax wrote: >> > > > > > I would also favour 3, not least because it also has the >> potential >> > > > > > to solve the host [component] issue. >> > > > > > I imagine it would be fairly straightforward to >> automatically >> > > > > > transfer existing annotations from e.g. extrinsic to >> membrane to >> > > > > > be gp -> qual [extrinsic] -> GO:membrane. >> > > > > Yep, that's the easy part.. >> > > > > > An obvious downside to this is that it would become more >> > > > > > complicated to retrieve e.g. all extrinsic membrane >> proteins as >> > > > > > the query would now involve info in the annotation as >> well as the >> > > > > > GO term. Tools would need to be more sophisticated to >> handle this. >> > > > > ..that's the harder part. The extra level of sophistication >> is not >> > > > > that high, but getting 50 developers to reengineer their >> enrichment >> > > > > tools... >> > > > > > That's not necessarily a deal-breaker though. We'd also >> need to >> > > > > > make sure the qualifiers had accessible definitions >> somewhere >> > > > > > because the GO term definition would not be sufficient to >> make the >> > > > > > annotation. >> > > > > These could live as relations declared in the main obo file. >> > > > > > Shall we put this on the agenda for the GOC meeting? >> > > > > yep >> > > > > I wonder if it's worth soliciting people for other similar >> > > > > qualifiers that may be useful? That may be too big a can of >> worms >> > > > > > Jane >> > > > > > Midori Harris wrote: >> > > > > > > For the integral/intrinsic/extrinsic to membrane terms, >> it >> > > > > > > doesn't really matter whether we interpret CC as >> locations or >> > > > > > > "stuff" that has mass. Even if the CC terms were >> "located in X" >> > > > > > > rather than the thing "X", terms like "intrinsic to >> membrane" >> > > > > > > describe spatial relations -- it's "how" they're >> located there, >> > > > > > > exactly as Harold says. >> > > > > > > I don't want to do [1], not least because it wouldn't >> solve the >> > > > > > > intrinsic-to-membrane problem anyway. I also agree that >> an >> > > > > > > annotation makes a "located-in" (or on, or at) >> statement, so the >> > > > > > > ontology doesn't have to. I don't like [2] at all; it >> sounds >> > > > > > > like a lot of work for a rather small gain. My vote is >> to work >> > > > > > > towards [3], and put up with [4] until we have [3] >> deployed. >> > > > > > > m >> > > > > > > On Wed, 5 Aug 2009, Harold Drabkin wrote: >> > > > > > > > This is a good point. So when we annotate, we are >> saying >> > > > > > > > something is in or at the component. Example: a >> protein is in >> > > > > > > > the plasma membrane. But if we use a term >> integral_to_plasma >> > > > > > > > _membrane as an annotation, we are now adding a "how" >> the >> > > > > > > > protein is "in/at" the plasma membrane. It's still in >> the >> > > > > > > > membrane. It might need a new relationship to link >> > > > > > > > integral_to_plasma_membrane with "plasma membrane" >> > > > > > > > hd >> > > > > > > > Chris Mungall wrote: >> > > > > > > > > I'd like to briefly revisit this issue >> > > > > > > > > We have a request for a term "Fully spanning the >> plasma >> > > > > > > > > membrane" >> > > > > > > > > https://sourceforge.net/tracker/index.php?func=detail&aid=28 >> > > > > > > > > 31884&group_id=36855&atid=440764 >> > > > > > > > > This a location rather than a cellular component. The >> > > > > > > > > determining factor would be whether instances of >> the type >> > > > > > > > > would have mass. A nucleus instance has mass, but >> it's not >> > > > > > > > > clear what a "Fully spanning the plasma membrane" >> instance >> > > > > > > > > is. >> > > > > > > > > This is also the case for the intrinsic/extrinsic >> terms. See >> > > > > > > > > the email from Alan below which I guess never made >> it onto >> > > > > > > > > the annotation list (I guess we should probably set >> up a >> > > > > > > > > public GO ontology discussion list distinct from GO >> friends >> > > > > > > > > to stop people spamming the wrong lists?) >> > > > > > > > > Of course these terms are useful and I'm not >> suggesting >> > > > > > > > > getting rid of current annotations. But it is >> important to >> > > > > > > > > be clear what the terms in CC are. This is especially >> > > > > > > > > important as other groups start using GO terms in >> > > > > > > > > cross-products. >> > > > > > > > > The options as I see it are: >> > > > > > > > > [1] Interpret all GO CC terms as locations. >> > > > > > > > > Thus GO:0005634 ! nucleus would be interpreted as >> "located >> > > > > > > > > in the nucleus". Note that this is not the current >> > > > > > > > > interpretation as far as I see it; an instance of >> GO:0005634 >> > > > > > > > > is an instance of a nucleus. When we have an >> association >> > > > > > > > > between a gene product and a nucleus then we >> interpret this >> > > > > > > > > as the gene product being localized to the nucleus. >> > > > > > > > > [2] Introduce a new high level term "cell location". >> > > > > > > > > The is_a hierarchy would be something (roughly) like >> > > > > > > > > cell location >> > > > > > > > > membrane location >> > > > > > > > > extrinsic >> > > > > > > > > spanning >> > > > > > > > > intrinsic >> > > > > > > > > fully spanning >> > > > > > > > > full spanning plasma membrane >> > > > > > > > > (it would be more complex with dual parentage as we >> have the >> > > > > > > > > cross product between membrane type and spatial >> qualifier) >> > > > > > > > > There would be additional contained_in/part_of links >> > > > > > > > > following the current structure, so query results, >> > > > > > > > > enrichment etc would remain roughly the same >> > > > > > > > > [3] Use spatial qualifiers in annotation >> > > > > > > > > Here we would actually obsolete the locational >> terms, and >> > > > > > > > > replace them with annotation qualifiers >> > > > > > > > > * extrinsic, intrinsic: membranes only >> > > > > > > > > * overlaps >> > > > > > > > > * fully contained by >> > > > > > > > > * fully spanning >> > > > > > > > > [4] Keep things as they actually are and not worry >> about >> > > > > > > > > giving a coherent explanation as to what a cell >> component >> > > > > > > > > is. >> > > > > > > > > I am against [1] for reasons I can expand on. I am >> also >> > > > > > > > > against [4] but partially resigned to it. I prefer >> [3] to >> > > > > > > > > [2] >> > > > > > > > > This is also related to the host terms as >> well, but I >> > > > > > > > > think this is best dealt with separately >> > > > > > > > > Begin forwarded message: >> > > > > > > > > > From: Barry Smith >> > > > > > > > > > Date: August 1, 2009 6:58:05 AM PDT >> > > > > > > > > > To: Alan Ruttenberg , >> Suzanna >> > > > > > > > > > Lewis , bfo-discuss at googlegroups.com >> > > > > > > > > > Cc: annotation at genome.stanford.edu, huntley at ebi.ac.uk >> > > > > > > > > > Subject: [bfo-discuss] Re: Example of one of the >> problems >> > > > > > > > > > in cellular component >> > > > > > > > > > Reply-To: bfo-discuss at googlegroups.com >> > > > > > > > > > > > ---------- Forwarded message ---------- >> > > > > > > > > > > > From: Rachael Huntley >> > > > > > > > > > > > Date: Fri, Jul 31, 2009 at 5:50 AM >> > > > > > > > > > > > Subject: [Annotation] GPI-anchored proteins >> > > > > > > > > > > > To: annotation at genome.stanford.edu >> > > > > > > > > > > > Hi all, >> > > > > > > > > > > > I'm after some advice. I'm a little confused >> about >> > > > > > > > > > > > these two terms, >> > > > > > > > > > > > with respect to GPI-anchored proteins; >> > > > > > > > > > > > GO:0031224 intrinsic to membrane - Located in a >> > > > > > > > > > > > membrane such that >> > > > > > > > > > > > some covalently attached portion of the gene >> product, >> > > > > > > > > > > > for example part >> > > > > > > > > > > > of a peptide sequence or some other covalently >> > > > > > > > > > > > attached moiety such as >> > > > > > > > > > > > a GPI anchor, spans or is embedded in one or >> both >> > > > > > > > > > > > leaflets of the >> > > > > > > > > > > > membrane. Note that proteins intrinsic to >> membranes >> > > > > > > > > > > > cannot be removed >> > > > > > > > > > > > without disrupting the membrane, e.g. by >> detergent. >> > > > > > > > > > > > GO:0019898 extrinsic to membrane - Loosely >> bound to >> > > > > > > > > > > > one surface of a >> > > > > > > > > > > > membrane, but not integrated into the >> hydrophobic >> > > > > > > > > > > > region. Note that >> > > > > > > > > > > > proteins extrinsic to membranes can be >> removed by >> > > > > > > > > > > > treatments that do >> > > > > > > > > > > > not disrupt the membrane, such as salt >> solutions. >> > > > > > > > > > > > This term can be used instead of these >> obsolete terms: >> > > > > > > > > > > > GO:0015025 >> > > > > > > > > > > > GPI-anchored membrane-bound receptor (consider >> > > > > > > > > > > > GO:0019898) >> > > > > > > > > > > > Both mention GPI anchor, the first (intrinsic >> to >> > > > > > > > > > > > membrane) in the >> > > > > > > > > > > > definition and the second as a suggestion to >> use >> > > > > > > > > > > > extrinsic to membrane >> > > > > > > > > > > > instead of the obsolete GO:0015025 GPI-anchored >> > > > > > > > > > > > membrane-bound >> > > > > > > > > > > > receptor >> > > > > > > > > > > > I don't know much about GPI-anchored >> proteins, but >> > > > > > > > > > > > from what I can >> > > > > > > > > > > > gather they can be extracted by detergent- >> solubilizing >> > > > > > > > > > > > a membrane >> > > > > > > > > > > > (PMID:19374451) which would suggest use of >> the term >> > > > > > > > > > > > GO:0031224 >> > > > > > > > > > > > intrinsic to membrane. However, the GPI- >> anchor can be >> > > > > > > > > > > > disrupted by >> > > > > > > > > > > > phospholipase C, thus releasing the associated >> > > > > > > > > > > > protein, which would >> > > > > > > > > > > > suggest use of the term GO:0019898 extrinsic to >> > > > > > > > > > > > membrane. >> > > > > > > > > > > > Additionally, GO:0031224 intrinsic to >> membrane has the >> > > > > > > > > > > > child >> > > > > > > > > > > > GO:0031225 anchored to membrane (Def: >> Tethered to a >> > > > > > > > > > > > membrane by a >> > > > > > > > > > > > covalently attached anchor, such as a lipid >> moiety, >> > > > > > > > > > > > that is embedded >> > > > > > > > > > > > in the membrane. When used to describe a >> protein, >> > > > > > > > > > > > indicates that none >> > > > > > > > > > > > of the peptide sequence is embedded in the >> membrane.) >> > > > > > > > > > > > which would be a >> > > > > > > > > > > > term I would use for GPI-anchored proteins. >> > > > > > > > > > > > Can anyone suggest whether GPI-anchored >> proteins >> > > > > > > > > > > > should be annotated >> > > > > > > > > > > > to extrinsic or intrinsic to membrane. Either >> way, it >> > > > > > > > > > > > looks as though >> > > > > > > > > > > > the ontology could be refined in this area. >> > > > > > > > > > > > Thanks for your help. >> > > > > > > > > > > > Rachael. >> > > > > > > > > > > > -- > > > > > > > > > > > GOA and IntAct Curator >> > > > > > > > > > > > European Bioinformatics Institute >> > > > > > > > > > > > Welcome Trust Genome Campus >> > > > > > > > > > > > Hinxton >> > > > > > > > > > > > Cambridge, CB10 1SD >> > > > > > > > > > > > UK >> > > > > > > > > > > > Tel: 01223 492515 >> > > > > > > > > > > > Fax: 01223 494468 >> > > > > > > > > > > At 09:27 AM 7/31/2009, Alan Ruttenberg wrote: >> > > > > > > > > > > Nice discussion. I'll note, however, that an >> adjective >> > > > > > > > > > > can't be a >> > > > > > > > > > > place. Taj Mahal, versus Tah Mahalish. Or >> "above the Taj >> > > > > > > > > > > Mahal" (a >> > > > > > > > > > > location) versus "floating above the Taj >> Mahal" (a >> > > > > > > > > > > relation to a >> > > > > > > > > > > location). >> > > > > > > > > > > Cellular components are either places that >> things can be >> > > > > > > > > > > located in, >> > > > > > > > > > > or substances that are part_of cells. >> > > > > > > > > > > And GO:0031225: anchored to membrane >> > > > > > > > > > > Is neither of these - it is a state of affairs or >> > > > > > > > > > > disposition or something. >> > > > > > > > > > > -Alan >> > > > > > > > > > I agree with Alan that there are cellular >> component terms >> > > > > > > > > > which need >> > > > > > > > > > cleaning up. However, I note that, according to >> BFO, >> > > > > > > > > > objects can have >> > > > > > > > > > both other objects and also holes (cavities) as >> parts. >> > > > > > > > > > Thus for >> > > > > > > > > > instance your gut and your nostrils are parts of >> you. This >> > > > > > > > > > is one of >> > > > > > > > > > the reasons why it is wrong to see organisms as >> sums of >> > > > > > > > > > molecules, for example. >> > > > > > > > > > BS >> > > > > > > > > > BS >> > > > > > > > > > --~--~---------~--~----~------------~-------~-- >> ~----~ >> > > > > > > > > > You received this message because you are >> subscribed to >> > > > > > > > > > the Google Groups "BFO Discuss" group. >> > > > > > > > > > To post to this group, send email to >> > > > > > > > > > bfo-discuss at googlegroups.com >> > > > > > > > > > To unsubscribe from this group, send email to >> > > > > > > > > > bfo-discuss+unsubscribe at googlegroups.com >> > > > > > > > > > For more options, visit this group at >> > > > > > > > > > http://groups.google.com/group/bfo-discuss?hl=en >> > > > > > > > > > -~----------~----~----~----~------~----~------~-- >> ~--- >> > > > > > > > > _______________________________________________ >> > > > > > > > > Ontology-editors mailing list >> > > > > > > > > Ontology-editors at geneontology.org >> > > > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > > > > > > > _______________________________________________ >> > > > > > > > Ontology-editors mailing list >> > > > > > > > Ontology-editors at geneontology.org >> > > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > > > > > > _______________________________________________ >> > > > > > > Ontology-editors mailing list >> > > > > > > Ontology-editors at geneontology.org >> > > > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > > > > > -- > > > > > Dr Jane Lomax >> > > > > > GO Editorial Office >> > > > > > EMBL-EBI >> > > > > > Wellcome Trust Genome Campus >> > > > > > Hinxton >> > > > > > Cambridgeshire, UK >> > > > > > CB10 1SD >> > > > > > p: +44 1223 492516 >> > > > > > f: +44 1223 494468 >> > > > > _______________________________________________ >> > > > > Ontology-editors mailing list >> > > > > Ontology-editors at geneontology.org >> > > > > http://fafner.stanford.edu/mailman/listinfo/ontology-editors >> > > > > > > From aji at ebi.ac.uk Wed Aug 12 17:12:26 2009 From: aji at ebi.ac.uk (Amelia Ireland) Date: Wed, 12 Aug 2009 17:12:26 -0700 Subject: [Ontology-editors] new monthly reports Message-ID: Hi guys, I've written a new version of the monthly report script, which documents changes in the ontology. I've uploaded a couple of sample files into go/scratch/reports/ (accessible via http://geneontology.org/scratch/reports/ ) ; if you're interested, please have a look, and let me know if they seem comprehensible or if you have any suggestions. I'm considering doing an html version, but I don't know how much demand there would be for it... what do people think? If the files aren't there when you look, it might be that the server hasn't updated - they are in cvs now, though. Cheers, Amelia. -- Amelia Ireland GO Editorial Office http://www.berkeleybop.org || http://www.ebi.ac.uk Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ From midori at ebi.ac.uk Thu Aug 13 02:20:27 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 13 Aug 2009 10:20:27 +0100 (BST) Subject: [Ontology-editors] new monthly reports In-Reply-To: References: Message-ID: On Wed, 12 Aug 2009, Amelia Ireland wrote: > Hi guys, > > I've written a new version of the monthly report script, which documents > changes in the ontology. I've uploaded a couple of sample files into > go/scratch/reports/ (accessible via http://geneontology.org/scratch/reports/ > ) ; if you're interested, please have a look, and let me know if they seem These look awesome. It's good to have the reports on the way back. Thanks! > comprehensible or if you have any suggestions. I'm considering doing an html > version, but I don't know how much demand there would be for it... what do > people think? I'm not aware of any demand ... but then, I'm not sure how widely known the reports are in the first place. m > If the files aren't there when you look, it might be that the server hasn't > updated - they are in cvs now, though. > > Cheers, > Amelia. > > -- > Amelia Ireland > GO Editorial Office > http://www.berkeleybop.org || http://www.ebi.ac.uk > Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ > > > > > > > > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors From sart2 at gen.cam.ac.uk Fri Aug 14 01:47:35 2009 From: sart2 at gen.cam.ac.uk (Susan Tweedie) Date: Fri, 14 Aug 2009 09:47:35 +0100 Subject: [Ontology-editors] [Web-presence] new monthly reports In-Reply-To: References: Message-ID: <960BE2DE-B302-4712-9DC4-E2550B280949@gen.cam.ac.uk> Hi Amelia I think they are great. Plain text will do me. Susan On 13 Aug 2009, at 13/Aug/2009 01:12:26, Amelia Ireland wrote: > Hi guys, > > I've written a new version of the monthly report script, which > documents changes in the ontology. I've uploaded a couple of sample > files into go/scratch/reports/ (accessible via http://geneontology.org/scratch/reports/ > ) ; if you're interested, please have a look, and let me know if > they seem comprehensible or if you have any suggestions. I'm > considering doing an html version, but I don't know how much demand > there would be for it... what do people think? > > If the files aren't there when you look, it might be that the server > hasn't updated - they are in cvs now, though. > > Cheers, > Amelia. > > -- > Amelia Ireland > GO Editorial Office > http://www.berkeleybop.org || http://www.ebi.ac.uk > Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ > > > > > > > > > _______________________________________________ > Web-presence-working-group mailing list > Web-presence-working-group at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/web-presence-working-group Susan Tweedie FlyBase GO curator s.tweedie at gen.cam.ac.uk From jdeegan at ebi.ac.uk Fri Aug 14 09:19:11 2009 From: jdeegan at ebi.ac.uk (Jennifer Deegan (nee Clark)) Date: Fri, 14 Aug 2009 17:19:11 +0100 Subject: [Ontology-editors] disjoint links Message-ID: <4A858E7F.5020209@ebi.ac.uk> Hi, Could everybody keep a close eye on the disjoint links in the live file when committing back to the respository? OBO-Edit has developed an intermittent bug that causes it not to save out the disjoint links, but we are not able to reproduce the error. If anybody spots it happening it would be great if they could send details to the OBO-Edit working group. Thanks, Jennifer From cjm at berkeleybop.org Fri Aug 14 18:59:31 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Fri, 14 Aug 2009 18:59:31 -0700 Subject: [Ontology-editors] displaying has_part In-Reply-To: References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> Message-ID: <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> [answering on ontology-editors list as this relates mostly to GO at the moment. Context for the discussion here: https://sourceforge.net/mailarchive/forum.php?forum_name=geneontology-oboedit-working-group&max_rows=25&style=ultimate&viewmonth=200908 ] On Aug 14, 2009, at 2:37 PM, Karen Christie wrote: > To me "deeply uniintuitive" is an enormous problem, regardless of > whether you call it a bug. I agree unintuitive is a problem, which is why we are restricting this to the editors file and the gene_ontology_ext for now. The reason the display is unintuitive is because the relation is unintuitive, at least to people used to the GO. There is in fact no completely satisfactory solution to the display problem. > Personally, I do not want to see the input-output relationship > determine the hierarchy of display of the terms. I'm afraid that's how every single tool that displays ontologies works, so you are out of luck in a big way. What you are advocating is not a single change in oboedit but an across-the-board modification of everyone's ontology display tools. > It does NOT make sense when viewing the terms linked by has_part and > will be an even bigger problem when we need to display this > relationship to users. As far as denormalized-tree type displays such as the oboedit ontology tree editor I agree, and we need to constructively come up with solutions. > I also don't see why you say that parent-child doesn't apply to > has_part. I am saying the terminology is confusing and we should abandon it. See below for reasons. > Our documentation states that the has_part relationship is the > inverse of the is_part relationship. This is not quite correct, and should be corrected. They are only inverses on the instance level. I'll explain this further below. > For the biological examples so far, both the spliceosome ones in > component that are already in and also a has_part relationship that > I plan to add in process (see go/scratch/RNAsurveillance.obo), it > makes sense to still define a parent-child relationship, just in the > opposite direction of part_of. For part_of, one can can say that the > smaller/more granular/child thing is part_of the larger/less > granular/parent. For has_part, the converse, one can say that the > larger/less granular/parent has the smaller/more granular/child > thing as a part. But be aware that in the first case the subject of the relationship is the child and the target/object is the parent. In the second case the subject is the parent and the target/object is the child. This reversal is bound to confuse people used to equating the two. This is why I advocate abandoning the terminology. That and the fact that there is no intuitive mappings of parent/child to other relations. Furthermore, the fact that we choose to make this terminological distinction has absolutely no bearing on the behavior of all tools which will draw things like this target [rel] subject in denormalized tree displays and target ^ | subject in graph displays > Hiding the has_part relationship so that only ontology editors see > it is also not an acceptable permanent solution. Eurie and Mike are > not pleased that the introduction of the has_part relationship, and > its absence from the main file, means that SGD has lost information > from our displays that used to be present. So, we need to be > displaying has_part to users, and in a way that makes sense. I'm sorry that you're not happy with the introduction of the has_part relationship. It was discussed for some time on this list prior to its introduction, as was the plan regarding gene_ontology_ext. I'm not sure why you, Mike or Eurie have not mentioned this until now. From the above, I'm not quite sure what it is SGD is not happy about - the introduction of the has_part relationship or its absence from the main GO file? If the latter then SGD is free to use gene_ontology_ext. Be aware you will have to modify your software to avoid the unintuitive display seen in oboedit. You will also have to make sure the software developers understand the semantics of ontology relationships. With the introduction of has_part it's untenable to carry on using basic DAG traversal algorithms. I'm very happy to talk with the software developers of the various MODs and tool developers to help them understand this (I've already done this with a few). I also appreciate that we could do with more extensive documentation here. > So, I still contend that the fact that the has_part relationship is > inverting the display of terms from the way that makes sense is NOT > OK. - we knew the introduction of has_part would contradict assumptions and confuse people, which is why its introduction was delayed for so long and it was introduced vary carefully - all software constructed according to assumptions surrounding the original GO will display has_part in an unintuitive way in tree-type displays. it will also produce the wrong answers to queries. - no one is saying this is a good thing, that's just how it is. - this is why we only expose it in gene_ontology_ext, for sufficiently advanced tools - this solution is not ideal for GO ontology developers such as yourself - we should constructively work towards better solutions One improvement is to simply not show the has_part relationship in ontology tree editor like denormalized tree displays. I'm not saying this is a panacea or that it is 100% perfect. It's just a practical, simple, achievable step that doesn't involve completely rewriting display algorithms. That's all. Let's consider other options that may involve some software rewriting (for everyone, not just OE). I presume you would prefer something that preserves a visual structure with the broader entity at the top and the narrower at the bottom, such as: U2-type spliceosomal complex U2-type prespliceosome U1 snRNP (my assumptions may be wrong, correct me if they are) Here are the transformation steps required to get tools to display things in this way: Given the ontology contains the relationships: (all) U2-type prespliceosome [has_part] (some) U1 snRNP (all) U2-type prespliceosome [is_a] U2-type spliceosomal complex the tool has to first infer from the relationship: (all) U2-type prespliceosome [has_part] (some) U1 snRNP the inverse relationship: U1 snRNP [part_of_all] U2-type prespliceosome ****note the relation**** We haven't explicitly named the type-level inverse of has_part until now. I'm using [part_of_all] here to indicate the unusual inverted all-some direction but am open to other names. The semantics are: X part_of_all Y <-> every instance of Y (instance level)has_part some X Hopefully everyone understands why part_of and has_part are not inverses on the type level: (all) U2-type prespliceosome [has_part] (some) U1 snRNP -- TRUE (all) U1 snRNP [part_of] (some) U2-type prespliceosome -- FALSE If the tool is then configured to hide has_part but to show the inferred inverse then default display algorithms will show: U2-type spliceosomal complex [is_a] U2-type prespliceosome [part_of_all] U1 snRNP Is this the kind of thing you are getting at? If not some diagrams would help me. It's also easy for me to come down to SGD and discuss this with a whiteboard to help. Assuming that it is -- with a bit of work it would be possible to get OE to show things this way. It would be considerably more work to do this across the board for all tools that visualize the GO in some way. I would strongly advocate that even given sufficient developer hours it is better *not* to display the ontology in this way at all. It perhaps looks more comforting, but people will make the same comforting assumptions that no longer hold. For example, it looks like there is some kind of transitive relationship between U1 snRNP and U2- type spliceosomal complex ***which there is not***. It looks like the true path rule might hold. ***it does not*** . Queries for U2-type spliceosomal complex should ***not*** return gene products localized to U1 snRNP complex. I'm happy to be overruled by majority vote here and consider recommending this sort of display for all tools. I've had this discussion a few times before with others, most people start off wanting to retain the comforting broad/narrow visual structure, but on understanding the semantics of has_part change their mind. There are some alternatives for tree type displays. One is something like this: nucleus [p] nuclear part [i] small nuclear ribonucleoprotein complex [i] U1 snRNP [part_of_all : U2-type spliceosome, penta-snRNP complex, ...] [i] U2 snRNP [part_of_all : U2-type spilceosome, penta-snRNP complex, ...] This maintains the convention of having the direction of implication flow from bottom right to top left, i.e. the true path rule. every U1 snRNP is part_of the nucleus. I think it does make sense to show has_part in graph displays without introducing any kind of relation transformation. It might be possible to change the layout algorithm such that vertical layout correlates with relative size yet the arrows still accurately depict the ontology relationships. I include an example at the end of this file (although line crossing will always be a massive problem here). Hopefully this is at least partly convincing you that this is not a bug and that oboedit and other tools are just accurately depicting the relationships in a consistent manner, and that solutions to the fact that this is unintuitive are non-trivial. -------------- next part -------------- A non-text attachment was scrubbed... Name: pastedGraphic.pdf Type: application/pdf Size: 15389 bytes Desc: not available URL: -------------- next part -------------- > > -Karen > > > > On Fri, 14 Aug 2009, Chris Mungall wrote: > >> >> Answering on list: >> >> The display is indeed deeply unintuitive, but it's not a bug, >> because it's completely consistent with how OE displays >> relationships. >> >> The OTE shows relationships like this >> >> >> Target >> [relation] Subject >> >> Which means that you have to read sentences from bottom right to >> top left. This is true of all relationships in the OTE. >> >> Btw, the parent-child terminology doesn't really make any sense >> once we go beyond an is_a/part_of graph. It makes complete >> intuitive sense for simple graphs but leads to confusion when used >> with other relationships. I'd say about a third of folks map parent/ >> child to subject/target, about a third of folks map parent/child to >> bigger-entity/smaller-entity and the rest use parent/child >> exclusively for is_a relationships. >> >> Given that it is unintuitive, how should OE make it simpler? One >> option would be add a new feature to allow certain relations to be >> inverted in the displays. This seems like it might work. But in the >> long run this makes things even more confusing as you are reversing >> the direction of implication. >> >> Consider the two sentences currently in GO, first written in a >> graphics neutral natural language subject-relation-target form: >> >> (every) catalytic step 1 spliceosome has_part (some) U5 snSNP >> (every) U5 snRNP is_a small nuclear ribonucleoprotein complex >> >> From this it follows: >> >> (every) catalytic step 1 spliceosome has_part (some) small nuclear >> ribonucleoprotein complex >> >> If we draw it graphically: >> >> small nuclear ribonucleoprotein complex >> ^ >> | >> [is_a] >> | >> U5 snRNP >> ^ >> | >> [has_part] >> | >> catalytic step 1 spliceosome >> >> The inferences become clear as they follow the arrows. See the GO >> documentation: >> >> http://geneontology.org/GO.ontology-ext.relations.shtml#haspart >> >> If however we invert the arrow to make U5 snRNP the leaf of this >> particular subgraph it obfuscates this. This is particularly true >> of longer chains. >> >> Some steps we can make in order to make all this more palatable: >> >> - omit has_part from the basic public GO (done). Only trained >> curators see these at the moment >> >> - change the OTE such that relations have to be explicitly included >> rather than included by default. Most other ontology editors refuse >> to show anything other than subclass in the equivalent of the OTE. >> This is too extreme but there is really no value in showing >> has_part in the OTE, it's just confusing >> >> - discourage use of parent/child terminology for anything other >> than is_a/part_of (really the parent editor should be called the >> relationship editor) >> >> Any other ideas? >> >> In general this is part of a wider presentation problem. The >> majority of biologists I talk to think *all* the arrows in ontology >> diagrams are labeled in the wrong direction the first time they >> encounter ontologies (particularly the develops_from relation). I >> don't know if that's the experience of folks here. >> >> On Aug 14, 2009, at 11:04 AM, Karen Christie wrote: >> >>> Hi, >>> My top bug is the fact that the has_part relationships are >>> inverting the >>> appropriate parent-child relationships of terms connected by >>> has_part in >>> all components. >>> https://sourceforge.net/tracker/?func=detail&aid=2837799&group_id=36855&atid=418257 >>> thanks, >>> -Karen >>> ------------------------------------------------------------------------------ >>> Let Crystal Reports handle the reporting - Free Crystal Reports >>> 2008 30-Day >>> trial. Simplify your report design, integration and deployment - >>> and focus on >>> what you do best, core application coding. Discover what's new with >>> Crystal Reports now. http://p.sf.net/sfu/bobj-july >>> _______________________________________________ >>> Geneontology-oboedit-working-group mailing list >>> Geneontology-oboedit-working-group at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/geneontology-oboedit-working-group >> > From cjm at berkeleybop.org Thu Aug 20 08:41:26 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Thu, 20 Aug 2009 08:41:26 -0700 Subject: [Ontology-editors] [Gohelp] GO synonyms In-Reply-To: References: <200908171308.n7HD8e0h017740@meatloaf.Stanford.EDU> Message-ID: If OE does not check for these I can put in a check in the ontology publishing pipeline to catch these upstream. On Aug 20, 2009, at 8:07 AM, Midori Harris wrote: > Dear Dr. Stenetorp, > > Thank you for these additional comments. I've interspersed specific > responses below. > > On Mon, 17 Aug 2009, ninjin at nada.kth.se wrote: > >> Date: Mon, 17 Aug 2009 06:08:40 -0700 >> >> contactName: Pontus Stenetorp >> contactEmail: ninjin at nada.kth.se >> contactText: Dear GO, >> >> I have continued working on my project and found some more oddities >> in the database. Currently my sanity check for synonyms has picked >> up the following for the latest database. >> >> 'trachea development' is an exact synonym for 'GO:0007424' and the >> name of 'GO:0060438'. In accordance to my earlier comment I believe >> this should not be allowed. Might it be that in the literature it >> is used as a shorthand for 'GO:0007424'? > > I think it's simply an error; I've removed the synonym from GO: > 0007424. > >> Now to less intriguing things I have picked up. The following GO- >> tags has their own name as an exact synonym. I am not sure whether >> this is allowed/intended or not. But I am attaching the list of the >> tags where this occurs. > > Thanks; I've removed these duplicates (I suppose they're not > actively a problem, but they are entirely redundant). > >> I have also noticed that the constants for synonym_type_id has >> changed some time recently. It is now. >> >> EXACT: 17 >> ALT_ID: 20 >> BROAD: 38 >> NARROW: 42 >> RELATED: 86 >> >> While it used to be about a month ago all numbers were the current >> ones minus one. >> >> EXACT: 16 >> ALT_ID: 19 >> BROAD: 37 >> NARROW: 41 >> RELATED: 85 >> >> I have tried to find these constants documented somewhere, as I >> mentioned earlier. But I have failed to do so, I really would like >> to know if there is a page with the exact numbers, not the meaning >> of the synonym types themselves. > > I think these are just internal identifiers for the synonym scope > tags, but I'll pass the question about why they changed to the > programmers. > >> Sorry about only mentioning inconsistencies whenever I contact you. > > No problem; it brings them to our attention so we can fix them. > > Best regards, > Midori > >> P.S >> Would it be appreciated if I attempted to supply some SQL in order >> to for the database to check its own integrity in regard of these >> things? > > Probably yes; it certainly wouldn't hurt. Thanks! > > > _______________________________________________ > Gohelp mailing list > Gohelp at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/gohelp > From midori at ebi.ac.uk Thu Aug 20 08:49:01 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Thu, 20 Aug 2009 16:49:01 +0100 (BST) Subject: [Ontology-editors] [Gohelp] GO synonyms In-Reply-To: References: <200908171308.n7HD8e0h017740@meatloaf.Stanford.EDU> Message-ID: OE checks for cases where two terms have the same EXACT synonym, or where one term's name is the same as another term's exact synonym. There's a feature request open to have OE catch casen where a term has its own name as a synonym: https://sourceforge.net/tracker/?func=detail&aid=1925380&group_id=36855&atid=418260 I suspect sometimes editors forget to run the existing check ... m On Thu, 20 Aug 2009, Chris Mungall wrote: > > If OE does not check for these I can put in a check in the ontology > publishing pipeline to catch these upstream. > > On Aug 20, 2009, at 8:07 AM, Midori Harris wrote: > >> Dear Dr. Stenetorp, >> >> Thank you for these additional comments. I've interspersed specific >> responses below. >> >> On Mon, 17 Aug 2009, ninjin at nada.kth.se wrote: >> >>> Date: Mon, 17 Aug 2009 06:08:40 -0700 >>> >>> contactName: Pontus Stenetorp >>> contactEmail: ninjin at nada.kth.se >>> contactText: Dear GO, >>> >>> I have continued working on my project and found some more oddities in the >>> database. Currently my sanity check for synonyms has picked up the >>> following for the latest database. >>> >>> 'trachea development' is an exact synonym for 'GO:0007424' and the name of >>> 'GO:0060438'. In accordance to my earlier comment I believe this should >>> not be allowed. Might it be that in the literature it is used as a >>> shorthand for 'GO:0007424'? >> >> I think it's simply an error; I've removed the synonym from GO:0007424. >> >>> Now to less intriguing things I have picked up. The following GO-tags has >>> their own name as an exact synonym. I am not sure whether this is >>> allowed/intended or not. But I am attaching the list of the tags where >>> this occurs. >> >> Thanks; I've removed these duplicates (I suppose they're not actively a >> problem, but they are entirely redundant). >> >>> I have also noticed that the constants for synonym_type_id has changed >>> some time recently. It is now. >>> >>> EXACT: 17 >>> ALT_ID: 20 >>> BROAD: 38 >>> NARROW: 42 >>> RELATED: 86 >>> >>> While it used to be about a month ago all numbers were the current ones >>> minus one. >>> >>> EXACT: 16 >>> ALT_ID: 19 >>> BROAD: 37 >>> NARROW: 41 >>> RELATED: 85 >>> >>> I have tried to find these constants documented somewhere, as I mentioned >>> earlier. But I have failed to do so, I really would like to know if there >>> is a page with the exact numbers, not the meaning of the synonym types >>> themselves. >> >> I think these are just internal identifiers for the synonym scope tags, but >> I'll pass the question about why they changed to the programmers. >> >>> Sorry about only mentioning inconsistencies whenever I contact you. >> >> No problem; it brings them to our attention so we can fix them. >> >> Best regards, >> Midori >> >>> P.S >>> Would it be appreciated if I attempted to supply some SQL in order to for >>> the database to check its own integrity in regard of these things? >> >> Probably yes; it certainly wouldn't hurt. Thanks! >> >> >> _______________________________________________ >> Gohelp mailing list >> Gohelp at geneontology.org >> http://fafner.stanford.edu/mailman/listinfo/gohelp > From jane at ebi.ac.uk Fri Aug 21 03:35:06 2009 From: jane at ebi.ac.uk (Jane Lomax) Date: Fri, 21 Aug 2009 11:35:06 +0100 Subject: [Ontology-editors] [Web-presence] new monthly reports In-Reply-To: References: Message-ID: <4A8E785A.5000900@ebi.ac.uk> Great! I'd like an html page to allow me to search all the reports at once e.g. for a specific id. Otherwise plain text is fine. Jane Amelia Ireland wrote: > Hi guys, > > I've written a new version of the monthly report script, which > documents changes in the ontology. I've uploaded a couple of sample > files into go/scratch/reports/ (accessible via > http://geneontology.org/scratch/reports/ ) ; if you're interested, > please have a look, and let me know if they seem comprehensible or if > you have any suggestions. I'm considering doing an html version, but I > don't know how much demand there would be for it... what do people think? > > If the files aren't there when you look, it might be that the server > hasn't updated - they are in cvs now, though. > > Cheers, > Amelia. > > -- > Amelia Ireland > GO Editorial Office > http://www.berkeleybop.org || http://www.ebi.ac.uk > Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ > > > > > > > > > _______________________________________________ > Web-presence-working-group mailing list > Web-presence-working-group at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/web-presence-working-group -- Dr Jane Lomax GO Editorial Office EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridgeshire, UK CB10 1SD p: +44 1223 492516 f: +44 1223 494468 From midori at ebi.ac.uk Fri Aug 21 08:20:02 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Fri, 21 Aug 2009 16:20:02 +0100 (BST) Subject: [Ontology-editors] nitrogen and ChEBI xps Message-ID: Hi Chris, Could you take a look at SF 2841833? It's about nitrogen compounds and bp-xp-chebi newlinks. https://sourceforge.net/tracker/?func=detail&aid=2841833&group_id=36855&atid=440764 (ccing list mainly for archiving, but anyone else is welcome to jump in) thanks, m From midori at ebi.ac.uk Fri Aug 21 08:36:11 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Fri, 21 Aug 2009 16:36:11 +0100 (BST) Subject: [Ontology-editors] CC organization just won't go away (q about catabolism) Message-ID: Hi fellow ontology developers, This isn't urgent, but we should think about it ... in the CC organization work, we briefly considered whether to link catabolism and disassembly. >From the notes we seemed to be leaning away from linking, but I'm not sure it was quite completely resolved. https://sourceforge.net/tracker/?func=detail&aid=2841836&group_id=36855&atid=440764 http://wiki.geneontology.org/index.php/Cellular_component_processes m From kchris at genome.stanford.edu Mon Aug 24 10:36:36 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Mon, 24 Aug 2009 10:36:36 -0700 (PDT) Subject: [Ontology-editors] displaying has_part In-Reply-To: <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> Message-ID: Hi Chris, Sorry so long to get back to you on this. I was sick half of last week and there was a lot to go through in your email. Comments inline. -Karen On Fri, 14 Aug 2009, Chris Mungall wrote: > > [answering on ontology-editors list as this relates mostly to GO at the > moment. Context for the discussion here: > https://sourceforge.net/mailarchive/forum.php?forum_name=geneontology-oboedi > t-working-group&max_rows=25&style=ultimate&viewmonth=200908 ] > > On Aug 14, 2009, at 2:37 PM, Karen Christie wrote: > > > To me "deeply uniintuitive" is an enormous problem, regardless of whether > > you call it a bug. > > I agree unintuitive is a problem, which is why we are restricting this to > the editors file and the gene_ontology_ext for now. The reason the display > is unintuitive is because the relation is unintuitive, at least to people > used to the GO. There is in fact no completely satisfactory solution to the > display problem. > > > Personally, I do not want to see the input-output relationship determine > > the hierarchy of display of the terms. > > I'm afraid that's how every single tool that displays ontologies works, so > you are out of luck in a big way. What you are advocating is not a single > change in oboedit but an across-the-board modification of everyone's > ontology display tools. I thought we started GO because biologists needed something to annotate, and that we started making our own tool because existing ontology tools did not meet our needs, so I don't see why the fact that other ontology tools display something in a particular way should automatically override the needs of biologists to see something biologically sensible. > > It does NOT make sense when viewing the terms linked by has_part and will > > be an even bigger problem when we need to display this relationship to > > users. > > As far as denormalized-tree type displays such as the oboedit ontology tree > editor I agree, and we need to constructively come up with solutions. > > > I also don't see why you say that parent-child doesn't apply to has_part. > > I am saying the terminology is confusing and we should abandon it. See below > for reasons. > > > Our documentation states that the has_part relationship is the inverse of > > the is_part relationship. > > This is not quite correct, and should be corrected. They are only inverses > on the instance level. I'll explain this further below. > > > For the biological examples so far, both the spliceosome ones in component > > that are already in and also a has_part relationship that I plan to add in > > process (see go/scratch/RNAsurveillance.obo), it makes sense to still > > define a parent-child relationship, just in the opposite direction of > > part_of. For part_of, one can can say that the smaller/more granular/child > > thing is part_of the larger/less granular/parent. For has_part, the > > converse, one can say that the larger/less granular/parent has the > > smaller/more granular/child thing as a part. > > But be aware that in the first case the subject of the relationship is the > child and the target/object is the parent. In the second case the subject is > the parent and the target/object is the child. This reversal is bound to > confuse people used to equating the two. This is why I advocate abandoning > the terminology. That and the fact that there is no intuitive mappings of > parent/child to other relations. I am completely aware that has_part reverses the direction of the relationship as compared to is_a and part_of. However, it seems to me that the parent/child terminology and the subject/target terminology describe different things. Parent/child seems to convey information about the granularity, i.e. the child is smaller, or more specific than than the parent. subject/target conveys information about the direction of the relationship. By only using the subject/target information to draw the relationship, the terms related by has_part relationships all appear inverted, in both trees and in graphs, in a way that appears biologically wrong to people used to reading outline form or hierarchy graphs. It seems that if we could use both parent/child and subject/target information, we should show the information in a meaningful way that made sense to biologists and still showed the direction of the has_part relationship. > Furthermore, the fact that we choose to make this terminological distinction > has absolutely no bearing on the behavior of all tools which will draw > things like this > > target > [rel] subject > > in denormalized tree displays > > and > > target > ^ > | > subject > > in graph displays Yes, that's the crux of it isn't it. For has_part, I don't think this is the correct way to draw the terms. You've talked a bunch about sentences and order, but there are other analogies from English that people are also used to looking at, such as outline form, where the more specific thing is below and to the right of the more general thing. For has_part, I would rather see this: subject | \/ target and invert the direction of the arrow rather than the biologically meaningful parent/child relationship. > > Hiding the has_part relationship so that only ontology editors see it is > > also not an acceptable permanent solution. Eurie and Mike are not pleased > > that the introduction of the has_part relationship, and its absence from > > the main file, means that SGD has lost information from our displays that > > used to be present. So, we need to be displaying has_part to users, and in > > a way that makes sense. > > I'm sorry that you're not happy with the introduction of the has_part > relationship. It was discussed for some time on this list prior to its > introduction, as was the plan regarding gene_ontology_ext. I'm not sure why > you, Mike or Eurie have not mentioned this until now. I was intimately involved in the introduction of the has_part relationship and the development of the spliceosomal complex terms that brought it in; I may have even suggested it in the SF item. Personally, I think it is far better than the horrible kludgy terms we made for a similar situation in the TFIIH complex terms. However, I do not think that we fully considered the impact of introducing this relationship, and using it to replace existing part-of relationships. While it is true that we have not introduced something new and potentially confusing into the the main users' file; we have also effectively removed information that used to be present in that file. > From the above, I'm not quite sure what it is SGD is not happy about - the > introduction of the has_part relationship or its absence from the main GO > file? > If the latter then SGD is free to use gene_ontology_ext. Be aware you will > have to modify your software to avoid the unintuitive display seen in > oboedit. You will also have to make sure the software developers understand > the semantics of ontology relationships. With the introduction of has_part > it's untenable to carry on using basic DAG traversal algorithms. I'm very > happy to talk with the software developers of the various MODs and tool > developers to help them understand this (I've already done this with a few). > I also appreciate that we could do with more extensive documentation here. Not exactly either of those things. Mike and Eurie are not happy with the fact that SGD's displays are now missing information that used to be present via part_of relationships. Our displays and tools no longer show a connection between things like the U5 snRNP and any spliceosomal complex. However, we are not yet in a position to use gene_ontology_ext because we are aware that there is a lot of software that would need to be modified in order for us to load that file and we have other priorities at the moment. > > So, I still contend that the fact that the has_part relationship is > > inverting the display of terms from the way that makes sense is NOT OK. > > - we knew the introduction of has_part would contradict assumptions and > confuse people, which is why its introduction was delayed for so long and it > was introduced vary carefully > - all software constructed according to assumptions surrounding the original > GO will display has_part in an unintuitive way in tree-type displays. it > will also produce the wrong answers to queries. > - no one is saying this is a good thing, that's just how it is. > - this is why we only expose it in gene_ontology_ext, for sufficiently > advanced tools > - this solution is not ideal for GO ontology developers such as yourself > - we should constructively work towards better solutions > > One improvement is to simply not show the has_part relationship in ontology > tree editor like denormalized tree displays. I'm not saying this is a > panacea or that it is 100% perfect. It's just a practical, simple, > achievable step that doesn't involve completely rewriting display > algorithms. That's all. While I don't like the way it's displaying now, I'm not sure I would consider removing has_part relationships from displays in OE to be an improvement, personally. It seems that would make it even harder to detect that there is an existing relationship between something like the U2 snRNP and a spliceosomal complex term. > Let's consider other options that may involve some software rewriting (for > everyone, not just OE). I presume you would prefer something that preserves > a visual structure with the broader entity at the top and the narrower at > the bottom, such as: > > U2-type spliceosomal complex > U2-type prespliceosome > U1 snRNP > > (my assumptions may be wrong, correct me if they are) > > Here are the transformation steps required to get tools to display things in > this way: > > Given the ontology contains the relationships: > > (all) U2-type prespliceosome [has_part] (some) U1 snRNP > (all) U2-type prespliceosome [is_a] U2-type spliceosomal complex > the tool has to first infer from the relationship: > > (all) U2-type prespliceosome [has_part] (some) U1 snRNP > > the inverse relationship: > > U1 snRNP [part_of_all] U2-type prespliceosome > > ****note the relation**** We haven't explicitly named the type-level > inverse of has_part until now. I'm using [part_of_all] here to indicate the > unusual inverted all-some direction but am open to other names. The > semantics are: > > X part_of_all Y <-> every instance of Y (instance level)has_part > some X > > Hopefully everyone understands why part_of and has_part are not inverses on > the type level: > > (all) U2-type prespliceosome [has_part] (some) U1 snRNP -- TRUE > (all) U1 snRNP [part_of] (some) U2-type prespliceosome -- FALSE > > If the tool is then configured to hide has_part but to show the inferred > inverse then default display algorithms will show: > > U2-type spliceosomal complex > [is_a] U2-type prespliceosome > [part_of_all] U1 snRNP > > Is this the kind of thing you are getting at? If not some diagrams would > help me. It's also easy for me to come down to SGD and discuss this with a > whiteboard to help. > > Assuming that it is -- with a bit of work it would be possible to get OE to > show things this way. It would be considerably more work to do this across > the board for all tools that visualize the GO in some way. > > I would strongly advocate that even given sufficient developer hours it is > better *not* to display the ontology in this way at all. It perhaps looks > more comforting, but people will make the same comforting assumptions that > no longer hold. For example, it looks like there is some kind of transitive > relationship between U1 snRNP and U2-type spliceosomal complex ***which > there is not***. It looks like the true path rule might hold. ***it does > not*** . Queries for U2-type spliceosomal complex should ***not*** return > gene products localized to U1 snRNP complex. Why should "Queries for U2-type spliceosomal complex should ***not*** return gene products localized to U1 snRNP complex."? I would have thought that they should. If I wanted to know the parts of "U2-type spliceosomal complex" I would want to know all the things that compose the series of complexes that are all considered to be a "U2-type spliceosomal complex". [Note that we need to revise the defs of "U2-type spliceosomal complex ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; GO:0005689") to be consistent with the def of the parent term "spliceosomal complex ; GO:0005681" and specify that these terms represent series of complexes. I'll submit a SF item for this.] > I'm happy to be overruled by majority vote here and consider recommending > this sort of display for all tools. I've had this discussion a few times > before with others, most people start off wanting to retain the comforting > broad/narrow visual structure, but on understanding the semantics of > has_part change their mind. I prefer to call the broad/narrow structure sensible with respect to the biology. It seems that we can display the fact that the has_part relationship goes in the opposite direction by changing the direction of the arrow rather than by destoying the broad/narrow display that conveys different information. > There are some alternatives for tree type displays. One is something like > this: > > nucleus > [p] nuclear part > [i] small nuclear ribonucleoprotein complex > [i] U1 snRNP [part_of_all : U2-type spliceosome, > penta-snRNP complex, ...] > [i] U2 snRNP [part_of_all : U2-type spilceosome, > penta-snRNP complex, ...] > > This maintains the convention of having the direction of implication flow > from bottom right to top left, i.e. the true path rule. every U1 snRNP is > part_of the nucleus. > > I think it does make sense to show has_part in graph displays without > introducing any kind of relation transformation. It might be possible to > change the layout algorithm such that vertical layout correlates with > relative size yet the arrows still accurately depict the ontology > relationships. I include an example at the end of this file (although line > crossing will always be a massive problem here). > > Hopefully this is at least partly convincing you that this is not a bug and > that oboedit and other tools are just accurately depicting the relationships > in a consistent manner, and that solutions to the fact that this is > unintuitive are non-trivial. Oh, I believe you that it's not a bug, and that solutions are non-trivial. I would consider this a significant design flaw that it is very important that we fix. Not just for OE, we are going to have to solve this issue in order to present reasonable displays to users, many of whom will probably never understand ontology design. From adiehl at informatics.jax.org Mon Aug 24 11:05:38 2009 From: adiehl at informatics.jax.org (Alexander Diehl) Date: Mon, 24 Aug 2009 14:05:38 -0400 Subject: [Ontology-editors] [OBO-Edit Working Group] displaying has_part In-Reply-To: References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> Message-ID: <4A92D672.1060907@informatics.jax.org> Karen, For better or worse, these display issues have be a feature of other ontologies for a long time. Try looking at the Sequence Ontology for instance. Furthermore, as we build cross-products within the GO, between the GO and other ontologies, and among other ontologies for different use, we simply have to have a consistent display of relationships. After all, when relating items in orthologous ontologies, how do we decide which term is "broader" or "narrower." When I first encountered this display issue some time ago, I was confused and taken aback, but upon continued exposure, I have gotten used to it and appreciate its logic. OBO-Edit is not just a tool for the GO, but indeed is in use by many people for many ontologies. As ontologies grow more complicated, ontology visualization grows more complicated -- we've been spared this in the GO for a long time by limiting ourselves to is_a and part_of, and the three regulates relationships, which themselves do not even represent child-parent relationships or narrower-broader relationships in the way that is_a or part_of relationships work. It may well be true that introducing has_part has caused the loss of information from versions of the GO that do not contain this relationship. However, the implementation of has_part was a decision of the GO Consortium that solves more problems than it creates. Ideally, MODs that cannot support internal software development for tools to handle this relationship should switch to AmiGO in the long term for display of the ontology and term annotations, and continued development of AmiGO to display and handle these relationships should be a priority. Thanks, Alex Karen Christie wrote: > Hi Chris, > > Sorry so long to get back to you on this. I was sick half of last week > and there was a lot to go through in your email. Comments inline. > > -Karen > > On Fri, 14 Aug 2009, Chris Mungall wrote: > > >> [answering on ontology-editors list as this relates mostly to GO at the >> moment. Context for the discussion here: >> https://sourceforge.net/mailarchive/forum.php?forum_name=geneontology-oboedi >> t-working-group&max_rows=25&style=ultimate&viewmonth=200908 ] >> >> On Aug 14, 2009, at 2:37 PM, Karen Christie wrote: >> >> >>> To me "deeply uniintuitive" is an enormous problem, regardless of whether >>> you call it a bug. >>> >> I agree unintuitive is a problem, which is why we are restricting this to >> the editors file and the gene_ontology_ext for now. The reason the display >> is unintuitive is because the relation is unintuitive, at least to people >> used to the GO. There is in fact no completely satisfactory solution to the >> display problem. >> >> >>> Personally, I do not want to see the input-output relationship determine >>> the hierarchy of display of the terms. >>> >> I'm afraid that's how every single tool that displays ontologies works, so >> you are out of luck in a big way. What you are advocating is not a single >> change in oboedit but an across-the-board modification of everyone's >> ontology display tools. >> > > I thought we started GO because biologists needed something to > annotate, and that we started making our own tool because existing > ontology tools did not meet our needs, so I don't see why the fact > that other ontology tools display something in a particular way should > automatically override the needs of biologists to see something > biologically sensible. > > >>> It does NOT make sense when viewing the terms linked by has_part and will >>> be an even bigger problem when we need to display this relationship to >>> users. >>> >> As far as denormalized-tree type displays such as the oboedit ontology tree >> editor I agree, and we need to constructively come up with solutions. >> >> >>> I also don't see why you say that parent-child doesn't apply to has_part. >>> >> I am saying the terminology is confusing and we should abandon it. See below >> for reasons. >> >> >>> Our documentation states that the has_part relationship is the inverse of >>> the is_part relationship. >>> >> This is not quite correct, and should be corrected. They are only inverses >> on the instance level. I'll explain this further below. >> >> >>> For the biological examples so far, both the spliceosome ones in component >>> that are already in and also a has_part relationship that I plan to add in >>> process (see go/scratch/RNAsurveillance.obo), it makes sense to still >>> define a parent-child relationship, just in the opposite direction of >>> part_of. For part_of, one can can say that the smaller/more granular/child >>> thing is part_of the larger/less granular/parent. For has_part, the >>> converse, one can say that the larger/less granular/parent has the >>> smaller/more granular/child thing as a part. >>> >> But be aware that in the first case the subject of the relationship is the >> child and the target/object is the parent. In the second case the subject is >> the parent and the target/object is the child. This reversal is bound to >> confuse people used to equating the two. This is why I advocate abandoning >> the terminology. That and the fact that there is no intuitive mappings of >> parent/child to other relations. >> > > I am completely aware that has_part reverses the direction of the > relationship as compared to is_a and part_of. However, it seems to me > that the parent/child terminology and the subject/target terminology > describe different things. Parent/child seems to convey information > about the granularity, i.e. the child is smaller, or more specific > than than the parent. subject/target conveys information about the > direction of the relationship. > > By only using the subject/target information to draw the relationship, > the terms related by has_part relationships all appear inverted, in > both trees and in graphs, in a way that appears biologically wrong to > people used to reading outline form or hierarchy graphs. It seems that > if we could use both parent/child and subject/target information, we > should show the information in a meaningful way that made sense to > biologists and still showed the direction of the has_part > relationship. > > >> Furthermore, the fact that we choose to make this terminological distinction >> has absolutely no bearing on the behavior of all tools which will draw >> things like this >> >> target >> [rel] subject >> >> in denormalized tree displays >> >> and >> >> target >> ^ >> | >> subject >> >> in graph displays >> > > Yes, that's the crux of it isn't it. For has_part, I don't think this > is the correct way to draw the terms. You've talked a bunch about > sentences and order, but there are other analogies from English that > people are also used to looking at, such as outline form, where the > more specific thing is below and to the right of the more general > thing. For has_part, I would rather see this: > > subject > | > \/ > target > > and invert the direction of the arrow rather than the biologically > meaningful parent/child relationship. > > >>> Hiding the has_part relationship so that only ontology editors see it is >>> also not an acceptable permanent solution. Eurie and Mike are not pleased >>> that the introduction of the has_part relationship, and its absence from >>> the main file, means that SGD has lost information from our displays that >>> used to be present. So, we need to be displaying has_part to users, and in >>> a way that makes sense. >>> >> I'm sorry that you're not happy with the introduction of the has_part >> relationship. It was discussed for some time on this list prior to its >> introduction, as was the plan regarding gene_ontology_ext. I'm not sure why >> you, Mike or Eurie have not mentioned this until now. >> > > I was intimately involved in the introduction of the has_part > relationship and the development of the spliceosomal complex terms > that brought it in; I may have even suggested it in the SF > item. Personally, I think it is far better than the horrible kludgy > terms we made for a similar situation in the TFIIH complex terms. > > However, I do not think that we fully considered the impact of > introducing this relationship, and using it to replace existing > part-of relationships. While it is true that we have not introduced > something new and potentially confusing into the the main users' file; > we have also effectively removed information that used to be present > in that file. > > >> From the above, I'm not quite sure what it is SGD is not happy about - the >> introduction of the has_part relationship or its absence from the main GO >> file? >> > > >> If the latter then SGD is free to use gene_ontology_ext. Be aware you will >> have to modify your software to avoid the unintuitive display seen in >> oboedit. You will also have to make sure the software developers understand >> the semantics of ontology relationships. With the introduction of has_part >> it's untenable to carry on using basic DAG traversal algorithms. I'm very >> happy to talk with the software developers of the various MODs and tool >> developers to help them understand this (I've already done this with a few). >> I also appreciate that we could do with more extensive documentation here. >> > > Not exactly either of those things. Mike and Eurie are not happy with > the fact that SGD's displays are now missing information that used to > be present via part_of relationships. Our displays and tools no longer > show a connection between things like the U5 snRNP and any > spliceosomal complex. > > However, we are not yet in a position to use gene_ontology_ext because > we are aware that there is a lot of software that would need to be > modified in order for us to load that file and we have other > priorities at the moment. > > >>> So, I still contend that the fact that the has_part relationship is >>> inverting the display of terms from the way that makes sense is NOT OK. >>> >> - we knew the introduction of has_part would contradict assumptions and >> confuse people, which is why its introduction was delayed for so long and it >> was introduced vary carefully >> - all software constructed according to assumptions surrounding the original >> GO will display has_part in an unintuitive way in tree-type displays. it >> will also produce the wrong answers to queries. >> - no one is saying this is a good thing, that's just how it is. >> - this is why we only expose it in gene_ontology_ext, for sufficiently >> advanced tools >> - this solution is not ideal for GO ontology developers such as yourself >> - we should constructively work towards better solutions >> >> One improvement is to simply not show the has_part relationship in ontology >> tree editor like denormalized tree displays. I'm not saying this is a >> panacea or that it is 100% perfect. It's just a practical, simple, >> achievable step that doesn't involve completely rewriting display >> algorithms. That's all. >> > > While I don't like the way it's displaying now, I'm not sure I would > consider removing has_part relationships from displays in OE to be an > improvement, personally. It seems that would make it even harder to > detect that there is an existing relationship between something like > the U2 snRNP and a spliceosomal complex term. > > >> Let's consider other options that may involve some software rewriting (for >> everyone, not just OE). I presume you would prefer something that preserves >> a visual structure with the broader entity at the top and the narrower at >> the bottom, such as: >> >> U2-type spliceosomal complex >> U2-type prespliceosome >> U1 snRNP >> >> (my assumptions may be wrong, correct me if they are) >> >> Here are the transformation steps required to get tools to display things in >> this way: >> >> Given the ontology contains the relationships: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >> (all) U2-type prespliceosome [is_a] U2-type spliceosomal complex >> the tool has to first infer from the relationship: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >> >> the inverse relationship: >> >> U1 snRNP [part_of_all] U2-type prespliceosome >> >> ****note the relation**** We haven't explicitly named the type-level >> inverse of has_part until now. I'm using [part_of_all] here to indicate the >> unusual inverted all-some direction but am open to other names. The >> semantics are: >> >> X part_of_all Y <-> every instance of Y (instance level)has_part >> some X >> >> Hopefully everyone understands why part_of and has_part are not inverses on >> the type level: >> >> (all) U2-type prespliceosome [has_part] (some) U1 snRNP -- TRUE >> (all) U1 snRNP [part_of] (some) U2-type prespliceosome -- FALSE >> >> If the tool is then configured to hide has_part but to show the inferred >> inverse then default display algorithms will show: >> >> U2-type spliceosomal complex >> [is_a] U2-type prespliceosome >> [part_of_all] U1 snRNP >> >> Is this the kind of thing you are getting at? If not some diagrams would >> help me. It's also easy for me to come down to SGD and discuss this with a >> whiteboard to help. >> >> Assuming that it is -- with a bit of work it would be possible to get OE to >> show things this way. It would be considerably more work to do this across >> the board for all tools that visualize the GO in some way. >> >> I would strongly advocate that even given sufficient developer hours it is >> better *not* to display the ontology in this way at all. It perhaps looks >> more comforting, but people will make the same comforting assumptions that >> no longer hold. For example, it looks like there is some kind of transitive >> relationship between U1 snRNP and U2-type spliceosomal complex ***which >> there is not***. It looks like the true path rule might hold. ***it does >> not*** . Queries for U2-type spliceosomal complex should ***not*** return >> gene products localized to U1 snRNP complex. >> > > Why should "Queries for U2-type spliceosomal complex should ***not*** > return gene products localized to U1 snRNP complex."? I would have > thought that they should. If I wanted to know the parts of "U2-type > spliceosomal complex" I would want to know all the things that compose > the series of complexes that are all considered to be a "U2-type > spliceosomal complex". > > [Note that we need to revise the defs of "U2-type spliceosomal complex > ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; > GO:0005689") to be consistent with the def of the parent term > "spliceosomal complex ; GO:0005681" and specify that these terms > represent series of complexes. I'll submit a SF item for this.] > > >> I'm happy to be overruled by majority vote here and consider recommending >> this sort of display for all tools. I've had this discussion a few times >> before with others, most people start off wanting to retain the comforting >> broad/narrow visual structure, but on understanding the semantics of >> has_part change their mind. >> > > I prefer to call the broad/narrow structure sensible with respect to > the biology. It seems that we can display the fact that the has_part > relationship goes in the opposite direction by changing the direction > of the arrow rather than by destoying the broad/narrow display that > conveys different information. > > >> There are some alternatives for tree type displays. One is something like >> this: >> >> nucleus >> [p] nuclear part >> [i] small nuclear ribonucleoprotein complex >> [i] U1 snRNP [part_of_all : U2-type spliceosome, >> penta-snRNP complex, ...] >> [i] U2 snRNP [part_of_all : U2-type spilceosome, >> penta-snRNP complex, ...] >> >> This maintains the convention of having the direction of implication flow >> from bottom right to top left, i.e. the true path rule. every U1 snRNP is >> part_of the nucleus. >> >> I think it does make sense to show has_part in graph displays without >> introducing any kind of relation transformation. It might be possible to >> change the layout algorithm such that vertical layout correlates with >> relative size yet the arrows still accurately depict the ontology >> relationships. I include an example at the end of this file (although line >> crossing will always be a massive problem here). >> >> Hopefully this is at least partly convincing you that this is not a bug and >> that oboedit and other tools are just accurately depicting the relationships >> in a consistent manner, and that solutions to the fact that this is >> unintuitive are non-trivial. >> > > Oh, I believe you that it's not a bug, and that solutions are > non-trivial. I would consider this a significant design flaw that it > is very important that we fix. Not just for OE, we are going to have > to solve this issue in order to present reasonable displays to users, > many of whom will probably never understand ontology design. > > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Geneontology-oboedit-working-group mailing list > Geneontology-oboedit-working-group at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/geneontology-oboedit-working-group > -- Alexander D. Diehl, Ph.D. Senior Scientific Curator Mouse Genome Informatics The Jackson Laboratory 600 Main Street Bar Harbor, ME 04609 email: adiehl at informatics.jax.org work: +1 (207) 288-6427 fax: +1 (207) 288-6131 From cjm at berkeleybop.org Mon Aug 24 14:06:10 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Mon, 24 Aug 2009 14:06:10 -0700 Subject: [Ontology-editors] displaying has_part In-Reply-To: References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> Message-ID: <7F2B6B8A-0F59-4F48-9274-B6AFC0F24493@berkeleybop.org> On Aug 24, 2009, at 10:36 AM, Karen Christie wrote: [snipped part of dialog for now to focus on one issue] >> I would strongly advocate that even given sufficient developer >> hours it is >> better *not* to display the ontology in this way at all. It perhaps >> looks >> more comforting, but people will make the same comforting >> assumptions that >> no longer hold. For example, it looks like there is some kind of >> transitive >> relationship between U1 snRNP and U2-type spliceosomal complex >> ***which >> there is not***. It looks like the true path rule might hold. ***it >> does >> not*** . Queries for U2-type spliceosomal complex should ***not*** >> return >> gene products localized to U1 snRNP complex. > > Why should "Queries for U2-type spliceosomal complex should ***not*** > return gene products localized to U1 snRNP complex."? OK, good, I think we are circling on on the crux of the issue here. According to GO, it is not necessarily the case that a gp localized to a U1 snRNP is necessarily localized to a U2-type spliceosomal complex. For example, the gp may be localized to a particular U1 snRNP that is part of a penta-snRNP complex (I may not have chosen the best example as we're lacking annotations to this new term, but I can pull out an analogous example if you're not convinced) This can be seen in the sub-graph which I reproduce at the end of this email (let me know if this is not visible in some email programs, I can make a wiki page with this all on it) - there is no path following the arrows from U1 snRNP to U2-type spliceosomal complex. Note that attempting to show the graph in an "intuitive" way as I have attempted to do below, with the smaller entities at the bottom, is actually *misleading* because it leads one to assume that there is some inferred all-some relationship between these two terms when in fact there is not. Biology is complex and logic is hard. There's no escaping this. I don't believe we should simplify either to the point where we get false positives. I do think we need better ways of displaying this complex information, but I think we should focus resources on doing this in end user-facing tools rather than oboedit, as we would hope everyone using oboedit to edit the ontology would have an understanding of the logic or a willingness to learn. Here is a roughly sketched out example of how this could work Let's say PMID:123 describes an observation of a product of g1 being localized to a 'U1 snRNP' via an IDA. Let's say that's all we know, either due to the resolution of the assay or that's all that the annotator specified. The user queries for 'U2-type spliceosomal complex' The query result screen could show something like: There are no genes with products known to be localized to 'U2-type spliceosomal complex'. However, every 'U2-type spliceosomal complex' has the following parts: a U1 snRNP -- genes: g1 a U2 snRNP -- genes: none A more advanced tool would be able to show even more: PMID:123 shows g1 in U1 snRNP. prob('U2-type spliceosomal complex') = 0.83 prob('penta-snRNP complex') = 0.05 ... This leads the user to terms of relevance, shows what is known, shows what might be the case, and does not show anything that is false. > I would have > thought that they should. If I wanted to know the parts of "U2-type > spliceosomal complex" I would want to know all the things that compose > the series of complexes that are all considered to be a "U2-type > spliceosomal complex". I'm not sure I really understand the statement, it sounds tautological, I don't understand what adding "series of" adds. If the question is "what parts can be found in every U2-type spliceosomal complex" then the answer is found via the has_part relation and it's closure. However, this is a different question from "what gene products have been observed to be present in a U2-type spliceosomal complex > [Note that we need to revise the defs of "U2-type spliceosomal complex > ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; > GO:0005689") to be consistent with the def of the parent term > "spliceosomal complex ; GO:0005681" and specify that these terms > represent series of complexes. I'll submit a SF item for this.] https://sourceforge.net/tracker/index.php?func=detail&aid=2843718&group_id=36855&atid=440764 I don't understand the motivation here. I think the definitions should employ a consistent style, but I would change the parent from: GO:0005681 ! spliceosomal complex [DEF: "Any of a series of ribonucleoprotein complexes that... to GO:0005681 ! spliceosomal complex [DEF: "A ribonucleoprotein complex that... I'm not sure how adding "any of a series of..." changes the meaning, it seems to just add extra verbiage that obfuscates the definition. -------------- next part -------------- A non-text attachment was scrubbed... Name: pastedGraphic.pdf Type: application/pdf Size: 15389 bytes Desc: not available URL: -------------- next part -------------- From cjm at berkeleybop.org Mon Aug 24 14:40:05 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Mon, 24 Aug 2009 14:40:05 -0700 Subject: [Ontology-editors] displaying has_part In-Reply-To: References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> Message-ID: On Aug 24, 2009, at 10:36 AM, Karen Christie wrote: >> >> > Hiding the has_part relationship so that only ontology editors >> see it is >> > also not an acceptable permanent solution. Eurie and Mike are not >> pleased >> > that the introduction of the has_part relationship, and its >> absence from >> > the main file, means that SGD has lost information from our >> displays that >> > used to be present. So, we need to be displaying has_part to >> users, and in >> > a way that makes sense. >> I'm sorry that you're not happy with the introduction of the has_part >> relationship. It was discussed for some time on this list prior to >> its >> introduction, as was the plan regarding gene_ontology_ext. I'm not >> sure why >> you, Mike or Eurie have not mentioned this until now. > > I was intimately involved in the introduction of the has_part > relationship and the development of the spliceosomal complex terms > that brought it in; I may have even suggested it in the SF > item. Personally, I think it is far better than the horrible kludgy > terms we made for a similar situation in the TFIIH complex terms. > > However, I do not think that we fully considered the impact of > introducing this relationship, and using it to replace existing > part-of relationships. While it is true that we have not introduced > something new and potentially confusing into the the main users' file; > we have also effectively removed information that used to be present > in that file. I agree we could have done more impact assessment here. In particular, with respect to assessing the frequency of changes that fall into two distinct scenarios: [1] Add-only changes to the ontology in which new has_part relationships are added [2] Changes to the ontology in which erroneous part_of links are deleted and the correct has_part relationships are added I think [1] is a win-win scenario. There is no loss as far as anyone is concerned, it's only optional new information [2] is more subtle. For tools that do not consume the has_part relationships there is an overall improvement logically and a reduction in false positive results to queries. However, there is a loss of navigational information. For example, previously we had [i] every snRNP U1 (GO:0005685) is part_of a U2-dependent spliceosome (GO:0005684) Now this relationship is gone (presumably because it did not hold in an all-some fashion). There is a new relationship: [ii] every U2-type spliceosomal complex(GO:0005684) has_part some U1 snRNP (GO:0005685) So even though [i] is false, it is useful for ontology navigational purposes, in the absence of [ii]. If there are many changes of type [2] then it increases the priority of fixing tools to be able to use gene_ontology_ext such that they can use links of type [ii] for navigation, and perhaps presentation of query results in a fashion similar to how I outlined in my previous email. I don't think this is a large change, and it has been discussed in the amigo and software groups for a number of years (groups with representatives from many of the reference genomes). However, we have yet to provide support for this in AmiGO. This was my oversight as I assumed most changes would be of type [1]. From kchris at genome.stanford.edu Tue Aug 25 11:18:08 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Tue, 25 Aug 2009 11:18:08 -0700 (PDT) Subject: [Ontology-editors] [OBO-Edit Working Group] displaying has_part In-Reply-To: <4A92D672.1060907@informatics.jax.org> References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> <4A92D672.1060907@informatics.jax.org> Message-ID: Alex, As I stated in my previous email, I was part of the group that REQUESTED use of the has_part relationship. Having also been involved in making horrible terms for TFIIH complexes, I completely understand what problems it solves. However, we did not do a good job of thinking through all the consequences, one of which is that AmiGO, which you propose as a solution for groups without their own GO display software, will show less information than it used to. Yes, my point is that this display issue is, as you put it, "for worse". Regarding has_part, it seems that if thing A has thing B as a part, then the broader/narrower relationship between thing A and thing B is established by the has_part relationship. I understand that parent/child or broader/narrower may not be relevant to regulates relationships, but I was only talking about has_part. I realize that as "ontologies grow more complicated, ontology visualization grows more complicated", but I think we can do better with has_part. -Karen On Mon, 24 Aug 2009, Alexander Diehl wrote: > Karen, > > For better or worse, these display issues have be a feature of other > ontologies for a long time. Try looking at the Sequence Ontology for > instance. Furthermore, as we build cross-products within the GO, between the > GO and other ontologies, and among other ontologies for different use, we > simply have to have a consistent display of relationships. After all, when > relating items in orthologous ontologies, how do we decide which term is > "broader" or "narrower." > > When I first encountered this display issue some time ago, I was confused and > taken aback, but upon continued exposure, I have gotten used to it and > appreciate its logic. OBO-Edit is not just a tool for the GO, but indeed is > in use by many people for many ontologies. As ontologies grow more > complicated, ontology visualization grows more complicated -- we've been > spared this in the GO for a long time by limiting ourselves to is_a and > part_of, and the three regulates relationships, which themselves do not even > represent child-parent relationships or narrower-broader relationships in the > way that is_a or part_of relationships work. > > It may well be true that introducing has_part has caused the loss of > information from versions of the GO that do not contain this relationship. > However, the implementation of has_part was a decision of the GO Consortium > that solves more problems than it creates. Ideally, MODs that cannot support > internal software development for tools to handle this relationship should > switch to AmiGO in the long term for display of the ontology and term > annotations, and continued development of AmiGO to display and handle these > relationships should be a priority. > > Thanks, > > Alex > > > Karen Christie wrote: >> Hi Chris, >> >> Sorry so long to get back to you on this. I was sick half of last week >> and there was a lot to go through in your email. Comments inline. >> >> -Karen >> >> On Fri, 14 Aug 2009, Chris Mungall wrote: >> >> >>> [answering on ontology-editors list as this relates mostly to GO at the >>> moment. Context for the discussion here: >>> https://sourceforge.net/mailarchive/forum.php?forum_name=geneontology-oboedi >>> t-working-group&max_rows=25&style=ultimate&viewmonth=200908 ] >>> >>> On Aug 14, 2009, at 2:37 PM, Karen Christie wrote: >>> >>> >>>> To me "deeply uniintuitive" is an enormous problem, regardless of whether >>>> you call it a bug. >>>> >>> I agree unintuitive is a problem, which is why we are restricting this to >>> the editors file and the gene_ontology_ext for now. The reason the display >>> is unintuitive is because the relation is unintuitive, at least to people >>> used to the GO. There is in fact no completely satisfactory solution to >>> the >>> display problem. >>> >>> >>>> Personally, I do not want to see the input-output relationship determine >>>> the hierarchy of display of the terms. >>>> >>> I'm afraid that's how every single tool that displays ontologies works, so >>> you are out of luck in a big way. What you are advocating is not a single >>> change in oboedit but an across-the-board modification of everyone's >>> ontology display tools. >>> >> >> I thought we started GO because biologists needed something to >> annotate, and that we started making our own tool because existing >> ontology tools did not meet our needs, so I don't see why the fact >> that other ontology tools display something in a particular way should >> automatically override the needs of biologists to see something >> biologically sensible. >> >> >>>> It does NOT make sense when viewing the terms linked by has_part and will >>>> be an even bigger problem when we need to display this relationship to >>>> users. >>>> >>> As far as denormalized-tree type displays such as the oboedit ontology >>> tree >>> editor I agree, and we need to constructively come up with solutions. >>> >>> >>>> I also don't see why you say that parent-child doesn't apply to has_part. >>>> >>> I am saying the terminology is confusing and we should abandon it. See >>> below >>> for reasons. >>> >>> >>>> Our documentation states that the has_part relationship is the inverse of >>>> the is_part relationship. >>>> >>> This is not quite correct, and should be corrected. They are only inverses >>> on the instance level. I'll explain this further below. >>> >>> >>>> For the biological examples so far, both the spliceosome ones in >>>> component >>>> that are already in and also a has_part relationship that I plan to add >>>> in >>>> process (see go/scratch/RNAsurveillance.obo), it makes sense to still >>>> define a parent-child relationship, just in the opposite direction of >>>> part_of. For part_of, one can can say that the smaller/more >>>> granular/child >>>> thing is part_of the larger/less granular/parent. For has_part, the >>>> converse, one can say that the larger/less granular/parent has the >>>> smaller/more granular/child thing as a part. >>>> >>> But be aware that in the first case the subject of the relationship is the >>> child and the target/object is the parent. In the second case the subject >>> is >>> the parent and the target/object is the child. This reversal is bound to >>> confuse people used to equating the two. This is why I advocate abandoning >>> the terminology. That and the fact that there is no intuitive mappings of >>> parent/child to other relations. >>> >> >> I am completely aware that has_part reverses the direction of the >> relationship as compared to is_a and part_of. However, it seems to me >> that the parent/child terminology and the subject/target terminology >> describe different things. Parent/child seems to convey information >> about the granularity, i.e. the child is smaller, or more specific >> than than the parent. subject/target conveys information about the >> direction of the relationship. >> >> By only using the subject/target information to draw the relationship, >> the terms related by has_part relationships all appear inverted, in >> both trees and in graphs, in a way that appears biologically wrong to >> people used to reading outline form or hierarchy graphs. It seems that >> if we could use both parent/child and subject/target information, we >> should show the information in a meaningful way that made sense to >> biologists and still showed the direction of the has_part >> relationship. >> >> >>> Furthermore, the fact that we choose to make this terminological >>> distinction >>> has absolutely no bearing on the behavior of all tools which will draw >>> things like this >>> >>> target >>> [rel] subject >>> >>> in denormalized tree displays >>> >>> and >>> >>> target >>> ^ >>> | >>> subject >>> >>> in graph displays >>> >> >> Yes, that's the crux of it isn't it. For has_part, I don't think this >> is the correct way to draw the terms. You've talked a bunch about >> sentences and order, but there are other analogies from English that >> people are also used to looking at, such as outline form, where the >> more specific thing is below and to the right of the more general >> thing. For has_part, I would rather see this: >> >> subject >> | >> \/ >> target >> >> and invert the direction of the arrow rather than the biologically >> meaningful parent/child relationship. >> >> >>>> Hiding the has_part relationship so that only ontology editors see it is >>>> also not an acceptable permanent solution. Eurie and Mike are not pleased >>>> that the introduction of the has_part relationship, and its absence from >>>> the main file, means that SGD has lost information from our displays that >>>> used to be present. So, we need to be displaying has_part to users, and >>>> in >>>> a way that makes sense. >>>> >>> I'm sorry that you're not happy with the introduction of the has_part >>> relationship. It was discussed for some time on this list prior to its >>> introduction, as was the plan regarding gene_ontology_ext. I'm not sure >>> why >>> you, Mike or Eurie have not mentioned this until now. >>> >> >> I was intimately involved in the introduction of the has_part >> relationship and the development of the spliceosomal complex terms >> that brought it in; I may have even suggested it in the SF >> item. Personally, I think it is far better than the horrible kludgy >> terms we made for a similar situation in the TFIIH complex terms. >> >> However, I do not think that we fully considered the impact of >> introducing this relationship, and using it to replace existing >> part-of relationships. While it is true that we have not introduced >> something new and potentially confusing into the the main users' file; >> we have also effectively removed information that used to be present >> in that file. >> >> >>> From the above, I'm not quite sure what it is SGD is not happy about - the >>> introduction of the has_part relationship or its absence from the main GO >>> file? >>> >> >> >>> If the latter then SGD is free to use gene_ontology_ext. Be aware you will >>> have to modify your software to avoid the unintuitive display seen in >>> oboedit. You will also have to make sure the software developers >>> understand >>> the semantics of ontology relationships. With the introduction of has_part >>> it's untenable to carry on using basic DAG traversal algorithms. I'm very >>> happy to talk with the software developers of the various MODs and tool >>> developers to help them understand this (I've already done this with a >>> few). >>> I also appreciate that we could do with more extensive documentation here. >>> >> >> Not exactly either of those things. Mike and Eurie are not happy with >> the fact that SGD's displays are now missing information that used to >> be present via part_of relationships. Our displays and tools no longer >> show a connection between things like the U5 snRNP and any >> spliceosomal complex. >> >> However, we are not yet in a position to use gene_ontology_ext because >> we are aware that there is a lot of software that would need to be >> modified in order for us to load that file and we have other >> priorities at the moment. >> >> >>>> So, I still contend that the fact that the has_part relationship is >>>> inverting the display of terms from the way that makes sense is NOT OK. >>>> >>> - we knew the introduction of has_part would contradict assumptions and >>> confuse people, which is why its introduction was delayed for so long and >>> it >>> was introduced vary carefully >>> - all software constructed according to assumptions surrounding the >>> original >>> GO will display has_part in an unintuitive way in tree-type displays. it >>> will also produce the wrong answers to queries. >>> - no one is saying this is a good thing, that's just how it is. >>> - this is why we only expose it in gene_ontology_ext, for sufficiently >>> advanced tools >>> - this solution is not ideal for GO ontology developers such as yourself >>> - we should constructively work towards better solutions >>> >>> One improvement is to simply not show the has_part relationship in >>> ontology >>> tree editor like denormalized tree displays. I'm not saying this is a >>> panacea or that it is 100% perfect. It's just a practical, simple, >>> achievable step that doesn't involve completely rewriting display >>> algorithms. That's all. >>> >> >> While I don't like the way it's displaying now, I'm not sure I would >> consider removing has_part relationships from displays in OE to be an >> improvement, personally. It seems that would make it even harder to >> detect that there is an existing relationship between something like >> the U2 snRNP and a spliceosomal complex term. >> >> >>> Let's consider other options that may involve some software rewriting (for >>> everyone, not just OE). I presume you would prefer something that >>> preserves >>> a visual structure with the broader entity at the top and the narrower at >>> the bottom, such as: >>> >>> U2-type spliceosomal complex >>> U2-type prespliceosome >>> U1 snRNP >>> >>> (my assumptions may be wrong, correct me if they are) >>> >>> Here are the transformation steps required to get tools to display things >>> in >>> this way: >>> >>> Given the ontology contains the relationships: >>> >>> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >>> (all) U2-type prespliceosome [is_a] U2-type spliceosomal complex >>> the tool has to first infer from the relationship: >>> >>> (all) U2-type prespliceosome [has_part] (some) U1 snRNP >>> >>> the inverse relationship: >>> >>> U1 snRNP [part_of_all] U2-type prespliceosome >>> >>> ****note the relation**** We haven't explicitly named the type-level >>> inverse of has_part until now. I'm using [part_of_all] here to indicate >>> the >>> unusual inverted all-some direction but am open to other names. The >>> semantics are: >>> >>> X part_of_all Y <-> every instance of Y (instance level)has_part >>> some X >>> >>> Hopefully everyone understands why part_of and has_part are not inverses >>> on >>> the type level: >>> >>> (all) U2-type prespliceosome [has_part] (some) U1 snRNP -- TRUE >>> (all) U1 snRNP [part_of] (some) U2-type prespliceosome -- FALSE >>> >>> If the tool is then configured to hide has_part but to show the inferred >>> inverse then default display algorithms will show: >>> >>> U2-type spliceosomal complex >>> [is_a] U2-type prespliceosome >>> [part_of_all] U1 snRNP >>> >>> Is this the kind of thing you are getting at? If not some diagrams would >>> help me. It's also easy for me to come down to SGD and discuss this with a >>> whiteboard to help. >>> >>> Assuming that it is -- with a bit of work it would be possible to get OE >>> to >>> show things this way. It would be considerably more work to do this across >>> the board for all tools that visualize the GO in some way. >>> >>> I would strongly advocate that even given sufficient developer hours it is >>> better *not* to display the ontology in this way at all. It perhaps looks >>> more comforting, but people will make the same comforting assumptions that >>> no longer hold. For example, it looks like there is some kind of >>> transitive >>> relationship between U1 snRNP and U2-type spliceosomal complex ***which >>> there is not***. It looks like the true path rule might hold. ***it does >>> not*** . Queries for U2-type spliceosomal complex should ***not*** return >>> gene products localized to U1 snRNP complex. >>> >> >> Why should "Queries for U2-type spliceosomal complex should ***not*** >> return gene products localized to U1 snRNP complex."? I would have >> thought that they should. If I wanted to know the parts of "U2-type >> spliceosomal complex" I would want to know all the things that compose >> the series of complexes that are all considered to be a "U2-type >> spliceosomal complex". >> >> [Note that we need to revise the defs of "U2-type spliceosomal complex >> ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; >> GO:0005689") to be consistent with the def of the parent term >> "spliceosomal complex ; GO:0005681" and specify that these terms >> represent series of complexes. I'll submit a SF item for this.] >> >> >>> I'm happy to be overruled by majority vote here and consider recommending >>> this sort of display for all tools. I've had this discussion a few times >>> before with others, most people start off wanting to retain the comforting >>> broad/narrow visual structure, but on understanding the semantics of >>> has_part change their mind. >>> >> >> I prefer to call the broad/narrow structure sensible with respect to >> the biology. It seems that we can display the fact that the has_part >> relationship goes in the opposite direction by changing the direction >> of the arrow rather than by destoying the broad/narrow display that >> conveys different information. >> >> >>> There are some alternatives for tree type displays. One is something like >>> this: >>> >>> nucleus >>> [p] nuclear part >>> [i] small nuclear ribonucleoprotein complex >>> [i] U1 snRNP [part_of_all : U2-type spliceosome, >>> penta-snRNP complex, ...] >>> [i] U2 snRNP [part_of_all : U2-type spilceosome, >>> penta-snRNP complex, ...] >>> >>> This maintains the convention of having the direction of implication flow >>> from bottom right to top left, i.e. the true path rule. every U1 snRNP is >>> part_of the nucleus. >>> >>> I think it does make sense to show has_part in graph displays without >>> introducing any kind of relation transformation. It might be possible to >>> change the layout algorithm such that vertical layout correlates with >>> relative size yet the arrows still accurately depict the ontology >>> relationships. I include an example at the end of this file (although line >>> crossing will always be a massive problem here). >>> >>> Hopefully this is at least partly convincing you that this is not a bug >>> and >>> that oboedit and other tools are just accurately depicting the >>> relationships >>> in a consistent manner, and that solutions to the fact that this is >>> unintuitive are non-trivial. >>> >> >> Oh, I believe you that it's not a bug, and that solutions are >> non-trivial. I would consider this a significant design flaw that it >> is very important that we fix. Not just for OE, we are going to have >> to solve this issue in order to present reasonable displays to users, >> many of whom will probably never understand ontology design. >> >> >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day >> trial. Simplify your report design, integration and deployment - and focus >> on what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Geneontology-oboedit-working-group mailing list >> Geneontology-oboedit-working-group at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/geneontology-oboedit-working-group >> > > > -- > Alexander D. Diehl, Ph.D. > Senior Scientific Curator > Mouse Genome Informatics > The Jackson Laboratory > 600 Main Street > Bar Harbor, ME 04609 > > email: adiehl at informatics.jax.org > work: +1 (207) 288-6427 > fax: +1 (207) 288-6131 > From kchris at genome.stanford.edu Wed Aug 26 09:37:51 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Wed, 26 Aug 2009 09:37:51 -0700 (PDT) Subject: [Ontology-editors] editing freeze please - from noon-4 Pacific time Message-ID: Hi, I'd like to try to merge in some RNA surveillance terms this afternoon. Since it's my first attempt to use obomerge, I'd like to block off the afternoon if no one minds. But if anyone has editing plans for that time, shoot me a mail and let's see if we can work out a mutually agreeable schedule. thanks, -Karen From kchris at genome.stanford.edu Wed Aug 26 15:13:40 2009 From: kchris at genome.stanford.edu (Karen Christie) Date: Wed, 26 Aug 2009 15:13:40 -0700 (PDT) Subject: [Ontology-editors] [OBO-Edit Working Group] displaying has_part In-Reply-To: <7F2B6B8A-0F59-4F48-9274-B6AFC0F24493@berkeleybop.org> References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> <7F2B6B8A-0F59-4F48-9274-B6AFC0F24493@berkeleybop.org> Message-ID: OK, I understand your explanation about "Why should "Queries for U2-type spliceosomal complex should ***not*** return gene products localized to U1 snRNP complex." But I think that is separate from the display issue I have with has_part. However, it also brings up a different issue. A question that a biologist might want to ask is "give me all the proteins in the major spliceosome". They would expect to get the proteins present in the 5 snRNPs (U1, U2, U4, U5, and U6). By replacing part_of relations with has_part, it seems that they might have difficulty getting a full answer to that question. In your example where g1 is annotated only to "U1 snRNP", you give the hypothetical result: > There are no genes with products known to be localized to 'U2-type > spliceosomal complex'. > However, every 'U2-type spliceosomal complex' has the following > parts: > a U1 snRNP -- genes: g1 > a U2 snRNP -- genes: none However, if I wanted to get all proteins in the major spliceosome, I don't want just gene products that are in EVERY 'U2-type spliceosomal complex', I want gene products that are in ANY 'U2-type spliceosomal complex'. What do we need to do to make this a question that biologists get a reasonable answer to? I understand that GO needs to implement more rigorous logic, but if we do it in a way that prevents biologists from getting reasonable answers, we are not doing what we set out to do. -Karen On Mon, 24 Aug 2009, Chris Mungall wrote: > > On Aug 24, 2009, at 10:36 AM, Karen Christie wrote: > > [snipped part of dialog for now to focus on one issue] > >>> I would strongly advocate that even given sufficient developer hours it is >>> better *not* to display the ontology in this way at all. It perhaps looks >>> more comforting, but people will make the same comforting assumptions that >>> no longer hold. For example, it looks like there is some kind of >>> transitive >>> relationship between U1 snRNP and U2-type spliceosomal complex ***which >>> there is not***. It looks like the true path rule might hold. ***it does >>> not*** . Queries for U2-type spliceosomal complex should ***not*** return >>> gene products localized to U1 snRNP complex. >> >> Why should "Queries for U2-type spliceosomal complex should ***not*** >> return gene products localized to U1 snRNP complex."? > > OK, good, I think we are circling on on the crux of the issue here. > > According to GO, it is not necessarily the case that a gp localized to a U1 > snRNP is necessarily localized to a U2-type spliceosomal complex. For > example, the gp may be localized to a particular U1 snRNP that is part of a > penta-snRNP complex (I may not have chosen the best example as we're lacking > annotations to this new term, but I can pull out an analogous example if > you're not convinced) > > This can be seen in the sub-graph which I reproduce at the end of this email > (let me know if this is not visible in some email programs, I can make a wiki > page with this all on it) - there is no path following the arrows from U1 > snRNP to U2-type spliceosomal complex. Note that attempting to show the graph > in an "intuitive" way as I have attempted to do below, with the smaller > entities at the bottom, is actually *misleading* because it leads one to > assume that there is some inferred all-some relationship between these two > terms when in fact there is not. > > Biology is complex and logic is hard. There's no escaping this. I don't > believe we should simplify either to the point where we get false positives. > I do think we need better ways of displaying this complex information, but I > think we should focus resources on doing this in end user-facing tools rather > than oboedit, as we would hope everyone using oboedit to edit the ontology > would have an understanding of the logic or a willingness to learn. > > Here is a roughly sketched out example of how this could work > > Let's say PMID:123 describes an observation of a product of g1 being > localized to a 'U1 snRNP' via an IDA. Let's say that's all we know, either > due to the resolution of the assay or that's all that the annotator > specified. The user queries for 'U2-type spliceosomal complex' > > The query result screen could show something like: > > There are no genes with products known to be localized to 'U2-type > spliceosomal complex'. > However, every 'U2-type spliceosomal complex' has the following > parts: > a U1 snRNP -- genes: g1 > a U2 snRNP -- genes: none > > A more advanced tool would be able to show even more: > > PMID:123 shows g1 in U1 snRNP. > prob('U2-type spliceosomal complex') = 0.83 > prob('penta-snRNP complex') = 0.05 > ... > > This leads the user to terms of relevance, shows what is known, shows what > might be the case, and does not show anything that is false. > >> I would have >> thought that they should. If I wanted to know the parts of "U2-type >> spliceosomal complex" I would want to know all the things that compose >> the series of complexes that are all considered to be a "U2-type >> spliceosomal complex". > > I'm not sure I really understand the statement, it sounds tautological, I > don't understand what adding "series of" adds. > > If the question is "what parts can be found in every U2-type spliceosomal > complex" then the answer is found via the has_part relation and it's closure. > > However, this is a different question from "what gene products have been > observed to be present in a U2-type spliceosomal complex > >> [Note that we need to revise the defs of "U2-type spliceosomal complex >> ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; >> GO:0005689") to be consistent with the def of the parent term >> "spliceosomal complex ; GO:0005681" and specify that these terms >> represent series of complexes. I'll submit a SF item for this.] > > https://sourceforge.net/tracker/index.php?func=detail&aid=2843718&group_id=36855&atid=440764 > > I don't understand the motivation here. I think the definitions should employ > a consistent style, but I would change the parent from: > > GO:0005681 ! spliceosomal complex [DEF: "Any of a series of ribonucleoprotein > complexes that... > > to > > GO:0005681 ! spliceosomal complex [DEF: "A ribonucleoprotein complex that... > > I'm not sure how adding "any of a series of..." changes the meaning, it seems > to just add extra verbiage that obfuscates the definition. > > > From cjm at berkeleybop.org Thu Aug 27 13:46:36 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Thu, 27 Aug 2009 13:46:36 -0700 Subject: [Ontology-editors] [OBO-Edit Working Group] displaying has_part In-Reply-To: References: <40390C5E-D152-4B85-9D13-65EED54F0A81@berkeleybop.org> <46666E98-3809-4DD3-BE3E-11A586C8F8B3@berkeleybop.org> <7F2B6B8A-0F59-4F48-9274-B6AFC0F24493@berkeleybop.org> Message-ID: <0FF6CC9D-00D0-4A02-80BB-23930D85CADF@berkeleybop.org> Traveling at the moment so apologies for the cursory answer, more later. Currently in GO you can ask questions of the form [1] What gene products are found in ANY/SOME (formally: for what P does there exist an instance p such that p is located in some C) But you cannot ask questions of the form: [2] What gene products are in ALL/EVERY (formally: for what C does it hold that all instances of C have as part some instance p of type P) You can ask questions of type [1] and then interpret them as if you were asking questions of type [2], but this will often yield false positives, especially higher up the graph. For example, here is the answer to the question "what genes have products are found in SOME HDA1 complexes": http://amigo.geneontology.org/cgi-bin/amigo/term-assoc.cgi?gptype=all&speciesdb=SGD&taxid=all&evcode=all&term_assocs=all&term=GO%3A0070823&action=filter You can interpret this page as the answer to the question "what gene products are found in ALL HDA1 complexes", and in this case you may be correct. But if you start looking at less granular classes then the answer will be wrong if this is the question. I think it would be good to be able to answer both kinds of questions. We will be some way to being able to answer the second kind of question with the CC-PRO xp definitions. But to go all the way would require an extension of the GAF format and a change to annotation practice to explicitly indicate that an annotation indicates that all instances of the complex in a species have the gene product as part. I am interested in pursuing this but I'm not sure where this figures in everyone's priority. On Aug 26, 2009, at 3:13 PM, Karen Christie wrote: > OK, I understand your explanation about "Why should "Queries for U2- > type > spliceosomal complex should ***not*** return gene products localized > to U1 snRNP complex." But I think that is separate from the display > issue I have with has_part. > > However, it also brings up a different issue. A question that a > biologist might want to ask is "give me all the proteins in the major > spliceosome". They would expect to get the proteins present in the 5 > snRNPs (U1, U2, U4, U5, and U6). By replacing part_of relations with > has_part, it seems that they might have difficulty getting a full > answer to that question. In your example where g1 is annotated only to > "U1 snRNP", you give the hypothetical result: > >> There are no genes with products known to be localized to 'U2- >> type >> spliceosomal complex'. >> However, every 'U2-type spliceosomal complex' has the following >> parts: >> a U1 snRNP -- genes: g1 >> a U2 snRNP -- genes: none > > However, if I wanted to get all proteins in the major spliceosome, I > don't want just gene products that are in EVERY 'U2-type spliceosomal > complex', I want gene products that are in ANY 'U2-type spliceosomal > complex'. What do we need to do to make this a question that > biologists > get a reasonable answer to? > > I understand that GO needs to implement more rigorous logic, but if we > do it in a way that prevents biologists from getting reasonable > answers, we are not doing what we set out to do. > > -Karen > > > > > On Mon, 24 Aug 2009, Chris Mungall wrote: > >> >> On Aug 24, 2009, at 10:36 AM, Karen Christie wrote: >> >> [snipped part of dialog for now to focus on one issue] >> >>>> I would strongly advocate that even given sufficient developer >>>> hours it is >>>> better *not* to display the ontology in this way at all. It >>>> perhaps looks >>>> more comforting, but people will make the same comforting >>>> assumptions that >>>> no longer hold. For example, it looks like there is some kind of >>>> transitive >>>> relationship between U1 snRNP and U2-type spliceosomal complex >>>> ***which >>>> there is not***. It looks like the true path rule might hold. >>>> ***it does >>>> not*** . Queries for U2-type spliceosomal complex should >>>> ***not*** return >>>> gene products localized to U1 snRNP complex. >>> Why should "Queries for U2-type spliceosomal complex should >>> ***not*** >>> return gene products localized to U1 snRNP complex."? >> >> OK, good, I think we are circling on on the crux of the issue here. >> >> According to GO, it is not necessarily the case that a gp localized >> to a U1 snRNP is necessarily localized to a U2-type spliceosomal >> complex. For example, the gp may be localized to a particular U1 >> snRNP that is part of a penta-snRNP complex (I may not have chosen >> the best example as we're lacking annotations to this new term, but >> I can pull out an analogous example if you're not convinced) >> >> This can be seen in the sub-graph which I reproduce at the end of >> this email (let me know if this is not visible in some email >> programs, I can make a wiki page with this all on it) - there is no >> path following the arrows from U1 snRNP to U2-type spliceosomal >> complex. Note that attempting to show the graph in an "intuitive" >> way as I have attempted to do below, with the smaller entities at >> the bottom, is actually *misleading* because it leads one to assume >> that there is some inferred all-some relationship between these two >> terms when in fact there is not. >> >> Biology is complex and logic is hard. There's no escaping this. I >> don't believe we should simplify either to the point where we get >> false positives. I do think we need better ways of displaying this >> complex information, but I think we should focus resources on doing >> this in end user-facing tools rather than oboedit, as we would hope >> everyone using oboedit to edit the ontology would have an >> understanding of the logic or a willingness to learn. >> >> Here is a roughly sketched out example of how this could work >> >> Let's say PMID:123 describes an observation of a product of g1 >> being localized to a 'U1 snRNP' via an IDA. Let's say that's all we >> know, either due to the resolution of the assay or that's all that >> the annotator specified. The user queries for 'U2-type spliceosomal >> complex' >> >> The query result screen could show something like: >> >> There are no genes with products known to be localized to 'U2-type >> spliceosomal complex'. >> However, every 'U2-type spliceosomal complex' has the following >> parts: >> a U1 snRNP -- genes: g1 >> a U2 snRNP -- genes: none >> >> A more advanced tool would be able to show even more: >> >> PMID:123 shows g1 in U1 snRNP. >> prob('U2-type spliceosomal complex') = 0.83 >> prob('penta-snRNP complex') = 0.05 >> ... >> >> This leads the user to terms of relevance, shows what is known, >> shows what might be the case, and does not show anything that is >> false. >> >>> I would have >>> thought that they should. If I wanted to know the parts of "U2-type >>> spliceosomal complex" I would want to know all the things that >>> compose >>> the series of complexes that are all considered to be a "U2-type >>> spliceosomal complex". >> >> I'm not sure I really understand the statement, it sounds >> tautological, I don't understand what adding "series of" adds. >> >> If the question is "what parts can be found in every U2-type >> spliceosomal complex" then the answer is found via the has_part >> relation and it's closure. >> >> However, this is a different question from "what gene products have >> been observed to be present in a U2-type spliceosomal complex >> >>> [Note that we need to revise the defs of "U2-type spliceosomal >>> complex >>> ; GO:0005684" (and its sibling "U12-type spliceosomal complex ; >>> GO:0005689") to be consistent with the def of the parent term >>> "spliceosomal complex ; GO:0005681" and specify that these terms >>> represent series of complexes. I'll submit a SF item for this.] >> >> https://sourceforge.net/tracker/index.php?func=detail&aid=2843718&group_id=36855&atid=440764 >> >> I don't understand the motivation here. I think the definitions >> should employ a consistent style, but I would change the parent from: >> >> GO:0005681 ! spliceosomal complex [DEF: "Any of a series of >> ribonucleoprotein complexes that... >> >> to >> >> GO:0005681 ! spliceosomal complex [DEF: "A ribonucleoprotein >> complex that... >> >> I'm not sure how adding "any of a series of..." changes the >> meaning, it seems to just add extra verbiage that obfuscates the >> definition. >> >> >> > From cherry at stanford.edu Thu Aug 27 22:15:52 2009 From: cherry at stanford.edu (Mike Cherry) Date: Thu, 27 Aug 2009 22:15:52 -0700 Subject: [Ontology-editors] OBO checking script found an error References: <200908280230.n7S2U2Ge020566@meatloaf.Stanford.EDU> Message-ID: Begin forwarded message: > From: root at genome.Stanford.EDU (Cron Daemon) > Date: August 27, 2009 7:30:02 PM PDT > To: gocvs at genome.stanford.edu > Subject: Cron /share/go/bin/build_obo2obo-plus- > copy.pl > > ERROR: ascii-check: Line: def: "The chemical reactions and pathways > occurring in the nucleus and resulting in the breakdown of a > ribosomal RNA (rRNA) molecule, including RNA fragments released as > part of processing the primary transcript into multiple mature rRNA > species, initiated by the enzymatic addition of a sequence of > adenylyl residues (polyadenylation) at the 3' end the target > rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang F, Butler JS, > Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", PMID: > 15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, > Tollervey D ?(2007)", PMID:18591258 "Milligan L, et al. (2008)"] > check-obo-for-standard-release.pl failed: Inappropriate ioctl for > device From midori at ebi.ac.uk Fri Aug 28 05:47:17 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Fri, 28 Aug 2009 13:47:17 +0100 (BST) Subject: [Ontology-editors] OBO checking script found an error In-Reply-To: References: <200908280230.n7S2U2Ge020566@meatloaf.Stanford.EDU> Message-ID: I think I've fixed this. m On Thu, 27 Aug 2009, Mike Cherry wrote: > > > Begin forwarded message: > >> From: root at genome.Stanford.EDU (Cron Daemon) >> Date: August 27, 2009 7:30:02 PM PDT >> To: gocvs at genome.stanford.edu >> Subject: Cron /share/go/bin/build_obo2obo-plus-copy.pl >> >> ERROR: ascii-check: Line: def: "The chemical reactions and pathways >> occurring in the nucleus and resulting in the breakdown of a ribosomal RNA >> (rRNA) molecule, including RNA fragments released as part of processing the >> primary transcript into multiple mature rRNA species, initiated by the >> enzymatic addition of a sequence of adenylyl residues (polyadenylation) at >> the 3' end the target rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang >> F, Butler JS, Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", >> PMID:15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, >> Tollervey D (2007)", PMID:18591258 "Milligan L, et al. (2008)"] >> check-obo-for-standard-release.pl failed: Inappropriate ioctl for device > > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors From aji at ebi.ac.uk Fri Aug 28 15:10:12 2009 From: aji at ebi.ac.uk (Amelia Ireland) Date: Fri, 28 Aug 2009 15:10:12 -0700 Subject: [Ontology-editors] XP docs Message-ID: Hello people, I've drafted some XP docs for the GO website if you want to have a look at them: http://geneontology.org/test-html.shtml If anyone has suggestions for what should go in that last paragraph, they'd be gratefully received! Thanks, Amelia. -- Amelia Ireland GO Editorial Office http://www.berkeleybop.org || http://www.ebi.ac.uk Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ From cherry at stanford.edu Fri Aug 28 21:05:24 2009 From: cherry at stanford.edu (Mike Cherry) Date: Fri, 28 Aug 2009 21:05:24 -0700 Subject: [Ontology-editors] obo check References: <200908290230.n7T2U2P6015266@meatloaf.Stanford.EDU> Message-ID: <3B708AD8-5F01-4600-8796-264ABFDFA046@stanford.edu> Another error, or the same one? Begin forwarded message: > From: root at genome.Stanford.EDU (Cron Daemon) > Date: August 28, 2009 7:30:02 PM PDT > To: gocvs at genome.stanford.edu > Subject: Cron /share/go/bin/build_obo2obo-plus- > copy.pl > > ERROR: ascii-check: Line: def: "The chemical reactions and pathways > occurring in the nucleus and resulting in the breakdown of a > ribosomal RNA (rRNA) molecule, including RNA fragments released as > part of processing the primary transcript into multiple mature rRNA > species, initiated by the enzymatic addition of a sequence of > adenylyl residues (polyadenylation) at the 3' end the target > rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang F, Butler JS, > Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", PMID: > 15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, > Tollervey D ?(2007)", PMID:18591258 "Milligan L, et al. (2008)"] > check-obo-for-standard-release.pl failed: Inappropriate ioctl for > device From midori at ebi.ac.uk Sat Aug 29 03:44:48 2009 From: midori at ebi.ac.uk (Midori Harris) Date: Sat, 29 Aug 2009 11:44:48 +0100 (BST) Subject: [Ontology-editors] obo check In-Reply-To: <3B708AD8-5F01-4600-8796-264ABFDFA046@stanford.edu> References: <200908290230.n7T2U2P6015266@meatloaf.Stanford.EDU> <3B708AD8-5F01-4600-8796-264ABFDFA046@stanford.edu> Message-ID: Looks like the same problem, but on a different line (there were a few lines with the same offending character, and I missed one). Fixing now. m On Fri, 28 Aug 2009, Mike Cherry wrote: > Another error, or the same one? > > Begin forwarded message: > >> From: root at genome.Stanford.EDU (Cron Daemon) >> Date: August 28, 2009 7:30:02 PM PDT >> To: gocvs at genome.stanford.edu >> Subject: Cron /share/go/bin/build_obo2obo-plus-copy.pl >> >> ERROR: ascii-check: Line: def: "The chemical reactions and pathways >> occurring in the nucleus and resulting in the breakdown of a ribosomal RNA >> (rRNA) molecule, including RNA fragments released as part of processing the >> primary transcript into multiple mature rRNA species, initiated by the >> enzymatic addition of a sequence of adenylyl residues (polyadenylation) at >> the 3' end the target rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang >> F, Butler JS, Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", >> PMID:15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, >> Tollervey D (2007)", PMID:18591258 "Milligan L, et al. (2008)"] >> check-obo-for-standard-release.pl failed: Inappropriate ioctl for device > > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors From cjm at yuri.lbl.gov Fri Aug 28 07:52:10 2009 From: cjm at yuri.lbl.gov (Chris Mungall) Date: Fri, 28 Aug 2009 07:52:10 -0700 Subject: [Ontology-editors] problem with obo file Message-ID: ERROR: ascii-check: Line: def: "The chemical reactions and pathways occurring in the nucleus and resulting in the breakdown of a ribosomal RNA (rRNA) molecule, including RNA fragments released as part of processing the primary transcript into multiple mature rRNA species, initiated by the enzymatic addition of a sequence of adenylyl residues (polyadenylation) at the 3' end the target rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang F, Butler JS, Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", PMID:15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, Tollervey D ???(2007)", PMID:18591258 "Milligan L, et al. (2008)"] From cjm at yuri.lbl.gov Sat Aug 29 07:52:11 2009 From: cjm at yuri.lbl.gov (Chris Mungall) Date: Sat, 29 Aug 2009 07:52:11 -0700 Subject: [Ontology-editors] problem with obo file Message-ID: ERROR: ascii-check: Line: def: "The chemical reactions and pathways occurring in the nucleus and resulting in the breakdown of a ribosomal RNA (rRNA) molecule, including RNA fragments released as part of processing the primary transcript into multiple mature rRNA species, initiated by the enzymatic addition of a sequence of adenylyl residues (polyadenylation) at the 3' end the target rRNA." [GOC:df, GOC:krc, PMID:15173578 "Kuai L, Fang F, Butler JS, Sherman F (2004)", PMID:15572680 "Fang F, et al. (2004)", PMID:15935758 "LaCava J, et al. (2005)", PMID:17652137 "Dez C, Dlakic M, Tollervey D ???(2007)", PMID:18591258 "Milligan L, et al. (2008)"] From cjm at berkeleybop.org Mon Aug 31 10:22:12 2009 From: cjm at berkeleybop.org (Chris Mungall) Date: Mon, 31 Aug 2009 10:22:12 -0700 Subject: [Ontology-editors] XP docs In-Reply-To: References: Message-ID: Looks very good! Couple of comments: > If we use ontology terms in the genus and the differentia, we > can see that these logical definitions take the general form > > term relation term > The problem is we already use a triple of the form to denote all X R some Y I suggest that instead of term relation term You write term and relation term or term that relation term e.g. instead of > lysosomal membrane is membrane surrounding lysosome > we should have: lysosomal membrane is (a) membrane that surrounds (a) lysosome This is consistent with what you have above, and with the formal semantics. Also the actual relation used is 'surrounds' not 'surrounding'. Of course it's not hard to have a simple algorithm that translates a logical definition of the form lysosomal membrane = membrane that surrounds lysosome To a more user-friendly a lysosome membrane is a membrane surrounding a lysosome But this documentation should not be dependent on such an algorithm. The docs should probably not shy away from showing the relevant piece of the obo format stanza and include a translational table for going between the two. E.g. id: X intersection_of: G intersection_of: R1 Y1 intersection_of: Rn Yn <=> X = G that and ... and The docs use the terms 'term', 'concept' and 'category'. I'm not sure what the difference between these are. I think it would be simplest if we use 'class' throughout. Unfortunately the usage of 'term' for what are actually classes is very prevalent in GO. We could at least reduce it to term and class and eliminate concept and category. It may be better to use a species-neutral ontology like cell or chebi instead of the species-specific fly_anatomy. For the internal xp examples it would be good to have examples drawn from the xps that will go live first: the subset of bp_xp_cc that are simple ; regulation xps ; the subset of cc_xp_self that is . We could also have an example of how this is being used to maintain the GO. David and Amina are writing docs for OE users so we should be sure to coordinate. Thanks! On Aug 28, 2009, at 3:10 PM, Amelia Ireland wrote: > Hello people, > > I've drafted some XP docs for the GO website if you want to have a > look at them: > > http://geneontology.org/test-html.shtml > > If anyone has suggestions for what should go in that last paragraph, > they'd be gratefully received! > > Thanks, > Amelia. > > -- > Amelia Ireland > GO Editorial Office > http://www.berkeleybop.org || http://www.ebi.ac.uk > Boycott Trader Joe's Red List seafood: http://traitorjoe.com/ > > > > > > > > > _______________________________________________ > Ontology-editors mailing list > Ontology-editors at geneontology.org > http://fafner.stanford.edu/mailman/listinfo/ontology-editors >