[Go] addition of localization specific process terms ?

Chris Mungall cjm at berkeleybop.org
Wed Mar 4 12:25:12 PST 2009


On Mar 4, 2009, at 11:34 AM, Karen Christie wrote:

> I don't think you can necessarily count on the P and C annotations  
> always being made from the same paper. Certainly for cerevisiae, I  
> see genes where the localization may be shown in one paper and  
> subsequent papers build on that to show process, but may or may not  
> duplicate the experimental demonstration of localization in a way  
> that can be annotated.

Exactly. Using the PMID is a decent *heuristic* but there will either  
be false negatives (above) or false positives (which Alex pointed out)

> However, if the problem is that "co-annotation is insufficient,  
> because the annotations, once entered, are not linked to each  
> other", then possibly there are other solutions than creating all  
> possible localization-specific P terms. In Chris's original email,  
> he wrote this:
>
>  1. post-compose
>        http://wiki.geneontology.org/index.php/ 
> Annotation_Cross_Products
>        - annotate to P and C as normal, using two annotations
>        - in the annotation for P, put "occurs_in(C)" in col17
>
> However, we don't currently explicitly link a P annotation with a C  
> term in the way that he writes in his summary of the post-compose  
> approach.

We don't currently do this but we will start doing this soon. In the  
interim, we have to fairly liberal in pre-composing BPxCC terms that  
annotators request.

> I can see advantages and disadvantages of both approaches, either
> pre-composing or post-composing. However, right now, it seems that we
> are kind of half and half, where we have created localization-specific
> process terms for some things, but not for lots of others.  A big
> problem with this half and half situation for me, as an annotater and
> ontology developer, is that it is not clear when it is, or is not,
> appropriate to ask for a new localization-specific process terms.

I agree. We need more precise guidelines. I think we're getting there.

> If we are on the path of fully instantiating all possible  
> localization-specific process terms, we will have a vast expansion  
> of the number of terms. Even currently, it's not just the term  
> "mitochondrial translation", but the children under that term  
> already duplicate whole sections of the children of "translation",  
> and then there are things we don't even have localization-specific P  
> terms for, like the mitochondrial specific proteases that initiated  
> the discussion...

Yes, it would be unwise to do an advance pre-composition of all  
possible BP-CC combinations. Doing this on an as-requested basis  
potentially creates a bottleneck and may still open the floodgates, if  
we encourage annotators to request terms any time they see a new  
combination -- though it's still debatable if and at what size too big  
becomes a problem. See my response to Jim, particularly w.r.t  
duplicating sections under "translation".

>
> -Karen
>
>
>
>
> On Wed, 4 Mar 2009, Harold Drabkin wrote:
>
>> They are in the sense that they would have the same PMID for both  
>> annotations.  So, two MF activities for the same reaction but  
>> located in different component.... The only difference between the  
>> two activities are that they are done by different proteins. That  
>> should not be a reason for two terms. Conversely, there are  
>> actually proteins that end up in both mitochondria and cytoplasm  
>> AND do the same activity (some AARS's come to mind). In this case 1  
>> activity, 2 component, vs 2 activities two components. But it's the  
>> actual same activity as well as protein, so actually there is no  
>> difference in the activity, only where it is done. hjd
>>
>>
>> Alexander Diehl wrote:
>>> I agree with Chris.  Co-annotation is insufficient, because the  
>>> annotations, once entered, are not linked to each other.
>>> -- Alex
>>> Chris Mungall wrote:
>>>> On Mar 4, 2009, at 9:34 AM, Harold Drabkin wrote:
>>>>> Valerie Wood wrote:
>>>>>> I agree that the aa specific tRNA aminoacylation process terms  
>>>>>> are excessively granular (although I have used them), as they  
>>>>>> are equivalent to the function terms. I still think we need  
>>>>>> terms for mitochondrial tRNA aminoacylation because, although  
>>>>>> the process is the same (and sometimes the gene products  
>>>>>> involved) the process has different target genes and hence  
>>>>>> different biological consequences (and phenotypes).
>>>>> I really think these are annotation issues, not ontology issues.  
>>>>> You are saying that the gene products involved are different.  
>>>>> But we are still talking about translation. We do not, and  
>>>>> should not, have things like "translation of x protein",  
>>>>> translation of y protein"; It's still translation. Different  
>>>>> 'targets"; same process.  The ontology can be used to describe  
>>>>> the overall biology of all sorts of different proteins.
>>>>>> For your simple search, (i.e to retrieve genes involved in  
>>>>>> mitochondrial amino acylation) a combination of mitochondria  
>>>>>> and tRNA aminoacylation would work fine.
>>>>>> However this is not the major use of GO. Increasingly GO is  
>>>>>> used for hypothesis generating exercises with a complete gene  
>>>>>> set, these combination searches are not helpful, you need to be  
>>>>>> able to look for enrichment at the level of process, none of  
>>>>>> the enrichment tools (including the one in AmiGO) can perform  
>>>>>> these inter ontology analyses.
>>>>>> In a genome wide set you would not be able to detect  (for  
>>>>>> example) that the set of essential genes in S. cerevisiae is  
>>>>>> enriched for translation components BUT not for mitochondrial  
>>>>>> translation components, but that in pombe the mitochondrial  
>>>>>> translation components are also essential (this is a real  
>>>>>> example).
>>>>> Why not? If the user understands the GO properly, they would set  
>>>>> up the scan to return things annotated to both the component and  
>>>>> the process. You will see overlap in gene products that appear  
>>>>> enriched to "mitochondria" and "translation" annotations.
>>>> See my original email. co-annotation is insufficient.
>>>> _______________________________________________
>>>> Go mailing list
>>>> Go at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>>
>



More information about the Go mailing list