[Go] addition of localization specific process terms ?

Chris Mungall cjm at berkeleybop.org
Tue Mar 24 10:26:29 PDT 2009


On Mar 24, 2009, at 9:00 AM, Jim Hu wrote:

> In general, I like precomposition too.  But for binding, and to a  
> lesser extent location, I don't like the idea of having parent terms  
> with thousands of children.  The terms like regulation of  
> translation of gene X mRNA are terrifying to me.  I noticed that  
> somewhere on wiki.geneontology.org, there's a statement that GO will  
> never do those kinds of terms by precomposition, but a few terms  
> like that are already in GO, and there was recently a sourceforge  
> item about protein chaperones for specific gene products.

I'll let David and Tanya answer the regulation case -)

> I usually find terms by searching for a keyword combination,  
> navigating to a particular term, and then browsing up and down the  
> ontology.  Do others not do the browsing part?  I think that's where  
> the massive expansion is most problematic.
>
> I see what you mean about time, but requesting a new term is also a  
> time barrier to annotation.

True. I am wondering if sometimes annotators simply do not make a  
request and annotate to the most specific currently pre-composed term,  
even if the new term they want falls within the precomposition  
guidelines. I hope not.

> Perhaps a test version of the ontology could be automatically  
> generated with ChEBI x binding, and people could see if my  
> intuitions or everyone else's are correct.

I think this experiment would be biased to confirm your intuitions!

If we materialized the full binding x CHEBI xp then we would end up  
propagating all the unusual CHEBI distinctions and naming issues in GO.

Anyway, I don't think anyone has ever argued in favour of  
materializing the full cross-product. The GO pre-composition rules  
would state that we would create a new "X binding" as child of "Y  
binding", if the process of binding to X was significantly different  
form binding to Y. Whilst the application of this rule will be fuzzy  
in places, it would still be premature to do the full pre-composition.

>  In general, I suspect that people want precomposition for their own  
> annotations and are annoyed at the excess terms that they don't see  
> themselves ever using.  E. coli being the most distant from everyone  
> else in the phylogeny may be why I'm where I am on this! ;)

maybe..

of course it's possible to use the taxon constraints Jen created to  
automatically filter GO such that you (E coli) never see a term like  
mitochondrial translation. I can show folks how to set up this filter  
in OE, and work with the software groups who make annotation tools to  
recapitulate the logic there.

> Jim
>
> On Mar 24, 2009, at 8:38 AM, Alexander Diehl wrote:
>
>> I want to add my agreement to the words of Val and David.  It is  
>> much simpler to use a pre-composed existing term in annotation.   
>> One aspect of the annotation process I feel is over looked as we  
>> add more complexity to the annotation process is that post- 
>> composition adds a significant bit of time to the annotation  
>> process, resulting in fewer annotations overall and lower metrics  
>> for the database and grant.  While it is important to do detailed  
>> and correct annotations whenever possible, anything we can to do to  
>> increase throughput, such as precomposing likely terms, is  
>> beneficial.  I'm not saying we should add all possible combinations  
>> of X and Y, just the appropriate ones.  This is one of the main  
>> reasons for having annotators lead ontology development and holding  
>> ontology content meetings where expert biologists can discuss  
>> processes actually seen in nature, so that the appropriate  
>> combinations of X and Y are added.
>>
>> And knowing which pre-composed terms to use is a matter of training  
>> and experience, both in general biology, and in annotation.   
>> There's no way around it.
>>
>> -- Alex
>>
>>
>> val at sanger.ac.uk wrote:
>>> I agree, it is far better to have pre-composed terms if possible,
>>> especially for new curators.
>>> As we encourage annotation to the most specific term possible it  
>>> is hard
>>> to overlook the precomposed terms, because we (I hope) always  
>>> check the
>>> child terms).
>>>
>>> Val
>>>
>>>
>>>> From someone who has been annotating using a lot of pre- 
>>>> composition as
>>>> well as post-composition for a reasonably long time;  although  
>>>> there is
>>>> an initial activation energy to get a pre-composed term into the
>>>> ontology, once they are there, they are much easier to use than  
>>>> to look
>>>> up things in multiple ontologies for post-composition.
>>>>
>>>> The key to finding pre-composed terms easily is to have a good  
>>>> way of
>>>> viewing the ontology.
>>>>
>>>> my 2c
>>>>
>>>> D
>>>>
>>>>> I would like to hear the opinion of some of the annotators here.  
>>>>> Is
>>>>> excessive pre-coordination a concern for curation?
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Go mailing list
>>>> Go at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> -- 
>> Alexander D. Diehl, Ph.D.
>> Senior Scientific Curator
>> Mouse Genome Informatics
>> The Jackson Laboratory
>> 600 Main Street
>> Bar Harbor, ME  04609
>>
>> email:  adiehl at informatics.jax.org
>> work:  +1 (207) 288-6427
>> fax:  +1 (207) 288-6131
>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>
> =====================================
> Jim Hu
> Associate Professor
> Dept. of Biochemistry and Biophysics
> 2128 TAMU
> Texas A&M Univ.
> College Station, TX 77843-2128
> 979-862-4054
>
>



More information about the Go mailing list