[Go] addition of localization specific process terms ?

Chris Mungall cjm at berkeleybop.org
Wed Mar 4 12:21:15 PST 2009


On Mar 4, 2009, at 10:33 AM, Jim Hu wrote:

> On Mar 4, 2009, at 11:49 AM, Chris Mungall wrote:
>
>> On Mar 4, 2009, at 7:59 AM, Jim Hu wrote:
>>
>>> On Mar 4, 2009, at 2:38 AM, Valerie Wood wrote:
>>>
>>>> Because of all of the arguments in favour  mentioned by Karen and  
>>>> Chris I  thought it was always necessary and required for  
>>>> curators to make the more granular annotation in these cases. We  
>>>> decided long ago that proliferation of the ontology was not an  
>>>> issue when pitched against accurate capture of biology, and  I  
>>>> wasn't aware that it was ever GO philosophy not to capture  
>>>> compartment specific processes in this way.
>>>
>>> I wasn't involved in GO when this was decided, but as someone who  
>>> does stuff on the software side as well as the annotation side, I  
>>> think proliferation of the ontology should be an issue that is not  
>>> dismissed so lightly.
>>
>> What are your concerns in particular?
>>>
>
> My two concerns are the obvious ones, nothing particularly  
> sophisticated:
>
> 1) performance, especially of web-based tools that have to display  
> GO with short processing times.  IIRC, AmiGO has had this problem -  
> traversing the ontology to find all the children and annotations to  
> children is slow enough that Mike had to write a cron job to kill  
> excess db queries that came from users getting impatient and  
> reloading the page while the traversal was in progress.  As the  
> ontology gets big, these traversals take longer.  Maybe there are  
> more efficient algorithms to deal with this, maybe AJAX partially  
> makes this tolerable, and maybe the problem is the same with post- 
> composition.  But it seems to me that at some point sheer size has a  
> performance hit.
>
> 2) User interface.  When I browse the ontology to look for the  
> appropriate terms to do an annotation, there are nodes that would be  
> unreadable if precomposition was being done consistently.   
> Fortunately it isn't being done consistently at present.  For  
> example, look at the children of the positive and negative  
> regulation terms in the process ontology.  There are terms in there  
> for mRNAs for specific genes (oskar and bicoid)!  That strikes me as  
> being completely insane... if implemented for all regulated genes in  
> all organisms, that node would have hundreds of thousands of  
> children - it would be a large subset of UniProt/Genbank all at one  
> level.  Or worse, because many genes would be present at multiple  
> overpopulated nodes in GO.

I would also add another concern that others often bring up:

3) Difficulty in maintaining the correct parentage in the ontology  
(Karen brought this up in her email)

However, I would respond to this and say that as we gain confidence in  
using the cross-product definitions and the reasoner to automate this  
procedure it becomes less of a concern (not yet eliminated, but less  
of a concern).

For example, there used to be massive errors in the regulation graph,  
but we now use the reasoner and the regulation xps in batch  
frequently, and as soon as OE2 is released we can directly incorporate  
this directly into the ontology editing cycle. Thanks to Midori's  
efforts we are making a lot of progress on the more difficult BPxCC  
composite terms, and I feel we will soon be able to manage the  
hierarchy for these terms automatically, making pre-composition less  
of a worry here:

	http://wiki.geneontology.org/index.php/XP:biological_process_xp_cellular_component
>
> I shudder to think what the graph representations would look like.
>
>
> <snip>
>>> I find the argument that one can't do an AND with some tools to be  
>>> more of an argument to improve the tools than an argument to do  
>>> extensive precomposition.  If we have to build GO practice around  
>>> the weakest tools, then we should also do explicit annotation all  
>>> the way up to root for every term, to handle tools that don't use  
>>> the true path rule.  I'm NOT advocating that!!
>>
>> I agree that we shouldn't avoid doing the right thing because of  
>> the weakest tools. I think we should have a plan for how we can  
>> support tools, but I think we first need to agree roughly on what  
>> the right thing is..
>
> I think everyone agrees with this!
>
> Jim
>
>
>>
>>>
>>> Jim
>>>
>>> =====================================
>>> Jim Hu
>>> Associate Professor
>>> Dept. of Biochemistry and Biophysics
>>> 2128 TAMU
>>> Texas A&M Univ.
>>> College Station, TX 77843-2128
>>> 979-862-4054
>>>
>>>
>>>
>>
>
> =====================================
> Jim Hu
> Associate Professor
> Dept. of Biochemistry and Biophysics
> 2128 TAMU
> Texas A&M Univ.
> College Station, TX 77843-2128
> 979-862-4054
>
>
>



More information about the Go mailing list