[Go] addition of localization specific process terms ?

Jim Hu jimhu at tamu.edu
Wed Mar 4 10:33:40 PST 2009


On Mar 4, 2009, at 11:49 AM, Chris Mungall wrote:

> On Mar 4, 2009, at 7:59 AM, Jim Hu wrote:
>
>> On Mar 4, 2009, at 2:38 AM, Valerie Wood wrote:
>>
>>> Because of all of the arguments in favour  mentioned by Karen and  
>>> Chris I  thought it was always necessary and required for curators  
>>> to make the more granular annotation in these cases. We decided  
>>> long ago that proliferation of the ontology was not an issue when  
>>> pitched against accurate capture of biology, and  I wasn't aware  
>>> that it was ever GO philosophy not to capture compartment specific  
>>> processes in this way.
>>
>> I wasn't involved in GO when this was decided, but as someone who  
>> does stuff on the software side as well as the annotation side, I  
>> think proliferation of the ontology should be an issue that is not  
>> dismissed so lightly.
>
> What are your concerns in particular?
>>

My two concerns are the obvious ones, nothing particularly  
sophisticated:

1) performance, especially of web-based tools that have to display GO  
with short processing times.  IIRC, AmiGO has had this problem -  
traversing the ontology to find all the children and annotations to  
children is slow enough that Mike had to write a cron job to kill  
excess db queries that came from users getting impatient and reloading  
the page while the traversal was in progress.  As the ontology gets  
big, these traversals take longer.  Maybe there are more efficient  
algorithms to deal with this, maybe AJAX partially makes this  
tolerable, and maybe the problem is the same with post-composition.   
But it seems to me that at some point sheer size has a performance hit.

2) User interface.  When I browse the ontology to look for the  
appropriate terms to do an annotation, there are nodes that would be  
unreadable if precomposition was being done consistently.  Fortunately  
it isn't being done consistently at present.  For example, look at the  
children of the positive and negative regulation terms in the process  
ontology.  There are terms in there for mRNAs for specific genes  
(oskar and bicoid)!  That strikes me as being completely insane... if  
implemented for all regulated genes in all organisms, that node would  
have hundreds of thousands of children - it would be a large subset of  
UniProt/Genbank all at one level.  Or worse, because many genes would  
be present at multiple overpopulated nodes in GO.

I shudder to think what the graph representations would look like.


<snip>
>> I find the argument that one can't do an AND with some tools to be  
>> more of an argument to improve the tools than an argument to do  
>> extensive precomposition.  If we have to build GO practice around  
>> the weakest tools, then we should also do explicit annotation all  
>> the way up to root for every term, to handle tools that don't use  
>> the true path rule.  I'm NOT advocating that!!
>
> I agree that we shouldn't avoid doing the right thing because of the  
> weakest tools. I think we should have a plan for how we can support  
> tools, but I think we first need to agree roughly on what the right  
> thing is..

I think everyone agrees with this!

Jim


>
>>
>> Jim
>>
>> =====================================
>> Jim Hu
>> Associate Professor
>> Dept. of Biochemistry and Biophysics
>> 2128 TAMU
>> Texas A&M Univ.
>> College Station, TX 77843-2128
>> 979-862-4054
>>
>>
>>
>

=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054




More information about the Go mailing list