[Go] addition of localization specific process terms ?
Chris Mungall
cjm at berkeleybop.org
Tue Mar 24 12:43:30 PDT 2009
I agree - but I don't personally have a good intuition of what these
novice annotators need. Perhaps the outreach group and WPWG could come
up with some requirements? So far we have been focusing on end users
but as you say novice annotators are an important group to serve.
On Mar 24, 2009, at 10:10 AM, Gwinn Giglio, Michelle wrote:
>
>
> Hi,
>
> I know this discussion is tabled until the meeting, I just want to
> make a
> small comment on what Chris said earlier about the MODs and UniProt
> having
> access to tools that will make finding the terms easier. I agree
> that is
> probably true, not to mention the MOD's expertise with the GO will
> make it
> much easier for them to find terms. But when we approach this
> discussion, I
> want to remind people that we need to remember the needs of annotators
> beyond the MODs. One of the GO goals its to get more people doing GO
> annotation. Therefore we need to strike a careful balance between
> complexity and usability. Whenever possible it would be great for
> the GO to
> provide tools (browsing and annotation) that make all of this easy for
> novice annotators no matter how complex the underlying systems are
> (but I
> realize this is much easier said than done). We have to remember
> that if
> things get too complicated, people won't even try to use GO.
>
> Michelle
>
>
>
> On 3/23/09 7:37 PM, "David Hill" <dph at informatics.jax.org> wrote:
>
>> It's on the agenda in ontology development, although we may want to
>> move
>> its timing.
>>
>> David
>>
>> Karen Christie wrote:
>>> I actually meant the " when to instantiate localization specific
>>> process terms" issue, though that is perhaps tied up in the col 16
>>> and
>>> 17 discussion too.
>>>
>>> -Karen
>>>
>>>
>>> On Mon, 23 Mar 2009, Chris Mungall wrote:
>>>
>>>>
>>>> Thanks Karen
>>>>
>>>> I guess it makes sense to talk about col 16 (and 17 whilst we are
>>>> there anyway) before the binding discussion?
>>>>
>>>> On Mar 23, 2009, at 4:09 PM, Karen Christie wrote:
>>>>
>>>>> Maybe we should talk about this topic at the GO meeting. While
>>>>> there
>>>>> was lots of discussion, I never really got a sense of what I
>>>>> should
>>>>> actually do now, in terms of when, or when not, to request new
>>>>> "pre-composed" terms.
>>>>>
>>>>> I guess I'll put this on the agenda.
>>>>>
>>>>> -Karen
>>>>>
>>>>>
>>>>> On Mon, 23 Mar 2009, Chris Mungall wrote:
>>>>>
>>>>>>
>>>>>> On Mar 4, 2009, at 12:21 PM, Chris Mungall wrote:
>>>>>>
>>>>>>> On Mar 4, 2009, at 10:33 AM, Jim Hu wrote:
>>>>>>>> On Mar 4, 2009, at 11:49 AM, Chris Mungall wrote:
>>>>>>>>> On Mar 4, 2009, at 7:59 AM, Jim Hu wrote:
>>>>>>>>>> On Mar 4, 2009, at 2:38 AM, Valerie Wood wrote:
>>>>>>>>>>> Because of all of the arguments in favour mentioned by
>>>>>>>>>>> Karen
>>>>>>>>>>> and Chris I thought it was always necessary and required
>>>>>>>>>>> for
>>>>>>>>>>> curators to make the more granular annotation in these
>>>>>>>>>>> cases.
>>>>>>>>>>> We decided long ago that proliferation of the ontology was
>>>>>>>>>>> not
>>>>>>>>>>> an issue when pitched against accurate capture of biology,
>>>>>>>>>>> and I wasn't aware that it was ever GO philosophy not to
>>>>>>>>>>> capture compartment specific processes in this way.
>>>>>>>>>> I wasn't involved in GO when this was decided, but as someone
>>>>>>>>>> who does stuff on the software side as well as the annotation
>>>>>>>>>> side, I think proliferation of the ontology should be an
>>>>>>>>>> issue
>>>>>>>>>> that is not dismissed so lightly.
>>>>>>>>> What are your concerns in particular?
>>>>>>>> My two concerns are the obvious ones, nothing particularly
>>>>>>>> sophisticated:
>>>>>>>> 1) performance, especially of web-based tools that have to
>>>>>>>> display GO with short processing times. IIRC, AmiGO has had
>>>>>>>> this
>>>>>>>> problem - traversing the ontology to find all the children and
>>>>>>>> annotations to children is slow enough that Mike had to write a
>>>>>>>> cron job to kill excess db queries that came from users getting
>>>>>>>> impatient and reloading the page while the traversal was in
>>>>>>>> progress. As the ontology gets big, these traversals take
>>>>>>>> longer. Maybe there are more efficient algorithms to deal with
>>>>>>>> this, maybe AJAX partially makes this tolerable, and maybe the
>>>>>>>> problem is the same with post-composition. But it seems to me
>>>>>>>> that at some point sheer size has a performance hit.
>>>>>>>> 2) User interface. When I browse the ontology to look for the
>>>>>>>> appropriate terms to do an annotation, there are nodes that
>>>>>>>> would
>>>>>>>> be unreadable if precomposition was being done consistently.
>>>>>>>> Fortunately it isn't being done consistently at present. For
>>>>>>>> example, look at the children of the positive and negative
>>>>>>>> regulation terms in the process ontology. There are terms in
>>>>>>>> there for mRNAs for specific genes (oskar and bicoid)! That
>>>>>>>> strikes me as being completely insane... if implemented for all
>>>>>>>> regulated genes in all organisms, that node would have hundreds
>>>>>>>> of thousands of children - it would be a large subset of
>>>>>>>> UniProt/Genbank all at one level. Or worse, because many genes
>>>>>>>> would be present at multiple overpopulated nodes in GO.
>>>>>>
>>>>>> I previously addressed this from an end-user point of view. But
>>>>>> as
>>>>>> Jim mentions in the sf tracker item about binding, it's also
>>>>>> important to consider this from the curation point of view.
>>>>>>
>>>>>> Jim's point is that increased pre-coordination in the ontology
>>>>>> makes it harder for curators, because it will take longer to hone
>>>>>> in on the most appropriate term for an annotation.
>>>>>>
>>>>>> Whilst I can see that obviously there is some correlation between
>>>>>> ontology size and time to find a term, I'm wondering the extent
>>>>>> to
>>>>>> which this is a problem. I would have expected that most
>>>>>> annotation
>>>>>> systems used at the MODs and UniProtKB would utilize some kind of
>>>>>> term completion rather than the curator manually traversing down
>>>>>> the graph. Also, if the curators are expected to post-compose
>>>>>> using
>>>>>> col 16, then they have *two* terms to find: for example to
>>>>>> annotate
>>>>>> "PEP binding" they would find the most specific term in GO *and*
>>>>>> the relevant CHEBI terms (and finding terms in CHEBI is probably
>>>>>> harder than finding terms in GO)
>>>>>>
>>>>>> But I don't annotate so I'm not sure.
>>>>>>
>>>>>> I would like to hear the opinion of some of the annotators
>>>>>> here. Is
>>>>>> excessive pre-coordination a concern for curation?
>>>>>>
>>>>>> I think it would be good if at the meeting a representative
>>>>>> curator
>>>>>> from each of the main annotation producing groups were to comment
>>>>>> on the various situations in which pre-composed terms vs col 16
>>>>>> are
>>>>>> preferred.
>>>>>>
>>>>>>> I would also add another concern that others often bring up:
>>>>>>> 3) Difficulty in maintaining the correct parentage in the
>>>>>>> ontology
>>>>>>> (Karen brought this up in her email)
>>>>>>> However, I would respond to this and say that as we gain
>>>>>>> confidence in using the cross-product definitions and the
>>>>>>> reasoner
>>>>>>> to automate this procedure it becomes less of a concern (not yet
>>>>>>> eliminated, but less of a concern).
>>>>>>> For example, there used to be massive errors in the regulation
>>>>>>> graph, but we now use the reasoner and the regulation xps in
>>>>>>> batch
>>>>>>> frequently, and as soon as OE2 is released we can directly
>>>>>>> incorporate this directly into the ontology editing cycle.
>>>>>>> Thanks
>>>>>>> to Midori's efforts we are making a lot of progress on the more
>>>>>>> difficult BPxCC composite terms, and I feel we will soon be able
>>>>>>> to manage the hierarchy for these terms automatically, making
>>>>>>> pre-composition less of a worry here:
>>>>>>>
>>>>>>>
>>>>>>> http://wiki.geneontology.org/index.php/XP:biological_process_xp_cellular_
>>>>>>> component
>>>>>>>
>>>>>>>> I shudder to think what the graph representations would look
>>>>>>>> like.
>>>>>>>> <snip>
>>>>>>>>>> I find the argument that one can't do an AND with some
>>>>>>>>>> tools to
>>>>>>>>>> be more of an argument to improve the tools than an
>>>>>>>>>> argument to
>>>>>>>>>> do extensive precomposition. If we have to build GO practice
>>>>>>>>>> around the weakest tools, then we should also do explicit
>>>>>>>>>> annotation all the way up to root for every term, to handle
>>>>>>>>>> tools that don't use the true path rule. I'm NOT advocating
>>>>>>>>>> that!!
>>>>>>>>> I agree that we shouldn't avoid doing the right thing
>>>>>>>>> because of
>>>>>>>>> the weakest tools. I think we should have a plan for how we
>>>>>>>>> can
>>>>>>>>> support tools, but I think we first need to agree roughly on
>>>>>>>>> what the right thing is..
>>>>>>>> I think everyone agrees with this!
>>>>>>>> Jim
>>>>>>>>>> Jim
>>>>>>>>>> =====================================
>>>>>>>>>> Jim Hu
>>>>>>>>>> Associate Professor
>>>>>>>>>> Dept. of Biochemistry and Biophysics
>>>>>>>>>> 2128 TAMU
>>>>>>>>>> Texas A&M Univ.
>>>>>>>>>> College Station, TX 77843-2128
>>>>>>>>>> 979-862-4054
>>>>>>>> =====================================
>>>>>>>> Jim Hu
>>>>>>>> Associate Professor
>>>>>>>> Dept. of Biochemistry and Biophysics
>>>>>>>> 2128 TAMU
>>>>>>>> Texas A&M Univ.
>>>>>>>> College Station, TX 77843-2128
>>>>>>>> 979-862-4054
>>>>>>> _______________________________________________
>>>>>>> Go mailing list
>>>>>>> Go at geneontology.org
>>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>>>
>>>>>> _______________________________________________
>>>>>> Go mailing list
>>>>>> Go at geneontology.org
>>>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>>
>>> _______________________________________________
>>> Go mailing list
>>> Go at geneontology.org
>>> http://fafner.stanford.edu/mailman/listinfo/go
>
>
More information about the Go
mailing list