[go] AmiGO and annotation qualifiers
Chris Mungall
cjm at fruitfly.org
Fri Feb 1 12:15:08 PST 2008
So I'm following up on Jim's comment on showing contradictions - I'm
just dealing with the simple case where there is a single splice-form.
Harold's case is different, as here there is in fact no
contradiction. I'll answer Harold's in a separate email..
On Feb 1, 2008, at 12:08 PM, Judith Blake wrote:
> hummm
> I think the case that Harold was saying, and that we currently have
> in other annotations here at MGI, is that we have
>
> x A
> x NOT A
>
> both lines of evidence exist at this point.
>
> In some cases, different experiments give different results
>
> In the case that Harold discussed, the issue really was that we
> don't properly distinguish isoforms, so from the gene level, the
> two are combined whereas if you could represent each isoform, the
> one would be x-1 A and one would be x-2 NOT A.
>
> judy
>
> Chris Mungall wrote:
>>
>> On Jan 31, 2008, at 2:04 PM, Jim Hu wrote:
>>
>>>
>>> On Jan 31, 2008, at 1:30 PM, Chris Mungall wrote:
>>>
>>>>
>>>> Unambiguous is good.
>>>>
>>>> A related question:
>>>>
>>>> Let's assume there is a NOT annotation for GO:0008047 enzyme
>>>> activator activity (is_a parent of proteasome activator activity)
>>>>
>>>> If you are on the page for GO:0008538 proteasome activator
>>>> activity, would you expect to see this annotation?
>>>
>>> I'd expect to see this iff there is an unqualified annotation to
>>> the same product. In other words, this is only useful as a
>>> conflict flag IMO.
>>
>> So let's summarise
>>
>> Given an is_a hierarchy:
>>
>> A
>> B
>> C
>>
>> And annotations of gene product x, y and z - here, numbered for
>> reference purposes:
>>
>> [1] x NOT A
>> [2] x B
>> [3] y B
>> [4] y NOT C
>> [5] z A
>> [6] z C
>>
>> The same information shown in the DAG:
>>
>> A : NOT-x, z
>> B : x, y
>> C : NOT-y, z
>>
>> We have a contradicting pair of annotations:
>>
>> [1]+[2]
>>
>> the other pairs are non-contradicting:
>>
>> By your criteria the page for B should show any annotations of
>> gene products that are purported to have property B, *plus* any
>> annotations that contradict the annotations shown. i.e. the page
>> would show:
>>
>> [1] x NOT A
>> [2] x B
>> [3] y B
>> [4] y NOT c
>> [6] z C
>>
>> (structured in some way that reflects the DAG and with the
>> contradiction visually highlighted)
>>
>> Note that [5] is not shown because A is less specific than B, and
>> it contradicts nothing on the page
>>
>> Note that the way AmiGO (and I imagine most annotation browsers)
>> are set up now is such that the following are shown on the detail
>> page for B:
>>
>> [2] x B
>> [3] y B
>> [4] y NOT c
>> [6] z C
>>
>> The fact that [2] has a contradiction with [1] becomes apparent
>> only as focus is shifted up the DAG. We have a simple algorithm
>> that goes down the DAG (or propagates genes up depending how you
>> look at it), ignoring the NOT reversal rule.
>>
>> Which do people prefer?
>>
>> Either way AmiGO will most likely keep the current algorithm for
>> the next release.
>>
>>>>
>>>>
>>>> I think most people's answer would be no, this would be too
>>>> confusing.
>>>>
>>>> However, it is the case that anything that is not an enzyme
>>>> activator is by definition not a proteasome activator activity.
>>>>
>>>>
>>>>
>>>> On Jan 31, 2008, at 11:21 AM, Michael Ashburner wrote:
>>>>
>>>>> I prefer demo3 - it is clear and unambiguous.
>>>>>
>>>>> Michael
>>>>>
>>>>> Michael
>>>>>
>>>>> On 29 Jan 2008, at 14:00, Amelia Ireland wrote:
>>>>>
>>>>>> Hi GO Consortium,
>>>>>>
>>>>>> There has been some discussion in the web presence working
>>>>>> group about the display of NOT annotations and annotations
>>>>>> with qualifiers. There are two issues; how to display the
>>>>>> annotations with qualifiers, and how gene product counts for
>>>>>> terms with qualifiers should be displayed. We'd like to get
>>>>>> some feedback from the consortium on these issues before
>>>>>> releasing the new AmiGO.
>>>>>>
>>>>>> Annotations With Qualifiers
>>>>>> ===========================
>>>>>>
>>>>>> I've made some mock ups of different possible arrangements here:
>>>>>>
>>>>>> http://www.ebi.ac.uk/~aji/demo.html
>>>>>> http://www.ebi.ac.uk/~aji/demo2.html
>>>>>> http://www.ebi.ac.uk/~aji/demo3.html
>>>>>>
>>>>>> These pages are viewing the GPs annotated to a term (and its
>>>>>> children), but a similar arrangement will be used to view the
>>>>>> terms with which a GP is associated.
>>>>>>
>>>>>> Which arrangement is the clearest? Things to consider:
>>>>>>
>>>>>> - how easy is it to see inconsistent annotations, e.g. the
>>>>>> same GP has a normal and a NOT annotation?
>>>>>>
>>>>>> - is it clear what the qualifier applies to?
>>>>>>
>>>>>> - some views (e.g. having the associations with operators/
>>>>>> qualifiers separated out) require extra calculations to
>>>>>> generate and will hence be slower
>>>>>>
>>>>>> - should the default view contain NOT annotations? Is a GO
>>>>>> newbie going to understand them?
>>>>>>
>>>>>> - is the 'nn gene products' link unambiguous in meaning? Is it
>>>>>> helpful to know how many GPs there are annotated to a term?
>>>>>>
>>>>>>
>>>>>> Gene Product Counts
>>>>>> ===================
>>>>>>
>>>>>> The WPWG discussed generating separate totals for numbers of
>>>>>> GPs annotated to a term without any qualifier and the no. of
>>>>>> GPs annotated to a term with each of the other qualifiers
>>>>>> (currently the GP count for a term *includes* any GPs
>>>>>> annotated with 'contributes_to' and 'colocalizes_with', but
>>>>>> *excludes* NOT annotations). How useful would this be, and is
>>>>>> it intuitive to split the counts up thus, or is it confusing
>>>>>> (especially for GO newbies)? There is also this to bear in mind:
>>>>>>
>>>>>> On 24 Jan 2008, at 20:11, Chris Mungall wrote:
>>>>>>> On Jan 24, 2008, at 11:08 AM, Seth Carbon wrote:
>>>>>>>> *) ask chris about splitting the gp count by qualifier (seth)
>>>>>>>
>>>>>>> you mean by contributes_to, the NOT operator etc?
>>>>>>>
>>>>>>> unfortunately it is expensive to pre-compute the gp counts in
>>>>>>> any way that a single gp can end up in two partitions.
>>>>>>>
>>>>>>> we currently partition by gp db name - no gp can be in > 1.
>>>>>>> this has the advantage that we can get the correct total by
>>>>>>> summing the numbers at query time.
>>>>>>>
>>>>>>> however, if we have a partition that does not split gps in
>>>>>>> this way we can't sum across the partitions, which means we
>>>>>>> have to pre-compute for all combinations.
>>>>>>>
>>>>>>> This may not be so bad for the qualifiers however, since the
>>>>>>> qualifiers are relatively rare.
>>>>>>>
>>>>>>> would you really want this for the NOT operator though?
>>>>>>> remember the semantics are easy to get confused here. Would
>>>>>>> you want:
>>>>>>>
>>>>>>> all gps that are NOT GPCRs (ie include those that are
>>>>>>> asserted not-TM receptor in the count) - negation propagates
>>>>>>> up the graph
>>>>>>>
>>>>>>> all gps that have a NOT annotation to some kind of GPCR -
>>>>>>> negation propagates as normal
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any feedback would be gratefully received. Thanks!
>>>>>>
>>>>>> Amelia / the Web Presence Working Group.
>>>>>>
>>>>>> --
>>>>>> Amelia Ireland
>>>>>> GO Editorial Office,
>>>>>> European Bioinformatics Institute, UK.
>>>>>> Carbon neutral driving: http://www.targetneutral.com/TONIC/
>>>>>> index.jsp
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> =====================================
>>> Jim Hu
>>> Associate Professor
>>> Dept. of Biochemistry and Biophysics
>>> 2128 TAMU
>>> Texas A&M Univ.
>>> College Station, TX 77843-2128
>>> 979-862-4054
>>>
>>>
>>
>
More information about the Go
mailing list