[go] AmiGO and annotation qualifiers

Chris Mungall cjm at fruitfly.org
Fri Feb 1 12:15:08 PST 2008


So I'm following up on Jim's comment on showing contradictions - I'm  
just dealing with the simple case where there is a single splice-form.

Harold's case is different, as here there is in fact no  
contradiction. I'll answer Harold's in a separate email..

On Feb 1, 2008, at 12:08 PM, Judith Blake wrote:

> hummm
> I think the case that Harold was saying, and that we currently have  
> in other annotations here at MGI, is that we have
>
> x A
> x NOT A
>
> both lines of evidence exist at this point.
>
> In some cases, different experiments give different results
>
> In the case that Harold discussed, the issue really was that we  
> don't properly distinguish isoforms, so from the gene level, the  
> two are combined whereas if you could represent each isoform, the  
> one would be x-1 A and one would be x-2 NOT A.
>
> judy
>
> Chris Mungall wrote:
>>
>> On Jan 31, 2008, at 2:04 PM, Jim Hu wrote:
>>
>>>
>>> On Jan 31, 2008, at 1:30 PM, Chris Mungall wrote:
>>>
>>>>
>>>> Unambiguous is good.
>>>>
>>>> A related question:
>>>>
>>>> Let's assume there is a NOT annotation for GO:0008047 enzyme  
>>>> activator activity (is_a parent of proteasome activator activity)
>>>>
>>>> If you are on the page for GO:0008538 proteasome activator  
>>>> activity, would you expect to see this annotation?
>>>
>>> I'd expect to see this iff there is an unqualified annotation to  
>>> the same product. In other words, this is only useful as a  
>>> conflict flag IMO.
>>
>> So let's summarise
>>
>> Given an is_a hierarchy:
>>
>> A
>>   B
>>     C
>>
>> And annotations of gene product x, y and z - here, numbered for  
>> reference purposes:
>>
>> [1] x NOT A
>> [2] x B
>> [3] y B
>> [4] y NOT C
>> [5] z A
>> [6] z C
>>
>> The same information shown in the DAG:
>>
>> A    : NOT-x, z
>>   B    : x, y
>>     C    : NOT-y, z
>>
>> We have a contradicting pair of annotations:
>>
>> [1]+[2]
>>
>> the other pairs are non-contradicting:
>>
>> By your criteria the page for B should show any annotations of  
>> gene products that are purported to have property B, *plus* any  
>> annotations that contradict the annotations shown. i.e. the page  
>> would show:
>>
>> [1] x NOT A
>> [2] x B
>> [3] y B
>> [4] y NOT c
>> [6] z C
>>
>> (structured in some way that reflects the DAG and with the  
>> contradiction visually highlighted)
>>
>> Note that [5] is not shown because A is less specific than B, and  
>> it contradicts nothing on the page
>>
>> Note that the way AmiGO (and I imagine most annotation browsers)  
>> are set up now is such that the following are shown on the detail  
>> page for B:
>>
>> [2] x B
>> [3] y B
>> [4] y NOT c
>> [6] z C
>>
>> The fact that [2] has a contradiction with [1] becomes apparent  
>> only as focus is shifted up the DAG. We have a simple algorithm  
>> that goes down the DAG (or propagates genes up depending how you  
>> look at it), ignoring the NOT reversal rule.
>>
>> Which do people prefer?
>>
>> Either way AmiGO will most likely keep the current algorithm for  
>> the next release.
>>
>>>>
>>>>
>>>> I think most people's answer would be no, this would be too  
>>>> confusing.
>>>>
>>>> However, it is the case that anything that is not an enzyme  
>>>> activator is by definition not a proteasome activator activity.
>>>>
>>>>
>>>>
>>>> On Jan 31, 2008, at 11:21 AM, Michael Ashburner wrote:
>>>>
>>>>> I prefer demo3 - it is clear and unambiguous.
>>>>>
>>>>> Michael
>>>>>
>>>>> Michael
>>>>>
>>>>> On 29 Jan 2008, at 14:00, Amelia Ireland wrote:
>>>>>
>>>>>> Hi GO Consortium,
>>>>>>
>>>>>> There has been some discussion in the web presence working  
>>>>>> group about the display of NOT annotations and annotations  
>>>>>> with qualifiers. There are two issues; how to display the  
>>>>>> annotations with qualifiers, and how gene product counts for  
>>>>>> terms with qualifiers should be displayed. We'd like to get  
>>>>>> some feedback from the consortium on these issues before  
>>>>>> releasing the new AmiGO.
>>>>>>
>>>>>> Annotations With Qualifiers
>>>>>> ===========================
>>>>>>
>>>>>> I've made some mock ups of different possible arrangements here:
>>>>>>
>>>>>> http://www.ebi.ac.uk/~aji/demo.html
>>>>>> http://www.ebi.ac.uk/~aji/demo2.html
>>>>>> http://www.ebi.ac.uk/~aji/demo3.html
>>>>>>
>>>>>> These pages are viewing the GPs annotated to a term (and its  
>>>>>> children), but a similar arrangement will be used to view the  
>>>>>> terms with which a GP is associated.
>>>>>>
>>>>>> Which arrangement is the clearest? Things to consider:
>>>>>>
>>>>>> - how easy is it to see inconsistent annotations, e.g. the  
>>>>>> same GP has a normal and a NOT annotation?
>>>>>>
>>>>>> - is it clear what the qualifier applies to?
>>>>>>
>>>>>> - some views (e.g. having the associations with operators/ 
>>>>>> qualifiers separated out) require extra calculations to  
>>>>>> generate and will hence be slower
>>>>>>
>>>>>> - should the default view contain NOT annotations? Is a GO  
>>>>>> newbie going to understand them?
>>>>>>
>>>>>> - is the 'nn gene products' link unambiguous in meaning? Is it  
>>>>>> helpful to know how many GPs there are annotated to a term?
>>>>>>
>>>>>>
>>>>>> Gene Product Counts
>>>>>> ===================
>>>>>>
>>>>>> The WPWG discussed generating separate totals for numbers of  
>>>>>> GPs annotated to a term without any qualifier and the no. of  
>>>>>> GPs annotated to a term with each of the other qualifiers  
>>>>>> (currently the GP count for a term *includes* any GPs  
>>>>>> annotated with 'contributes_to' and 'colocalizes_with', but  
>>>>>> *excludes* NOT annotations). How useful would this be, and is  
>>>>>> it intuitive to split the counts up thus, or is it confusing  
>>>>>> (especially for GO newbies)? There is also this to bear in mind:
>>>>>>
>>>>>> On 24 Jan 2008, at 20:11, Chris Mungall wrote:
>>>>>>> On Jan 24, 2008, at 11:08 AM, Seth Carbon wrote:
>>>>>>>> *) ask chris about splitting the gp count by qualifier (seth)
>>>>>>>
>>>>>>> you mean by contributes_to, the NOT operator etc?
>>>>>>>
>>>>>>> unfortunately it is expensive to pre-compute the gp counts in  
>>>>>>> any way that a single gp can end up in two partitions.
>>>>>>>
>>>>>>> we currently partition by gp db name - no gp can be in > 1.  
>>>>>>> this has the advantage that we can get the correct total by  
>>>>>>> summing the numbers at query time.
>>>>>>>
>>>>>>> however, if we have a partition that does not split gps in  
>>>>>>> this way we can't sum across the partitions, which means we  
>>>>>>> have to pre-compute for all combinations.
>>>>>>>
>>>>>>> This may not be so bad for the qualifiers however, since the  
>>>>>>> qualifiers are relatively rare.
>>>>>>>
>>>>>>> would you really want this for the NOT operator though?  
>>>>>>> remember the semantics are easy to get confused here. Would  
>>>>>>> you want:
>>>>>>>
>>>>>>> all gps that are NOT GPCRs (ie include those that are  
>>>>>>> asserted not-TM receptor in the count) - negation propagates  
>>>>>>> up the graph
>>>>>>>
>>>>>>> all gps that have a NOT annotation to some kind of GPCR -  
>>>>>>> negation propagates as normal
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any feedback would be gratefully received. Thanks!
>>>>>>
>>>>>> Amelia / the Web Presence Working Group.
>>>>>>
>>>>>> -- 
>>>>>> Amelia Ireland
>>>>>> GO Editorial Office,
>>>>>> European Bioinformatics Institute, UK.
>>>>>> Carbon neutral driving: http://www.targetneutral.com/TONIC/ 
>>>>>> index.jsp
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> =====================================
>>> Jim Hu
>>> Associate Professor
>>> Dept. of Biochemistry and Biophysics
>>> 2128 TAMU
>>> Texas A&M Univ.
>>> College Station, TX 77843-2128
>>> 979-862-4054
>>>
>>>
>>
>




More information about the Go mailing list