[go] AmiGO and annotation qualifiers

Chris Mungall cjm at fruitfly.org
Fri Feb 1 16:14:49 PST 2008



On Feb 1, 2008, at 6:20 AM, Harold Drabkin wrote:

> It is important to note that NOT is a very useful and potentially  
> important piece of information. I just finished annotations of the  
> gene products of two unrelated genes. Both of them made two  
> isoforms by alternate splicing, resulting in a long form and a  
> short form. The long form has an activity; the short form does not;  
> both assemble with another protein equally as well; by adjusting  
> the amount of the two isoforms, the activity of the resulting  
> complex is regulated. Because we annotate to gene object, these  
> each get two annotations, one for the long, one for the short. The  
> short has the NOT qualifier. At the moment, the isoform ID is not  
> displayed (but stored in special notes in our EI), nor is it in the  
> GAF. However, since both annotations map to the same reference, a  
> user should check it out.  One often sees "NOT" when one isoform is  
> nuclear, and one is not; or if a protein modification,such as  
> phosphorylation, alters the activity of the protein. Again, we  
> currently have no means of putting that in the GAF; however, I  
> expect a user to be able to read the references associated with the  
> annotation.  Additionally,  literature in question can often have  
> apparently conflicting data because of differences in cell type,  
> etc. The GAF is a summary of the totality of  information colle
> In a separate discussion thread, to be taken up at SLC, we will  
> start to address how we can make the extra data readily available.  
> MODS that annotate protein ids directly do not have this dilema. cted.

So I think the above has two related arguments for retaining NOTs.  
One is based solely on the limitations of GAFs w.r.t alternate  
spliceforms. I think this one is a little wrong-headed. We shouldn't  
curate additional content solely to make up for inadequate file formats.

However, this may not be what you're saying. Regardless of GAFs,  
there is some interesting biology here that should be captured. The  
different isoforms have different activity - one appears to lack an  
activity the other possesses. We can't state this in a logically  
coherent way without introducing negation.

I think this example should go in a NOT FAQ somewhere.

  I'm sure there must be other examples, perhaps pertaining to  
evolutionary loss of function, that will become apparent as we take  
less organism-centric views of annotation.

> In any event, it is fairly easy to parse a GAF and remove the  
> qualifiers if one wants.
>
> hjd
>
> Jim Hu wrote:
>>
>> On Jan 31, 2008, at 1:30 PM, Chris Mungall wrote:
>>
>>>
>>> Unambiguous is good.
>>>
>>> A related question:
>>>
>>> Let's assume there is a NOT annotation for GO:0008047 enzyme  
>>> activator activity (is_a parent of proteasome activator activity)
>>>
>>> If you are on the page for GO:0008538 proteasome activator  
>>> activity, would you expect to see this annotation?
>>
>> I'd expect to see this iff there is an unqualified annotation to  
>> the same product. In other words, this is only useful as a  
>> conflict flag IMO.
>>
>>>
>>>
>>> I think most people's answer would be no, this would be too  
>>> confusing.
>>>
>>> However, it is the case that anything that is not an enzyme  
>>> activator is by definition not a proteasome activator activity.
>>>
>>>
>>>
>>> On Jan 31, 2008, at 11:21 AM, Michael Ashburner wrote:
>>>
>>>> I prefer demo3 - it is clear and unambiguous.
>>>>
>>>> Michael
>>>>
>>>> Michael
>>>>
>>>> On 29 Jan 2008, at 14:00, Amelia Ireland wrote:
>>>>
>>>>> Hi GO Consortium,
>>>>>
>>>>> There has been some discussion in the web presence working  
>>>>> group about the display of NOT annotations and annotations with  
>>>>> qualifiers. There are two issues; how to display the  
>>>>> annotations with qualifiers, and how gene product counts for  
>>>>> terms with qualifiers should be displayed. We'd like to get  
>>>>> some feedback from the consortium on these issues before  
>>>>> releasing the new AmiGO.
>>>>>
>>>>> Annotations With Qualifiers
>>>>> ===========================
>>>>>
>>>>> I've made some mock ups of different possible arrangements here:
>>>>>
>>>>> http://www.ebi.ac.uk/~aji/demo.html <http://www.ebi.ac.uk/% 
>>>>> 7Eaji/demo.html>
>>>>> http://www.ebi.ac.uk/~aji/demo2.html <http://www.ebi.ac.uk/% 
>>>>> 7Eaji/demo2.html>
>>>>> http://www.ebi.ac.uk/~aji/demo3.html <http://www.ebi.ac.uk/% 
>>>>> 7Eaji/demo3.html>
>>>>>
>>>>> These pages are viewing the GPs annotated to a term (and its  
>>>>> children), but a similar arrangement will be used to view the  
>>>>> terms with which a GP is associated.
>>>>>
>>>>> Which arrangement is the clearest? Things to consider:
>>>>>
>>>>> - how easy is it to see inconsistent annotations, e.g. the same  
>>>>> GP has a normal and a NOT annotation?
>>>>>
>>>>> - is it clear what the qualifier applies to?
>>>>>
>>>>> - some views (e.g. having the associations with operators/ 
>>>>> qualifiers separated out) require extra calculations to  
>>>>> generate and will hence be slower
>>>>>
>>>>> - should the default view contain NOT annotations? Is a GO  
>>>>> newbie going to understand them?
>>>>>
>>>>> - is the 'nn gene products' link unambiguous in meaning? Is it  
>>>>> helpful to know how many GPs there are annotated to a term?
>>>>>
>>>>>
>>>>> Gene Product Counts
>>>>> ===================
>>>>>
>>>>> The WPWG discussed generating separate totals for numbers of  
>>>>> GPs annotated to a term without any qualifier and the no. of  
>>>>> GPs annotated to a term with each of the other qualifiers  
>>>>> (currently the GP count for a term *includes* any GPs annotated  
>>>>> with 'contributes_to' and 'colocalizes_with', but *excludes*  
>>>>> NOT annotations). How useful would this be, and is it intuitive  
>>>>> to split the counts up thus, or is it confusing (especially for  
>>>>> GO newbies)? There is also this to bear in mind:
>>>>>
>>>>> On 24 Jan 2008, at 20:11, Chris Mungall wrote:
>>>>>> On Jan 24, 2008, at 11:08 AM, Seth Carbon wrote:
>>>>>>> *) ask chris about splitting the gp count by qualifier (seth)
>>>>>>
>>>>>> you mean by contributes_to, the NOT operator etc?
>>>>>>
>>>>>> unfortunately it is expensive to pre-compute the gp counts in  
>>>>>> any way that a single gp can end up in two partitions.
>>>>>>
>>>>>> we currently partition by gp db name - no gp can be in > 1.  
>>>>>> this has the advantage that we can get the correct total by  
>>>>>> summing the numbers at query time.
>>>>>>
>>>>>> however, if we have a partition that does not split gps in  
>>>>>> this way we can't sum across the partitions, which means we  
>>>>>> have to pre-compute for all combinations.
>>>>>>
>>>>>> This may not be so bad for the qualifiers however, since the  
>>>>>> qualifiers are relatively rare.
>>>>>>
>>>>>> would you really want this for the NOT operator though?  
>>>>>> remember the semantics are easy to get confused here. Would  
>>>>>> you want:
>>>>>>
>>>>>> all gps that are NOT GPCRs (ie include those that are asserted  
>>>>>> not-TM receptor in the count) - negation propagates up the graph
>>>>>>
>>>>>> all gps that have a NOT annotation to some kind of GPCR -  
>>>>>> negation propagates as normal
>>>>>
>>>>>
>>>>>
>>>>> Any feedback would be gratefully received. Thanks!
>>>>>
>>>>> Amelia / the Web Presence Working Group.
>>>>>
>>>>> --
>>>>> Amelia Ireland
>>>>> GO Editorial Office,
>>>>> European Bioinformatics Institute, UK.
>>>>> Carbon neutral driving: http://www.targetneutral.com/TONIC/ 
>>>>> index.jsp
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>> =====================================
>>
>> Jim Hu
>>
>> Associate Professor
>>
>> Dept. of Biochemistry and Biophysics
>>
>> 2128 TAMU
>>
>> Texas A&M Univ.
>>
>> College Station, TX 77843-2128
>>
>> 979-862-4054
>>
>>
>>
>
>




More information about the Go mailing list