[go] AmiGO and annotation qualifiers

Jim Hu jimhu at tamu.edu
Fri Feb 1 07:40:34 PST 2008


I agree.  We are faced with a large number of Multifun to GO mappings  
that will have to be marked as NOT, based on the very different  
philosophies of the two functional annotation systems.  For E. coli,  
where we have multiple web-accessible data resources showing these  
annotations, we think it's important to assert NOT publicly.

Jim

On Feb 1, 2008, at 6:20 AM, Harold Drabkin wrote:

> It is important to note that NOT is a very useful and potentially  
> important piece of information. I just finished annotations of the  
> gene products of two unrelated genes. Both of them made two isoforms  
> by alternate splicing, resulting in a long form and a short form.  
> The long form has an activity; the short form does not; both  
> assemble with another protein equally as well; by adjusting the  
> amount of the two isoforms, the activity of the resulting complex is  
> regulated. Because we annotate to gene object, these each get two  
> annotations, one for the long, one for the short. The short has the  
> NOT qualifier. At the moment, the isoform ID is not displayed (but  
> stored in special notes in our EI), nor is it in the GAF. However,  
> since both annotations map to the same reference, a user should  
> check it out.  One often sees "NOT" when one isoform is nuclear, and  
> one is not; or if a protein modification,such as phosphorylation,  
> alters the activity of the protein. Again, we currently have no  
> means of putting that in the GAF; however, I expect a user to be  
> able to read the references associated with the annotation.   
> Additionally,  literature in question can often have apparently  
> conflicting data because of differences in cell type, etc. The GAF  
> is a summary of the totality of  information colle
> In a separate discussion thread, to be taken up at SLC, we will  
> start to address how we can make the extra data readily available.  
> MODS that annotate protein ids directly do not have this dilemma.  
> cted.
>
> In any event, it is fairly easy to parse a GAF and remove the  
> qualifiers if one wants.
>
> hjd
>
> Jim Hu wrote:
>>
>> On Jan 31, 2008, at 1:30 PM, Chris Mungall wrote:
>>
>>>
>>> Unambiguous is good.
>>>
>>> A related question:
>>>
>>> Let's assume there is a NOT annotation for GO:0008047 enzyme  
>>> activator activity (is_a parent of proteasome activator activity)
>>>
>>> If you are on the page for GO:0008538 proteasome activator  
>>> activity, would you expect to see this annotation?
>>
>> I'd expect to see this iff there is an unqualified annotation to  
>> the same product. In other words, this is only useful as a conflict  
>> flag IMO.
>>
>>>
>>>
>>> I think most people's answer would be no, this would be too  
>>> confusing.
>>>
>>> However, it is the case that anything that is not an enzyme  
>>> activator is by definition not a proteasome activator activity.
>>>
>>>
>>>
>>> On Jan 31, 2008, at 11:21 AM, Michael Ashburner wrote:
>>>
>>>> I prefer demo3 - it is clear and unambiguous.
>>>>
>>>> Michael
>>>>
>>>> Michael
>>>>
>>>> On 29 Jan 2008, at 14:00, Amelia Ireland wrote:
>>>>
>>>>> Hi GO Consortium,
>>>>>
>>>>> There has been some discussion in the web presence working group  
>>>>> about the display of NOT annotations and annotations with  
>>>>> qualifiers. There are two issues; how to display the annotations  
>>>>> with qualifiers, and how gene product counts for terms with  
>>>>> qualifiers should be displayed. We'd like to get some feedback  
>>>>> from the consortium on these issues before releasing the new  
>>>>> AmiGO.
>>>>>
>>>>> Annotations With Qualifiers
>>>>> ===========================
>>>>>
>>>>> I've made some mock ups of different possible arrangements here:
>>>>>
>>>>> http://www.ebi.ac.uk/~aji/demo.html <http://www.ebi.ac.uk/%7Eaji/demo.html 
>>>>> >
>>>>> http://www.ebi.ac.uk/~aji/demo2.html <http://www.ebi.ac.uk/%7Eaji/demo2.html 
>>>>> >
>>>>> http://www.ebi.ac.uk/~aji/demo3.html <http://www.ebi.ac.uk/%7Eaji/demo3.html 
>>>>> >
>>>>>
>>>>> These pages are viewing the GPs annotated to a term (and its  
>>>>> children), but a similar arrangement will be used to view the  
>>>>> terms with which a GP is associated.
>>>>>
>>>>> Which arrangement is the clearest? Things to consider:
>>>>>
>>>>> - how easy is it to see inconsistent annotations, e.g. the same  
>>>>> GP has a normal and a NOT annotation?
>>>>>
>>>>> - is it clear what the qualifier applies to?
>>>>>
>>>>> - some views (e.g. having the associations with operators/ 
>>>>> qualifiers separated out) require extra calculations to generate  
>>>>> and will hence be slower
>>>>>
>>>>> - should the default view contain NOT annotations? Is a GO  
>>>>> newbie going to understand them?
>>>>>
>>>>> - is the 'nn gene products' link unambiguous in meaning? Is it  
>>>>> helpful to know how many GPs there are annotated to a term?
>>>>>
>>>>>
>>>>> Gene Product Counts
>>>>> ===================
>>>>>
>>>>> The WPWG discussed generating separate totals for numbers of GPs  
>>>>> annotated to a term without any qualifier and the no. of GPs  
>>>>> annotated to a term with each of the other qualifiers (currently  
>>>>> the GP count for a term *includes* any GPs annotated with  
>>>>> 'contributes_to' and 'colocalizes_with', but *excludes* NOT  
>>>>> annotations). How useful would this be, and is it intuitive to  
>>>>> split the counts up thus, or is it confusing (especially for GO  
>>>>> newbies)? There is also this to bear in mind:
>>>>>
>>>>> On 24 Jan 2008, at 20:11, Chris Mungall wrote:
>>>>>> On Jan 24, 2008, at 11:08 AM, Seth Carbon wrote:
>>>>>>> *) ask chris about splitting the gp count by qualifier (seth)
>>>>>>
>>>>>> you mean by contributes_to, the NOT operator etc?
>>>>>>
>>>>>> unfortunately it is expensive to pre-compute the gp counts in  
>>>>>> any way that a single gp can end up in two partitions.
>>>>>>
>>>>>> we currently partition by gp db name - no gp can be in > 1.  
>>>>>> this has the advantage that we can get the correct total by  
>>>>>> summing the numbers at query time.
>>>>>>
>>>>>> however, if we have a partition that does not split gps in this  
>>>>>> way we can't sum across the partitions, which means we have to  
>>>>>> pre-compute for all combinations.
>>>>>>
>>>>>> This may not be so bad for the qualifiers however, since the  
>>>>>> qualifiers are relatively rare.
>>>>>>
>>>>>> would you really want this for the NOT operator though?  
>>>>>> remember the semantics are easy to get confused here. Would you  
>>>>>> want:
>>>>>>
>>>>>> all gps that are NOT GPCRs (ie include those that are asserted  
>>>>>> not-TM receptor in the count) - negation propagates up the graph
>>>>>>
>>>>>> all gps that have a NOT annotation to some kind of GPCR -  
>>>>>> negation propagates as normal
>>>>>
>>>>>
>>>>>
>>>>> Any feedback would be gratefully received. Thanks!
>>>>>
>>>>> Amelia / the Web Presence Working Group.
>>>>>
>>>>> --
>>>>> Amelia Ireland
>>>>> GO Editorial Office,
>>>>> European Bioinformatics Institute, UK.
>>>>> Carbon neutral driving: http://www.targetneutral.com/TONIC/index.jsp
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>> =====================================
>>
>> Jim Hu
>>
>> Associate Professor
>>
>> Dept. of Biochemistry and Biophysics
>>
>> 2128 TAMU
>>
>> Texas A&M Univ.
>>
>> College Station, TX 77843-2128
>>
>> 979-862-4054
>>
>>
>>
>

=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://fafner.stanford.edu/pipermail/go/attachments/20080201/97d10a08/attachment.html 


More information about the Go mailing list