[go] Protein domain GO annotation

Michael Ashburner ma11 at gen.cam.ac.uk
Fri Nov 2 05:36:18 PDT 2007


I am sorry, but I am still not convinced. GO terms are used by the GO  
to annotate gene products.
I do not think that we, as the GO should use these to annotate the  
anatomy of a protein.
Of course, of some other group want to do this then fine, but as  
Emily points out they will need more
than the GO for that purpose.

Michael
On 2 Nov 2007, at 05:28, E Dimmer wrote:

> However there are quite a number of GO function terms which occur  
> on discrete portions of a protein sequence, for instance many of  
> the child terms of  'binding' (GO:0005488) (protein, ATP, lipid, co- 
> factor etc) and simple catalytic domains.
>
> I feel that there could be a mid-way point - there are GO terms  
> that can be annotated to a specific region of a sequence where it  
> is also appropriate for the function to be 'inherited' by the whole  
> protein.
> But also there there are domain terms which are not appropriate for  
> a whole protein - then these should go into another ontology, which  
> could be a composite of GO terms and domain-specific terms.
>
> So while protein domain function annotators and and gene-product  
> annotators will need to work from a different term set and add  
> different parameters to their annotations, where a domain has been  
> annotated to a GO term e.g. 'DNA binding' IDA , then we could  
> consider including these into GO.
>
> Would this suggestion be more acceptable to GO folk?
>
> Emily
>
>
> Michael Ashburner wrote:
>> I agree with Ben, this is not for the GO.
>> Michael
>>
>> On 1 Nov 2007, at 14:49, Benjamin Hitz wrote:
>>
>>>
>>> As resident protein structure expert, no.
>>> Not that what they are doing is wrong, or not important - but  
>>> it's not a biological process/function/component of the gene  
>>> (product) in question.
>>>
>>> What's next?  We annotate alpha helices?
>>>
>>> Ben
>>>
>>> On Nov 1, 2007, at 9:17 AM, E Dimmer wrote:
>>>
>>>> Hi,
>>>>
>>>> Could I please ask people's opinion on the functional annotation  
>>>> of protein domains/regions to the GO?
>>>>
>>>> I have been contacted by a group who would like to annotate GO  
>>>> functions to identified disordered regions in proteins.
>>>>
>>>> The thought so far is that they would annotate to a  
>>>> 'disordered_region' SO term, along with sequence co-ordinates,  
>>>> and then also attach a GO term with a reference and evidence code.
>>>> (I have spoken with Gabby Reeves from BioSapiens, who would be  
>>>> happy to add 'disordered_region' terms to the BioSapiens protein  
>>>> feature ontology section of SO).
>>>>
>>>> For an annotation example: protein LEF-1 (Q9QXN1) has a  
>>>> disordered region corresponding to residues 296 - 397. This  
>>>> domain has been found to act to bend DNA, as reported in a  
>>>> experiment in PMID: 7651541.
>>>> In the normal course of GO annotation I would of course happily  
>>>> to annotate the whole protein (Q9QXN1) to the DNA bending term  
>>>> (DNA bending activity, GO:0008301), and while I might read about  
>>>> the discrete region in the protein that is responsible for this  
>>>> function I would not capture this data.
>>>> However the IUP(Intrinsically Unstructured Protein) curators  
>>>> would include the aa residue information in their annotations  
>>>> and want to describe the individual functions that a protein's  
>>>> multiple domains might have.
>>>>
>>>> So I assume that for these kinds of annotations, where an  
>>>> equivalent GO term exists, a GOC annotation group could  
>>>> integrate this group's annotations and relate it up to the whole  
>>>> protein/gene product (and possibly being able to keep the SO  
>>>> term in the new cross-reference column 16? but not the aa  
>>>> residue location?).
>>>>
>>>> While the majority of the function terms that the IUP community  
>>>> are interested in applying to their domains do map quite  
>>>> straight-forwardly to GO terms, there are some new ones which  
>>>> would need to be requested. And some of these new terms seem to  
>>>> describe more domain-specific, intra-protein function. For  
>>>> example, for some of the function terms used in the DisProt  
>>>> database:
>>>>
>>>> flexible linker/spacer
>>>> Provides separation and permits movement between adjacent domains
>>>>
>>>> entropic brisle
>>>> A disordered region that creates a zone of exclusion by its  
>>>> entropic movement
>>>>
>>>> entropic spring
>>>> Provides a restoring force resulting from randomization of bond  
>>>> torsion angles that become restricted upon stretching.
>>>>
>>>> (see: http://www.disprot.org/view_function_subclass.php)
>>>>
>>>> So, would GO be willing to add these types of terms? And how  
>>>> much of the IUP communities annotation data would GOC groups be  
>>>> happy to incorporate into their own annotation sets?
>>>>
>>>> Thanks,
>>>> Emily
>>>>
>>>>
>>>> --************************************
>>>>    Emily Dimmer
>>>>    GOA Coordinator
>>>>    EMBL-EBI
>>>>    Wellcome Trust Genome Campus
>>>>    Hinxton
>>>>    Cambridge CB10 1SD, U.K.
>>>>    Tel:     +44 1223 494654
>>>>    Fax:    +44 1223 494468
>>>>    email:  edimmer at ebi.ac.uk
>>>> ************************************
>>>
>>> -- 
>>> Ben Hitz
>>> Senior Scientific Programmer ** Saccharomyces Genome Database **  
>>> GO Consortium
>>> Stanford University ** hitz at genome.stanford.edu
>>>
>>>
>>>
>
>
> -- 
> ************************************
>    Emily Dimmer
>    GOA Coordinator
>    EMBL-EBI
>    Wellcome Trust Genome Campus
>    Hinxton
>    Cambridge CB10 1SD, U.K.
>    Tel:     +44 1223 494654
>    Fax:    +44 1223 494468
>    email:  edimmer at ebi.ac.uk
> ************************************
>




More information about the Go mailing list