[go] Protein domain GO annotation

Harold Drabkin hjd at informatics.jax.org
Fri Nov 2 09:36:26 PDT 2007


I agree with Ben
Really, this seems very close to InterPro. These are closer to 
"domains". We don't annotate domains, we annotate whole gene products. 
An ip2go translation table can be used as a first pass for an entire 
protein for function prediction. But we don't annotate the domain itself.
 suppose one could call the translation table a file that is essentially 
GO annotation to the domain; it's a curated "set" of domains. But it's 
not the same as annotating the entire gene product.

hjd

Ben Hitz wrote:
>
> The difference is whether or not the (sub) domain imparts the 
> "quality" (i.e, MF) to the whole protein or not.
> An ATP binding domain imparts the function ATP binding to the protein 
> that contains it.
> "Linker Region" does not impart anything to the whole.
> "Entropic Bristle", might I suppose depending on the definition.
> "Entropic Spring" really sounds like a function of the whole protein.
>
> So really there are two separate issues here:  1) whether or not we 
> should annotate regions of protein ("domains") to distinct terms.  2) 
> Whether or not we should add certain MF terms that are related to a 
> protein domains "disorder".
>
> Interpro, in principal, already does the former.  I can see a possible 
> argument for the latter, but a curator would have to determine 
> biological significance.  (Assuming experimental evidence, yadda yadda)
>
> Ben
> On Nov 2, 2007, at 2:28 AM, E Dimmer wrote:
>
>> However there are quite a number of GO function terms which occur on 
>> discrete portions of a protein sequence, for instance many of the 
>> child terms of  'binding' (GO:0005488) (protein, ATP, lipid, 
>> co-factor etc) and simple catalytic domains.
>>
>> I feel that there could be a mid-way point - there are GO terms that 
>> can be annotated to a specific region of a sequence where it is also 
>> appropriate for the function to be 'inherited' by the whole protein.
>> But also there there are domain terms which are not appropriate for a 
>> whole protein - then these should go into another ontology, which 
>> could be a composite of GO terms and domain-specific terms.
>>
>> So while protein domain function annotators and and gene-product 
>> annotators will need to work from a different term set and add 
>> different parameters to their annotations, where a domain has been 
>> annotated to a GO term e.g. 'DNA binding' IDA , then we could 
>> consider including these into GO.
>>
>> Would this suggestion be more acceptable to GO folk?
>>
>> Emily
>>
>>
>> Michael Ashburner wrote:
>>> I agree with Ben, this is not for the GO.
>>> Michael
>>>
>>> On 1 Nov 2007, at 14:49, Benjamin Hitz wrote:
>>>
>>>>
>>>> As resident protein structure expert, no.
>>>> Not that what they are doing is wrong, or not important - but it's 
>>>> not a biological process/function/component of the gene (product) 
>>>> in question.
>>>>
>>>> What's next?  We annotate alpha helices?
>>>>
>>>> Ben
>>>>
>>>> On Nov 1, 2007, at 9:17 AM, E Dimmer wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Could I please ask people's opinion on the functional annotation 
>>>>> of protein domains/regions to the GO?
>>>>>
>>>>> I have been contacted by a group who would like to annotate GO 
>>>>> functions to identified disordered regions in proteins.
>>>>>
>>>>> The thought so far is that they would annotate to a 
>>>>> 'disordered_region' SO term, along with sequence co-ordinates, and 
>>>>> then also attach a GO term with a reference and evidence code.
>>>>> (I have spoken with Gabby Reeves from BioSapiens, who would be 
>>>>> happy to add 'disordered_region' terms to the BioSapiens protein 
>>>>> feature ontology section of SO).
>>>>>
>>>>> For an annotation example: protein LEF-1 (Q9QXN1) has a disordered 
>>>>> region corresponding to residues 296 - 397. This domain has been 
>>>>> found to act to bend DNA, as reported in a experiment in PMID: 
>>>>> 7651541.
>>>>> In the normal course of GO annotation I would of course happily to 
>>>>> annotate the whole protein (Q9QXN1) to the DNA bending term (DNA 
>>>>> bending activity, GO:0008301), and while I might read about the 
>>>>> discrete region in the protein that is responsible for this 
>>>>> function I would not capture this data.
>>>>> However the IUP(Intrinsically Unstructured Protein) curators would 
>>>>> include the aa residue information in their annotations and want 
>>>>> to describe the individual functions that a protein's multiple 
>>>>> domains might have.
>>>>>
>>>>> So I assume that for these kinds of annotations, where an 
>>>>> equivalent GO term exists, a GOC annotation group could integrate 
>>>>> this group's annotations and relate it up to the whole 
>>>>> protein/gene product (and possibly being able to keep the SO term 
>>>>> in the new cross-reference column 16? but not the aa residue 
>>>>> location?).
>>>>>
>>>>> While the majority of the function terms that the IUP community 
>>>>> are interested in applying to their domains do map quite 
>>>>> straight-forwardly to GO terms, there are some new ones which 
>>>>> would need to be requested. And some of these new terms seem to 
>>>>> describe more domain-specific, intra-protein function. For 
>>>>> example, for some of the function terms used in the DisProt database:
>>>>>
>>>>> flexible linker/spacer
>>>>> Provides separation and permits movement between adjacent domains
>>>>>
>>>>> entropic brisle
>>>>> A disordered region that creates a zone of exclusion by its 
>>>>> entropic movement
>>>>>
>>>>> entropic spring
>>>>> Provides a restoring force resulting from randomization of bond 
>>>>> torsion angles that become restricted upon stretching.
>>>>>
>>>>> (see: http://www.disprot.org/view_function_subclass.php)
>>>>>
>>>>> So, would GO be willing to add these types of terms? And how much 
>>>>> of the IUP communities annotation data would GOC groups be happy 
>>>>> to incorporate into their own annotation sets?
>>>>>
>>>>> Thanks,
>>>>> Emily
>>>>>
>>>>>
>>>>> --************************************
>>>>>    Emily Dimmer
>>>>>    GOA Coordinator
>>>>>    EMBL-EBI
>>>>>    Wellcome Trust Genome Campus
>>>>>    Hinxton
>>>>>    Cambridge CB10 1SD, U.K.
>>>>>    Tel:     +44 1223 494654
>>>>>    Fax:    +44 1223 494468
>>>>>    email:  edimmer at ebi.ac.uk
>>>>> ************************************
>>>>
>>>> -- 
>>>> Ben Hitz
>>>> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO 
>>>> Consortium
>>>> Stanford University ** hitz at genome.stanford.edu
>>>>
>>>>
>>>>
>>
>>
>> -- 
>> ************************************
>>    Emily Dimmer
>>    GOA Coordinator
>>    EMBL-EBI
>>    Wellcome Trust Genome Campus
>>    Hinxton
>>    Cambridge CB10 1SD, U.K.
>>    Tel:     +44 1223 494654
>>    Fax:    +44 1223 494468
>>    email:  edimmer at ebi.ac.uk
>> ************************************
>
> -- 
> Ben Hitz
> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO 
> Consortium
> Stanford University ** hitz at genome.stanford.edu
>
>
>




More information about the Go mailing list