[go] Protein domain GO annotation

David Hill dph at informatics.jax.org
Fri Nov 2 05:48:21 PDT 2007


Hi Everyone,

I might as well weigh in as well. I think Michael hits the nail on the 
head:
> GO terms are used by the GO to annotate gene products.
Whenever we do anything with the GO it is always with the idea that we 
are using it for this purpose. The GO may suit other purposes, but our 
concern is that it is suitable for gene product annotation. That's what 
we do.

David
>
> Michael
> On 2 Nov 2007, at 05:28, E Dimmer wrote:
>
>> However there are quite a number of GO function terms which occur on 
>> discrete portions of a protein sequence, for instance many of the 
>> child terms of  'binding' (GO:0005488) (protein, ATP, lipid, 
>> co-factor etc) and simple catalytic domains.
>>
>> I feel that there could be a mid-way point - there are GO terms that 
>> can be annotated to a specific region of a sequence where it is also 
>> appropriate for the function to be 'inherited' by the whole protein.
>> But also there there are domain terms which are not appropriate for a 
>> whole protein - then these should go into another ontology, which 
>> could be a composite of GO terms and domain-specific terms.
>>
>> So while protein domain function annotators and and gene-product 
>> annotators will need to work from a different term set and add 
>> different parameters to their annotations, where a domain has been 
>> annotated to a GO term e.g. 'DNA binding' IDA , then we could 
>> consider including these into GO.
>>
>> Would this suggestion be more acceptable to GO folk?
>>
>> Emily
>>
>>
>> Michael Ashburner wrote:
>>> I agree with Ben, this is not for the GO.
>>> Michael
>>>
>>> On 1 Nov 2007, at 14:49, Benjamin Hitz wrote:
>>>
>>>>
>>>> As resident protein structure expert, no.
>>>> Not that what they are doing is wrong, or not important - but it's 
>>>> not a biological process/function/component of the gene (product) 
>>>> in question.
>>>>
>>>> What's next?  We annotate alpha helices?
>>>>
>>>> Ben
>>>>
>>>> On Nov 1, 2007, at 9:17 AM, E Dimmer wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Could I please ask people's opinion on the functional annotation 
>>>>> of protein domains/regions to the GO?
>>>>>
>>>>> I have been contacted by a group who would like to annotate GO 
>>>>> functions to identified disordered regions in proteins.
>>>>>
>>>>> The thought so far is that they would annotate to a 
>>>>> 'disordered_region' SO term, along with sequence co-ordinates, and 
>>>>> then also attach a GO term with a reference and evidence code.
>>>>> (I have spoken with Gabby Reeves from BioSapiens, who would be 
>>>>> happy to add 'disordered_region' terms to the BioSapiens protein 
>>>>> feature ontology section of SO).
>>>>>
>>>>> For an annotation example: protein LEF-1 (Q9QXN1) has a disordered 
>>>>> region corresponding to residues 296 - 397. This domain has been 
>>>>> found to act to bend DNA, as reported in a experiment in PMID: 
>>>>> 7651541.
>>>>> In the normal course of GO annotation I would of course happily to 
>>>>> annotate the whole protein (Q9QXN1) to the DNA bending term (DNA 
>>>>> bending activity, GO:0008301), and while I might read about the 
>>>>> discrete region in the protein that is responsible for this 
>>>>> function I would not capture this data.
>>>>> However the IUP(Intrinsically Unstructured Protein) curators would 
>>>>> include the aa residue information in their annotations and want 
>>>>> to describe the individual functions that a protein's multiple 
>>>>> domains might have.
>>>>>
>>>>> So I assume that for these kinds of annotations, where an 
>>>>> equivalent GO term exists, a GOC annotation group could integrate 
>>>>> this group's annotations and relate it up to the whole 
>>>>> protein/gene product (and possibly being able to keep the SO term 
>>>>> in the new cross-reference column 16? but not the aa residue 
>>>>> location?).
>>>>>
>>>>> While the majority of the function terms that the IUP community 
>>>>> are interested in applying to their domains do map quite 
>>>>> straight-forwardly to GO terms, there are some new ones which 
>>>>> would need to be requested. And some of these new terms seem to 
>>>>> describe more domain-specific, intra-protein function. For 
>>>>> example, for some of the function terms used in the DisProt database:
>>>>>
>>>>> flexible linker/spacer
>>>>> Provides separation and permits movement between adjacent domains
>>>>>
>>>>> entropic brisle
>>>>> A disordered region that creates a zone of exclusion by its 
>>>>> entropic movement
>>>>>
>>>>> entropic spring
>>>>> Provides a restoring force resulting from randomization of bond 
>>>>> torsion angles that become restricted upon stretching.
>>>>>
>>>>> (see: http://www.disprot.org/view_function_subclass.php)
>>>>>
>>>>> So, would GO be willing to add these types of terms? And how much 
>>>>> of the IUP communities annotation data would GOC groups be happy 
>>>>> to incorporate into their own annotation sets?
>>>>>
>>>>> Thanks,
>>>>> Emily
>>>>>
>>>>>
>>>>> --************************************
>>>>>    Emily Dimmer
>>>>>    GOA Coordinator
>>>>>    EMBL-EBI
>>>>>    Wellcome Trust Genome Campus
>>>>>    Hinxton
>>>>>    Cambridge CB10 1SD, U.K.
>>>>>    Tel:     +44 1223 494654
>>>>>    Fax:    +44 1223 494468
>>>>>    email:  edimmer at ebi.ac.uk
>>>>> ************************************
>>>>
>>>> --Ben Hitz
>>>> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO 
>>>> Consortium
>>>> Stanford University ** hitz at genome.stanford.edu
>>>>
>>>>
>>>>
>>
>>
>> --************************************
>>    Emily Dimmer
>>    GOA Coordinator
>>    EMBL-EBI
>>    Wellcome Trust Genome Campus
>>    Hinxton
>>    Cambridge CB10 1SD, U.K.
>>    Tel:     +44 1223 494654
>>    Fax:    +44 1223 494468
>>    email:  edimmer at ebi.ac.uk
>> ************************************
>>
>





More information about the Go mailing list