[go] Protein domain GO annotation
Valerie Wood
val at sanger.ac.uk
Fri Nov 2 04:33:15 PDT 2007
Hi Emily,
I agree that when a function or process is known to be the role of a
specific region of a protein, it is usful to capture this information. I
often do this by recording the amino acid coordinates or the domain
accession (if there is one) as a qualifier to the annotation, but at
present do nothing further with this information. I was also hoping that
this could somehow eventually be cross-referenced to a separate ontology
in column 16 (i.e the protein feature region of SO).
I have some extreme examples. In pombe there are a number of 'fusion
proteins' which appear to be post translationally cleaved. For example
the mitochondrial type I [2Fe-2S] ferredoxin Etp1/ cytochrome oxidase
cofactor Cox15, fusion. As I don't know where the cleavage site is, and
I annotate to the gene, the GO terms are all annotated to a single
sequence. Clearly it will be useful to record the region which the
process and function applies to in these cases.
I agree also, that if a region or domain has clearly been ascribed a
specific function or activity, then it is also useful to record this.
Another
For example, in obvious cases of mulitfunctional enzynmes like ura 1
http://pfam.sanger.ac.uk/protein?acc=Q09794
but also, more subtly if different regions of proteins have been
identified to be important for different processes. This is all
important data that should be captured by curators, even if it is not
part of GO.
Val
E Dimmer wrote:
> However there are quite a number of GO function terms which occur on
> discrete portions of a protein sequence, for instance many of the
> child terms of 'binding' (GO:0005488) (protein, ATP, lipid, co-factor
> etc) and simple catalytic domains.
>
> I feel that there could be a mid-way point - there are GO terms that
> can be annotated to a specific region of a sequence where it is also
> appropriate for the function to be 'inherited' by the whole protein.
> But also there there are domain terms which are not appropriate for a
> whole protein - then these should go into another ontology, which
> could be a composite of GO terms and domain-specific terms.
>
> So while protein domain function annotators and and gene-product
> annotators will need to work from a different term set and add
> different parameters to their annotations, where a domain has been
> annotated to a GO term e.g. 'DNA binding' IDA , then we could consider
> including these into GO.
>
> Would this suggestion be more acceptable to GO folk?
>
> Emily
>
>
> Michael Ashburner wrote:
>
>> I agree with Ben, this is not for the GO.
>> Michael
>>
>> On 1 Nov 2007, at 14:49, Benjamin Hitz wrote:
>>
>>>
>>> As resident protein structure expert, no.
>>> Not that what they are doing is wrong, or not important - but it's
>>> not a biological process/function/component of the gene (product) in
>>> question.
>>>
>>> What's next? We annotate alpha helices?
>>>
>>> Ben
>>>
>>> On Nov 1, 2007, at 9:17 AM, E Dimmer wrote:
>>>
>>>> Hi,
>>>>
>>>> Could I please ask people's opinion on the functional annotation of
>>>> protein domains/regions to the GO?
>>>>
>>>> I have been contacted by a group who would like to annotate GO
>>>> functions to identified disordered regions in proteins.
>>>>
>>>> The thought so far is that they would annotate to a
>>>> 'disordered_region' SO term, along with sequence co-ordinates, and
>>>> then also attach a GO term with a reference and evidence code.
>>>> (I have spoken with Gabby Reeves from BioSapiens, who would be
>>>> happy to add 'disordered_region' terms to the BioSapiens protein
>>>> feature ontology section of SO).
>>>>
>>>> For an annotation example: protein LEF-1 (Q9QXN1) has a disordered
>>>> region corresponding to residues 296 - 397. This domain has been
>>>> found to act to bend DNA, as reported in a experiment in PMID:
>>>> 7651541.
>>>> In the normal course of GO annotation I would of course happily to
>>>> annotate the whole protein (Q9QXN1) to the DNA bending term (DNA
>>>> bending activity, GO:0008301), and while I might read about the
>>>> discrete region in the protein that is responsible for this
>>>> function I would not capture this data.
>>>> However the IUP(Intrinsically Unstructured Protein) curators would
>>>> include the aa residue information in their annotations and want to
>>>> describe the individual functions that a protein's multiple domains
>>>> might have.
>>>>
>>>> So I assume that for these kinds of annotations, where an
>>>> equivalent GO term exists, a GOC annotation group could integrate
>>>> this group's annotations and relate it up to the whole protein/gene
>>>> product (and possibly being able to keep the SO term in the new
>>>> cross-reference column 16? but not the aa residue location?).
>>>>
>>>> While the majority of the function terms that the IUP community are
>>>> interested in applying to their domains do map quite
>>>> straight-forwardly to GO terms, there are some new ones which would
>>>> need to be requested. And some of these new terms seem to describe
>>>> more domain-specific, intra-protein function. For example, for some
>>>> of the function terms used in the DisProt database:
>>>>
>>>> flexible linker/spacer
>>>> Provides separation and permits movement between adjacent domains
>>>>
>>>> entropic brisle
>>>> A disordered region that creates a zone of exclusion by its
>>>> entropic movement
>>>>
>>>> entropic spring
>>>> Provides a restoring force resulting from randomization of bond
>>>> torsion angles that become restricted upon stretching.
>>>>
>>>> (see: http://www.disprot.org/view_function_subclass.php)
>>>>
>>>> So, would GO be willing to add these types of terms? And how much
>>>> of the IUP communities annotation data would GOC groups be happy to
>>>> incorporate into their own annotation sets?
>>>>
>>>> Thanks,
>>>> Emily
>>>>
>>>>
>>>> --************************************
>>>> Emily Dimmer
>>>> GOA Coordinator
>>>> EMBL-EBI
>>>> Wellcome Trust Genome Campus
>>>> Hinxton
>>>> Cambridge CB10 1SD, U.K.
>>>> Tel: +44 1223 494654
>>>> Fax: +44 1223 494468
>>>> email: edimmer at ebi.ac.uk
>>>> ************************************
>>>
>>>
>>> --
>>> Ben Hitz
>>> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO
>>> Consortium
>>> Stanford University ** hitz at genome.stanford.edu
>>>
>>>
>>>
>
>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Go
mailing list