[Annotation] filtering IEA translations?

Harold Drabkin hjd at informatics.jax.org
Thu Mar 12 15:39:06 PDT 2009


We do load and display all of the domains from both types of records, 
but we only translate the ones from the SwissProt records; in the 
example I have, the translation is correct but the possession of the 
domain is usually not. We have in the past requested  both translation 
removal, as well as question the domains when we think dubious.
hd


Doug wrote:
> We start with both SwissProt and Trembl records for domain 
> assignments...so we may pick up some dubious stuff I suppose.   It 
> seems there are several places such an issue could be corrected:
>
> 1.  Don't download domain info from Trembl records.
> 2.  Improve the protein domain model so it produces fewer false 
> positives.
> 3.  Remove the translation from the translation file if warranted.
>
> Doug Howe, Ph.D.
> Scientific Curator
> Zebrafish Nomenclature Coordinator
> Zebrafish Information Network
> 541-346-0120
> dhowe at cs.uoregon.edu
>
>
>
> On Mar 12, 2009, at 3:10 PM, Harold Drabkin wrote:
>
>> We get our IEAs based on ip2go from domains contained in SwissProt 
>> records. However, we make sure we only use the curated records and 
>> not Trembl records. These appear to often contain domains that are 
>> kind of weird. For example, there is an S6 kinase domain that often 
>> appears, and then this results in getting ip2GO for ribosome, etc. 
>> for proteins that aren't ribosomal.
>> How are you getting the domain assignments?
>>
>> Harold
>>
>> Doug wrote:
>>> I vaguely recall that interpro can mark domains as false positive 
>>> hits.  The problem with that system is that I believe the domain hit 
>>> remains but only interpro knows it is marked as a false 
>>> positive...ie we don't get that false positive information when we 
>>> sync our data with UniProt (our source for protein domains).  
>>> Perhaps I am mistaken there though?
>>>
>>> Doug Howe, Ph.D.
>>> Scientific Curator
>>> Zebrafish Nomenclature Coordinator
>>> Zebrafish Information Network
>>> 541-346-0120
>>> dhowe at cs.uoregon.edu
>>>
>>>
>>>
>>> On Mar 12, 2009, at 2:32 PM, David Hill wrote:
>>>
>>>> I think if the problem is with the mis-assignment of the domain, 
>>>> then the translation should be kept and the interpro domain 
>>>> assignment should be corrected.
>>>>
>>>>
>>>>
>>>> Doug wrote:
>>>>> So in my example, I would speculate wildly that the zebrafish gene 
>>>>> does in fact have a domain that looks very much like something 
>>>>> that would cause beta-catenin binding, but is perhaps different 
>>>>> enough as to not promote such binding...so the translation should 
>>>>> be stricken from the translation file until the domain model 
>>>>> itself can be improved so it can distinguish between domains that 
>>>>> do and don't bind beta-catenin?
>>>>>
>>>>>
>>>>> Doug Howe, Ph.D.
>>>>> Scientific Curator
>>>>> Zebrafish Nomenclature Coordinator
>>>>> Zebrafish Information Network
>>>>> 541-346-0120
>>>>> dhowe at cs.uoregon.edu
>>>>>
>>>>>
>>>>>
>>>>> On Mar 12, 2009, at 12:41 PM, David Hill wrote:
>>>>>
>>>>>> I think we should remove the translation. When we originally made 
>>>>>> these translation table, we used the very conservative rule that 
>>>>>> if we could find an exception to the translation being correct, 
>>>>>> we would remove it. Otherwise, erroneous data may be generated 
>>>>>> for organisms that don't have experimental support.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> Doug wrote:
>>>>>>> For groups that apply interpro2go, spkw2go, or ec2go translation 
>>>>>>> files:
>>>>>>>
>>>>>>> If a translation from interpro2go for example takes you to a GO 
>>>>>>> term which is directly contradictory to an experimentally 
>>>>>>> supported annotation in your database, do you apply the IEA 
>>>>>>> annotation or do you filter it out?
>>>>>>>
>>>>>>> Example:
>>>>>>>  We have an IPI annotation to NOT beta-catenin binding and an 
>>>>>>> IEA annotation (translation of InterPro:IPR009428) to 
>>>>>>> 'beta-catenin binding' on our lzic gene.  Should such an IEA 
>>>>>>> annotation be made when it conflicts with experimental annotations?
>>>>>>>
>>>>>>> I see no problem as long as the evidence code is taken into 
>>>>>>> account...what do others think?
>>>>>>>
>>>>>>> -Doug
>>>>>>>
>>>>>>> Doug Howe, Ph.D.
>>>>>>> Scientific Curator
>>>>>>> Zebrafish Nomenclature Coordinator
>>>>>>> Zebrafish Information Network
>>>>>>> 541-346-0120
>>>>>>> dhowe at cs.uoregon.edu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Annotation mailing list
>>>>>>> Annotation at geneontology.org
>>>>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>>>>
>>>>>> -- 
>>>>>> David P. Hill, Ph.D.
>>>>>> Bioinformatics Scientist: Ontology Development
>>>>>> Gene Ontology Consortium
>>>>>> The Jackson Laboratory
>>>>>> www.geneontology.org
>>>>>> www.informatics.jax.org
>>>>>> tel:207-288-6430
>>>>>
>>>>
>>>> -- 
>>>> David P. Hill, Ph.D.
>>>> Bioinformatics Scientist: Ontology Development
>>>> Gene Ontology Consortium
>>>> The Jackson Laboratory
>>>> www.geneontology.org
>>>> www.informatics.jax.org
>>>> tel:207-288-6430
>>>
>>> _______________________________________________
>>> Annotation mailing list
>>> Annotation at geneontology.org
>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>



More information about the Annotation mailing list