[Annotation] filtering IEA translations?
Doug
dhowe at cs.uoregon.edu
Thu Mar 12 15:33:03 PDT 2009
We start with both SwissProt and Trembl records for domain
assignments...so we may pick up some dubious stuff I suppose. It
seems there are several places such an issue could be corrected:
1. Don't download domain info from Trembl records.
2. Improve the protein domain model so it produces fewer false
positives.
3. Remove the translation from the translation file if warranted.
Doug Howe, Ph.D.
Scientific Curator
Zebrafish Nomenclature Coordinator
Zebrafish Information Network
541-346-0120
dhowe at cs.uoregon.edu
On Mar 12, 2009, at 3:10 PM, Harold Drabkin wrote:
> We get our IEAs based on ip2go from domains contained in SwissProt
> records. However, we make sure we only use the curated records and
> not Trembl records. These appear to often contain domains that are
> kind of weird. For example, there is an S6 kinase domain that often
> appears, and then this results in getting ip2GO for ribosome, etc.
> for proteins that aren't ribosomal.
> How are you getting the domain assignments?
>
> Harold
>
> Doug wrote:
>> I vaguely recall that interpro can mark domains as false positive
>> hits. The problem with that system is that I believe the domain
>> hit remains but only interpro knows it is marked as a false
>> positive...ie we don't get that false positive information when we
>> sync our data with UniProt (our source for protein domains).
>> Perhaps I am mistaken there though?
>>
>> Doug Howe, Ph.D.
>> Scientific Curator
>> Zebrafish Nomenclature Coordinator
>> Zebrafish Information Network
>> 541-346-0120
>> dhowe at cs.uoregon.edu
>>
>>
>>
>> On Mar 12, 2009, at 2:32 PM, David Hill wrote:
>>
>>> I think if the problem is with the mis-assignment of the domain,
>>> then the translation should be kept and the interpro domain
>>> assignment should be corrected.
>>>
>>>
>>>
>>> Doug wrote:
>>>> So in my example, I would speculate wildly that the zebrafish
>>>> gene does in fact have a domain that looks very much like
>>>> something that would cause beta-catenin binding, but is perhaps
>>>> different enough as to not promote such binding...so the
>>>> translation should be stricken from the translation file until
>>>> the domain model itself can be improved so it can distinguish
>>>> between domains that do and don't bind beta-catenin?
>>>>
>>>>
>>>> Doug Howe, Ph.D.
>>>> Scientific Curator
>>>> Zebrafish Nomenclature Coordinator
>>>> Zebrafish Information Network
>>>> 541-346-0120
>>>> dhowe at cs.uoregon.edu
>>>>
>>>>
>>>>
>>>> On Mar 12, 2009, at 12:41 PM, David Hill wrote:
>>>>
>>>>> I think we should remove the translation. When we originally
>>>>> made these translation table, we used the very conservative rule
>>>>> that if we could find an exception to the translation being
>>>>> correct, we would remove it. Otherwise, erroneous data may be
>>>>> generated for organisms that don't have experimental support.
>>>>>
>>>>> David
>>>>>
>>>>> Doug wrote:
>>>>>> For groups that apply interpro2go, spkw2go, or ec2go
>>>>>> translation files:
>>>>>>
>>>>>> If a translation from interpro2go for example takes you to a GO
>>>>>> term which is directly contradictory to an experimentally
>>>>>> supported annotation in your database, do you apply the IEA
>>>>>> annotation or do you filter it out?
>>>>>>
>>>>>> Example:
>>>>>> We have an IPI annotation to NOT beta-catenin binding and an
>>>>>> IEA annotation (translation of InterPro:IPR009428) to 'beta-
>>>>>> catenin binding' on our lzic gene. Should such an IEA
>>>>>> annotation be made when it conflicts with experimental
>>>>>> annotations?
>>>>>>
>>>>>> I see no problem as long as the evidence code is taken into
>>>>>> account...what do others think?
>>>>>>
>>>>>> -Doug
>>>>>>
>>>>>> Doug Howe, Ph.D.
>>>>>> Scientific Curator
>>>>>> Zebrafish Nomenclature Coordinator
>>>>>> Zebrafish Information Network
>>>>>> 541-346-0120
>>>>>> dhowe at cs.uoregon.edu
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Annotation mailing list
>>>>>> Annotation at geneontology.org
>>>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>>>
>>>>> --
>>>>> David P. Hill, Ph.D.
>>>>> Bioinformatics Scientist: Ontology Development
>>>>> Gene Ontology Consortium
>>>>> The Jackson Laboratory
>>>>> www.geneontology.org
>>>>> www.informatics.jax.org
>>>>> tel:207-288-6430
>>>>
>>>
>>> --
>>> David P. Hill, Ph.D.
>>> Bioinformatics Scientist: Ontology Development
>>> Gene Ontology Consortium
>>> The Jackson Laboratory
>>> www.geneontology.org
>>> www.informatics.jax.org
>>> tel:207-288-6430
>>
>> _______________________________________________
>> Annotation mailing list
>> Annotation at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/annotation
More information about the Annotation
mailing list