[Go] documentation for the new 'Experimental' evidence code.

Chris Mungall cjm at fruitfly.org
Mon Apr 14 11:28:40 PDT 2008


On Apr 14, 2008, at 5:19 AM, Ruth Lovering wrote:

> Hi Emily
>
> I think we need to try not to make things more confusing for the  
> users of GO.  It would be good to choose something both logical,  
> clear and consistent with the other evidence code acronyms.   
> Unfortunately there is not such an obvious acronym for the  
> experimental code as there was for the ISS subgroups.
>
> From the text written for the IE code it is apparent that this code  
> only really for non-GOC members, eg Reactome and other species  
> specific databases.  Which may have a bearing on the decision for  
> what acronym to choose (although maybe not).
>
> I would be very wary of using IE because we already have IEA and  
> IEP and it might appear to be related to both.
>
> EXP has the advantage of being the most obvious acronym for  
> experimental (and the majority of scientists will recognise it  
> instantly as such), however breaks the usual convention of starting  
> the acronym with 'I'.  Although as you point out not all evidence  
> codes have 3 letters (or start with I).  I guess the RCA evidence  
> code provides the support for the EXP acronym.
>
> An alternative would be: IEX: Inferred from EXperiment
>
> Given the choice between IE, IEX and EXP I would favour EXP:
>
> EXP: inferred from EXPeriment

Hi Ruth

I agree that it's good to choose something logical, clear and  
consistent. However, this is completely at odds with the requirement  
to choose a unique name consisting of 3 letters.

EXP is most clear, least confusing, and least open to  
misinterpretation (maybe: perhaps many people will think expression),  
so it's the best choice. However, we have to acknowledge it's kind of  
daft - it's not even an acronym, as you point out.

I think we have to admit the current system is weird, ad-hoc, broken,  
inconsistent, illogical and confusing. For example, nothing in the  
letters IEA vs ISS clearly indicates the most salient difference  
between these two codes. Yet it's impossible to change them because  
we use them as IDs, not labels. The whole code system is reminiscent  
of pre-GO era terminologies. I don't think GO would have made much  
progress if we had wasted time discussing whether apoptosis should be  
APO or APT or APS. We have a system with a robust identifier  
lifecycle policy, synonym mechanism etc that works for GO terms, why  
not use it for evidence too?

Sticking with the system would be defensible if we knew that the set  
of codes was stable and frozen. But this seems unlikely - there will  
always be new experimental methods, and the subtle interactions  
between human and computer based methods will continue to evolve.

The only reason not to switch to ECO IDs immediately is the fact that  
this requires a non-backwards compatible change. I think any such  
change would need to be well orchestrated and would need a lead time  
of a year, but I think it may be necessary.

As a possible short term measure: what about allowing >3 letters?  
We'd need to give a little lead time for people who have hardcoded a  
<=3 letter assumption in software and database schemas, but this is  
relatively trivial compared to switch to a full blown ontology. We  
could then call EXP "EXPT". Or "EXPERIMENT". Or how about  
"Experiment" even. So it's not a "code". Does that matter? It does  
have the advantage of being in the same language that the majority of  
scientists (outside the GO cognoscenti) use.

> With respect to the text, in the three sections the hierarchy is  
> referred to in 3 different ways, parent, granular and specific.  So  
> perhaps the 'specific' phrase could be used in these sections?  
> (although this does lead to the text being rather repetitive):
>> This code is used in an annotation to indicate that an experimental
>> assay has been located the cited reference, whose results indicate a
>> gene product's function, process involvement, or subcellular location
>> (indicated by the GO term). The IE code is the parent code for the  
>> more specific IDA, IMP, IGI, IEP and IPI experimental codes.
>
>> The IE evidence code can be used where any of the assays described
>> for the IDA, IMP, IGI, IPI or IEP evidence codes is reported. However
>> it is highly encouraged that groups should annotate to the
>> more specific experimental codes (IDA, IMP, IGI, IPI or IEP) instead
>> of the general IE code, and all curators directly involved in the  
>> GO Reference Genome
>> annotation effort are obliged to use these and not IE.
>>
>> The IE code exists for groups who would like to contribute
>> high-quality GO annotations that are produced from directly
>> associating GO terms to
>> gene products by citing experimental published results, but where the
>> group is unable to fit the appropriate specific experimental GO
>> evidence codes to each annotation.
>>
>
>
> Finally, although the text in the ISS and IE documentation explains  
> the hierarchy created by the new experimental codes I think it  
> would be clearer to users if the list of evidence codes at the top  
> of the webpage also indicated that the hierarchy exists, eg using  
> the bullet tabs (as used for ISO etc in the supporting  
> documentation for ISS).
>
> Curator-assigned Evidence Codes
> Experimental Evidence Codes
> EXP: inferred from EXPeriment
> IDA: Inferred from Direct Assay
> IPI: Inferred from Physical Interaction
> IMP: Inferred from Mutant Phenotype
> IGI: Inferred from Genetic Interaction
> IEP: Inferred from Expression Pattern
>
> Computational Analysis Evidence Codes
> ISS: Inferred from Sequence or Structural Similarity
> ISO: Inferred from Sequence Orthology
> ISA: Inferred from Sequence Alignment
> ISM: Inferred from Sequence Model
> IGC: Inferred from Genomic Context
> RCA: inferred from Reviewed Computational Analysis
>
> Ruth
>
>
> On 10 Apr 2008, at 14:16, Emily Dimmer wrote:
>> Hi Jen,
>>
>> Thanks. Sure, I agree that 'EXP' is a very memorable code - the  
>> reason
>> I've put the experimental code is 'IE' is because it matches the  
>> style
>> of acronyms used for the other evidence codes, and that 'IE' is  
>> used in
>> the ECO. It would be good to hear more votes for/against on this  
>> point.
>>
>> Emily
>>
>> Jennifer Deegan (nee Clark) wrote:
>>> Hi Emily,
>>>
>>> That looks brilliant. Very clear documentation. Was there any
>>> objection to the idea of calling this evidence code 'EXP'? I found
>>> that very memorable.
>>>
>>> Thanks,
>>>
>>> Jen
>>>
>>> E Dimmer wrote:
>>>
>>>> Hi,
>>>>
>>>> Below you will find a draft of the new documentation for the
>>>> 'Experimental' evidence code.  This draft has been passed by  
>>>> evidence
>>>> code committee. I'd be grateful if you could have a look and  
>>>> send any
>>>> comments or suggestions to me by the 18th of April so that we can
>>>> start adding it to the GO website etc., and will mean that the code
>>>> is available for Reactome to use in their gene association file.
>>>> The documentation of this code is quite short as it only acts to
>>>> group each of the well-defined child experimental terms. The  
>>>> acronym
>>>> for this code is currently 'IE' - in line with the entry in the ECO
>>>> ontology.
>>>>
>>>> Thanks,
>>>> Emily
>>>>
>>>>
>>>> IE: Inferred from Experiment
>>>>
>>>> This code is used in an annotation to indicate that an experimental
>>>> assay has been located the cited reference, whose results  
>>>> indicate a
>>>> gene product's function, process involvement, or subcellular  
>>>> location
>>>> (indicated by the GO term). The IE code is the parent code for the
>>>> IDA, IMP, IGI, IEP and IPI
>>>> experimental codes.
>>>>
>>>> The IE evidence code can be used where any of the assays described
>>>> for the IDA, IMP, IGI, IPI or IEP evidence codes is reported.  
>>>> However
>>>> it is highly encouraged that groups should annotate to one of the
>>>> more granular experimental codes (IDA, IMP, IGI, IPI or IEP)  
>>>> instead
>>>> of IE, and all curators directly involved in the GO Reference  
>>>> Genome
>>>> annotation effort are obliged to use these and not IE.
>>>>
>>>> The IE code exists for groups who would like to contribute
>>>> high-quality GO annotations that are produced from directly
>>>> associating GO terms to
>>>> gene products by citing experimental published results, but  
>>>> where the
>>>> group is unable to fit the appropriate specific experimental GO
>>>> evidence codes to each annotation.
>>>>
>>>> ------------------------------------------------------------------
>>>>    Emily Dimmer Ph.D.
>>>>    GOA Coordinator
>>>>    EMBL-EBI
>>>>    Wellcome Trust Genome Campus
>>>>    Hinxton
>>>>    Cambridge CB10 1SD, U.K.
>>>>    Tel:     +44 1223 494654
>>>>    Fax:    +44 1223 494468
>>>>    email:  edimmer at ebi.ac.uk
>>>>    URL:    http://www.ebi.ac.uk/goa
>>>>
>>>> _______________________________________________
>>>> Go mailing list
>>>> Go at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>
>>
>>
>> -- 
>>
>>
>> Do you need any additional GO annotation resources?
>> Which proteins would you like annotated with GO?
>>
>> Let us know in the GOA User Survey, available at: http:// 
>> www.ebi.ac.uk/GOA/contactus.html
>>
>> ------------------------------------------------------------------
>>
>>    Emily Dimmer Ph.D.
>>    GOA Coordinator
>>    EMBL-EBI
>>    Wellcome Trust Genome Campus
>>    Hinxton
>>    Cambridge CB10 1SD, U.K.
>>    Tel:     +44 1223 494654
>>    Fax:    +44 1223 494468
>>    email:  edimmer at ebi.ac.uk
>>    URL:    http://www.ebi.ac.uk/goa
>>
>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>
> _______________________________________________
> Go mailing list
> Go at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/go



More information about the Go mailing list