[go] finishing up Evidence Code Issues

Karen Christie kchris at genome.Stanford.EDU
Mon Sep 10 16:15:37 PDT 2007


Hi,

Since I'm due on September 20th and will be going on maternity leave 
shortly before the GO meeting, Mike asked me to send these remaining items 
to finish up the Evidence Code documentation directly to the list to at 
least get the discussion started. Some issues may need to be discussed at 
the GO meeting as well.

This email will contain some responses to Midori's last email and a few 
other email comments to resolve some minor comments on the current draft 
of the new Evidence Code documentation. I will send separate emails to 
deal with each of these specific issues:

1. Restriction that all unknowns MUST use ND

2. IMP vs IGI for single gene mutations, regardless of gene being annotated

3. How to put program or method names in the with column for ISS

4. Scope of the RCA evidence code

For both issues 2 and 4, I think that the recommendations I've made
will help make it possible to create a decision tree/flowchart that is
fairly simple and clear. I'll send a very rough draft of a flowchart
separately as well.

Note that for both #s 3 and 4, I have put some supplemental info into
html docs in my personal space.  I did not spend much time doing html
formatting for these docs, on the thought that people might prefer to
move them to the GOC wiki. However, as the Evidence Code Committee was
not designated as a Working Group, I have no idea where to put them
within the wiki structure. If a spot is designated for them, they can
be moved to the wiki.

-Karen



Responses and comments on things in red on this page:

http://www-dev.yeastgenome.org/draftGO/go/www/GO.evidence.new.shtml


1. GO_REF documentation

> We should have documentation that explains GO_REF's and links to it
> when we refer to them.

Midori (15 Jun 2007):
   Links can go to the existing GO References page:

     http://www.geneontology.org/cgi-bin/references.cgi

   I can write up a description (which will be brief; there's not an
   enormous amount to say) and give it to Amelia to be added to the blurb
   at the top of this page. The plain text file from which the web page
   is generated contains a brief description of the format, which could
   be HTMLified and also added to the blurb if it would be useful.

Karen (9 Sept 2007):

   Please do. It would also be good if the page for the GO_REFs is made
   easier to find in general in our documentation.


2. ChEBI IDs in with field?

> Do we allow things like ChEBI IDs in the with field?

Midori (15 Jun 2007):
   I would say yes.

Karen (9 Sept 2007):

   Perhaps we should make this a quick agenda item for the next GO
   meeting, so that people can ratify this face to face, unless we get an
   overwhelming response via email to proceed with allowing this new ID
   for the with field.

3. IMP examples

> any more positive examples for IMP?, e.g. phenotypic similarity

Midori (15 Jun 2007):
   Dredged up from email from January 2002 ...

   Erich Schwarz needed to know which code to use for "other mutations
   sharing a complex mutant phenotype syndrome with [a well-characterized
   mutant]." My comment at the time was: "The situation you've described
   is IMP, not IGI, because (if I understand correctly) you're looking at
   one mutation at a time. Comparing the phenotype of one mutation to
   that of another helps you interpret the meaning, but is not a kind of
   genetic interaction."

   I think this still holds. Erich provided some details of an example,
   which I can forward if you want.

Karen (9 Sept 2007):

   We can certainly include it, the more examples the better in my
   opinion, but don't send it to me. I'll be going on maternity leave
   soon and don't want to be responsible for this getting added.


4. use of with field for NAS

> The Evidence Code Committee discussed the idea of making GO
> annotations from Reactome entries. ... What does the full group feel
> about the idea of allowing the ID for a database record, when such
> exist, in the with field?

Midori (15 Jun 2007):
   I'm all for including annotations based on Reactome entries -- they
   have a well-developed curation system that deeply involves expert
   biologists, so the statements in their records are very reliable.

   I am not in favor of putting the Reactome ID in the with field for
   these annotations, however, because the Reactome entry does not modify
   or supplement the evidence; rather, the entry provides the
   evidence. GO would effectively be using a Recatome record as a source
   of information about a gene product, so it would make much more sense
   to put the Reactome ID in the reference field.

   For the more general database record case, it may be that I don't
   sufficiently understand what might go in a GO_REF (or equivalent), so
   I don't understand the rationale for allowing 'with' for NAS.

   For the case where the author infers one thing from another, using a
   GO ID in 'with' makes more sense, but I think it's not really
   necessary because the author (presumably) hasn't actually made any GO
   annotations, and hasn't stated observations or conclusions in terms
   of, well, GO terms. (Perhaps this will change some day!) Also, note
   that we have expressly disallowed the use of 'with' for NAS, so the
   script would have to be changed if the use of with-for-NAS is agreed.

Karen (9 Sept 2007):

   Regarding the idea of allowing Reactome IDs in the with field, the
   thought was that it provided the specific information about which
   record in Reactome made the statement, but the idea was
   controversial even just with the Evidence Code Committee.

   Regarding the idea of allowing GOids for NAS, I think you bring up a
   good point that this may not make sense since the author has typically
   not stated their statement in terms of a GOid from which an inference
   was made. Allowing this may just be more confusing than helpful,
   especially since deciding which GOid to put in the with field will
   almost always be a curator judgement.

   However, I wasn't one of the proponents of this idea, so those who
   are may wish to defend it.

   In any case, rather than adding yet another usage of the with column
   that is potentially confusing to users, I could personallyjust go
   with not allowing use of the with column at all for NAS.


5. Representation of examples for with/from:

Susan (14 Jun 2007):

   IPI examples

   Looks good but there something odd about the IPI example,
   assuming I am looking at the latest version ok.

   Firstly, the paper is about mouse proteins not Drosophila so could we
   change FB to MGI please. Also, I am confused as to why there are three
   lines shown - MGI just list the middle one:

   FB:gene_1_ID	Abcd3	GO:0005515	PMID:10551832	IPI	UniProt:protein_2_ID    ...
   FB:gene_1_ID	Abcd3	GO:0005515	PMID:10551832   IPI	UniProt:protein_2_ID|UniProt:protein_3_ID       ...
   FB:gene_1_ID	Abcd3	GO:0005515	PMID:10551832   IPI	FB:gene_2_ID

   So unless I'm missing something I suggest we lose the extra lines and
   have either:

   MGI:1349216	Abcd3	GO:0005515	PMID:10551832	IPI	UniProt:P33897|UniProt:Q61285

   OR

   MGI:gene_1_ID	Abcd3	GO:0005515	PMID:10551832	IPI	UniProt:protein_2_ID|UniProt:protein_3_ID

   I'd prefer to include the real identifiers so it isn't a mix of 'real'
   and 'example'.

   Similarly there seems to be a mix of FB and SGD db identifiers in the
   IGI examples. A possible alternative for IGI is:

   In PMID:9043060, flies simultaneously mutant for three genes: klingon
   (klg), sevenless (sev) and Son of sevenless (Sos) are used to show that
   klingon participates in R7 photoreceptor fate commitment. This leads to
   the annotation:

   FB:FBgn0017590	klg	GO:0045466	PMID:9043060	IGI	FB:FBgn0003366|FB:FBgn0001965


Karen (9 Sept 2007):

   I'm all for real examples, but I don't have time to dig them up for
   every evidence code. Perhaps we could distribute this task around, so
   that we have multiple real examples for each evidence code. It would
   be good to have at least one example with one entry in the with
   column, as well as the one with multiple. It would also be good if
   they showed various IDs in the with field.

   This would be a reasonable task if there was one person for each
   evidence code to find some real examples, and then hopefully it would
   be easy for Amelia to put them in the right format if she was given
   all the specific info that should be in the table.

6. ISS & with col:
> Note that there should be good evidence that the gene product(s)
> placed in the with/from column actually has the activity, process,
> etc. being annotated.

Midori (15 Jun 2007):
   Do we want to specifically say the "good evidence" should be
   *experimental* evidence? Would be consistent with the Ref Genome
   requirement, and good practice generally ...

Karen (9 Sept 2007):
   We do have to remember that this Evidence Code document is not just
   for the use of the Reference Genomes. While did agree that ISS should
   not be made from pairwise BLAST unless the gene to be placed in the
   with column has been experimentally characterized, the ISS code covers
   more situations than just that. The with field may also contain Pfams,
   Prosite, TIGRFAMS, CBS, COG, PANTHER, and we also have to determine
   how to include method names here for stuff like tRNAscan and my
   specific question about snoRNAs. Michelle Gwinn may wish to comment
   on this too.


Typos, other trivial fixes:
-------------------------------------------------------

1. IGI

> Should we add a statement in the paragraph above to IGI, similar to
> the one in IMP, about care in making annotations from gain of
> function mutations ...?

Midori (15 Jun 2007):
   Sounds reasonable to me.

Karen (9 Sept 2007):

   OK, added to first paragraph of IGI.

2. Last paragraph of Introduction:

Midori (15 Jun 2007):
   Change "effect" to "affect" in "... will also effect the quality of the resulting annotation."

Karen (9 Sept 2007):
   done

3. IDA & IMP:

Midori (15 Jun 2007):
   Does "over-expression" really need to be hyphenated? I've seen it
   unhyphenated more frequently; also, there's one unhyphenated
   occurrence in the document.

Karen (9 Sept 2007):
   changed to unhyphenated

4. IGI examples:

Midori (15 Jun 2007):
   The statements "For this type of experiment, use the IGI Code" could
   be deleted -- they're redundant with the fact that the description
   appears in a list headed "where the IGI code should be used."

Karen (9 Sept 2007):

   done





More information about the Go mailing list