[annotation] [Fwd:What evidence code to use?]

Tanya Berardini tberardi at acoma.Stanford.EDU
Wed Nov 21 10:39:39 PST 2007


Forwarding this from the evidence code discussion group. Apologies to 
those who are on both lists.  I've sorted the emails from top to bottom 
in chronological order for easier reading:

----------
My original email:

 > Ah, the eternal question:  Is it ISS, is it RCA?
 >
 > I've got a paper that describes the identification of a nice big set
 > of transcription factors in Arabidopsis.
 >
 > 
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus 

 >
 >
 >
 > The authors use a combination of motif searches + BLAST + sequence
 > alignment and review those by eye and came up with 1500 or so genes
 > that they call 'transcription factors.'
 >
 > Right now, we've got these annotated to 'transcription factor
 > activity' with the evidence code ISS but nothing in the evidence_with
 > column.  If I leave these as ISS, I'd like to put something in the
 > with column, but what?  Does this type of a combination of sequence
 > analysis methods that's reviewed manually make it RCA?  Not according
 > to the current RCA documentation:
 >
 > "Examples where the RCA evidence code should not be used:
 >
 >     * Annotations based on more than one type of gene product sequence
 > based evidence, including such things as BLAST, profile HMMs, TMHMM,
 > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
 > should use the ISS code. "
 >
 > Should I wait till ISS comes to a resolution?
 >
 > Help!

---------
Ben's reply:

If you can't put something USEFUL in the WITH column, I think this has 
to be RCA.
I guess under the new, non-documented system, this would be ISS/no 
"With"  ISA/ISO/ISM would require withs... (either seq ids or model aka 
interpro ids).


Ben

----------

Val's reply:

This is *exactly* the type of data why I was orginally suggesting that 
RCA should not be restricted to analysis which include some experimental 
component.  Unfortunately I couldn't come up with any good examples at 
the time.

These would surely be  better as RCA, even though they are sequence based

Val

----------

Susan's reply:

I've just hit another example...

Enhanced function annotations for Drosophila serine proteases: A case
study for
systematic annotation of multi-member gene families.

Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE, Rodrigues V,
White KP, Bork P, Sowdhamini R.

PMID: 17996400

This is a functional classification of serine proteases based on a
'function residue clustering' algorithm. The algorithm incorporates info
from sequence alignments, hydrophobicity plots and info about key
residues from 3D structures - all sequence based but no one thing to put
in the 'with'.

Susan

-----------

Pascale's reply:

Tanya,

I thought we agreed that BLAST and InterPro were ISS, as you point out. 
I don't think ISS + ISS = RCA?? That is, I would say using InterPro or 
the BLAST result should be enough to make the annotation; we dont need 
to capture both? In this case, the easiest might be using ISS with an 
InterPro domain ID in the 'with',

Similarly in the paper Susan cites, they mention several domains and 
also they have compared to several proteins whose 3D structure has been 
determined hence can be used in the 'with' - I would pick one of those 
example proteins and ISS to that.

Pascale

---------

Any other thoughts?


Thanks,

Tanya



























-------- Original Message --------
Subject: Re: [evidence] What evidence code to use?
Date: Wed, 21 Nov 2007 08:43:16 -0500
From: Pascale Gaudet <pgaudet at northwestern.edu>
Reply-To: pgaudet at northwestern.edu
Organization: Northwestern University
To: tberardi at acoma.stanford.edu
CC: evidence at genome.stanford.edu
References: <47437C88.5070204 at acoma.stanford.edu>

Tanya,

I thought we agreed that BLAST and InterPro were ISS, as you point out.
I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
the BLAST result should be enough to make the annotation; we dont need
to capture both? In this case, the easiest might be using ISS with an
InterPro domain ID in the 'with',

Similarly in the paper Susan cites, they mention several domains and
also they have compared to several proteins whose 3D structure has been
determined hence can be used in the 'with' - I would pick one of those
example proteins and ISS to that.

Pascale


> ------------------------------------------------------------------------------------------ 
>
> Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
> The Arabidopsis Information Resource    FAX: (650) 325-6857
> Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
> Department of Plant Biology        URL: http://arabidopsis.org/
> 260 Panama St.
> Stanford, CA 94305
> ------------------------------------------------------------------------------------------ 
>
>
>


-- 
------------------------------------------------------------------------------------------
Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
The Arabidopsis Information Resource    FAX: (650) 325-6857
Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
Department of Plant Biology        URL: http://arabidopsis.org/
260 Panama St.
Stanford, CA 94305
------------------------------------------------------------------------------------------



More information about the Annotation mailing list