[annotation] [Fwd:What evidence code to use?]

Tanya Berardini tberardi at acoma.Stanford.EDU
Wed Nov 28 10:36:59 PST 2007


Thanks, everyone, for your replies.  To come back around to the original 
question, then, should I use:

1. RCA

or

2. ISS and pick one of the domain identifiers/Genbank sequences from the 
paper to put into the 'with' field

?

I've heard opinions supporting both options.

Thanks,

Tanya



Mike Cherry wrote:
> I believe RCA was proposed by SGD to use with analyzes like Biopixie.
> 
> Cheers, Mike
> 
> 
> On Nov 27, 2007, at 9:00 PM, Judith Blake <jblake at informatics.jax.org> 
> wrote:
> 
>> This is exactly what RCA was originally used for.  With the FANTOM 
>> project [mouse full length cDNA annotatons], participants employed a 
>> series of algorithmic approaches combined with manual inspection and 
>> evaluation to provide annotations.  Actually, I think RCA was created 
>> as a result of the FANTOM project.
>>
>> Judy
>>
>> Tanya Berardini wrote:
>>> Forwarding this from the evidence code discussion group. Apologies to 
>>> those who are on both lists.  I've sorted the emails from top to 
>>> bottom in chronological order for easier reading:
>>>
>>> ----------
>>> My original email:
>>>
>>> > Ah, the eternal question:  Is it ISS, is it RCA?
>>> >
>>> > I've got a paper that describes the identification of a nice big set
>>> > of transcription factors in Arabidopsis.
>>> >
>>> > 
>>> http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus 
>>>
>>> >
>>> >
>>> >
>>> > The authors use a combination of motif searches + BLAST + sequence
>>> > alignment and review those by eye and came up with 1500 or so genes
>>> > that they call 'transcription factors.'
>>> >
>>> > Right now, we've got these annotated to 'transcription factor
>>> > activity' with the evidence code ISS but nothing in the evidence_with
>>> > column.  If I leave these as ISS, I'd like to put something in the
>>> > with column, but what?  Does this type of a combination of sequence
>>> > analysis methods that's reviewed manually make it RCA?  Not according
>>> > to the current RCA documentation:
>>> >
>>> > "Examples where the RCA evidence code should not be used:
>>> >
>>> >     * Annotations based on more than one type of gene product sequence
>>> > based evidence, including such things as BLAST, profile HMMs, TMHMM,
>>> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
>>> > should use the ISS code. "
>>> >
>>> > Should I wait till ISS comes to a resolution?
>>> >
>>> > Help!
>>>
>>> ---------
>>> Ben's reply:
>>>
>>> If you can't put something USEFUL in the WITH column, I think this 
>>> has to be RCA.
>>> I guess under the new, non-documented system, this would be ISS/no 
>>> "With"  ISA/ISO/ISM would require withs... (either seq ids or model 
>>> aka interpro ids).
>>>
>>>
>>> Ben
>>>
>>> ----------
>>>
>>> Val's reply:
>>>
>>> This is *exactly* the type of data why I was orginally suggesting 
>>> that RCA should not be restricted to analysis which include some 
>>> experimental component.  Unfortunately I couldn't come up with any 
>>> good examples at the time.
>>>
>>> These would surely be  better as RCA, even though they are sequence 
>>> based
>>>
>>> Val
>>>
>>> ----------
>>>
>>> Susan's reply:
>>>
>>> I've just hit another example...
>>>
>>> Enhanced function annotations for Drosophila serine proteases: A case
>>> study for
>>> systematic annotation of multi-member gene families.
>>>
>>> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE, 
>>> Rodrigues V,
>>> White KP, Bork P, Sowdhamini R.
>>>
>>> PMID: 17996400
>>>
>>> This is a functional classification of serine proteases based on a
>>> 'function residue clustering' algorithm. The algorithm incorporates info
>>> from sequence alignments, hydrophobicity plots and info about key
>>> residues from 3D structures - all sequence based but no one thing to put
>>> in the 'with'.
>>>
>>> Susan
>>>
>>> -----------
>>>
>>> Pascale's reply:
>>>
>>> Tanya,
>>>
>>> I thought we agreed that BLAST and InterPro were ISS, as you point 
>>> out. I don't think ISS + ISS = RCA?? That is, I would say using 
>>> InterPro or the BLAST result should be enough to make the annotation; 
>>> we dont need to capture both? In this case, the easiest might be 
>>> using ISS with an InterPro domain ID in the 'with',
>>>
>>> Similarly in the paper Susan cites, they mention several domains and 
>>> also they have compared to several proteins whose 3D structure has 
>>> been determined hence can be used in the 'with' - I would pick one of 
>>> those example proteins and ISS to that.
>>>
>>> Pascale
>>>
>>> ---------
>>>
>>> Any other thoughts?
>>>
>>>
>>> Thanks,
>>>
>>> Tanya
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: [evidence] What evidence code to use?
>>> Date: Wed, 21 Nov 2007 08:43:16 -0500
>>> From: Pascale Gaudet <pgaudet at northwestern.edu>
>>> Reply-To: pgaudet at northwestern.edu
>>> Organization: Northwestern University
>>> To: tberardi at acoma.stanford.edu
>>> CC: evidence at genome.stanford.edu
>>> References: <47437C88.5070204 at acoma.stanford.edu>
>>>
>>> Tanya,
>>>
>>> I thought we agreed that BLAST and InterPro were ISS, as you point out.
>>> I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
>>> the BLAST result should be enough to make the annotation; we dont need
>>> to capture both? In this case, the easiest might be using ISS with an
>>> InterPro domain ID in the 'with',
>>>
>>> Similarly in the paper Susan cites, they mention several domains and
>>> also they have compared to several proteins whose 3D structure has been
>>> determined hence can be used in the 'with' - I would pick one of those
>>> example proteins and ISS to that.
>>>
>>> Pascale
>>>
>>>
>>>> ------------------------------------------------------------------------------------------ 
>>>>
>>>> Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
>>>> The Arabidopsis Information Resource    FAX: (650) 325-6857
>>>> Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
>>>> Department of Plant Biology        URL: http://arabidopsis.org/
>>>> 260 Panama St.
>>>> Stanford, CA 94305
>>>> ------------------------------------------------------------------------------------------ 
>>>>
>>>>
>>>>
>>>
>>>

-- 
------------------------------------------------------------------------------------------
Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
The Arabidopsis Information Resource    FAX: (650) 325-6857
Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
Department of Plant Biology        URL: http://arabidopsis.org/
260 Panama St.
Stanford, CA 94305
------------------------------------------------------------------------------------------



More information about the Annotation mailing list