[annotation] [Fwd:What evidence code to use?]

Judith Blake jblake at informatics.jax.org
Tue Nov 27 18:00:56 PST 2007


This is exactly what RCA was originally used for.  With the FANTOM 
project [mouse full length cDNA annotatons], participants employed a 
series of algorithmic approaches combined with manual inspection and 
evaluation to provide annotations.  Actually, I think RCA was created as 
a result of the FANTOM project.

Judy

Tanya Berardini wrote:
> Forwarding this from the evidence code discussion group. Apologies to 
> those who are on both lists.  I've sorted the emails from top to 
> bottom in chronological order for easier reading:
>
> ----------
> My original email:
>
> > Ah, the eternal question:  Is it ISS, is it RCA?
> >
> > I've got a paper that describes the identification of a nice big set
> > of transcription factors in Arabidopsis.
> >
> > 
> http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus 
>
> >
> >
> >
> > The authors use a combination of motif searches + BLAST + sequence
> > alignment and review those by eye and came up with 1500 or so genes
> > that they call 'transcription factors.'
> >
> > Right now, we've got these annotated to 'transcription factor
> > activity' with the evidence code ISS but nothing in the evidence_with
> > column.  If I leave these as ISS, I'd like to put something in the
> > with column, but what?  Does this type of a combination of sequence
> > analysis methods that's reviewed manually make it RCA?  Not according
> > to the current RCA documentation:
> >
> > "Examples where the RCA evidence code should not be used:
> >
> >     * Annotations based on more than one type of gene product sequence
> > based evidence, including such things as BLAST, profile HMMs, TMHMM,
> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
> > should use the ISS code. "
> >
> > Should I wait till ISS comes to a resolution?
> >
> > Help!
>
> ---------
> Ben's reply:
>
> If you can't put something USEFUL in the WITH column, I think this has 
> to be RCA.
> I guess under the new, non-documented system, this would be ISS/no 
> "With"  ISA/ISO/ISM would require withs... (either seq ids or model 
> aka interpro ids).
>
>
> Ben
>
> ----------
>
> Val's reply:
>
> This is *exactly* the type of data why I was orginally suggesting that 
> RCA should not be restricted to analysis which include some 
> experimental component.  Unfortunately I couldn't come up with any 
> good examples at the time.
>
> These would surely be  better as RCA, even though they are sequence based
>
> Val
>
> ----------
>
> Susan's reply:
>
> I've just hit another example...
>
> Enhanced function annotations for Drosophila serine proteases: A case
> study for
> systematic annotation of multi-member gene families.
>
> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE, 
> Rodrigues V,
> White KP, Bork P, Sowdhamini R.
>
> PMID: 17996400
>
> This is a functional classification of serine proteases based on a
> 'function residue clustering' algorithm. The algorithm incorporates info
> from sequence alignments, hydrophobicity plots and info about key
> residues from 3D structures - all sequence based but no one thing to put
> in the 'with'.
>
> Susan
>
> -----------
>
> Pascale's reply:
>
> Tanya,
>
> I thought we agreed that BLAST and InterPro were ISS, as you point 
> out. I don't think ISS + ISS = RCA?? That is, I would say using 
> InterPro or the BLAST result should be enough to make the annotation; 
> we dont need to capture both? In this case, the easiest might be using 
> ISS with an InterPro domain ID in the 'with',
>
> Similarly in the paper Susan cites, they mention several domains and 
> also they have compared to several proteins whose 3D structure has 
> been determined hence can be used in the 'with' - I would pick one of 
> those example proteins and ISS to that.
>
> Pascale
>
> ---------
>
> Any other thoughts?
>
>
> Thanks,
>
> Tanya
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -------- Original Message --------
> Subject: Re: [evidence] What evidence code to use?
> Date: Wed, 21 Nov 2007 08:43:16 -0500
> From: Pascale Gaudet <pgaudet at northwestern.edu>
> Reply-To: pgaudet at northwestern.edu
> Organization: Northwestern University
> To: tberardi at acoma.stanford.edu
> CC: evidence at genome.stanford.edu
> References: <47437C88.5070204 at acoma.stanford.edu>
>
> Tanya,
>
> I thought we agreed that BLAST and InterPro were ISS, as you point out.
> I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
> the BLAST result should be enough to make the annotation; we dont need
> to capture both? In this case, the easiest might be using ISS with an
> InterPro domain ID in the 'with',
>
> Similarly in the paper Susan cites, they mention several domains and
> also they have compared to several proteins whose 3D structure has been
> determined hence can be used in the 'with' - I would pick one of those
> example proteins and ISS to that.
>
> Pascale
>
>
>> ------------------------------------------------------------------------------------------ 
>>
>> Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
>> The Arabidopsis Information Resource    FAX: (650) 325-6857
>> Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
>> Department of Plant Biology        URL: http://arabidopsis.org/
>> 260 Panama St.
>> Stanford, CA 94305
>> ------------------------------------------------------------------------------------------ 
>>
>>
>>
>
>



More information about the Annotation mailing list