[annotation] [Fwd:What evidence code to use?]
Judith Blake
jblake at informatics.jax.org
Tue Nov 27 18:00:56 PST 2007
This is exactly what RCA was originally used for. With the FANTOM
project [mouse full length cDNA annotatons], participants employed a
series of algorithmic approaches combined with manual inspection and
evaluation to provide annotations. Actually, I think RCA was created as
a result of the FANTOM project.
Judy
Tanya Berardini wrote:
> Forwarding this from the evidence code discussion group. Apologies to
> those who are on both lists. I've sorted the emails from top to
> bottom in chronological order for easier reading:
>
> ----------
> My original email:
>
> > Ah, the eternal question: Is it ISS, is it RCA?
> >
> > I've got a paper that describes the identification of a nice big set
> > of transcription factors in Arabidopsis.
> >
> >
> http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus
>
> >
> >
> >
> > The authors use a combination of motif searches + BLAST + sequence
> > alignment and review those by eye and came up with 1500 or so genes
> > that they call 'transcription factors.'
> >
> > Right now, we've got these annotated to 'transcription factor
> > activity' with the evidence code ISS but nothing in the evidence_with
> > column. If I leave these as ISS, I'd like to put something in the
> > with column, but what? Does this type of a combination of sequence
> > analysis methods that's reviewed manually make it RCA? Not according
> > to the current RCA documentation:
> >
> > "Examples where the RCA evidence code should not be used:
> >
> > * Annotations based on more than one type of gene product sequence
> > based evidence, including such things as BLAST, profile HMMs, TMHMM,
> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
> > should use the ISS code. "
> >
> > Should I wait till ISS comes to a resolution?
> >
> > Help!
>
> ---------
> Ben's reply:
>
> If you can't put something USEFUL in the WITH column, I think this has
> to be RCA.
> I guess under the new, non-documented system, this would be ISS/no
> "With" ISA/ISO/ISM would require withs... (either seq ids or model
> aka interpro ids).
>
>
> Ben
>
> ----------
>
> Val's reply:
>
> This is *exactly* the type of data why I was orginally suggesting that
> RCA should not be restricted to analysis which include some
> experimental component. Unfortunately I couldn't come up with any
> good examples at the time.
>
> These would surely be better as RCA, even though they are sequence based
>
> Val
>
> ----------
>
> Susan's reply:
>
> I've just hit another example...
>
> Enhanced function annotations for Drosophila serine proteases: A case
> study for
> systematic annotation of multi-member gene families.
>
> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE,
> Rodrigues V,
> White KP, Bork P, Sowdhamini R.
>
> PMID: 17996400
>
> This is a functional classification of serine proteases based on a
> 'function residue clustering' algorithm. The algorithm incorporates info
> from sequence alignments, hydrophobicity plots and info about key
> residues from 3D structures - all sequence based but no one thing to put
> in the 'with'.
>
> Susan
>
> -----------
>
> Pascale's reply:
>
> Tanya,
>
> I thought we agreed that BLAST and InterPro were ISS, as you point
> out. I don't think ISS + ISS = RCA?? That is, I would say using
> InterPro or the BLAST result should be enough to make the annotation;
> we dont need to capture both? In this case, the easiest might be using
> ISS with an InterPro domain ID in the 'with',
>
> Similarly in the paper Susan cites, they mention several domains and
> also they have compared to several proteins whose 3D structure has
> been determined hence can be used in the 'with' - I would pick one of
> those example proteins and ISS to that.
>
> Pascale
>
> ---------
>
> Any other thoughts?
>
>
> Thanks,
>
> Tanya
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -------- Original Message --------
> Subject: Re: [evidence] What evidence code to use?
> Date: Wed, 21 Nov 2007 08:43:16 -0500
> From: Pascale Gaudet <pgaudet at northwestern.edu>
> Reply-To: pgaudet at northwestern.edu
> Organization: Northwestern University
> To: tberardi at acoma.stanford.edu
> CC: evidence at genome.stanford.edu
> References: <47437C88.5070204 at acoma.stanford.edu>
>
> Tanya,
>
> I thought we agreed that BLAST and InterPro were ISS, as you point out.
> I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
> the BLAST result should be enough to make the annotation; we dont need
> to capture both? In this case, the easiest might be using ISS with an
> InterPro domain ID in the 'with',
>
> Similarly in the paper Susan cites, they mention several domains and
> also they have compared to several proteins whose 3D structure has been
> determined hence can be used in the 'with' - I would pick one of those
> example proteins and ISS to that.
>
> Pascale
>
>
>> ------------------------------------------------------------------------------------------
>>
>> Tanya Berardini, Ph.D. tberardi at acoma.stanford.edu
>> The Arabidopsis Information Resource FAX: (650) 325-6857
>> Carnegie Institution of Washington Tel: (650) 325-1521 ext. 325
>> Department of Plant Biology URL: http://arabidopsis.org/
>> 260 Panama St.
>> Stanford, CA 94305
>> ------------------------------------------------------------------------------------------
>>
>>
>>
>
>
More information about the Annotation
mailing list