[annotation] [Fwd:What evidence code to use?]
Mike Cherry
cherry at stanford.edu
Wed Nov 28 04:47:09 PST 2007
I believe RCA was proposed by SGD to use with analyzes like Biopixie.
Cheers, Mike
On Nov 27, 2007, at 9:00 PM, Judith Blake <jblake at informatics.jax.org>
wrote:
> This is exactly what RCA was originally used for. With the FANTOM
> project [mouse full length cDNA annotatons], participants employed a
> series of algorithmic approaches combined with manual inspection and
> evaluation to provide annotations. Actually, I think RCA was
> created as a result of the FANTOM project.
>
> Judy
>
> Tanya Berardini wrote:
>> Forwarding this from the evidence code discussion group. Apologies
>> to those who are on both lists. I've sorted the emails from top to
>> bottom in chronological order for easier reading:
>>
>> ----------
>> My original email:
>>
>> > Ah, the eternal question: Is it ISS, is it RCA?
>> >
>> > I've got a paper that describes the identification of a nice big
>> set
>> > of transcription factors in Arabidopsis.
>> >
>> > http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus
>> >
>> >
>> >
>> > The authors use a combination of motif searches + BLAST + sequence
>> > alignment and review those by eye and came up with 1500 or so genes
>> > that they call 'transcription factors.'
>> >
>> > Right now, we've got these annotated to 'transcription factor
>> > activity' with the evidence code ISS but nothing in the
>> evidence_with
>> > column. If I leave these as ISS, I'd like to put something in the
>> > with column, but what? Does this type of a combination of sequence
>> > analysis methods that's reviewed manually make it RCA? Not
>> according
>> > to the current RCA documentation:
>> >
>> > "Examples where the RCA evidence code should not be used:
>> >
>> > * Annotations based on more than one type of gene product
>> sequence
>> > based evidence, including such things as BLAST, profile HMMs,
>> TMHMM,
>> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
>> > should use the ISS code. "
>> >
>> > Should I wait till ISS comes to a resolution?
>> >
>> > Help!
>>
>> ---------
>> Ben's reply:
>>
>> If you can't put something USEFUL in the WITH column, I think this
>> has to be RCA.
>> I guess under the new, non-documented system, this would be ISS/no
>> "With" ISA/ISO/ISM would require withs... (either seq ids or model
>> aka interpro ids).
>>
>>
>> Ben
>>
>> ----------
>>
>> Val's reply:
>>
>> This is *exactly* the type of data why I was orginally suggesting
>> that RCA should not be restricted to analysis which include some
>> experimental component. Unfortunately I couldn't come up with any
>> good examples at the time.
>>
>> These would surely be better as RCA, even though they are sequence
>> based
>>
>> Val
>>
>> ----------
>>
>> Susan's reply:
>>
>> I've just hit another example...
>>
>> Enhanced function annotations for Drosophila serine proteases: A case
>> study for
>> systematic annotation of multi-member gene families.
>>
>> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE,
>> Rodrigues V,
>> White KP, Bork P, Sowdhamini R.
>>
>> PMID: 17996400
>>
>> This is a functional classification of serine proteases based on a
>> 'function residue clustering' algorithm. The algorithm incorporates
>> info
>> from sequence alignments, hydrophobicity plots and info about key
>> residues from 3D structures - all sequence based but no one thing
>> to put
>> in the 'with'.
>>
>> Susan
>>
>> -----------
>>
>> Pascale's reply:
>>
>> Tanya,
>>
>> I thought we agreed that BLAST and InterPro were ISS, as you point
>> out. I don't think ISS + ISS = RCA?? That is, I would say using
>> InterPro or the BLAST result should be enough to make the
>> annotation; we dont need to capture both? In this case, the easiest
>> might be using ISS with an InterPro domain ID in the 'with',
>>
>> Similarly in the paper Susan cites, they mention several domains
>> and also they have compared to several proteins whose 3D structure
>> has been determined hence can be used in the 'with' - I would pick
>> one of those example proteins and ISS to that.
>>
>> Pascale
>>
>> ---------
>>
>> Any other thoughts?
>>
>>
>> Thanks,
>>
>> Tanya
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: [evidence] What evidence code to use?
>> Date: Wed, 21 Nov 2007 08:43:16 -0500
>> From: Pascale Gaudet <pgaudet at northwestern.edu>
>> Reply-To: pgaudet at northwestern.edu
>> Organization: Northwestern University
>> To: tberardi at acoma.stanford.edu
>> CC: evidence at genome.stanford.edu
>> References: <47437C88.5070204 at acoma.stanford.edu>
>>
>> Tanya,
>>
>> I thought we agreed that BLAST and InterPro were ISS, as you point
>> out.
>> I don't think ISS + ISS = RCA?? That is, I would say using InterPro
>> or
>> the BLAST result should be enough to make the annotation; we dont
>> need
>> to capture both? In this case, the easiest might be using ISS with an
>> InterPro domain ID in the 'with',
>>
>> Similarly in the paper Susan cites, they mention several domains and
>> also they have compared to several proteins whose 3D structure has
>> been
>> determined hence can be used in the 'with' - I would pick one of
>> those
>> example proteins and ISS to that.
>>
>> Pascale
>>
>>
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ------------------------------------------------------------------
>>> Tanya Berardini, Ph.D. tberardi at acoma.stanford.edu
>>> The Arabidopsis Information Resource FAX: (650) 325-6857
>>> Carnegie Institution of Washington Tel: (650) 325-1521 ext. 325
>>> Department of Plant Biology URL: http://arabidopsis.org/
>>> 260 Panama St.
>>> Stanford, CA 94305
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ---
>>> ------------------------------------------------------------------
>>>
>>>
>>
>>
More information about the Annotation
mailing list