[annotation] [Fwd:What evidence code to use?]
Judith Blake
jblake at informatics.jax.org
Wed Nov 28 10:42:35 PST 2007
RCA I think,
judy
Tanya Berardini wrote:
> Thanks, everyone, for your replies. To come back around to the
> original question, then, should I use:
>
> 1. RCA
>
> or
>
> 2. ISS and pick one of the domain identifiers/Genbank sequences from
> the paper to put into the 'with' field
>
> ?
>
> I've heard opinions supporting both options.
>
> Thanks,
>
> Tanya
>
>
>
> Mike Cherry wrote:
>> I believe RCA was proposed by SGD to use with analyzes like Biopixie.
>>
>> Cheers, Mike
>>
>>
>> On Nov 27, 2007, at 9:00 PM, Judith Blake
>> <jblake at informatics.jax.org> wrote:
>>
>>> This is exactly what RCA was originally used for. With the FANTOM
>>> project [mouse full length cDNA annotatons], participants employed a
>>> series of algorithmic approaches combined with manual inspection and
>>> evaluation to provide annotations. Actually, I think RCA was
>>> created as a result of the FANTOM project.
>>>
>>> Judy
>>>
>>> Tanya Berardini wrote:
>>>> Forwarding this from the evidence code discussion group. Apologies
>>>> to those who are on both lists. I've sorted the emails from top to
>>>> bottom in chronological order for easier reading:
>>>>
>>>> ----------
>>>> My original email:
>>>>
>>>> > Ah, the eternal question: Is it ISS, is it RCA?
>>>> >
>>>> > I've got a paper that describes the identification of a nice big set
>>>> > of transcription factors in Arabidopsis.
>>>> >
>>>> >
>>>> http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus
>>>>
>>>> >
>>>> >
>>>> >
>>>> > The authors use a combination of motif searches + BLAST + sequence
>>>> > alignment and review those by eye and came up with 1500 or so genes
>>>> > that they call 'transcription factors.'
>>>> >
>>>> > Right now, we've got these annotated to 'transcription factor
>>>> > activity' with the evidence code ISS but nothing in the
>>>> evidence_with
>>>> > column. If I leave these as ISS, I'd like to put something in the
>>>> > with column, but what? Does this type of a combination of sequence
>>>> > analysis methods that's reviewed manually make it RCA? Not
>>>> according
>>>> > to the current RCA documentation:
>>>> >
>>>> > "Examples where the RCA evidence code should not be used:
>>>> >
>>>> > * Annotations based on more than one type of gene product
>>>> sequence
>>>> > based evidence, including such things as BLAST, profile HMMs, TMHMM,
>>>> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
>>>> > should use the ISS code. "
>>>> >
>>>> > Should I wait till ISS comes to a resolution?
>>>> >
>>>> > Help!
>>>>
>>>> ---------
>>>> Ben's reply:
>>>>
>>>> If you can't put something USEFUL in the WITH column, I think this
>>>> has to be RCA.
>>>> I guess under the new, non-documented system, this would be ISS/no
>>>> "With" ISA/ISO/ISM would require withs... (either seq ids or model
>>>> aka interpro ids).
>>>>
>>>>
>>>> Ben
>>>>
>>>> ----------
>>>>
>>>> Val's reply:
>>>>
>>>> This is *exactly* the type of data why I was orginally suggesting
>>>> that RCA should not be restricted to analysis which include some
>>>> experimental component. Unfortunately I couldn't come up with any
>>>> good examples at the time.
>>>>
>>>> These would surely be better as RCA, even though they are sequence
>>>> based
>>>>
>>>> Val
>>>>
>>>> ----------
>>>>
>>>> Susan's reply:
>>>>
>>>> I've just hit another example...
>>>>
>>>> Enhanced function annotations for Drosophila serine proteases: A case
>>>> study for
>>>> systematic annotation of multi-member gene families.
>>>>
>>>> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE,
>>>> Rodrigues V,
>>>> White KP, Bork P, Sowdhamini R.
>>>>
>>>> PMID: 17996400
>>>>
>>>> This is a functional classification of serine proteases based on a
>>>> 'function residue clustering' algorithm. The algorithm incorporates
>>>> info
>>>> from sequence alignments, hydrophobicity plots and info about key
>>>> residues from 3D structures - all sequence based but no one thing
>>>> to put
>>>> in the 'with'.
>>>>
>>>> Susan
>>>>
>>>> -----------
>>>>
>>>> Pascale's reply:
>>>>
>>>> Tanya,
>>>>
>>>> I thought we agreed that BLAST and InterPro were ISS, as you point
>>>> out. I don't think ISS + ISS = RCA?? That is, I would say using
>>>> InterPro or the BLAST result should be enough to make the
>>>> annotation; we dont need to capture both? In this case, the easiest
>>>> might be using ISS with an InterPro domain ID in the 'with',
>>>>
>>>> Similarly in the paper Susan cites, they mention several domains
>>>> and also they have compared to several proteins whose 3D structure
>>>> has been determined hence can be used in the 'with' - I would pick
>>>> one of those example proteins and ISS to that.
>>>>
>>>> Pascale
>>>>
>>>> ---------
>>>>
>>>> Any other thoughts?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Tanya
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [evidence] What evidence code to use?
>>>> Date: Wed, 21 Nov 2007 08:43:16 -0500
>>>> From: Pascale Gaudet <pgaudet at northwestern.edu>
>>>> Reply-To: pgaudet at northwestern.edu
>>>> Organization: Northwestern University
>>>> To: tberardi at acoma.stanford.edu
>>>> CC: evidence at genome.stanford.edu
>>>> References: <47437C88.5070204 at acoma.stanford.edu>
>>>>
>>>> Tanya,
>>>>
>>>> I thought we agreed that BLAST and InterPro were ISS, as you point
>>>> out.
>>>> I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
>>>> the BLAST result should be enough to make the annotation; we dont need
>>>> to capture both? In this case, the easiest might be using ISS with an
>>>> InterPro domain ID in the 'with',
>>>>
>>>> Similarly in the paper Susan cites, they mention several domains and
>>>> also they have compared to several proteins whose 3D structure has
>>>> been
>>>> determined hence can be used in the 'with' - I would pick one of those
>>>> example proteins and ISS to that.
>>>>
>>>> Pascale
>>>>
>>>>
>>>>> ------------------------------------------------------------------------------------------
>>>>>
>>>>> Tanya Berardini, Ph.D. tberardi at acoma.stanford.edu
>>>>> The Arabidopsis Information Resource FAX: (650) 325-6857
>>>>> Carnegie Institution of Washington Tel: (650) 325-1521 ext. 325
>>>>> Department of Plant Biology URL: http://arabidopsis.org/
>>>>> 260 Panama St.
>>>>> Stanford, CA 94305
>>>>> ------------------------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>
More information about the Annotation
mailing list