[annotation] [Fwd:What evidence code to use?]

Judith Blake jblake at informatics.jax.org
Wed Nov 28 10:42:35 PST 2007


RCA I think,
judy
Tanya Berardini wrote:
> Thanks, everyone, for your replies.  To come back around to the 
> original question, then, should I use:
>
> 1. RCA
>
> or
>
> 2. ISS and pick one of the domain identifiers/Genbank sequences from 
> the paper to put into the 'with' field
>
> ?
>
> I've heard opinions supporting both options.
>
> Thanks,
>
> Tanya
>
>
>
> Mike Cherry wrote:
>> I believe RCA was proposed by SGD to use with analyzes like Biopixie.
>>
>> Cheers, Mike
>>
>>
>> On Nov 27, 2007, at 9:00 PM, Judith Blake 
>> <jblake at informatics.jax.org> wrote:
>>
>>> This is exactly what RCA was originally used for.  With the FANTOM 
>>> project [mouse full length cDNA annotatons], participants employed a 
>>> series of algorithmic approaches combined with manual inspection and 
>>> evaluation to provide annotations.  Actually, I think RCA was 
>>> created as a result of the FANTOM project.
>>>
>>> Judy
>>>
>>> Tanya Berardini wrote:
>>>> Forwarding this from the evidence code discussion group. Apologies 
>>>> to those who are on both lists.  I've sorted the emails from top to 
>>>> bottom in chronological order for easier reading:
>>>>
>>>> ----------
>>>> My original email:
>>>>
>>>> > Ah, the eternal question:  Is it ISS, is it RCA?
>>>> >
>>>> > I've got a paper that describes the identification of a nice big set
>>>> > of transcription factors in Arabidopsis.
>>>> >
>>>> > 
>>>> http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11118137&dopt=AbstractPlus 
>>>>
>>>> >
>>>> >
>>>> >
>>>> > The authors use a combination of motif searches + BLAST + sequence
>>>> > alignment and review those by eye and came up with 1500 or so genes
>>>> > that they call 'transcription factors.'
>>>> >
>>>> > Right now, we've got these annotated to 'transcription factor
>>>> > activity' with the evidence code ISS but nothing in the 
>>>> evidence_with
>>>> > column.  If I leave these as ISS, I'd like to put something in the
>>>> > with column, but what?  Does this type of a combination of sequence
>>>> > analysis methods that's reviewed manually make it RCA?  Not 
>>>> according
>>>> > to the current RCA documentation:
>>>> >
>>>> > "Examples where the RCA evidence code should not be used:
>>>> >
>>>> >     * Annotations based on more than one type of gene product 
>>>> sequence
>>>> > based evidence, including such things as BLAST, profile HMMs, TMHMM,
>>>> > SignalP, PROSITE, InterPro, mapping files such as interpro2go etc.
>>>> > should use the ISS code. "
>>>> >
>>>> > Should I wait till ISS comes to a resolution?
>>>> >
>>>> > Help!
>>>>
>>>> ---------
>>>> Ben's reply:
>>>>
>>>> If you can't put something USEFUL in the WITH column, I think this 
>>>> has to be RCA.
>>>> I guess under the new, non-documented system, this would be ISS/no 
>>>> "With"  ISA/ISO/ISM would require withs... (either seq ids or model 
>>>> aka interpro ids).
>>>>
>>>>
>>>> Ben
>>>>
>>>> ----------
>>>>
>>>> Val's reply:
>>>>
>>>> This is *exactly* the type of data why I was orginally suggesting 
>>>> that RCA should not be restricted to analysis which include some 
>>>> experimental component.  Unfortunately I couldn't come up with any 
>>>> good examples at the time.
>>>>
>>>> These would surely be  better as RCA, even though they are sequence 
>>>> based
>>>>
>>>> Val
>>>>
>>>> ----------
>>>>
>>>> Susan's reply:
>>>>
>>>> I've just hit another example...
>>>>
>>>> Enhanced function annotations for Drosophila serine proteases: A case
>>>> study for
>>>> systematic annotation of multi-member gene families.
>>>>
>>>> Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE, 
>>>> Rodrigues V,
>>>> White KP, Bork P, Sowdhamini R.
>>>>
>>>> PMID: 17996400
>>>>
>>>> This is a functional classification of serine proteases based on a
>>>> 'function residue clustering' algorithm. The algorithm incorporates 
>>>> info
>>>> from sequence alignments, hydrophobicity plots and info about key
>>>> residues from 3D structures - all sequence based but no one thing 
>>>> to put
>>>> in the 'with'.
>>>>
>>>> Susan
>>>>
>>>> -----------
>>>>
>>>> Pascale's reply:
>>>>
>>>> Tanya,
>>>>
>>>> I thought we agreed that BLAST and InterPro were ISS, as you point 
>>>> out. I don't think ISS + ISS = RCA?? That is, I would say using 
>>>> InterPro or the BLAST result should be enough to make the 
>>>> annotation; we dont need to capture both? In this case, the easiest 
>>>> might be using ISS with an InterPro domain ID in the 'with',
>>>>
>>>> Similarly in the paper Susan cites, they mention several domains 
>>>> and also they have compared to several proteins whose 3D structure 
>>>> has been determined hence can be used in the 'with' - I would pick 
>>>> one of those example proteins and ISS to that.
>>>>
>>>> Pascale
>>>>
>>>> ---------
>>>>
>>>> Any other thoughts?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Tanya
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [evidence] What evidence code to use?
>>>> Date: Wed, 21 Nov 2007 08:43:16 -0500
>>>> From: Pascale Gaudet <pgaudet at northwestern.edu>
>>>> Reply-To: pgaudet at northwestern.edu
>>>> Organization: Northwestern University
>>>> To: tberardi at acoma.stanford.edu
>>>> CC: evidence at genome.stanford.edu
>>>> References: <47437C88.5070204 at acoma.stanford.edu>
>>>>
>>>> Tanya,
>>>>
>>>> I thought we agreed that BLAST and InterPro were ISS, as you point 
>>>> out.
>>>> I don't think ISS + ISS = RCA?? That is, I would say using InterPro or
>>>> the BLAST result should be enough to make the annotation; we dont need
>>>> to capture both? In this case, the easiest might be using ISS with an
>>>> InterPro domain ID in the 'with',
>>>>
>>>> Similarly in the paper Susan cites, they mention several domains and
>>>> also they have compared to several proteins whose 3D structure has 
>>>> been
>>>> determined hence can be used in the 'with' - I would pick one of those
>>>> example proteins and ISS to that.
>>>>
>>>> Pascale
>>>>
>>>>
>>>>> ------------------------------------------------------------------------------------------ 
>>>>>
>>>>> Tanya Berardini, Ph.D.            tberardi at acoma.stanford.edu
>>>>> The Arabidopsis Information Resource    FAX: (650) 325-6857
>>>>> Carnegie Institution of Washington    Tel: (650) 325-1521 ext. 325
>>>>> Department of Plant Biology        URL: http://arabidopsis.org/
>>>>> 260 Panama St.
>>>>> Stanford, CA 94305
>>>>> ------------------------------------------------------------------------------------------ 
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>





More information about the Annotation mailing list