[Annotation] [Go-top] check for ND + additional annotation

Mike Cherry cherry at stanford.edu
Wed Sep 3 16:37:54 PDT 2008


Here is the definition of ND from the GO web site.

--
Use of the ND evidence code indicates that the annotator at the  
contributing database found no information that allowed making an  
annotation to any term indicating specific knowledge from the ontology  
in question (molecular function, biological process, or cellular  
component) as of the date indicated. This code should be used only for  
annotations to the root terms, molecular function ; GO:0003674,  
biological process ; GO:0008150, or cellular component ; GO:0005575,  
which, when used in annotations, indicate that no knowledge is  
available about a gene product in that aspect of GO.
--

To us this phrase is significant, "indicates that the annotator ...  
found no information".  To us that means information in the literature.



The ND means we have looked and found no literature for that protein.   
This will the idea that many groups filter out the computational  
annotationos from the yeast GAF.  The computational annotations come  
from other groups.

-Mike



On Sep 3, 2008, at 2:28 PM, Judith Blake wrote:

> hummm
>
> yah it makes sense in that there is no experimental data.  no it  
> does not make sense when there is a computational annotation.
>
> one way around this is that the ND genes should be (and are in MGI)  
> tagged as 'complete'.  That means that the literature has been  
> reviewed.  Then the database has a mechanism for adding new  
> literature tags (pubmedIDs) to a file when a new publication is  
> identified for that gene.  In this way, the ND designation  
> disappears in the annotation file, the gene is marked as 'complete',  
> and the  ongoing QC report alerts curators as to when experimental  
> literature enters the system.
>
> here's a example from our 'complete' QC file.  'outstanding refs'  
> are jnum (our lit number, tied to PubmedID) added to MasterBib after  
> the 'Data_complete'.  These are cases where the new literature needs  
> to be curator to maintain the 'complete' status.
>
> The columns are
>
> Gene Symbol   MGI-ID   reference gene status   Date_complete    
> #Refs_used   outstanding_refs
>
> in this list, the 'y' mean that it is a designated ref gene.  Genes  
> on this list with 'n' are, for example, genes with ND that have not  
> been chosen as a reference genome gene, but are on the 'complete'  
> list for us.  You can also see here for Aggf1, a ref genome gene,  
> that we have no literature.
>
> Judy
>
> cvr1b    MGI:1338944    y    04/19/2007    6   Acvr2a    MGI: 
> 102806    y    03/06/2008    12   Acvr2b    MGI:87912    y     
> 04/16/2008    16   Acvrl1    MGI:1338946    y    04/09/2008    5    
> Ada    MGI:87916    y        13    13
> Adh1    MGI:87921    y        5    18
> Adh4    MGI:1349472    y    03/03/2008    2   Adh5    MGI:87929     
> y    02/29/2008    5   Adh7    MGI:87926    y    03/12/2008    9    
> Adhfe1    MGI:1923437    y    02/05/2008    2   Adsl    MGI: 
> 103202    y    06/14/2007    1   Adss    MGI:87948    y     
> 05/02/2008    2   Adssl1    MGI:87947    y    05/08/2008    3    
> Aggf1    MGI:1913799    y    06/14/2007    0   Agl    MGI:1924809     
> y    04/09/2008    3   Agxt    MGI:1329033    y    10/15/2007    2    
> Alas2    MGI:87990    y    10/15/2007    4    J:127025,J:130852
> Alb    MGI:87991    y    08/11/2007    6    J:123909,J:129068,J: 
> 132868,J:134487
> Aldh5a1    MGI:2441982    y    03/28/2008    10   Alg12    MGI: 
> 2385025    y    04/06/2007    1
>
>
>
>
> Pascale Gaudet wrote:
>> That seems to make sense. Otherwise, how do you know a curator has  
>> reviewed the literature for that gene?
>>
>> Stacia Engel wrote:
>>> the gene in question was a yeast gene, CWC23.  there is ND to  
>>> generic BP because we have no experimental data for that gene.
>>>
>>> the gene has computational annotations from both UniProtKB and  
>>> bioPIXIE.  at SGD, we don't remove NDs to root nodes when only  
>>> computational annotations exist.
>>>
>>> stacia
>>>
>>>
>>> On Sep 3, 2008, at 11:37 AM, Judith Blake wrote:
>>>
>>>> Typically, I think this should be done at the time of generating  
>>>> the annotation...even an IEA.  Each group might consider  
>>>> implementing something to QC this, but a QC at the time of  
>>>> submission of data file to AmiGO would be good too.
>>>>
>>>> Judy
>>>>
>>>>
>>>>
>>>> Midori Harris wrote:
>>>>> Hi,
>>>>>
>>>>> We received a query on the GO Help list from a user who noticed  
>>>>> a gene that was annotated to both the biological process (BP)  
>>>>> root with ND, and annotated to a non-root BP term with another  
>>>>> evidence code. I agreed with him that it doesn't make sense to  
>>>>> keep the ND/root annotation if a "real"
>>>>> annotation could be made to a term in the same ontology. Would  
>>>>> it be worthwhile to add a check for cases like this, perhaps in  
>>>>> Mike's script?
>>>>>
>>>>> cheers,
>>>>> midori
>>>>> _______________________________________________
>>>>> Annotation mailing list
>>>>> Annotation at geneontology.org
>>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>> _______________________________________________
>>>> Annotation mailing list
>>>> Annotation at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>
>>> _______________________________________________
>>> Annotation mailing list
>>> Annotation at geneontology.org
>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>
>>>
> _______________________________________________
> Go-top mailing list
> Go-top at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/go-top



More information about the Annotation mailing list