[go] Requirement for all 'unknown' annotations to use ND code

Harold Drabkin hjd at informatics.jax.org
Tue Sep 18 07:00:39 PDT 2007


I think if  we had a paper with a KO that showed no phenotype, it 
wouldn't be used for an annotation; Lack of a phenotype (expected or 
not) might just mean that the organism in question has a way around the 
lack of the gene.

hjd

Suzanna Lewis wrote:
>
> On Sep 17, 2007, at 11:47 AM, Karen Christie wrote:
>
>> Hi,
>>
>> The reason I brought this issue up was that I was very uncomfortable 
>> with the rationale that people could use the ND evidence code as a 
>> way to find the unknown annotations, or with having that purpose as a 
>> justification to only allow the ND code for root annotations in our 
>> documentation. It seems that we have come to consensus that we should 
>> not be saying anything about this as a software shortcut to find 
>> unknown annotations.
>
> Whatever is decided. No one should do the above, ever. It was never, 
> ever the justification for the ND-only-at-root restriction. So yes, we 
> have consensus: we do -not- mention this in any documentation.
>
> I am convinced by what Ben and Mike have said. Yes we have overloaded 
> the meaning of the evidence code here. ND is really curation status. 
> What I'm still confused on Mike is the question I was trying to get 
> at. You say
>
>>>>> A knockout with no phenotype is data, its a negative result yes 
>>>>> but its data. You have data that there is no phenotype for that 
>>>>> gene for the screen that was performed.
>
> So would this be an annotation to a root node with IMP?
>
>>
>> The more recent discussion has been dealing with the issue that 
>> annotations to the root node are a special case anyway, allowing us 
>> to track curation progress. While we know that these aren't 
>> annotations of knowledge in the same way as other annotations, I 
>> think the group has agreed many, many times that we DO want a way to 
>> distinguish between genes that have been looked but where nothing is 
>> known and genes that just haven't been curated yet, so I think we're 
>> stuck with tracking curation progress in some way.
>>
>> In that sense, I can see a rationale for only allowing annotations to 
>> the root node to be made with the ND code in that we are making a 
>> curatorial statement about what a curator looked at in order to make 
>> an annotation to the root node. The ND evidence code is already a 
>> special case in that it can only be used for annotations to the root 
>> node.
>>
>> Provided that the documentation is phrased in terms of curatorial 
>> process, i.e. the procedure required in order to be able to make the 
>> statement the a given aspect is unknown for a given gene, I'm OK with 
>> this restriction.
>>
>> Note that I'm not volunteering to examine, or do any rewriting of, 
>> the ND documentation since I'm due to deliver within the coming week.
>>
>> Thanks to everyone for a good discussion.
>>
>> -Karen
>>
>>
>>
>>
>> On Mon, 17 Sep 2007, Jim Hu wrote:
>>
>>> Makes sense to me (and I'm sure it won't be the last bad idea I 
>>> throw out there).
>>>
>>> So... as a user, having access to instances where negative results 
>>> have been found, as in the no phenotype example, is useful.  Knowing 
>>> that the mutant has been made and looked at is valuable.  But I can 
>>> see that it probably doesn't belong in GO.
>>>
>>> I think the distinction that you and Ben raise about curation 
>>> progress vs. annotation is important.  Perhaps curation progress 
>>> really doesn't belong at all.
>>>
>>> Jim
>>>
>>>
>>>
>>> On Sep 17, 2007, at 12:53 PM, Mike Cherry wrote:
>>>
>>>> I think this is a bad idea.  ISS to another organism that has the 
>>>> root association is not useful.  That just means that in the other 
>>>> organism there was no data, you have to look for experiments in 
>>>> each organism.  A knockout with no phenotype is data, its a 
>>>> negative result yes but its data. You have data that there is no 
>>>> phenotype for that gene for the screen that was performed.
>>>> An association to the root was a convenience we used to show that 
>>>> we looked for a result.  Its not an annotation its a curation 
>>>> progress statement.  A note to say we are looking at all the 
>>>> genes.  I don't like the use of any experimental code for the root.
>>>> -Mike
>>>> On Sep 17, 2007, at 10:02 AM, Jim Hu wrote:
>>>>> On Sep 17, 2007, at 11:07 AM, Valerie Wood wrote:
>>>>>> I don't see how you can make an annotation to the root node using
>>>>>> RCA/IC/IMP/ISS or IDA?
>>>>> We haven't done these yet, but
>>>>> ISS - similarity to proteins annotated to the root node with ND in 
>>>>> another organism?
>>>>> IMP - What does one do for large scale knockout screens when a KO 
>>>>> shows no phenotype.  Someone did look, so it's not really ND, is it?
>>>>> IDA seems pretty hard to rationalize.  I can imagine negative 
>>>>> results, as in "previous analysis suggested that gene X has 
>>>>> activity Y, but we can't detect it"  but wouldn't that get a NOT 
>>>>> modifier for the assayed activity Y, if it was annotated at all?  
>>>>> I'm actually thinking of a case where paper A says that an E. coli 
>>>>> protein is a nuclease, and paper B shows that the nuclease 
>>>>> activity is a contaminant.  I'm thinking there are the following 
>>>>> choices if that results in not knowing the function of the gene X 
>>>>> product:
>>>>> * delete the annotation from paper A
>>>>> ** no annotation to the root node
>>>>> ** annotate to the root node
>>>>> *  add the annotation from paper B with a not Y
>>>>> ** no annotation to the root node
>>>>> ** annotate to the root node
>>>>> I recall discussing this kind of situation with Karen, but I'm not 
>>>>> sure that we covered how to handle the root node.  Does this 
>>>>> change if the information that the putative activity was a 
>>>>> contaminant is not published but the curator knows about it from a 
>>>>> meeting or a personal communication?
>>>>> Similarly, can one annotate to the root node with RCA if a 
>>>>> computational analysis shows that protein X does not have 
>>>>> previously suggested activity Y based on improved sophistication 
>>>>> of motif analysis?  Again, if this removes the only putative 
>>>>> activity from an earlier analysis, does the protein get a root 
>>>>> node annotation, or does it get nothing?  Example, a protein is 
>>>>> annotated by some project as a thioredoxin based on good sequence 
>>>>> similarity to the fold family members.  Later, someone notices 
>>>>> that the active site residues are missing.
>>>>> Jim
>>>>>> The ND means the curator has looked at all the papers for this 
>>>>>> gene (and for some databases checked the annotations to orthologs 
>>>>>> to see if any sensible inferences can be made), and as of the 
>>>>>> data the annotation was mane there is "no data".
>>>>>> We  wouldn't be able to do this with any of the other evidence 
>>>>>> codes.
>>>>>> Val
>>>>>> Suzanna Lewis wrote:
>>>>>>> After reading through this thread I see no strong reason for  
>>>>>>> requiring ND as the evidence code for annotations to the root.
>>>>>>> In fact, I'm now wondering why we have ND at all. Seems to me 
>>>>>>> that  "no data" is a result. It is not the type of experiment 
>>>>>>> that was  done. Maybe the only accurate use of ND is when we 
>>>>>>> don't even know  what kind of experiment was carried out.
>>>>>>> -S
>>>>>>> On Sep 17, 2007, at 8:42 AM, Valerie Wood wrote:
>>>>>>>> So we don't all need to run the query......
>>>>>>>> biological_process Dictybase ISS 1
>>>>>>>> biological_process Dictybase ND 1313
>>>>>>>> biological_process FB ND 1022
>>>>>>>> biological_process GeneDB_Pfalciparum ND 702
>>>>>>>> biological_process GeneDB_Spombe ND 1021
>>>>>>>> biological_process GeneDB_Tbrucei ND 1087
>>>>>>>> biological_process GeneDB_Tbrucei TAS 1
>>>>>>>> biological_process GR_protein IC 11
>>>>>>>> biological_process MGI IDA 1
>>>>>>>> biological_process MGI IMP 2
>>>>>>>> biological_process MGI ND 1382
>>>>>>>> biological_process PseudoCAP IDA 13
>>>>>>>> biological_process PseudoCAP ISS 2
>>>>>>>> biological_process PseudoCAP RCA 26
>>>>>>>> biological_process RGD IEA 1
>>>>>>>> biological_process RGD ND 607
>>>>>>>> biological_process SGD IMP 1
>>>>>>>> biological_process SGD NAS 1
>>>>>>>> biological_process SGD ND 1429
>>>>>>>> biological_process SGD TAS 1
>>>>>>>> biological_process TAIR ND 11086
>>>>>>>> biological_process TAIR RCA 12
>>>>>>>> biological_process TAIR TAS 3
>>>>>>>> biological_process TIGR_CMR ND 19190
>>>>>>>> biological_process TIGR_Tba1 ND 194
>>>>>>>> biological_process UniProt IEA 6
>>>>>>>> biological_process UniProt ND 966
>>>>>>>> biological_process WB IMP 1326
>>>>>>>> biological_process WB ND 2
>>>>>>>> biological_process ZFIN ND 5269
>>>>>>>> cellular_component Dictybase ISS 3
>>>>>>>> cellular_component Dictybase ND 1551
>>>>>>>> cellular_component FB ISS 1
>>>>>>>> cellular_component FB ND 2058
>>>>>>>> cellular_component GeneDB_Pfalciparum ND 288
>>>>>>>> cellular_component GeneDB_Spombe ND 190
>>>>>>>> cellular_component GeneDB_Tbrucei NAS 2
>>>>>>>> cellular_component GeneDB_Tbrucei ND 1623
>>>>>>>> cellular_component GeneDB_Tbrucei TAS 1
>>>>>>>> cellular_component GR_protein TAS 8
>>>>>>>> cellular_component MGI ND 1362
>>>>>>>> cellular_component MGI TAS 1
>>>>>>>> cellular_component PseudoCAP IDA 13
>>>>>>>> cellular_component PseudoCAP ISS 2
>>>>>>>> cellular_component RGD ND 718
>>>>>>>> cellular_component SGD ND 972
>>>>>>>> cellular_component SGD TAS 1
>>>>>>>> cellular_component TAIR ND 9877
>>>>>>>> cellular_component TAIR TAS 12
>>>>>>>> cellular_component TIGR_CMR ND 14318
>>>>>>>> cellular_component TIGR_Tba1 NAS 2
>>>>>>>> cellular_component TIGR_Tba1 ND 184
>>>>>>>> cellular_component UniProt ND 1278
>>>>>>>> cellular_component WB ND 55
>>>>>>>> cellular_component ZFIN ND 6283
>>>>>>>> molecular_function Dictybase ND 1064
>>>>>>>> molecular_function FB ND 1935
>>>>>>>> molecular_function FB TAS 1
>>>>>>>> molecular_function GeneDB_Lmajor IEA 57
>>>>>>>> molecular_function GeneDB_Pfalciparum IEA 38
>>>>>>>> molecular_function GeneDB_Pfalciparum ND 789
>>>>>>>> molecular_function GeneDB_Spombe ND 1452
>>>>>>>> molecular_function GeneDB_Tbrucei IEA 44
>>>>>>>> molecular_function GeneDB_Tbrucei ND 977
>>>>>>>> molecular_function GeneDB_Tbrucei TAS 7
>>>>>>>> molecular_function GR_protein IEA 255
>>>>>>>> molecular_function GR_protein RCA 15
>>>>>>>> molecular_function MGI ND 1381
>>>>>>>> molecular_function PseudoCAP IDA 13
>>>>>>>> molecular_function PseudoCAP ISS 2
>>>>>>>> molecular_function PseudoCAP RCA 46
>>>>>>>> molecular_function RGD ND 701
>>>>>>>> molecular_function SGD ISS 4
>>>>>>>> molecular_function SGD NAS 1
>>>>>>>> molecular_function SGD ND 2166
>>>>>>>> molecular_function SGD TAS 19
>>>>>>>> molecular_function TAIR NAS 3
>>>>>>>> molecular_function TAIR ND 10095
>>>>>>>> molecular_function TAIR RCA 403
>>>>>>>> molecular_function TAIR TAS 72
>>>>>>>> molecular_function TIGR_CMR ND 19337
>>>>>>>> molecular_function TIGR_Tba1 ND 181
>>>>>>>> molecular_function TIGR_Tba1 TAS 7
>>>>>>>> molecular_function UniProt ND 1124
>>>>>>>> molecular_function WB NAS 1
>>>>>>>> molecular_function WB ND 51
>>>>>>>> molecular_function WB TAS 2
>>>>>>>> molecular_function ZFIN ND 4950
>>>>>>>> -- 
>>>>>>>> The Wellcome Trust Sanger Institute is operated by Genome 
>>>>>>>> Research Limited, a charity registered in England with number 
>>>>>>>> 1021457 and a company registered in England with number 
>>>>>>>> 2742969, whose registered office is 215 Euston Road, London, 
>>>>>>>> NW1 2BE.
>>>>>> -- 
>>>>>> The Wellcome Trust Sanger Institute is operated by Genome 
>>>>>> Research Limited, a charity registered in England with number 
>>>>>> 1021457 and a company registered in England with number 2742969, 
>>>>>> whose registered office is 215 Euston Road, London, NW1 2BE.
>>>>> =====================================
>>>>> Jim Hu
>>>>> Associate Professor
>>>>> Dept. of Biochemistry and Biophysics
>>>>> 2128 TAMU
>>>>> Texas A&M Univ.
>>>>> College Station, TX 77843-2128
>>>>> 979-862-4054
>>>
>>> =====================================
>>> Jim Hu
>>> Associate Professor
>>> Dept. of Biochemistry and Biophysics
>>> 2128 TAMU
>>> Texas A&M Univ.
>>> College Station, TX 77843-2128
>>> 979-862-4054
>>>
>>>
>>
>




More information about the Go mailing list