[go] Requirement for all 'unknown' annotations to use ND code
Harold Drabkin
hjd at informatics.jax.org
Tue Sep 18 07:00:39 PDT 2007
I think if we had a paper with a KO that showed no phenotype, it
wouldn't be used for an annotation; Lack of a phenotype (expected or
not) might just mean that the organism in question has a way around the
lack of the gene.
hjd
Suzanna Lewis wrote:
>
> On Sep 17, 2007, at 11:47 AM, Karen Christie wrote:
>
>> Hi,
>>
>> The reason I brought this issue up was that I was very uncomfortable
>> with the rationale that people could use the ND evidence code as a
>> way to find the unknown annotations, or with having that purpose as a
>> justification to only allow the ND code for root annotations in our
>> documentation. It seems that we have come to consensus that we should
>> not be saying anything about this as a software shortcut to find
>> unknown annotations.
>
> Whatever is decided. No one should do the above, ever. It was never,
> ever the justification for the ND-only-at-root restriction. So yes, we
> have consensus: we do -not- mention this in any documentation.
>
> I am convinced by what Ben and Mike have said. Yes we have overloaded
> the meaning of the evidence code here. ND is really curation status.
> What I'm still confused on Mike is the question I was trying to get
> at. You say
>
>>>>> A knockout with no phenotype is data, its a negative result yes
>>>>> but its data. You have data that there is no phenotype for that
>>>>> gene for the screen that was performed.
>
> So would this be an annotation to a root node with IMP?
>
>>
>> The more recent discussion has been dealing with the issue that
>> annotations to the root node are a special case anyway, allowing us
>> to track curation progress. While we know that these aren't
>> annotations of knowledge in the same way as other annotations, I
>> think the group has agreed many, many times that we DO want a way to
>> distinguish between genes that have been looked but where nothing is
>> known and genes that just haven't been curated yet, so I think we're
>> stuck with tracking curation progress in some way.
>>
>> In that sense, I can see a rationale for only allowing annotations to
>> the root node to be made with the ND code in that we are making a
>> curatorial statement about what a curator looked at in order to make
>> an annotation to the root node. The ND evidence code is already a
>> special case in that it can only be used for annotations to the root
>> node.
>>
>> Provided that the documentation is phrased in terms of curatorial
>> process, i.e. the procedure required in order to be able to make the
>> statement the a given aspect is unknown for a given gene, I'm OK with
>> this restriction.
>>
>> Note that I'm not volunteering to examine, or do any rewriting of,
>> the ND documentation since I'm due to deliver within the coming week.
>>
>> Thanks to everyone for a good discussion.
>>
>> -Karen
>>
>>
>>
>>
>> On Mon, 17 Sep 2007, Jim Hu wrote:
>>
>>> Makes sense to me (and I'm sure it won't be the last bad idea I
>>> throw out there).
>>>
>>> So... as a user, having access to instances where negative results
>>> have been found, as in the no phenotype example, is useful. Knowing
>>> that the mutant has been made and looked at is valuable. But I can
>>> see that it probably doesn't belong in GO.
>>>
>>> I think the distinction that you and Ben raise about curation
>>> progress vs. annotation is important. Perhaps curation progress
>>> really doesn't belong at all.
>>>
>>> Jim
>>>
>>>
>>>
>>> On Sep 17, 2007, at 12:53 PM, Mike Cherry wrote:
>>>
>>>> I think this is a bad idea. ISS to another organism that has the
>>>> root association is not useful. That just means that in the other
>>>> organism there was no data, you have to look for experiments in
>>>> each organism. A knockout with no phenotype is data, its a
>>>> negative result yes but its data. You have data that there is no
>>>> phenotype for that gene for the screen that was performed.
>>>> An association to the root was a convenience we used to show that
>>>> we looked for a result. Its not an annotation its a curation
>>>> progress statement. A note to say we are looking at all the
>>>> genes. I don't like the use of any experimental code for the root.
>>>> -Mike
>>>> On Sep 17, 2007, at 10:02 AM, Jim Hu wrote:
>>>>> On Sep 17, 2007, at 11:07 AM, Valerie Wood wrote:
>>>>>> I don't see how you can make an annotation to the root node using
>>>>>> RCA/IC/IMP/ISS or IDA?
>>>>> We haven't done these yet, but
>>>>> ISS - similarity to proteins annotated to the root node with ND in
>>>>> another organism?
>>>>> IMP - What does one do for large scale knockout screens when a KO
>>>>> shows no phenotype. Someone did look, so it's not really ND, is it?
>>>>> IDA seems pretty hard to rationalize. I can imagine negative
>>>>> results, as in "previous analysis suggested that gene X has
>>>>> activity Y, but we can't detect it" but wouldn't that get a NOT
>>>>> modifier for the assayed activity Y, if it was annotated at all?
>>>>> I'm actually thinking of a case where paper A says that an E. coli
>>>>> protein is a nuclease, and paper B shows that the nuclease
>>>>> activity is a contaminant. I'm thinking there are the following
>>>>> choices if that results in not knowing the function of the gene X
>>>>> product:
>>>>> * delete the annotation from paper A
>>>>> ** no annotation to the root node
>>>>> ** annotate to the root node
>>>>> * add the annotation from paper B with a not Y
>>>>> ** no annotation to the root node
>>>>> ** annotate to the root node
>>>>> I recall discussing this kind of situation with Karen, but I'm not
>>>>> sure that we covered how to handle the root node. Does this
>>>>> change if the information that the putative activity was a
>>>>> contaminant is not published but the curator knows about it from a
>>>>> meeting or a personal communication?
>>>>> Similarly, can one annotate to the root node with RCA if a
>>>>> computational analysis shows that protein X does not have
>>>>> previously suggested activity Y based on improved sophistication
>>>>> of motif analysis? Again, if this removes the only putative
>>>>> activity from an earlier analysis, does the protein get a root
>>>>> node annotation, or does it get nothing? Example, a protein is
>>>>> annotated by some project as a thioredoxin based on good sequence
>>>>> similarity to the fold family members. Later, someone notices
>>>>> that the active site residues are missing.
>>>>> Jim
>>>>>> The ND means the curator has looked at all the papers for this
>>>>>> gene (and for some databases checked the annotations to orthologs
>>>>>> to see if any sensible inferences can be made), and as of the
>>>>>> data the annotation was mane there is "no data".
>>>>>> We wouldn't be able to do this with any of the other evidence
>>>>>> codes.
>>>>>> Val
>>>>>> Suzanna Lewis wrote:
>>>>>>> After reading through this thread I see no strong reason for
>>>>>>> requiring ND as the evidence code for annotations to the root.
>>>>>>> In fact, I'm now wondering why we have ND at all. Seems to me
>>>>>>> that "no data" is a result. It is not the type of experiment
>>>>>>> that was done. Maybe the only accurate use of ND is when we
>>>>>>> don't even know what kind of experiment was carried out.
>>>>>>> -S
>>>>>>> On Sep 17, 2007, at 8:42 AM, Valerie Wood wrote:
>>>>>>>> So we don't all need to run the query......
>>>>>>>> biological_process Dictybase ISS 1
>>>>>>>> biological_process Dictybase ND 1313
>>>>>>>> biological_process FB ND 1022
>>>>>>>> biological_process GeneDB_Pfalciparum ND 702
>>>>>>>> biological_process GeneDB_Spombe ND 1021
>>>>>>>> biological_process GeneDB_Tbrucei ND 1087
>>>>>>>> biological_process GeneDB_Tbrucei TAS 1
>>>>>>>> biological_process GR_protein IC 11
>>>>>>>> biological_process MGI IDA 1
>>>>>>>> biological_process MGI IMP 2
>>>>>>>> biological_process MGI ND 1382
>>>>>>>> biological_process PseudoCAP IDA 13
>>>>>>>> biological_process PseudoCAP ISS 2
>>>>>>>> biological_process PseudoCAP RCA 26
>>>>>>>> biological_process RGD IEA 1
>>>>>>>> biological_process RGD ND 607
>>>>>>>> biological_process SGD IMP 1
>>>>>>>> biological_process SGD NAS 1
>>>>>>>> biological_process SGD ND 1429
>>>>>>>> biological_process SGD TAS 1
>>>>>>>> biological_process TAIR ND 11086
>>>>>>>> biological_process TAIR RCA 12
>>>>>>>> biological_process TAIR TAS 3
>>>>>>>> biological_process TIGR_CMR ND 19190
>>>>>>>> biological_process TIGR_Tba1 ND 194
>>>>>>>> biological_process UniProt IEA 6
>>>>>>>> biological_process UniProt ND 966
>>>>>>>> biological_process WB IMP 1326
>>>>>>>> biological_process WB ND 2
>>>>>>>> biological_process ZFIN ND 5269
>>>>>>>> cellular_component Dictybase ISS 3
>>>>>>>> cellular_component Dictybase ND 1551
>>>>>>>> cellular_component FB ISS 1
>>>>>>>> cellular_component FB ND 2058
>>>>>>>> cellular_component GeneDB_Pfalciparum ND 288
>>>>>>>> cellular_component GeneDB_Spombe ND 190
>>>>>>>> cellular_component GeneDB_Tbrucei NAS 2
>>>>>>>> cellular_component GeneDB_Tbrucei ND 1623
>>>>>>>> cellular_component GeneDB_Tbrucei TAS 1
>>>>>>>> cellular_component GR_protein TAS 8
>>>>>>>> cellular_component MGI ND 1362
>>>>>>>> cellular_component MGI TAS 1
>>>>>>>> cellular_component PseudoCAP IDA 13
>>>>>>>> cellular_component PseudoCAP ISS 2
>>>>>>>> cellular_component RGD ND 718
>>>>>>>> cellular_component SGD ND 972
>>>>>>>> cellular_component SGD TAS 1
>>>>>>>> cellular_component TAIR ND 9877
>>>>>>>> cellular_component TAIR TAS 12
>>>>>>>> cellular_component TIGR_CMR ND 14318
>>>>>>>> cellular_component TIGR_Tba1 NAS 2
>>>>>>>> cellular_component TIGR_Tba1 ND 184
>>>>>>>> cellular_component UniProt ND 1278
>>>>>>>> cellular_component WB ND 55
>>>>>>>> cellular_component ZFIN ND 6283
>>>>>>>> molecular_function Dictybase ND 1064
>>>>>>>> molecular_function FB ND 1935
>>>>>>>> molecular_function FB TAS 1
>>>>>>>> molecular_function GeneDB_Lmajor IEA 57
>>>>>>>> molecular_function GeneDB_Pfalciparum IEA 38
>>>>>>>> molecular_function GeneDB_Pfalciparum ND 789
>>>>>>>> molecular_function GeneDB_Spombe ND 1452
>>>>>>>> molecular_function GeneDB_Tbrucei IEA 44
>>>>>>>> molecular_function GeneDB_Tbrucei ND 977
>>>>>>>> molecular_function GeneDB_Tbrucei TAS 7
>>>>>>>> molecular_function GR_protein IEA 255
>>>>>>>> molecular_function GR_protein RCA 15
>>>>>>>> molecular_function MGI ND 1381
>>>>>>>> molecular_function PseudoCAP IDA 13
>>>>>>>> molecular_function PseudoCAP ISS 2
>>>>>>>> molecular_function PseudoCAP RCA 46
>>>>>>>> molecular_function RGD ND 701
>>>>>>>> molecular_function SGD ISS 4
>>>>>>>> molecular_function SGD NAS 1
>>>>>>>> molecular_function SGD ND 2166
>>>>>>>> molecular_function SGD TAS 19
>>>>>>>> molecular_function TAIR NAS 3
>>>>>>>> molecular_function TAIR ND 10095
>>>>>>>> molecular_function TAIR RCA 403
>>>>>>>> molecular_function TAIR TAS 72
>>>>>>>> molecular_function TIGR_CMR ND 19337
>>>>>>>> molecular_function TIGR_Tba1 ND 181
>>>>>>>> molecular_function TIGR_Tba1 TAS 7
>>>>>>>> molecular_function UniProt ND 1124
>>>>>>>> molecular_function WB NAS 1
>>>>>>>> molecular_function WB ND 51
>>>>>>>> molecular_function WB TAS 2
>>>>>>>> molecular_function ZFIN ND 4950
>>>>>>>> --
>>>>>>>> The Wellcome Trust Sanger Institute is operated by Genome
>>>>>>>> Research Limited, a charity registered in England with number
>>>>>>>> 1021457 and a company registered in England with number
>>>>>>>> 2742969, whose registered office is 215 Euston Road, London,
>>>>>>>> NW1 2BE.
>>>>>> --
>>>>>> The Wellcome Trust Sanger Institute is operated by Genome
>>>>>> Research Limited, a charity registered in England with number
>>>>>> 1021457 and a company registered in England with number 2742969,
>>>>>> whose registered office is 215 Euston Road, London, NW1 2BE.
>>>>> =====================================
>>>>> Jim Hu
>>>>> Associate Professor
>>>>> Dept. of Biochemistry and Biophysics
>>>>> 2128 TAMU
>>>>> Texas A&M Univ.
>>>>> College Station, TX 77843-2128
>>>>> 979-862-4054
>>>
>>> =====================================
>>> Jim Hu
>>> Associate Professor
>>> Dept. of Biochemistry and Biophysics
>>> 2128 TAMU
>>> Texas A&M Univ.
>>> College Station, TX 77843-2128
>>> 979-862-4054
>>>
>>>
>>
>
More information about the Go
mailing list