Epistemic formalism (was Re: [Phenoscape] Re: [go] evidence code ontology)
Tanya Berardini
tberardi at acoma.Stanford.EDU
Wed Feb 6 16:54:51 PST 2008
> But even then there may be key contexts that we are not testing for
> in experiments.
Yes, but this is the nature of research in biology. You test as many
of the contexts that you can think of are relevant but how do you know
which are relevant? It's the old 'unknown unknown' question. So, we
are stuck with capturing what has been tested and published.
> > Another simple case of context is is two authors measuring the
> > expression of a gene at the RNA level. If one author measures by
> > Northern Blot, they may conclude the gene is not expressed. If
> > another author uses a very sensitive assay such as RT-PCR they may
> > detect expression of the gene. Would you consider this
> > contradictory? I'd conclude that you can't detect it by Northern,
> > but you can by RT-PCR. I'd also conclude that it is expressed. So
> > in a way the conclusion made by the first author is incorrect, but
> > it is consistent with the context in which he did the experiment.
>
> Out of interest, when annotating an experiment as this, would a
> curator make a NOT annotation? To what extent does this depend on the
> certainty of the author, and to what on the curators judgement
> (including knowledge of the sensitivity of various techniques?)
I think this is a great example of curation as an art opposed to a
science. It's easy to make hard and fast rules that are just waiting
to be broken: "If you don't see it in a Northern, it's not there."
vs. "If you don't see it in a Northern, it could be there but you need
more tissue/more mRNA/longer exposure/more sensitive probe...."
Personally, I annotate the positives in Northerns and tend to stay
away from the negatives (don't make the NOT annoation) because, you
never know. On the other hand, if the transcript isn't detected by
Q- or SQ-RT-PCR, I would make that NOT annotation.
Others may differ in that practice.
> I think the idea of contexts is crucial here. We can start capturing
> some contexts of the first variety (biological contexts) using the
> new annotation property column. And we can perhaps capture some of
> the second variety (experimental contexts) with a finer grained
> evidence/experiment ontology.
That is what we try to capture at TAIR with what has become ECO, a
link to more than just an evidence code - a link to an actual method.
So, at TAIR, a user can see whether the expression was detected by
Northern or RT-PCR or in situ hybridization.
my 2 cents,
Tanya
> > Larry Hunter wrote:
> >>
> >> On Feb 6, 2008, at 12:52 PM, Chris Mungall wrote:
> >>
> >>> Let's be clear about what you're asking for.
> >>>
> >>> If we have two assertions:
> >>>
> >>> [1] R(X,Y)
> >>> [2] R(X,not-Y)
> >>>
> >>> Where assertion [1] is supported by e1, and assertion [2] is
> >>> supported by e2.
> >>>
> >>> e1 and e2, on the surface, contradict one another (this situation
> >>> is actually a bit more subtle than this, it depends on how we
> >>> treat [1] and [2]).
> >>>
> >>> You would like relations such as has_evidence, between the
> >>> assertion and the evidence, and a contradicts relation between
> >>> evidences that is entailed by the assertions?
> >>
> >> Not quite.
> >>
> >> I was actually thinking that the epistemic relationships would be
> >> between the evidence and the propositions. I wouldn't say the
> >> "contradicts" relationship is between the evidences, but between
> >> hypotheses and evidences. And, of course, depending on what R
> >> is, R(X, Y) may or may not be incompatible with R(X, not-Y). Let
> >> me try an alternate formulation to make sure I am being clear,
> >> trying to hew to your notation.
> >>
> >> Let's start with instances. Imagine we have made GO annotations
> >> use explicit relations, as I described in my email addressed to
> >> Sue this morning, and we have an instance level assertion (called
> >> [1]) of a gene participating in a process:
> >>
> >> [1] gene1 participates-in process1
> >>
> >> Further imagine that this would have been annotated "inferred from
> >> direct assay" with a pointer to PMID1 as evidence. Glossing over
> >> what the relationship "inferred from direct assay" really means
> >> for the moment, and just treating it as a kind of evidential
> >> support, we might want to assert:
> >>
> >> PMID1 supports [1]
> >>
> >> Now imagine a second publication that comes along that provides
> >> evidence (perhaps also "inferred from direct assay") that gene1
> >> does NOT participate in process1. Currently there would be
> >> another GO annotation created, much like the first, but with the
> >> NOT qualifier. Then I would suggest something like:
> >>
> >> PMID2 contradicts [1]
> >>
> >> In this case, we have one hypothesis (gene1 participates-in
> >> process1), two pieces of evidence, and two epistemic relationships
> >> (one between each piece of evidence and the hypothesis).
> >>
> >> It would not be difficult to make subclasses of supports (and
> >> contradicts) that reflected the evidence codes as they are
> >> currently used (e.g. supports-via-direct-assay, contradicts-via-
> >> mutant-phenotype), although the many kinds of relationships
> >> between evidence and hypotheses (many of which require inference
> >> by someone along the way) suggest that a really good taxonomy
> >> would be non-trivial to create.
> >>
> >> It would also not be difficult to use this structure to capture
> >> evidence that is finer grained than the journal article (as
> >> represented by PMID above), say using the contents of a particular
> >> figure, table, or result statement as the evidence. I think the
> >> OBI DENRIE hierarchy nicely captures the things that can be
> >> evidence for a scientific hypothesis, and subsumes each of the
> >> above. It does make evidence a kind of information entity, but
> >> the alternatives to that all strike me as problematic. OBI also
> >> has a "hypothesis" term. Although there are some problems with
> >> the definition as it currently stands (see https://
> >> sourceforge.net/tracker/?
> >> func=detail&atid=886178&aid=1887478&group_id=177891), it may
> >> ultimately be the right domain for these epistemic relations.
> >>
> >> It would also not be difficult to make both "supports" and
> >> "contradicts" children of a relationship like "is-relevant-
> >> to" (Barry suggested "is-about"). Such a system would be
> >> immediately useful to support all kinds of tools that could be
> >> helpful to biologists -- not the least of which would be a query
> >> that says "show me all the evidence relevant to the role of this
> >> gene in that process"). An instance store that was represented
> >> using this formalism I think could offer quite a boon in the
> >> utility of the annotation work.
> >>
> >> The situation with universals is analogous to the instance case
> >> above, although we have never before mentioned evidence (as codes
> >> or otherwise) regarding relationships among universals. The
> >> reason for that is the "reality-based" desiderata, and the
> >> consequent assumption that its contents never need evidence (and
> >> can't be contradicted). The proposal for a set of epistemic
> >> terms with a range over OBI DENRIEs and with a domain of OBI
> >> hypotheses seems plausible to me at this point.
> >>
> >> There are at least two somewhat objections that I can see that
> >> need to be addressed. First, these relations are most clearly
> >> second-order, and cannot be represented in OWL-DL (although seem
> >> to offer no challenges in OWL-FULL). Second, it's not clear if
> >> every proposition with which one might want to associate evidence
> >> is properly considered a subclass of OBI hypothesis. If we were
> >> to turn all of the GO annotations into explicit relationship
> >> assertions (using "participates in", "is located in" etc.) would
> >> we want all of those propositions to be subclasses of OBI
> >> hypothesis? If not, we need to come up with something else to
> >> define the range of the epistemic relations.
> >>
> >> Larry
More information about the Go
mailing list