Epistemic formalism (was Re: [Phenoscape] Re: [go] evidence code ontology)

Larry Hunter Larry.Hunter at UCHSC.edu
Wed Feb 6 12:40:06 PST 2008


On Feb 6, 2008, at 12:52 PM, Chris Mungall wrote:

> Let's be clear about what you're asking for.
>
> If we have two assertions:
>
> [1] R(X,Y)
> [2] R(X,not-Y)
>
> Where assertion [1] is supported by e1, and assertion [2] is  
> supported by e2.
>
> e1 and e2, on the surface, contradict one another (this situation is  
> actually a bit more subtle than this, it depends on how we treat [1]  
> and [2]).
>
> You would like relations such as has_evidence, between the assertion  
> and the evidence, and a contradicts relation between evidences that  
> is entailed by the assertions?

Not quite.

I was actually thinking that the epistemic relationships would be  
between the evidence and the propositions.  I wouldn't say the  
"contradicts" relationship is between the evidences, but between  
hypotheses and evidences.   And, of course, depending on what R is,  
R(X, Y) may or may not be incompatible with R(X, not-Y).  Let me try  
an alternate formulation to make sure I am being clear, trying to hew  
to your notation.

Let's start with instances.  Imagine we have made GO annotations use  
explicit relations, as I described in my email addressed to Sue this  
morning, and we have an instance level assertion (called [1]) of a  
gene participating in a process:

[1] gene1 participates-in process1

Further imagine that this would have been annotated "inferred from  
direct assay" with a pointer to PMID1 as evidence.  Glossing over what  
the relationship "inferred from direct assay" really means for the  
moment, and just treating it as a kind of evidential support, we might  
want to assert:

PMID1 supports [1]

Now imagine a second publication that comes along that provides  
evidence (perhaps also "inferred from direct assay") that gene1 does  
NOT participate in process1.  Currently there would be another GO  
annotation created, much like the first, but with the NOT qualifier.   
Then I would suggest something like:

PMID2 contradicts [1]

In this case, we have one hypothesis (gene1 participates-in process1),  
two pieces of evidence, and two epistemic relationships (one between  
each piece of evidence and the hypothesis).

It would not be difficult to make subclasses of supports (and  
contradicts) that reflected the evidence codes as they are currently  
used (e.g. supports-via-direct-assay, contradicts-via-mutant- 
phenotype), although the many kinds of relationships between evidence  
and hypotheses (many of which require inference by someone along the  
way) suggest that a really good taxonomy would be non-trivial to create.

It would also not be difficult to use this structure to capture  
evidence that is finer grained than the journal article (as  
represented by PMID above), say using the contents of a particular  
figure, table, or result statement as the evidence.  I think the OBI  
DENRIE hierarchy nicely captures the things that can be evidence for a  
scientific hypothesis, and subsumes each of the above.  It does make  
evidence a kind of information entity, but the alternatives to that  
all strike me as problematic.   OBI also has a "hypothesis" term.   
Although there are some problems with the definition as it currently  
stands  (see https://sourceforge.net/tracker/?func=detail&atid=886178&aid=1887478&group_id=177891) 
, it may ultimately be the right domain for these epistemic relations.

It would also not be difficult to make both "supports" and  
"contradicts" children of a relationship like "is-relevant-to" (Barry  
suggested "is-about").  Such a system would be immediately useful to  
support all kinds of tools that could be helpful to biologists -- not  
the least of which would be a query that says "show me all the  
evidence relevant to the role of this gene in that process").  An  
instance store that was represented using this formalism I think could  
offer quite a boon in the utility of the annotation work.

The situation with universals is analogous to the instance case above,  
although we have never before mentioned evidence (as codes or  
otherwise) regarding relationships among universals.  The reason for  
that is the "reality-based" desiderata, and the consequent assumption  
that its contents never need evidence (and can't be contradicted).    
The proposal for a set of epistemic terms with a range over OBI  
DENRIEs and with a domain of OBI hypotheses seems plausible to me at  
this point.

There are at least two somewhat objections that I can see that need to  
be addressed.  First, these relations are most clearly second-order,  
and cannot be represented in OWL-DL (although seem to offer no  
challenges in OWL-FULL).  Second, it's not clear if every proposition  
with which one might want to associate evidence is properly considered  
a subclass of OBI hypothesis.   If we were to turn all of the GO  
annotations into explicit relationship assertions (using "participates  
in", "is located in" etc.) would we want all of those propositions to  
be subclasses of OBI hypothesis?   If not, we need to come up with  
something else to define the range of the epistemic relations.

Larry



More information about the Go mailing list