[Ontology-editors] Automating regulation terms (and combinatorial terms in general)
Chris Mungall
cjm at berkeleybop.org
Thu Aug 14 18:01:38 PDT 2008
I'm kind of embarassed that poor curators like Ruth have to type out
sourceforge requests like this:
* https://sourceforge.net/tracker/index.php?func=detail&aid=2045154&group_id=36855&atid=440764
And then someone has to manually add this into the graph being careful
to get the is_a links correct
Regulation term creation (and much xp creation in general) should be
far more automatable. This will be easier with OE2 when we can have
the intersection_of lines live in the main ontology, but I think there
is a lot we can do now.
The current methodology is to parse terms into xp defs, run the
reasoner via script, and have people look through reports. I don't
think this is scalable. The first time we did this, we ended up having
to fix 1000 links. Looking at the current reports[*] there are nearly
1000 more links needing fixed. This translates to an alarmingly high
rate of false negatives in search results and analyses.
David and Tanya like to look through the reports as examining the
reasoner results individually allows them to find errors in the source
graph. This happens if someone puts the biologically correct link in
the regulation graph but neglects to do the equivalent in the source
graph. The fact this is happening at all is worrying - no one should
be editing the regulation graph manually!
It would be really easy to automate things more, even whilst waiting
for OE2.
One idea would be to have a file in CVS that just contains a list of
terms or term requests. E.g.
regulation of blood vessel remodeling
negative regulation of blood vessel remodeling
positive regulation of blood vessel remodeling
Or it could just be the source term, with the request for 3 regulation
terms being implicit
A nightly script would generate obo including synonyms, standard defs
and correct placement in the obo file. It would write the unparseables
to a report file. It could even be automatically merged but I'm
guessing folks would prefer some kind of oversight. Either way it
should just be a simple script to merge the results in. We could
probably use obomerge.
Folks should also be aware that as far as "standard" regulation terms
go OE2 has a lot of useful functionality. For example, you can just
create a bunch of regulation terms under biological process, run the
semantic parser (under the reasoner menu) to generate the xp defs, run
the reasoner, then select "assert implied links". The main fiddly step
here is that the intersection_of lines have to be stripped before
going into the editors version. Should be possible with save filters.
I'm leaning towards doing this out of oboedit. The file-in-cvs
approach fits well with the annotator submission model. It's easily
extensible beyond standard regulation terms. It does introduce a
"fork" in the term submission process, but I think it creates less
work and less errors all round. I could also implement the whole thing
in < 5 mins.
A variation is a web form that allows you to build xps and then submit
them.
Thoughts? Call?
Cheers
Chris
[*] http://wiki.geneontology.org/index.php/XP:biological_process_xp_regulation#Availability
More information about the Ontology-editors
mailing list