[Ontology-editors] Automating regulation terms (and combinatorial terms in general)

Chris Mungall cjm at berkeleybop.org
Thu Aug 14 18:01:38 PDT 2008


I'm kind of embarassed that poor curators like Ruth have to type out  
sourceforge requests like this:

* https://sourceforge.net/tracker/index.php?func=detail&aid=2045154&group_id=36855&atid=440764

And then someone has to manually add this into the graph being careful  
to get the is_a links correct

Regulation term creation (and much xp creation in general) should be  
far more automatable. This will be easier with OE2 when we can have  
the intersection_of lines live in the main ontology, but I think there  
is a lot we can do now.

The current methodology is to parse terms into xp defs, run the  
reasoner via script, and have people look through reports. I don't  
think this is scalable. The first time we did this, we ended up having  
to fix 1000 links. Looking at the current reports[*] there are nearly  
1000 more links needing fixed. This translates to an alarmingly high  
rate of false negatives in search results and analyses.

David and Tanya like to look through the reports as examining the  
reasoner results individually allows them to find errors in the source  
graph. This happens if someone puts the biologically correct link in  
the regulation graph but neglects to do the equivalent in the source  
graph. The fact this is happening at all is worrying - no one should  
be editing the regulation graph manually!

It would be really easy to automate things more, even whilst waiting  
for OE2.

One idea would be to have a file in CVS that just contains a list of  
terms or term requests. E.g.

	regulation of blood vessel remodeling
	negative regulation of blood vessel remodeling
	positive regulation of blood vessel remodeling

Or it could just be the source term, with the request for 3 regulation  
terms being implicit

A nightly script would generate obo including synonyms, standard defs  
and correct placement in the obo file. It would write the unparseables  
to a report file. It could even be automatically merged but I'm  
guessing folks would prefer some kind of oversight. Either way it  
should just be a simple script to merge the results in. We could  
probably use obomerge.

Folks should also be aware that as far as "standard" regulation terms  
go OE2 has a lot of useful functionality. For example, you can just  
create a bunch of regulation terms under biological process, run the  
semantic parser (under the reasoner menu) to generate the xp defs, run  
the reasoner, then select "assert implied links". The main fiddly step  
here is that the intersection_of lines have to be stripped before  
going into the editors version. Should be possible with save filters.

I'm leaning towards doing this out of oboedit. The file-in-cvs  
approach fits well with the annotator submission model. It's easily  
extensible beyond standard regulation terms. It does introduce a  
"fork" in the term submission process, but I think it creates less  
work and less errors all round. I could also implement the whole thing  
in < 5 mins.

A variation is a web form that allows you to build xps and then submit  
them.

Thoughts? Call?

Cheers
Chris

[*] http://wiki.geneontology.org/index.php/XP:biological_process_xp_regulation#Availability



More information about the Ontology-editors mailing list