[Ontology-editors] Automating regulation terms (and combinatorial terms in general)

Midori Harris midori at ebi.ac.uk
Fri Aug 15 06:09:47 PDT 2008


On Thu, 14 Aug 2008, Chris Mungall wrote:

> I'm kind of embarassed that poor curators like Ruth have to type out 
> sourceforge requests like this:
>
> * 
> https://sourceforge.net/tracker/index.php?func=detail&aid=2045154&group_id=36855&atid=440764
>
> And then someone has to manually add this into the graph being careful to get 
> the is_a links correct

Indeed. I've been moaning about this for years; you may recall my feature 
request on the subject (which may be addressed or superseded by actual OE2 
functionality):

https://sourceforge.net/tracker/index.php?func=detail&aid=1563000&group_id=36855&atid=418260

>
> Regulation term creation (and much xp creation in general) should be far more 
> automatable. This will be easier with OE2 when we can have the 
> intersection_of lines live in the main ontology, but I think there is a lot 
> we can do now.
>
> The current methodology is to parse terms into xp defs, run the reasoner via 
> script, and have people look through reports. I don't think this is scalable. 
> The first time we did this, we ended up having to fix 1000 links. Looking at 
> the current reports[*] there are nearly 1000 more links needing fixed. This 
> translates to an alarmingly high rate of false negatives in search results 
> and analyses.

Yeah, I've been thinking of those reports as an interim approach to be 
phased out when intersection links and OE2 go live.

> David and Tanya like to look through the reports as examining the reasoner 
> results individually allows them to find errors in the source graph. This 
> happens if someone puts the biologically correct link in the regulation graph 
> but neglects to do the equivalent in the source graph. The fact this is 
> happening at all is worrying - no one should be editing the regulation graph 
> manually!
>
> It would be really easy to automate things more, even whilst waiting for OE2.
>
> One idea would be to have a file in CVS that just contains a list of terms or 
> term requests. E.g.
>
> 	regulation of blood vessel remodeling
> 	negative regulation of blood vessel remodeling
> 	positive regulation of blood vessel remodeling
>
> Or it could just be the source term, with the request for 3 regulation terms 
> being implicit
>
> A nightly script would generate obo including synonyms, standard defs and 
> correct placement in the obo file. It would write the unparseables to a 
> report file. It could even be automatically merged but I'm guessing folks 
> would prefer some kind of oversight. Either way it should just be a simple 
> script to merge the results in. We could probably use obomerge.

Sounds good. I vote for curator oversight.

> Folks should also be aware that as far as "standard" regulation terms go OE2 
> has a lot of useful functionality. For example, you can just create a bunch 
> of regulation terms under biological process, run the semantic parser (under 
> the reasoner menu) to generate the xp defs, run the reasoner, then select 
> "assert implied links". The main fiddly step here is that the intersection_of 
> lines have to be stripped before going into the editors version. Should be 
> possible with save filters.

I don't think any of us have enough experience using the semantic parser 
to do this yet. It sounds promising, though, so we should learn our way 
around it. The OE2 user guide has no documentation for the semantic parser 
(just an empty placeholder page for the s.p. manager), so that will have 
to be written, and not by me, as a first step.

> I'm leaning towards doing this out of oboedit. The file-in-cvs approach fits 
> well with the annotator submission model. It's easily extensible beyond 
> standard regulation terms. It does introduce a "fork" in the term submission 
> process, but I think it creates less work and less errors all round. I could 
> also implement the whole thing in < 5 mins.

It would make more sense to think of it as a "fork" in the handling, not 
the submission, of term requests. I would expect, and encourage, anyone 
requesting new term to use the SF tracker as usual. A couple of reasons:

  - Most SF submissions come from annotators. I don't want them to have to 
worry about submitting some requests in SF and others somewhere else, with 
a different mechanism. Also, some of them don't have, or want, go cvs 
write access.

  - I'd like to carry on using SF as a (nearly) one-stop archive of 
requests.

> A variation is a web form that allows you to build xps and then submit them.
>
> Thoughts? Call?

I'm all for anything that would make it easier to add regulation terms! 
The file-in-cvs approach is fine with me. Those of us who claim SF items 
(for regulation terms, mainly David & Tanya these days, tho I've certainly 
done my share in the not-too-distant past) would be the ones writing to 
the new script-fodder file in CVS.

thanks for the suggestions!
m

>
> Cheers
> Chris
>
> [*] 
> http://wiki.geneontology.org/index.php/XP:biological_process_xp_regulation#Availability
>
> _______________________________________________
> Ontology-editors mailing list
> Ontology-editors at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/ontology-editors


More information about the Ontology-editors mailing list