[Ontology-editors] cross products

Chris Mungall cjm at berkeleybop.org
Mon Apr 6 11:31:38 PDT 2009


The answer is it's a mix, depending on the xp set. *Ultimately* the  
goal is to never have to parse a term again and to create xp defs  
prospectively rather than retrospectively (we can still use obol for  
*generation* of names, defs and synonyms, based on the newly created  
xp def). We're nearer this goal with some xp sets rather than others.

For example, the inter-organism regulation ones are quite hard, and we  
probably have a few iterations to go. These iterations are quite  
useful besides improving the grammar, as they clarify many parts of  
the ontology, and also consistency of naming.

The bp_xp_self which you are working on is a special case. There is a  
long tail of terms that can be trivially parsed; for example "X during  
Y" is translated to "an X that part_of Y". These have a high accuracy.  
But there are other terms in your set for which I liberally applied  
grammatical rules I knew to be less accurate. One such rule involved  
using a relational adjective to guess an xp def, resulting in

GO:0048007 ! antigen processing and presentation, exogenous lipid  
antigen via MHC class Ib
synonym: "exogenous lipid antigen processing and presentation via MHC  
class Ib" ==>
intersection_of: GO:0048003 ! antigen processing and presentation of  
lipid antigen via MHC class Ib
intersection_of: part_of GO:0042638 ! exogen

(currently in the vetted file)

This is quite obviously wrong!! I knew there would be many incorrect  
ones in the bp_xp_self set from the outset, I did a first pass filter  
myself, but probably not a good enough job. So in some ways your set  
is the hardest as it has the highest ratio of nonsense. We should  
perhaps invert the procedure for you, such that everything starts off  
unvetted and must be explicitly vetted.

The goal in this case is not to improve the obol rule: it's simply not  
possible to get a 100% accurate xp def from terms such as this. Obol  
is really for the low hanging fruit (LHF), the trivially composable  
terms.

We'll probably revisit the example above when we return to do  
bp_xp_pro. For now, just delete erroneous xp defs such as the one above

On Apr 6, 2009, at 5:27 AM, Jennifer Deegan (nee Clark) wrote:

> Hi,
>
> David and I were editing some of my cross products together on  
> Friday and we realised that we had different ideas of what we were  
> meant to do.  I just wanted to check that we are on the right lines  
> now, and Chris asked me to do it on list so we are all up to date on  
> the correct scheme.
>
> I had assumed that the intersection files that Chris generated with  
> obol  were a one-off, and that we were to hand edit the stanzas and  
> put the resulting corrected intersection tags into the vetted file.
>
> David thought that the plan was for us to correct the live GO and  
> feed back on obol's rules so that autogeneration would work  
> correctly for all future runs.
>
> Which one is right?
>
> Thanks,
>
> Jen
>
>
> _______________________________________________
> Ontology-editors mailing list
> Ontology-editors at geneontology.org
> http://fafner.stanford.edu/mailman/listinfo/ontology-editors
>



More information about the Ontology-editors mailing list