[Ontology-editors] cross products

Jennifer Deegan (nee Clark) jdeegan at ebi.ac.uk
Tue Apr 7 02:19:56 PDT 2009


Hi Chris,

Thanks for explaining. In cases where obol was wrong-footed by problems 
in the graph or rules that were not quite right I'm assuming that for my 
file I should:

1) Fix the live graph.
2) Feed back to you on how rules could be improved.
3) Fix the intesection tags by hand and then move these from unvetted to 
vetted.

Is that right? It doesn't make sense to me to have you regenerate the 
intersection tags using obol when we could fix them by hand fairly quickly.

Thanks,

Jen



Chris Mungall wrote:

> 
> The answer is it's a mix, depending on the xp set. *Ultimately* the  
> goal is to never have to parse a term again and to create xp defs  
> prospectively rather than retrospectively (we can still use obol for  
> *generation* of names, defs and synonyms, based on the newly created  xp 
> def). We're nearer this goal with some xp sets rather than others.
> 
> For example, the inter-organism regulation ones are quite hard, and we  
> probably have a few iterations to go. These iterations are quite  useful 
> besides improving the grammar, as they clarify many parts of  the 
> ontology, and also consistency of naming.
> 
> The bp_xp_self which you are working on is a special case. There is a  
> long tail of terms that can be trivially parsed; for example "X during  
> Y" is translated to "an X that part_of Y". These have a high accuracy.  
> But there are other terms in your set for which I liberally applied  
> grammatical rules I knew to be less accurate. One such rule involved  
> using a relational adjective to guess an xp def, resulting in
> 
> GO:0048007 ! antigen processing and presentation, exogenous lipid  
> antigen via MHC class Ib
> synonym: "exogenous lipid antigen processing and presentation via MHC  
> class Ib" ==>
> intersection_of: GO:0048003 ! antigen processing and presentation of  
> lipid antigen via MHC class Ib
> intersection_of: part_of GO:0042638 ! exogen
> 
> (currently in the vetted file)
> 
> This is quite obviously wrong!! I knew there would be many incorrect  
> ones in the bp_xp_self set from the outset, I did a first pass filter  
> myself, but probably not a good enough job. So in some ways your set  is 
> the hardest as it has the highest ratio of nonsense. We should  perhaps 
> invert the procedure for you, such that everything starts off  unvetted 
> and must be explicitly vetted.
> 
> The goal in this case is not to improve the obol rule: it's simply not  
> possible to get a 100% accurate xp def from terms such as this. Obol  is 
> really for the low hanging fruit (LHF), the trivially composable  terms.
> 
> We'll probably revisit the example above when we return to do  
> bp_xp_pro. For now, just delete erroneous xp defs such as the one above
> 
> On Apr 6, 2009, at 5:27 AM, Jennifer Deegan (nee Clark) wrote:
> 
>> Hi,
>>
>> David and I were editing some of my cross products together on  Friday 
>> and we realised that we had different ideas of what we were  meant to 
>> do.  I just wanted to check that we are on the right lines  now, and 
>> Chris asked me to do it on list so we are all up to date on  the 
>> correct scheme.
>>
>> I had assumed that the intersection files that Chris generated with  
>> obol  were a one-off, and that we were to hand edit the stanzas and  
>> put the resulting corrected intersection tags into the vetted file.
>>
>> David thought that the plan was for us to correct the live GO and  
>> feed back on obol's rules so that autogeneration would work  correctly 
>> for all future runs.
>>
>> Which one is right?
>>
>> Thanks,
>>
>> Jen
>>
>>
>> _______________________________________________
>> Ontology-editors mailing list
>> Ontology-editors at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors
>>

-- 
Jennifer Deegan (nee Clark)
EMBL-European Bioinformatics Institute
Gene Ontology Consortium


More information about the Ontology-editors mailing list