[Ontology-editors] cross products

Chris Mungall cjm at berkeleybop.org
Tue Apr 7 09:15:12 PDT 2009


Sounds good

On Apr 7, 2009, at 2:19 AM, Jennifer Deegan (nee Clark) wrote:

> Hi Chris,
>
> Thanks for explaining. In cases where obol was wrong-footed by  
> problems in the graph or rules that were not quite right I'm  
> assuming that for my file I should:
>
> 1) Fix the live graph.
> 2) Feed back to you on how rules could be improved.
> 3) Fix the intesection tags by hand and then move these from  
> unvetted to vetted.
>
> Is that right? It doesn't make sense to me to have you regenerate  
> the intersection tags using obol when we could fix them by hand  
> fairly quickly.
>
> Thanks,
>
> Jen
>
>
>
> Chris Mungall wrote:
>
>> The answer is it's a mix, depending on the xp set. *Ultimately*  
>> the  goal is to never have to parse a term again and to create xp  
>> defs  prospectively rather than retrospectively (we can still use  
>> obol for  *generation* of names, defs and synonyms, based on the  
>> newly created  xp def). We're nearer this goal with some xp sets  
>> rather than others.
>> For example, the inter-organism regulation ones are quite hard, and  
>> we  probably have a few iterations to go. These iterations are  
>> quite  useful besides improving the grammar, as they clarify many  
>> parts of  the ontology, and also consistency of naming.
>> The bp_xp_self which you are working on is a special case. There is  
>> a  long tail of terms that can be trivially parsed; for example "X  
>> during  Y" is translated to "an X that part_of Y". These have a  
>> high accuracy.  But there are other terms in your set for which I  
>> liberally applied  grammatical rules I knew to be less accurate.  
>> One such rule involved  using a relational adjective to guess an xp  
>> def, resulting in
>> GO:0048007 ! antigen processing and presentation, exogenous lipid   
>> antigen via MHC class Ib
>> synonym: "exogenous lipid antigen processing and presentation via  
>> MHC  class Ib" ==>
>> intersection_of: GO:0048003 ! antigen processing and presentation  
>> of  lipid antigen via MHC class Ib
>> intersection_of: part_of GO:0042638 ! exogen
>> (currently in the vetted file)
>> This is quite obviously wrong!! I knew there would be many  
>> incorrect  ones in the bp_xp_self set from the outset, I did a  
>> first pass filter  myself, but probably not a good enough job. So  
>> in some ways your set  is the hardest as it has the highest ratio  
>> of nonsense. We should  perhaps invert the procedure for you, such  
>> that everything starts off  unvetted and must be explicitly vetted.
>> The goal in this case is not to improve the obol rule: it's simply  
>> not  possible to get a 100% accurate xp def from terms such as  
>> this. Obol  is really for the low hanging fruit (LHF), the  
>> trivially composable  terms.
>> We'll probably revisit the example above when we return to do   
>> bp_xp_pro. For now, just delete erroneous xp defs such as the one  
>> above
>> On Apr 6, 2009, at 5:27 AM, Jennifer Deegan (nee Clark) wrote:
>>> Hi,
>>>
>>> David and I were editing some of my cross products together on   
>>> Friday and we realised that we had different ideas of what we  
>>> were  meant to do.  I just wanted to check that we are on the  
>>> right lines  now, and Chris asked me to do it on list so we are  
>>> all up to date on  the correct scheme.
>>>
>>> I had assumed that the intersection files that Chris generated  
>>> with  obol  were a one-off, and that we were to hand edit the  
>>> stanzas and  put the resulting corrected intersection tags into  
>>> the vetted file.
>>>
>>> David thought that the plan was for us to correct the live GO and   
>>> feed back on obol's rules so that autogeneration would work   
>>> correctly for all future runs.
>>>
>>> Which one is right?
>>>
>>> Thanks,
>>>
>>> Jen
>>>
>>>
>>> _______________________________________________
>>> Ontology-editors mailing list
>>> Ontology-editors at geneontology.org
>>> http://fafner.stanford.edu/mailman/listinfo/ontology-editors
>>>
>
> -- 
> Jennifer Deegan (nee Clark)
> EMBL-European Bioinformatics Institute
> Gene Ontology Consortium
>



More information about the Ontology-editors mailing list