[Ontology-editors] Wikipedia

Chris Mungall cjm at berkeleybop.org
Tue Jan 27 13:26:47 PST 2009


On Jan 27, 2009, at 12:51 PM, Jane Lomax wrote:

> Looks fine to me - looks like they're all exact text matches?

Yes. I suspect there's no need for fancy NLP stemming techniques or  
anything like that as wikipedia uses sensible naming rules like us.

> Might be able to pull out some more links via the definition dbxrefs  
> - I know I use Wikipedia as a ref quite a lot. Those would obviously  
> need more vetting though.

These are just the generic ones to:
http://www.wikipedia.org/

We could use these to check for xrefs that the string matching failed  
to find.

I propose that the wikipedia xrefs goes in as IDspace Wikipedia:  
rather than a full URL - I've already taken the liberty of adding this  
to GO.xrf_abbs

>
> Jane
>
> On Mon, 26 Jan 2009, Chris Mungall wrote:
>
>>
>> Link should work now.
>>
>> List for CC attached. Let me know if an alternate format (e.g. html  
>> page) is preferable. I could possibly slurp in the first paragraph  
>> from the wikipage into the table to help.
>>
>> This is based on exact synonyms only. I omitted abbreviations since  
>> these most probably go to the wikipedia disambiguation page.
>>
>> omitting abbrev: GO:0000407 PAS
>> omitting abbrev: GO:0000500 UAF
>> omitting abbrev: GO:0005628 FSM
>> omitting abbrev: GO:0005786 SRP
>> omitting abbrev: GO:0005854 NAC
>> omitting abbrev: GO:0008180 CSN
>> omitting abbrev: GO:0015627 MTB
>> omitting abbrev: GO:0016010 DGC
>> omitting abbrev: GO:0022623 PAN
>> omitting abbrev: GO:0030895 APOBEC
>> omitting abbrev: GO:0030896 CCC
>> omitting abbrev: GO:0030907 MBF
>> omitting abbrev: GO:0031431 DDK
>> omitting abbrev: GO:0031510 SAE
>> omitting abbrev: GO:0032046 MIPA
>> omitting abbrev: GO:0033597 MCC
>> omitting abbrev: GO:0034515 PSG
>> omitting abbrev: GO:0034751 AHRC
>> omitting abbrev: GO:0034992 MAS
>> omitting abbrev: GO:0035061 ICG
>> omitting abbrev: GO:0035145 EJC
>> omitting abbrev: GO:0035301 HSC
>> omitting abbrev: GO:0042101 TCR
>>
>>
>
> -- 
> Dr Jane Lomax
> GO Editorial Office
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridgeshire, UK
> CB10 1SD
>
> p: +44 1223 492516
> f: +44 1223 494468
>



More information about the Ontology-editors mailing list