[go] Do you upgrade InterPro2GO IEA annotations to ISS?
Valerie Wood
val at sanger.ac.uk
Wed Oct 10 10:01:32 PDT 2007
Hi Emily,
I don't make a conscious effort to 'convert these to ISS', but I have
filtering in place so that is a manual annotation exists the annotations
are suppressed.
If 28666 possible mappings (SPKW and Interpro) 4454 are
retained post filtering. So most Interpro annotations are already
represented by an existing annotation (either experimental or from an
ISS to an orthologs with experimental evidence).
Some terms which come through the filter are clearly not appropriate
mappings and these are submitted to the SF annotation tracker
https://sourceforge.net/tracker/?group_id=36855&atid=605890
to be removed.This is usually because the mapping does not apply to the entire family
(i.e is obviously too broad).
Occasionally, annotations are made manually to a domain or family with
ISS Interpro (174) or Pfam (1240). This is usually because the
ortholog cannot be identified unambiguously, but all family
members are considered to have this process or function.
i.e. protein kinase for PF00069 or transcription factor for PF00172.
Sometimes I will make these annotations when I have built a family for
Pfam, before it has been given any mappings by the Interpro database,
so they not directly Interpro mappings as such (I am just using the
alignment as the object in the "with" column, because it is better
evidence than a single homolog). I use GOC:unpublished in the ref
column when I do this.
I guess in answer to "What data sources do you look at to confirm
whether these electronic predictions are correct" most of the errors
are when a mapping has been made to a family which contains several
subfamilies and only the members of one subfamily have been shown to
have a particular process. It needs to reasonably conclusive that all
members of the family are likely to have the applied mapping, so family
size is usually a consideration. A very general rule of thumb would be
that a large enzyme family is unlikely to be accurately mapped to a
very specific activity or process. However a family with uniform
conservation can often have more granular mappings.
This process seems to work really well for identifiying mapping errors.
If you look at a few of the examples on the tracker it would probably be
clearer.
Val
E Dimmer wrote:
> Hi,
>
> A question for those GO annotation groups that manually assess
> InterPro2GO 'IEA' predictions, and when appropriate, convert them into
> manual ISS annotations.
>
> I would be very grateful if you could let me know what criteria your
> group uses to evaluate InterPro2GO annotations. What data sources do
> you look at to confirm whether these electronic predictions are
> correct and what information does your final annotation include? (i.e.
> I assume that the InterPro ID would go into the 'with' column, but
> what would be cited in the reference column - do you have a GO
> reference?), also how long does this process take? Am I right in
> thinking that S. Pombe and DictyBase groups carry out these kinds of
> annotations?
>
> I have just been asked this by a group who are considering whether
> they could carry out this kind of assessment while annotating their
> new genome.
>
> Thanks,
> Emily
>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Go
mailing list