[go] Putting method/program names into the with field for ISS
Ben Hitz
hitz at genome.Stanford.EDU
Fri Sep 28 14:32:19 PDT 2007
>
I guess, first point. - I don't think there is any need to separate
"sequence" from "structure" since sequence is merely a proxy for the
chemical structure, so I will use sequence to mean both. If anyone
has a contrary opinion, I would like to hear it. I just want to
make this disclaimer so we can use "sequence" to mean both.
> I think it is important to be able to distinguish methods that are
> based on just sequence analysis from everything else and that ISS
> should be the code to describe this.
I don't necessarily disagree - but I don't take this as a given
either. WHY is this important? And why SOLEY sequence analysis and
not partially sequence analysis?
Here are some things I think might be important:
o That the association is based on some computational theory, not an
experiment (so it would not fall under the proposed EXP hierarchy.
o That in cases where an association is transferred from a specific
gene product or family of gene products, that that the "transferree"
is mentioned.
o whether or not the association has been reviewed by a curator
o whether or not the method has been reviewed by a curator (sub case
if the above is not true)
o whether or not this is a (computational) prediction based on
combining several sources of data (aka "Baysian Blah Blah Blah")
> I think we need at least 2 categories:
> -one for all things sequence based (ISS or whatever new name might
> be created)
> -one for combinatorial analyses that bring together different types
> of information to reach a conclusion (ICA/RCA)
You are not accounting for some other non-sequence, non-combinatorial
analysis. For example - there are many algorithms that infer
biological process from pattern of physical interactions - while this
seems to me be your 2nd class (Non-sequence), it's only based on 1
source of data.
>
> If people feel there should be a code for alignments and only
> alignments then we will need to split the sequence-based category
> into 2 which would then give us 3 total:
> -orthology based evidence
> -all other sequence based evidence
> -combinatorial analyses that bring together different types of
> information to reach a conclusion (ICA/RCA)
>
> I favor the first option (2 categories, not 3) as I think it is
> cleaner and easier for people to understand. If we feel the need
> to change the name of ISS to reflect this more encompassing
> definition, then OK, but that brings another whole can of worms
> with it (what about legacy data, will the community have a cow, etc.)
There is a practical issue you are overlooking. It is very important
that we capture WITH Information for certain types of homology- or
similarity- based methods of inference. So important that your
association will be tossed back by MIke if you don't provide this
information.
I would say this is necessary for:
1) all pairwise sequence alignment methods
2) all "curated ortholog" methods (sub set of 1, above)
3) all protein-family assignment based methods (Pfam, SMART, ProDom)
So, for the above, WITH information is mandatory. For other methods
it isn't. It is much, much easier from a practical standpoint to
mandate WITH evidence code X, rather than mandate WITH for some
complicated subset of evidence code X.
Should we take this to the evidence-code mailing list?
Ben
--
Ben Hitz
Senior Scientific Programmer ** Saccharomyces Genome Database ** GO
Consortium
Stanford University ** hitz at genome.stanford.edu
More information about the Go
mailing list