Hi Karen,
Here are a couple of ppt slides from the talk I gave at the biocurator
meeting...and that segment of the paper is below. [see bolded part at
end of section]. The point is that if Shh was not annotated to 'heart
development' in the GO, then the effect of Shh as measured by this
experiment would not be recovered in any GO-based analysis, and thus is
'missing data'. Shh has many effects on development described largely as
the phenotypic results of knocking out the gene, but the exact
functioning of Shh in the particular instance of effecting heart
development is unknown.
Judy
Hill, D., Smith, B. McAndrews-Hill, M., Blake, J.A., 2008. Gene Ontology
Annotations: What they mean and where they come from. BMC Bioinformatics
9(Supple5) S2.
Biological Process Annotation
A molecular function instance is the enduring potential of a gene
product instance to act in a certain way. A biological process instance
is the execution of one or more such molecular function instances
working together to accomplish a certain biological objective. A
biological process instance is at the cellular or organismal level of
granularity what the execution of a function is at the level of the
molecule. There is a relationship between molecular functions and
biological processes. At this time this relationship is not represented
explicitly in GO. From a gene annotation perspective, we are interested
in going beyond the instance-instance relations at the cell- or
organism-level, but rather would like to infer type-type relations which
link gene product types at the molecular level of granularity to process
types at the level of the cell or organism. We are interested in the
fact that molecules of a given gene product type can be associated with
instances of a molecular function type (known or unknown) whose
execution contributes to the occurrence of a biological process of a
given type. Inferences about such type-type relations can be made
because experiments are designed to test what transpires when specified
biological conditions are satisfied in typical circumstances --
circumstances which, as a result of the efforts of the experimenter,
disturbing events do not interfere. Experiments are designed to be
reproducible and predictive, describing the instances that one would
expect to find in biological systems meeting the defined conditions. If
future experiments show that preceding experiments did not describe the
intended typical situation, then the conclusions from the preceding
experiments are questioned and may be reanalyzed and reinterpreted, or
even rejected entirely, and the corresponding annotations then need to
be amended accordingly.
Annotations in this way sometimes point to errors in the type-type
relationships described in the ontology. An example is the recent
removal of the type /seretonin secretion/ as an is_a child of
/neurotransmitter secretion /from the GO Biological Process ontology.
This modification was made as a result of an annotation from a paper
showing that serotonin can be secreted by cells of the immune system
where it does /not/ act as a neurotransmitter. Associations between gene
products and biological processes, too, can be detected experimentally.
When instances of biological process type /P/ are detected, either by
direct observation or by experimental assay, as being associated with
instances of a given gene product type /M/, then this justifies the
assertion of that sort of association between /M/ and /P/ which is
called a biological process annotation.
F*or those species of organisms where the tools of genetic study can be
successfully applied, the association of gene product types with
biological process types is usually achieved through the study of the
perturbations of biological processes following genetic mutation.
Curators use the IMP evidence code for these annotations. Figure 3 shows
an example of a mutational analysis done by Washington-Smoak /et al/ on
the effects of a mutation of the /Shh/ gene on mouse heart development
[11]. The left panel shows an image of a heart with normal copies of the
gene (WT) at 16.5 days of embryogenesis; the right panel shows a heart
with defective copies of the gene at 16.5 days of embryogenesis. The
figure clearly illustrates that the development of the outflow tracts of
the heart are defective in the embryo with the defective gene. The GO
Biological Process ontology defines the type /heart development/ as:
'the process whose specific outcome is the progression of the heart over
time, from its formation to the mature structure. The heart is a hollow,
muscular organ, which, by contracting rhythmically, keeps up the
circulation of the blood.'**
**Based on the mutational study reported in Washington-Smoak /et al/, an
MGI curator has made an annotation linking /heart development//Shh/ gene
using the IMP evidence code (Fig. 1). This annotation rests on the
identification in the normal animal of a molecule of the product of the
/Shh/ gene with a molecular function whose execution contributes to an
occurrence of the biological process /heart development/. We know that
the biological process /heart development/ exists because we observe it
in the normal animal. We know that a molecule of SHH contributes to this
process because when we take away all instances of the gene product of
the /Shh/ gene in an animal, the process of heart development is
disturbed. The annotation thus affirms that a molecule of SHH protein
has the potential to execute a molecular function that contributes to an
instance of the type /heart development/ in the Biological Process
ontology. We also generalize that the execution of the molecular
function of a molecule of SHH in a given mouse will in some way
contribute to the development of that mouse's heart. However, the
results of any phenotypic assay are limited to the resolution of the
phenotype itself. In the experiment described above, we have validated
the biological process, but cannot make any direct inferences about the
nature of the function executed. It is for this and other practical
reasons that the molecular function and biological process ontologies
were developed independently.*
Karen Christie wrote:
> Hi Judy,
>
> I'm glad to hear that isn't what you meant. This is the part or your
> email that gave me that impression:
>
> "the SGD genes would I think appear under-annotated if the effect of
> the gene on phenotype is not curated in BP. For comparative genomics
> studies using GO, this would be missing"
>
> If you want to clarify what you meant, I'd be happy to think about how
> that fits in with how SGD has been thinking about curating phenotype
> data and whether or not to make GO annotations from it.
>
> thanks,
>
> -Karen
>
>
> On Wed, 9 Jul 2008, Judith Blake wrote:
>
>> Hi Karen,
>>
>> I'm not sure what made you think that I meant that process mutations
>> should be made from all mutant phenotypes...I certainly didn't mean
>> to suggest that. judy
>>
>> Karen Christie wrote:
>>> I don't think the GOC has ever had a policy, or even a
>>> recommendation, that process annotations should be made from all
>>> mutant phenotypes, nor do I think that it should.
>>>
>>> For example, SGD is currently working on annotating phenotypes for
>>> Cell Division Cycle (CDC) mutants, i.e. mutations which cause a cell
>>> cycle arrest phenotype. Here are some of the ones I worked on
>>> yesterday:
>>>
>>> CDC60 leucyl tRNA synthetase
>>> PRT1 Subunit of eIF3
>>> ALA1 alanyl-tRNA synthetase
>>> CDC65 mitochondrial tRNA-Glu
>>> SPT16 Subunit FACT transcription elongation complex
>>>
>>> I don't think that anyone in the yeast community would expect or
>>> want to see any of these genes annotated to a GO process related to
>>> the cell cycle. There are lots of examples of where a mutant
>>> phenotype is due to some downstream effect and not due to the
>>> primary defect.
>>>
>>> So, at SGD, we try to focus on the primary process. Obviously, we
>>> don't always know, but once we do, we like to avoid making GO
>>> annotations for processes that are known to be downstream, rather
>>> than direct, results of the mutation.
>>>
>>> For Doug's specific example, if comparative data suggested that the
>>> gene was a specific regulatory transcription factor, I'd probably be
>>> inclined to go ahead and make specific process annotations. However,
>>> if comparative data suggested that it was related to a Pol II
>>> general transcription factor, I might not want to make a GO process
>>> annotation to such a specific process.
>>>
>>> At all of the Annotation Camps, we've always said that one should be
>>> careful when making annotations from mutant phenotypes. At both of
>>> the public ones, the question has come up of how much to annotate
>>> from mutant phenotypes. The answer we've given has been that if one
>>> only has a mutant phenotype to annotated from, then make the best
>>> annotations you can. However, be aware that as you learn more, you
>>> may find that some of the mutant phenotypes are indirect results
>>> rather than something the gene product is directly involved in, and
>>> that in these cases you may choose to remove process annotations
>>> based on these phenotypes.
>>>
>>> I think this is still good advice, that curator judgement should
>>> play a role in deciding whether a GO process annotation is merited
>>> from any particular mutant phenotype.
>>>
>>> -Karen
>>>
>>>
>>> On Sun, 6 Jul 2008, Judith Blake wrote:
>>>
>>>> I can understand the duplication of effort, but since the GO and
>>>> phenotype annotations aren't co-mingled in GOdb, the SGD genes
>>>> would I think appear under-annotated if the effect of the gene on
>>>> phenotype is not curated in BP. For comparative genomics studies
>>>> using GO, this would be missing, yet available in the literature,
>>>> information.
>>>>
>>>> for mouse, the phenotype data is effectively 'disfunction' data, so
>>>> the phenotype annotation reflects a different view from the GO
>>>> annotation.
>>>>
>>>> Judy
>>>>
>>>> Julie Park wrote:
>>>>> Hi Doug,
>>>>>
>>>>> SGD's practice on this is that if it is known that what is being
>>>>> observed is a secondary/downstream effect, then we only capture it
>>>>> via phenotypes and not as a GO process. However, if the gene
>>>>> product in question is not well characterized or there is a
>>>>> conflict in the literature about whether it is a direct or
>>>>> indirect involvement then we would give it a GO annotation.
>>>>>
>>>>> We've made a decision to use GO to try and capture the primary
>>>>> role of a gene product as much as possible and to reduce the
>>>>> duplication of effort required to capture data both in GO and as
>>>>> phenotypes.
>>>>>
>>>>> Just our take on things.
>>>>>
>>>>> Regards,
>>>>> -Julie
>>>>>
>>>>>
>>>>> On Jul 3, 2008, at 3:16 PM, Doug howe wrote:
>>>>>
>>>>>> Hi David,
>>>>>> It still seems like there is a line that has to be drawn somewhere.
>>>>>> We've talked in the past about the scope of a process...when does it
>>>>>> start and when does it end? A gene that has as it's primary role
>>>>>> regulation of transcription (perhaps binds DNA etc. etc.) may have a
>>>>>> secondary effect upon eye morphogenesis. However, the process of eye
>>>>>> morphogenesis does not start with the binding of such a gene to a
>>>>>> regulatory sequence...it is a downstream consequence....and
>>>>>> perhaps it
>>>>>> is the gene who's expression is being regulated that is really
>>>>>> involved
>>>>>> in the downstream process. It seems like there is a significant
>>>>>> amount
>>>>>> of redundant curation work to do if we always annotate both GO and
>>>>>> phenotype using the same GO process terms. I'm not strongly
>>>>>> opposed to
>>>>>> such annotations, I just want to revisit the discussion and see if
>>>>>> anyone has other views on the issue.
>>>>>> -Doug
>>>>>>
>>>>>> David Hill wrote:
>>>>>>> Doug,
>>>>>>>
>>>>>>> I do this all the time. I just finished systematically doing all
>>>>>>> the homeobox genes in mouse. Many of them are annotated to
>>>>>>> things like pattern specification. I think in the future, it
>>>>>>> will be very nice to know these are playing roles in regulating
>>>>>>> transcription but that regulation is fundamental in other
>>>>>>> processes as well.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> Doug howe wrote:
>>>>>>>> I'm still struggling with the issue of whether to make a GO
>>>>>>>> annotation (processes in particular) or only phenotype
>>>>>>>> annotation. The zebrafish literature is replete with mutant
>>>>>>>> papers that often describe phenotypes involving eyes, otic
>>>>>>>> vesicles, or pharyngeal arches, organ development etc. Often,
>>>>>>>> the IEA annotations for a gene seems to indicate that the gene
>>>>>>>> is binding DNA, and may be some sort of transcriptional
>>>>>>>> regulator. Should such a gene be annotated with GO terms like
>>>>>>>> 'otic vesicle development', or 'eye morphogenesis', or should
>>>>>>>> that be left for phenotype annotations?
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Doug Howe, Ph.D.
>>>>>> ZFIN Scientific Curator
>>>>>> Zebrafish Nomenclature Coordinator
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Annotation mailing list
>>>>>> Annotation at geneontology.org
>>>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>>>
>>>>> _______________________________________________
>>>>> Annotation mailing list
>>>>> Annotation at geneontology.org
>>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>> _______________________________________________
>>>> Annotation mailing list
>>>> Annotation at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/annotation
>>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ExampleIMP.ppt
Type: application/vnd.ms-powerpoint
Size: 734720 bytes
Desc: not available
URL: <http://fafner.stanford.edu/pipermail/annotation/attachments/20080710/fda985aa/attachment-0001.ppt>