[Go] Spliceform column in GAF

Chris Mungall cjm at berkeleybop.org
Wed Sep 10 11:29:42 PDT 2008


On Sep 10, 2008, at 11:13 AM, Rama Balakrishnan wrote:

> Chris,
>
> I am trying to understand this proposal. So please bear with me.
>
> Column 17 is optional and column 12 is mandatory. If you don't know  
> the spliceform of the product in col 2 and hence leave col 17 as  
> blank, then what would you put in column 12?

protein, if it's a protein-coding gene. RNA if RNA-coding etc.

I didn't document the case where the gene has not been molecularly  
characterized and we have IGI annotations. I'm open to suggestions  
here. Perhaps the best thing is to allow this column to be blank if we  
truly do not know if the gene is protein coding or not.

However, if the gene is known to be protein coding yet the particular  
spliceform is not known then the type column should be protein

> Because as I understand the proposal, what is in Col12 should  
> reflect the type of the spliceform in col 17?

Yes, but we're making the open-world assumption here: absence of data  
does not mean an absence of the entity in reality.

Thanks for the questions, looks like the document needs work to make  
it more readable.

We only had a very cursory discussion of the type column in SLC, and  
we certainly didn't give people time to absorb the ramifications.

>
> Thanks,
>
> Rama
>
> On Sep 10, 2008, at 10:51 AM, Chris Mungall wrote:
>
>>
>> If a group submitted annotations for two records corresponding to  
>> the same gene this would be in violation. The most likely way for  
>> this to happen would be when a MOD submits annotations to both  
>> UniProtKB IDs and MOD IDs.
>>
>> On Sep 10, 2008, at 10:34 AM, Stoddard, Alexander wrote:
>>
>>> I do not clearly understand the following part of the spec regarding
>>> non-redundant canonical entities:
>>>
>>> "In addition the GAF must be non-redundant with respect to canonical
>>> entities in a genome"
>>>
>>> Chris, would you please give an example of how a GAF file could be
>>> redundant with respect to canonical entities and how to correct the
>>> example?
>>>
>>> Thank you,
>>> Alex Stoddard
>>>
>>>
>>> -----Original Message-----
>>> From: go-bounces at genome.stanford.edu
>>> [mailto:go-bounces at genome.stanford.edu] On Behalf Of Chris Mungall
>>> Sent: Tuesday, September 09, 2008 5:30 PM
>>> To: go list; Paul D Thomas
>>> Subject: [Go] Spliceform column in GAF [was Re: [Gofriends]  
>>> Redundancy
>>> ingo_XXXXXX-assocdb-tables/dbxref.txt]
>>>
>>> [redirected to GO]
>>>
>>> The change Mike speaks of is for the new spliceform column in the  
>>> GAF.
>>>
>>> I have specced this out here:
>>>
>>> 	
>>> http://wiki.geneontology.org/index.php/ 
>>> GAF_Spliceform_Column_Proposal
>>>
>>> Note that most of you will have read the previous document  
>>> describing
>>> current practices for annotating alternate spliceforms:
>>>
>>> 	
>>> http://wiki.geneontology.org/index.php/Annotation_of_Alternate_Splicefor
>>> ms
>>>
>>> But you won't have read the fully formulated proposal, as I only put
>>> it on the wiki today.
>>>
>>> Note that this proposal was ratified at the SLC GOC meeting, but the
>>> majority of the discussion was at the RefG portion of the meeting.
>>> It's particularly important that folks who weren't at this part read
>>> and understand the proposal. Ratification at the GOC meeting may  
>>> have
>>> been premature as I only intended to sketch out a solution
>>> collaboratively at that meeting.
>>>
>>> Once the above wiki page is in shape, we should send an announcement
>>> to gofriends (promptly, as it is of relevance to the current
>>> discussion below), all data providers and consumers, and then after
>>> that in the newsletter and on the main GO docs.
>>>
>>> As Mike says we are aiming for a introduction some time in 2009.  
>>> It's
>>> important that anyone involved with producing GAFs is aware of the
>>> changes and is OK with this timetable.
>>>
>>> Cheers
>>> Chris
>>>
>>> On Sep 9, 2008, at 1:22 PM, Mike Cherry wrote:
>>>
>>>> There is a change coming to the format of the gene association file
>>>> which will solve this problem.  Annotations to proteins, gene,
>>>> transcripts, etc for a particular locus will be identified as such.
>>>> The change should occur in 2009.
>>>>
>>>> -Mike
>>>>
>>>>
>>>>> From: "Quaid Morris" <quaid.morris at gmail.com>
>>>>> To: "Gabriel Berriz" <gberriz at hms.harvard.edu>
>>>>> Subject: Re: [Gofriends] Redundancy in go_XXXXXX-assocdb-tables/
>>>>> dbxref.txt
>>>>> Cc: gofriends at genome.stanford.edu
>>>>>
>>>>> Hi Gabriel,
>>>>>
>>>>> It looks like in the example that you gave RGD ID 1302948 is a  
>>>>> gene
>>>>> ID and
>>>>> ENSRNOP00000034933 is a protein ID.  Are all your examples like
>>>>> this?  Maybe
>>>>> there are circumstances when it's possible to annotate a specific
>>>>> isoform
>>>>> and others when only the gene can be annotated.
>>>>>
>>>>> Q
>>>>>
>>>> _______________________________________________
>>>> Gofriends mailing list
>>>> Gofriends at geneontology.org
>>>> http://fafner.stanford.edu/mailman/listinfo/gofriends
>>>>
>>>
>>> _______________________________________________
>>> Go mailing list
>>> Go at geneontology.org
>>> http://fafner.stanford.edu/mailman/listinfo/go
>>>
>>
>> _______________________________________________
>> Go mailing list
>> Go at geneontology.org
>> http://fafner.stanford.edu/mailman/listinfo/go
>
>



More information about the Go mailing list