[go] Re: additional checks for gene association files
Mike Cherry
cherry at stanford.edu
Mon Oct 1 08:24:28 PDT 2007
There were no objections to these additional checks, either via email
or at the GOC meeting.
I have just committed the new version of the filtering script so
these new checks are now in place. Any files that change this week
will be subject to these checks, and all files (except GOA Uniprot)
will be filtered on Saturday (as is normal practice).
-Mike
On Sep 18, 2007, at 2:04 PM, Mike Cherry wrote:
> I have an update ready for the script that checks the syntax of the
> submitted gene association files. All of these have been discussed
> and approved, or there was no comment. I'll wait until after the
> GOC meetings next week to put this into production. Please let me
> know if you have any problems with these.
>
> -Mike
>
>
> 1. if a database name is included in DB_OBJECT_ID it must be a
> valid name found in the go/doc/GO.xrf_abbs file. This should have
> happened last year but I found a bug that caused this check to
> never run. Currently none of the gene association files have this
> type of problem.
>
> 2. check for double colons ('::') in DB_OBJECT_ID, GOID, REFERENCE,
> WITH and TAXON ID fields. If a double colon is found that line is
> not included in the filtered output, an error message is created.
> There are only two of these errors in the current files, one in
> GeneDB_Tbrucei and the other in RGD.
>
> 3. check for multiple DB_OBJECT_SYMBOLs associated with a
> DB_OBJECT_ID. This error was reported by Gavin Sherlock and
> discussed in late July and early August. There was no comment so
> I'm assuming everyone agrees this is an error. The checking script
> will allow one symbol to be associated with an ID. If a second
> symbol is found those lines containing the second (or third, ...)
> symbol will not be included in the filtered file. An error is
> created in the report. At the moment there are errors of this type
> in the RGD and WB files. There were errors in the pseudocap file
> but I've fixed those as that file is not active.
>
>
More information about the Go
mailing list