[Go] go db issues

Benjamin Hitz hitz at genome.stanford.edu
Mon Apr 13 09:15:28 PDT 2009



Update:  This failed again this morning.  As near as I can tell, the  
part of the sequence loading that uses esearch/efetch @ NCBI is not  
working.  My guess (from some handrolled tests) is that esearch is  
failing, but returing a valide WebEnv key.  That key when used with  
efetch returns enough of genbank so as to break everything.

I suppose I could just comment out the NCBI sequence loading... that's  
going to lose quite a bit though.  I will run a test on the dev  
database and give a report.

Ben

On Apr 10, 2009, at 3:17 PM, Benjamin Hitz wrote:

> Just FYI, I am currently debugging an issue with the GOLITE sequence  
> loading (golite load failed early this morning, this is the load  
> that amigo uses)
>
> Apparently, NCBI has somehow altered their output content or format  
> such that it runs the loader completely out of RAM - even when  
> requesting 73 or so proteins.
>
> The command in question is:
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&db=protein&retmax=500&WebEnv=0BUt3HnRet03iYBWkOas0DxsyJKMQliURHWxwJ8fWngZjPLNInSiXg8XrHz@03EE1B259DFC1621_0084SID&rettype=genbank&query_key=1&retstart=0
>
> Where WebEnv is set up by the previous query to esearch.fcgi...
>
> Currently the first wget is returning s 200+Mb XML file... and it's  
> not done yet...
>
> My choices at this point are basically to either disable sequence  
> loading for golite (as is for go full), or disable golite loading  
> altogether until I can figure out what the correct NCBI jiggery- 
> pokery is.
>
> Ben
> --
> Ben Hitz
> Senior Scientific Programmer ** Saccharomyces Genome Database ** GO  
> Consortium
> Stanford University ** hitz at genome.stanford.edu
>
>

--
Ben Hitz
Senior Scientific Programmer
Saccharomyces Genome Project
Stanford University
hitz at genome.stanford.edu





More information about the Go mailing list