I would like to save the output of a redshift query locally. I have tried using blaze/odo
but with default settings that tries to load all the data into memory before writing and trying to stream the data throws other errors, described another question: Streaming results with Blaze and SqlAlchemy
Since that question does not have any answers, I am looking for an alternative way to do this.
What is a simple, reliable and reasonably efficient way of loading larger than memory query result into bcolz from Redshift?
I see this is an old question, but I'll answer it to help if someone with the same problem stumble upon it. I assume you use Blaze to access Redshift. You'll have to partition table on some column and do it iteratively, like this:
and than use bcolz.walk to iterate over result.