How to Read the Result of Query into a Dask Dataframe in a Distributed Client?

42 views Asked by At

Trying to read the results of a query (from an AWS athena database) to a dask dataframe. Following the read_sql_query method of the official documentation.

Here is how I am calling it.

from dask import dataframe
query:sqlalchemy.Query
engine:sqlalchemy.Engine

dataframe.read_sql_query(sql=query.statement, con=str(engine.url), index_col='date') # AttributeError: 'OptionEngine' object has no attribute 'execute'

So, why this AttributeError when the documentation says I have to pass a string for the con argument? Should it be some other string? Note that the query is working fine without dask, so the database configuration parameters are alright.

Also, I am looking for a read method that can scale to a multinode dask cluster without having to collect all the results at the driver node.

0

There are 0 answers