Load Snowflake data into Pandas dataframe using AWS Sagemaker

Question

Load Snowflake data into Pandas dataframe using AWS Sagemaker

1.1k views Asked by user1420372 At 01 October 2020 at 14:23

I'm attempting to read data from Snowflake into a Pandas dataframe using a Jupyter Lab on AWS Sagemaker. I can successfully load 100 rows, but not 2.2 Million Rows

I gave up loading complete data after ~20 minutes
On my local computer it takes 4 minutes, If I change to fetch in batches on local computer it takes long (~10 minutes) – but could try this on Sagemaker?
Command is sent successfully to snowflake (looking at history log)
Placing underlying view into a table on Snowflake showed 222.5MB size

Steps:

Run: !pip install snowflake-connector-python[pandas]
Restart kernel

Execute:

 ctx = snowflake.connector.connect(
      user='<username>',
      account='<account>',
      password = '<password>',
      warehouse='<wh>',
      database='<db>',
      role='<role>',
      schema='<schema>'
  )
  cur = ctx.cursor()
  sql = """
          SELECT * <db>.<schema>.<view>
  """
  cur.execute(sql)
  # Fetch the result set from the cursor and deliver it as the Pandas DataFrame.
  df = cur.fetch_pandas_all()

  cur.close()
  ctx.close()

Are there any specific configurations I need to ensure that I am using?

Original Q&A

There are 1 answers

**user1420372** · Answer 1 · 2020-10-07T23:22:47+00:00

In this case, sizing up the On-Demand Notebook Instance from ml.tx.xlarge (2vCPU, 8Gib) worked. See Amazon SageMaker Pricing for notebook instance CPU/Memory specifications.

In an earlier attempt to fix the problem, we had increased the volume size but that is for storage of data and didn't help with memory (see Customize your notebook volume size, up to 16 TB, with Amazon SageMaker for more details on storage volume); so we were able to decrease the volume size from 50 GB EBS to 10 GB EBS.

Memory can be monitored by opening up a terminal using the Jupyter interface and typing the linux command free

TechQA.

Load Snowflake data into Pandas dataframe using AWS Sagemaker

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in SNOWFLAKE-CLOUD-DATA-PLATFORM

Related Questions in AMAZON-SAGEMAKER

Popular Questions

Popular Tags

Trending Questions