Connect to AWS Redshift using awswrangler

4.8k views Asked by At
import awswrangler as wr
con = wr.redshift.connect("MY_GLUE_CONNECTION")

What would be the value of "MY_GLUE_CONNECTION"?

1

There are 1 answers

0
Pavel Slepiankou On BEST ANSWER

You need to create a Glue connections with unique name in the AWS console first

Let's give it a name test_1

enter image description here

then you can just pass this name to awswrangler

import awswrangler as wr
con = wr.redshift.connect("test_1")
with con.cursor() as cursor:
    cursor.execute("SELECT 1;")
    print(cursor.fetchall())
    con.close()

and it will be working without any further steps if you have aws cli installed and authenticated. The trick here is the boto3 auth mechanism used by awswrangler. awswrangler uses boto3 in awswrangler.redshift.connect() method with the following note - boto3_session (boto3.Session(), optional) – Boto3 Session. The default boto3 session will be used if boto3_session receive None. So, in fact boto3_session is always used, but by being optional its use might be shadowed from you. With aws cli installed it will check ~/.aws/credentials && ~/.aws/config and will authenticate based on these files. Another option is to initiate boto3 session in your code and pass it directly to your method.

import awswrangler as wr
import boto3

session = boto3.Session(
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    aws_session_token=SESSION_TOKEN
)

con = wr.redshift.connect("test_1", boto3_session=session)
with con.cursor() as cursor:
    cursor.execute("SELECT 1;")
    print(cursor.fetchall())
    con.close()