I have a pretty simple AWS Lambda function in which I connect to an Amazon Keyspaces for Cassandra database. This code in Python works, but from time to time I get the error. How do I fix this strange behavior? I have an assumption that you need to make additional settings when initializing the cluster. For example, set_max_connections_per_host
. I would appreciate any help.
ERROR:
('Unable to complete the operation against any hosts', {<Host: X.XXX.XX.XXX:XXXX eu-central-1>: ConnectionShutdown('Connection to X.XXX.XX.XXX:XXXX was closed')})
lambda_function.py:
import sessions
cassandra_db_session = None
cassandra_db_username = 'your-username'
cassandra_db_password = 'your-password'
cassandra_db_endpoints = ['your-endpoint']
cassandra_db_port = 9142
def lambda_handler(event, context):
global cassandra_db_session
if not cassandra_db_session:
cassandra_db_session = sessions.create_cassandra_session(
cassandra_db_username,
cassandra_db_password,
cassandra_db_endpoints,
cassandra_db_port
)
result = cassandra_db_session.execute('select * from "your-keyspace"."your-table";')
return 'ok'
sessions.py:
from ssl import SSLContext
from ssl import CERT_REQUIRED
from ssl import PROTOCOL_TLSv1_2
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy
def create_cassandra_session(db_username, db_password, db_endpoints, db_port):
ssl_context = SSLContext(PROTOCOL_TLSv1_2)
ssl_context.load_verify_locations('your-path/AmazonRootCA1.pem')
ssl_context.verify_mode = CERT_REQUIRED
auth_provider = PlainTextAuthProvider(username=db_username, password=db_password)
cluster = Cluster(
db_endpoints,
ssl_context=ssl_context,
auth_provider=auth_provider,
port=db_port,
load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='eu-central-1'),
protocol_version=4,
connect_timeout=60
)
session = cluster.connect()
return session
There isn't much point setting the max connections on the client side since AWS Lambdas are effectively "dead" between runs. For the same reason, the recommendation is to disable driver heartbeats (with
idle_heartbeat_interval = 0
) since there is no activity that occurs until the next time the function is called.This doesn't necessarily cause the issue you are seeing but there's a good chance the connection is being reused by the driver after it has been closed server-side.
With the lack of public documentation on the inner-workings of AWS Keyspaces, it's difficult to know what is happening on the cluster. I've always suspected that AWS Keyspaces has a CQL-like API engine in front of a Dynamo DB so there are quirks like what you're seeing that are hard to track down since it requires knowledge only available internally at AWS.
FWIW the DataStax drivers aren't tested against AWS Keyspaces.