AWS VPC Endpoints - Not functioning

337 views Asked by At

I am struggling to get my AWS VPC endpoints to work. I've missed something in the setup but cannot work it out.

I have a lambda that is triggered by an SQS entry, the lambda is within a VPC. The lambda is triggered and attempts to write to a different SQS queue.

When the lambda runs it times out attempting to write an SQS queue entry.

I have created a VPC endpoint for 'com.amazonaws.eu-west-2.sqs' (London Region) attached to that VPC. Reading the document it says I have to route traffic through the endpoint, but I have no idea how to do that.

Can anyone help me with what the next steps in setting this up are, or point me in the direction of documentation that provides the step by step approach (for someone who have very little networking knowledge).

1

There are 1 answers

3
John Rotenstein On

To reproduce your situation I did the following (in the London region):

  • Created an Amazon SQS standard queue
  • Created an IAM Role for Lambda that has permission to write to the queue
  • Used the VPC Wizard to create a VPC:

VPC Wizard

  • Created a Security Group for the Lambda function (Lambda-SG) that permits all Outbound traffic
  • Created an AWS Lambda function and configured it to use a private subnet in the VPC, and associated the Lambda-SG security group with the function:
import boto3

def lambda_handler(event, context):
    sqs = boto3.client('sqs')

    queue_url = 'https://sqs.eu-west-2.amazonaws.com/782031212076/stack'

    # Send message to SQS queue
    response = sqs.send_message(QueueUrl=queue_url, MessageBody='Foo')

    
    print(f"Message ID: {response['MessageId']}")

I then invoked the Lambda function by pressing the Test button (and using the default test values) and received the error:

"errorMessage": "2023-12-03T11:15:03.853Z 048c31a5-6760-44d6-a9f3-3bd84a787e88 Task timed out after 3.01 seconds"

This is expected since the Lambda function is connected to a private subnet that does not have a VPC Endpoint for SQS.

  • Created a Security Group for the endpoint (Endpoint-SG) that permits All inbound traffic from Lambda-SG (That is, the Inbound rule in Endpoint-SG specifically references Lambda-SG)
  • Added a VPC Endpoint for SQS to the private subnet of the VPC
  • Confirmed connectivity using the VPC Reachability Analyzer:

VPC Reachability Analyzer

  • Used Test to run the Lambda function again

It worked:

Function Logs
START RequestId: 5a0d6359-1f57-4e50-98a1-4de6ccacd9b9 Version: $LATEST
Message ID: f1b46e9a-981d-4e92-9864-501b4ccb8db0
END RequestId: 5a0d6359-1f57-4e50-98a1-4de6ccacd9b9
REPORT RequestId: 5a0d6359-1f57-4e50-98a1-4de6ccacd9b9  Duration: 1332.49 ms    Billed Duration: 1333 ms    Memory Size: 128 MB Max Memory Used: 74 MB  Init Duration: 249.31 ms

I will admit that, after adding the VPC Endpoint, the function still had a timeout. That's why I used the VPC Reachability Analyzer to test the connection. It said everything should work okay. Then, when I tried the Lambda function again, it worked successfully. So, there might be a delay between adding the VPC Endpoint and having it work correctly.

Hopefully you can compare your setup with the above steps to see what differs. I recommend that you start by examining the Security Group on the Lambda function and the Security Group on the VPC Endpoint. These should be different Security Groups -- one with an Outbound rule and one with an Inbound rule.