AWS EKS Network Load Balancer traffic issues

373 views Asked by At

I had some strange issues with traffic/connections, let me describe my issue below:

I have AWS EKS + configured "aws-load-balancer-controller".

I have several pods + ingress (when ingress is going to be added to EKS - ALB was successfully created and it is working without any issues, I am good here).

Now I need to run a pod which is listening some TCP port, for example, 5555 and route traffic to this pod. So, I can`t use ALB, I need NLB.

I found that NLB can be created with configured specific annotations for pod service. So, I am using this config:

apiVersion: v1
kind: Service
metadata:
  name: {{ include "XXX.fullname" . }}
  labels:
    {{- include "XXX.labels" . | nindent 4 }}
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "external"
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
    service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-xxx, subnet-yyy
    service.beta.kubernetes.io/aws-load-balancer-name: xx-nlb-xx
    service.beta.kubernetes.io/aws-load-balancer-security-groups: sg-xxx

So, this configuration was created NLV for me. But, I am trying to send some specific data for my specific app and I am getting an error.

Here is an error (but I guess it is not does matter):

13:12:24.855 INFO - STORESCU->COINSDCMRCV(1) >> A-ASSOCIATE-RJ[result: 2 - rejected-transient, source: 3 - service-provider (Presentation related function), reason: 2 - local-limit-exceeded]

Most interesting thing, that I found - I can send succesfully data but for one specific time period:

  • When NLB was created - also target group was created
  • Target group has added listener - pod with specific IP (my deployed service)
  • This listener is marked is Healthy, so, looks like there is no problems

Most ineresting thing - when I am going to re deploy my service - listener is going to be recreated (new pod has new IP address, so, listener is going to be recreated)

And in this time, while old one listener is going to be deleted, and new one is going to be created - I can send successfully send data! But when new listener is added and status changed to Healthy - I am starting to get this error again!

So, I have time period during listener recreation, but after that - it is not working as expected.

What I am missing? What I can check? Any advices?

2

There are 2 answers

0
prosto.vint On BEST ANSWER

I found the reason.

My python application (not my, but which I am trying to deploy to K8s) by default had 10 connections limit (and kept these connections for 60 sec before close).

So, during configuring health check listener for NLB - all these connections were busy (health check took all of them as it check this resource each 10sec)

That`s it =)

1
Dmytro Sirant On

What is the reason for having custom SG in annotations? According to the documentation:

service.beta.kubernetes.io/aws-load-balancer-security-groups specifies the frontend securityGroups you want to attach to an NLB. When this annotation is not present, the controller will automatically create one security group. The security group will be attached to the LoadBalancer and allow access from inbound-cidrs to the listen-ports. Also, the securityGroups for target instances/ENIs will be modified to allow inbound traffic from this securityGroup.

Please try to remove this annotation and test again.