EKS Nodegroup fails to create when launch template ID is specified

490 views Asked by At

I'm trying to create a nodegroup with a custom launch template with userdata but nodegroup creation fails with the following error.

Resource handler returned message: "[Issue(Code=NodeCreationFailure, Message=Instances failed to join the kubernetes cluster, ResourceIds=[i-0ce4d16dfdaba673f])] (Service: null, Status Code: 0, Request ID: null)" (RequestToken: 26cdbc72-af9f-e2cd-3280-fa0e737b8d5f, HandlerErrorCode: GeneralServiceException)

  1. I've verified my NAT gateway is in the public subnet.
  2. My node group is being created in us-east-1b; same AZ as where my NAT gateway . I can also confirm that the route table in private subnet is pointing to NAT gateway for egress.
  3. The Security Group that i'm attaching to launch template is the Security group of the EKS cluster.

Launch Template Code.

AWSTemplateFormatVersion: 2010-09-09
Description: Creates AWSCloudFormationStackSetAdministrationRole which will be used by stacks to provision resources.

Parameters:
  ImageID:
    Type: AWS::EC2::Image::Id
    Description: AWS Image Id of instance type.
    Default: "ami-0a3da8b47de1d87b8"
  accountname:
    Type: String
    Default: "sandboxdemo3"
  InstanceType:
    Type: String
    Default: t3.medium

Resources:
  MyLaunchTemplate:
   Type: AWS::EC2::LaunchTemplate
   Properties:
     LaunchTemplateName: eks-customlaunch-template
     LaunchTemplateData:
       UserData: 
         Fn::Base64: 
           !Sub |
            #!/bin/bash
            echo "Custom user data script"
            yum update -y 
            yum install -y awslogs
            cat > /etc/awslogs/awslogs.conf << EOF
            [general]
            state_file = /var/lib/awslogs/agent-state

            [syslog]
            datetime_format = %b %d %H:%M:%S
            file = /var/log/messages
            buffer_duration = 5000
            log_stream_name = {instance_id}
            initial_position = start_of_file
            log_group_name = /aws/k8/${accountname}-syslog

            [zio]
            datetime_format = %b %d %H:%M:%S
            file = /run/containerd/io.containerd.runtime.v2.task/k8s.io/*/rootfs/var/log/zio-testing.log
            buffer_duration = 5000
            log_stream_name = {instance_id}-${accountname}
            initial_position = start_of_file
            log_group_name = /aws/k8/${accountname}-zio"
            EOF
            systemctl start awslogsd
            systemctl enable awslogsd
       ImageId: !Ref ImageID
       InstanceType: !Ref InstanceType
       BlockDeviceMappings:
         - DeviceName: '/dev/sdh'
           Ebs:
             VolumeSize: 80
       SecurityGroupIds:
         - Fn::ImportValue:
            #  'Fn::Sub': 'eksctl-myeks-${AWS::AccountId}-cluster::ClusterSecurityGroupId'
             'Fn::Sub': 'eksctl-demov6-cluster::ClusterSecurityGroupId'
       TagSpecifications:
         - ResourceType: instance
           Tags:
             - Key: alpha.eksctl.io/cluster-name
               Value: demov6
             - Key: alpha.eksctl.io/nodegroup-name
               Value: demobasev6-2
             - Key: k8s.io/cluster-autoscaler/enabled
               Value: true
            #  - Key: !Sub k8s.io/cluster-autoscaler/myeks-${AWS::AccountId}
            #    Value: owned
            #  - Key: !Sub kubernetes.io/cluster/myeks-${AWS::AccountId}
            #    Value: owned
             - Key: k8s.io/cluster-autoscaler/demov6
               Value: owned
             - Key: kubernetes.io/cluster/demov6
               Value: owned
        

Nodegroup Yaml file:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
- amiFamily: AmazonLinux2
  desiredCapacity: 1
  launchTemplate:
    version: "1"
    id: lt-0a99b80de662a40be
  iam:
    withAddonPolicies:
      albIngress: false
      appMesh: false
      appMeshPreview: false
      autoScaler: true
      awsLoadBalancerController: true
      certManager: false
      cloudWatch: false
      ebs: false
      efs: false
      externalDNS: false
      fsx: false
      imageBuilder: false
      xRay: false
  labels:
    alpha.eksctl.io/cluster-name: demov6
    alpha.eksctl.io/nodegroup-name: demobasev6-2
  maxSize: 4
  minSize: 1
  name: demobasev6-2
  availabilityZones: ["us-east-1b"]
  privateNetworking: true
  releaseVersion: ""
  tags:
    alpha.eksctl.io/nodegroup-name: demobasev6-2
    alpha.eksctl.io/nodegroup-type: managed
  volumeThroughput: 125
  volumeType: gp3
metadata:
  name: demov6
  region: us-east-1
  version: "1.24"
1

There are 1 answers

1
user2051904 On

Yes, i'm able to ssh into the node. I was able to fix this issue by switching over to terraform instead of eksctl & CF. Although it didn't answer my above question, i was able to fix issue by using TF