How can I fix this issue with using FSx as Persistent Volume for EKS?

1k views Asked by At

I have a cluster with nodes that are windows based. I followed this Using SMB CSI Driver on Amazon EKS Windows nodes | Microsoft Workloads on AWS but when I deployed the Windows pod (step 5.6), the pods are in pending state. This is the Warning I got:

Reason: FailedMount

From: kublet

Message:

  1. MountVolume.MountDevice failed for volume "pv-smb" : rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix C:\\var\\lib\\kubelet\\plugins\\smb.csi.k8s.io\\csi.sock: connect: No connection could be made because the target machine actively refused it."

  2. Unable to attach or mount volumes: unmounted volumes=[smb], unattached volumes=[smb kube-api-access-5v5p6]: timed out waiting for the condition

I would appreciate if anyone would help me out on this. Thank you :)

EDIT: After checking the connectivity and security group which was fixed, ended up with another error:

MountVolume.MountDevice failed for volume "pv-smb" : rpc error: code = Internal desc = volume(FSx_id) mount "//Fsx_id.AD_DNS_name/share" on "\var\lib\kubelet\plugins\kubernetes.io\csi\smb.csi.k8s.io\da35e2ac08d4bd6b3f917c217d32fc33bb4c2b87b9068efb5845c8eb666d8d5d\globalmount" failed with NewSmbGlobalMapping(\Fsx_id.AD_DNS_name\share, c:\var\lib\kubelet\plugins\kubernetes.io\csi\smb.csi.k8s.io\da35e2ac08d4bd6b3f917c217d32fc33bb4c2b87b9068efb5845c8eb666d8d5d\globalmount) failed with error: rpc error: code = Unknown desc = NewSmbGlobalMapping failed. output: "New-SmbGlobalMapping : The network path was not found. \r\nAt line:1 char:190\r\n+ ... ser, $PWord;New-SmbGlobalMapping -RemotePath $Env:smbremotepath -Cred ...\r\n+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n + CategoryInfo : NotSpecified: (MSFT_SmbGlobalMapping:ROOT/Microsoft/...mbGlobalMapping) [New-SmbGlobalMa \r\n pping], CimException\r\n + FullyQualifiedErrorId : Windows System Error 53,New-SmbGlobalMapping\r\n \r\n", err: exit status 1

1

There are 1 answers

5
Fuat Ulugay On

Some possible reasons:

- missing driver: C:\var\lib\kubelet\plugins\smb.csi.k8s.io\csi.sock exists on the Windows nodes. You can SSH into the Windows nodes and check if the file is present. If it's missing, it indicates an issue with the CSI driver installation.

- network connection, firewall, security group issues: Test the connectivity to the SMB share from a Windows node. You cand use tools like Test-NetConnection: Test-NetConnection -Port .

As far as I understand from the error message it is probably security groups and network access issue.

If you already tested these and checked security rules, please provide more details to troubleshoot.

--- After Edit of Question Above ---

  • Verify FSx access permissions
  • Check credentials and authentication: Ensure that the credentials used to access the FSx file system are valid and have the necessary permissions.
  • Review the SMB configuration: Double-check the configuration parameters for the SMB (Server Message Block) mount. Make sure the share path is accurate, and all necessary SMB-related settings, such as authentication methods and access control, are properly configured.
  • check this tutorial for step by step instructions to see if you are missing something. https://aws.amazon.com/blogs/storage/accessing-smb-file-shares-remotely-with-amazon-fsx-for-windows-file-server/