Unable to load bnxt_en driver intermittently on linux os backed by hypervisor

2k views Asked by At

I have a VM backed by vCenter. vCenter ESXi have physical adapter "Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller" and SR-IOV enabled on this.

VM is connected to 1mgmt network (vmxnet3) and 2 SR-IOV adapters (SRIOVPassthrough).

Upon booting of the VM, only 2 networks shown up. (1mgmt and 1SR-IOV).

Journalctl -k logs showed following error.

[ 4832.408471] bnxt_en 0000:13:00.0 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg {0x0 0x0} len:0
[ 4832.408930] bnxt_en: probe of 0000:13:00.0 failed with error -1

Reboot of machine did not help at all.

For the successful one adapter

bnxt_en 0000:03:00.0 eth1: NIC Link is Up, 25000 Mbps full duplex, Flow control: ON - receive & transmit
bnxt_en 0000:03:00.0 eth1: FEC autoneg off encodings: None

I did rescan of the pci devices and did multiple times reboot without any success.

Any pointers would be really helpful

2

There are 2 answers

1
Rob On

Disabling PXE didn't work for us, but we can get the ports back online, by running

echo 0000:af:00.0 > /sys/bus/pci/drivers/bnxt_en/bind

Where 0000:af:00.0 is the PCI number for the port, which can be gotten from dmesg | grep bnxt_en and looking for the port or ports that failed.

0
Georg On

We've got a similar issue and were able to fix it. In our case we had the same error message on Debian 10, 11 and Oracle Linux 8 but we installed it directly on hardware without an hypervisor. But it could be the same issue cause you're using passthrough.

There are two ways to fix it:

  • Usage of UEFI Boot
  • Disable PXE Boot and keep Bios / Legacy Boot

Both options fixed it.