How does GPUDirect enforce isolation on a shared device

560 views Asked by At

I have been reading here https://developer.nvidia.com/gpudirect about GPUDirect, In there example there is a network card attached to the PCIe together with two GPU's and a CPU.

How is isolation enforced between all clients trying to access the network device? Are they all accessing the same PCI BAR of the device?

Is the network device using some kind SR-IOV mechanism to enforce isolation?

1

There are 1 answers

0
datboi On

I believe you're talking about rDMA, which was supported with the second release of GPU Direct. It's where the NIC card can send/receive data external to the host machine and utilizes peer-to-peer DMA transfers to interact with the GPU's memory.

nVidia exports a variety of functions to kernel space that allow for programmers to look up where physical pages reside on the GPU, itself, and map them manually. nVidia also requires the use of physical addressing within kernel space, which greatly simplifies how other [3rd party] drivers interact with GPU's -- through the host machine's physical address space.

"RDMA for GPUDirect currently relies upon all physical addresses being the same from the PCI devices' point of view."

-nVidia, Design Considerations for rDMA and GPUDirect

As a result of nVidia requiring a physical addressing scheme, all IOMMU's must be disabled in the system, as these would alter the way each card views the memory space(s) of other cards. Currently, nVidia only supports physical addressing for rDMA+GPUDirect in kernel-space. Virtual addressing is possible via their UVA, made available to user space.

How is isolation enforced between all clients trying to access the network device? Are they all accessing the same PCI BAR of the device?

Yes. In kernel space, each GPU's memory is being accessed by it's physical address.

Is the network device using some kind SR-IOV mechanism to enforce isolation?

The driver of the network card is what does all of the work in setting up descriptor lists and managing concurrent access to resources -- which would be the the GPU's memory in this case. As I mentioned above, nVidia gives driver developers the ability to manage physical memory mappings on the GPU, allowing the 3rd party's NIC driver to control what resource(s) are available or not available to remote machines.

From what I understand about NIC drivers, I believe this to be a very rough outline of what's going on under the hood, relating to rDMA and GPUDirect:

  1. Network card receives an rDMA request (whether it be reading or writing).
  2. Network card's driver receives an interrupt that data has arrived or some polling mechanism has detected data has arrived.
  3. The driver processes the request; any address translation is performed now, since all memory mappings for the GPU's are made available to kernel space. Additionally, the driver will more than likely have to configure the network card, itself, to prep for the transfer (e.g. set up specific registers, determine addresses, create descriptor lists, etc).
  4. The DMA transfer is initiated and the network card reads data directly from the GPU.
  5. This data is then sent over the network to the remote machine.

All remote machines requesting data via rDMA will use that host machine's physical addressing scheme to manipulate memory. If, for example, two separate computers wish to read the same buffer from a third computer's GPU with rDMA+GPUDirect support, one would expect the incoming read request's offsets to be the same. The same goes for writing; however an additional problem is introduced if multiple DMA engines are set to manipulate data in overlapping regions. This concurrency issue should be handled by the 3rd party NIC driver.

On a very related note, another post of mine has a lot information regarding nVidia's UVA (Unified Virtual Addressing) scheme and how memory manipulation from within kernel-space, itself, is handled. A few of the sentences in this post were grabbed from it.

Short answer to your question: if by "isolated" you mean how does each card preserve its own unique address-space for rDMA+GPUDirect operations, this is accomplished by relying on the host machine's physical address space which fundamentally separates the physical address space(s) requested by all devices on the PCI bus. By forcing the use of each host machine's physical addressing scheme, nVidia essentially isolates each GPU in that host machine.