RDMA Scatter/Gather in verbs API

1.1k views Asked by At

RDMA Scatter/Gather is a nice way to consolidate data transfers. For example, verbs API allows data at multiple locations to be written in a remote buffer with a SINGLE RDMA write operation; or, data in a remote buffer could be read to multiple locations with a SINGLE RDMA read operation.

However, I can not initiate an RDMA operation writing to multiple locations on the remote side (or reading from multiple locations on the remote side). This feature is appealing to us because it efficiently uses the wide RDMA lanes for multiple small writes. I also checked the Intel qsm APIs and the Cray gni APIs. It seems no one support such a feature--let's call it "writer-controlled remote scatter". Is there a deep reason this is not supported?

2

There are 2 answers

0
Yuval Degani On

I do not have a good explanation for why the verbs interface does not support it, as it can be definitely implemented in hardware.

However, there are at least two ways to do this more efficiently: 1. Easier way - you can post a list of RDMA requests at once for multiple remote locations and request a completion entry only for the last one - this will provide better performance than posting them one by one. 2. More advanced: you can create a "UMR" on the remote host, that will group all of those locations into one contiguous virtual MR, then you can use that remote virtual MR with a single post operation

0
Peleg Abergel On

The reason RDMA write has a limited scatter list is that the list has to be transmitted over the wire and fulfilled by the HCA on the remote side and HCAs can have limited resources to store this information. This is in contrast to local operations such as posting a receive descriptor where the descriptor is local on the machine.