rdma connection manager driver pattern

561 views Asked by At

I'm using the OFED 3.18r2 implementation of Infiniband drivers for my application. In particular I'm using the rdma connection manager wrapper functions. To understand better what's going on under the hood I'm used to look at the source code. Doing this I came into something that looks like a pattern but I cannot understand it. Let's make an example. The rdma connection manager functions are in cma.c. Looking for example at the rdma_listen call (this is common to almost every functions defined in the library that start with "rdma_"):

int rdma_listen(struct rdma_cm_id *id, int backlog)
{
    struct ucma_abi_listen cmd;
    struct cma_id_private *id_priv;
    int ret;

    CMA_INIT_CMD(&cmd, sizeof cmd, LISTEN);
    id_priv = container_of(id, struct cma_id_private, id);
    cmd.id = id_priv->handle;
    cmd.backlog = backlog;

    ret = write(id->channel->fd, &cmd, sizeof cmd);
    if (ret != sizeof cmd)
        return (ret >= 0) ? ERR(ENODATA) : -1;

    if (af_ib_support)
        return ucma_query_addr(id);
    else
        return ucma_query_route(id);
}

Here you can see the pattern I mentioned before:

ret = write(id->channel->fd, &cmd, sizeof cmd);

the first argument to the write call is the file descriptor associated with the /dev/infiniband/rdma_cm , but what I cannot understand the usage of the cmd arguments. I dig into the source only to find that cmd is a struct that comes for the ABI definition of the rdma cm function calls. I really don't understand if this is a common pattern to execute device driver calls and how it works, where is the real code associated with the cmd argument. Could you please help me?

1

There are 1 answers

2
haggai_e On BEST ANSWER

Using a write() system call to execute commands is a common method for executing commands in the RDMA subsystem. It is used among others by the rdma_ucm module and by the ib_uverbs module. The kernel code associate with rdma_ucm can be found in the drivers/infiniband/core/ucma.c file. Specifically, the write() system call for this device is implemented in the ucma_write() function.

I don't think there is a lot of documentation on this method of calling into the driver. The user_verbs.txt document in the kernel documentation states:

Commands are sent to the kernel via write()s on these device files.
The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
The structs for commands that require a response from the kernel
contain a 64-bit field used to pass a pointer to an output buffer.
Status is returned to userspace as the return value of the write()
system call.

I think it may be a small abuse of the write() system call that implements something more similar to a ioctl().

Edit: Note that I added a link to the upstream kernel module, but the OFED source structure is similar. Edit: Added some documentation pointers.