Difference between how Linux raw sockets and DPDK pass data to userspace?

748 views Asked by At

I am trying to understand the difference between how DPDK passes packet data to userspace, versus Linux raw sockets?

The Man page for raw sockets implies the Kernel does no packet processing:

Packet sockets are used to receive or send raw packets at the device driver (OSI Layer 2) level. They allow the user to implement protocol modules in user space on top of the physical layer.

SOCK_RAW packets are passed to and from the device driver without any changes in the packet data.

https://man7.org/linux/man-pages/man7/packet.7.html

which at first glance might appear the same as 'Kernel bypass'. However, I presume this is incorrect, otherwise there would be no point to DPDK, OpenOnload, AF_XDP etc?

So, what is the difference between how DPDK passes frame/packet data to userspace, versus using raw sockets?

Is the answer raw socket packets still go through the kernel, still incurs context switches and copying etc, but userspace (eventually) sees the entire packet unmodified? And in contrast DPDK passes the (unmodified) data directly to userspace without copying and context switches (from the kernel)?

So both provide the same data to userspace, just via different paths?

2

There are 2 answers

0
4va1anch3 On

You are close to the answer. SOCK_RAW still needs kernel context switch and most importantly, packet copy to/from kernel memory. But DPDK does not need them and packet directly goes to or comes from userspace memory.

0
djen.yi On

http://yusufonlinux.blogspot.com/2010/11/data-link-access-and-zero-copy.html

summary(quoted from this article):

1. Unfortunately not much has been written about this feature of packet socket and through this article I will try to bridge this gap.

2. To enable this feature, Kernel should be compiled with below configuration

CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y

the important action is to allocation of RX ring buffer and TX ring buffer. To do this we need to use the setsockopt call as below

setsockopt(fd , SOL_PACKET , PACKET_RX_RING , (void*)&req , sizeof(req));
setsockopt(fd , SOL_PACKET , PACKET_TX_RING , (void*)&req , sizeof(req));

struct tpacket_req {
    unsigned int tp_block_size; /* Minimal size of contiguous block */
    unsigned int tp_block_nr; /* Number of blocks */
    unsigned int tp_frame_size; /* Size of frame */
    unsigned int tp_frame_nr; /* Total number of frames */
};

The above call of setsockopt sets the circular buffer, which is unswapable memory in kernel, As this buffers are mapped to user space.

4. To use the Zero copy, socket should be bound to an interface.

5. some other features such as DMA may need one more copy.