EPOLLRDHUP not reliable

3.8k views Asked by At

I'm using non-blocking read/writes over a client-server TCP connection with epoll_wait.

Problem is, I can't reliably detect 'peer closed connection' event using the EPOLLRDHUP flag. It often happens that the flag is not set. The client uses close() and the server, most of the time, receives, from epoll_wait, an EPOLLIN | EPOLLRDHUP event. Reading yields zero bytes, as expected. Sometimes, though, only EPOLLIN comes, yielding zero bytes.

Investigation using tcpdump shows that normal shutdown occurs as far as I can tell. I see a Flags [F.], Flags [F.], Flags [.] sequence of events, which should correspond to FIN, FIN and ACK. SO_LINGER is nowhere used.

I considered handling 'peer closed' on zero-byte read, however, there is the possibility that you get an EPOLLIN | EPOLLRDHUP event with non-zero bytes available, when the peer sends & immediately closes the connection - case in which I need to base myself on the EPOLLRDHUP. Suggestions?

1

There are 1 answers

2
haelix On BEST ANSWER

To answer this: EPOLLRDHUP indeed comes if you continue to poll after receiving a zero-byte read. So from my experiments it looks like either an EPOLLIN with zero-byte read or an EPOLLRDHUP are reliable indicators for orderly shutdown, the only trouble was, they are not received together. Sometimes (the case that makes the subject of this question), it happens that EPOLLIN is received, yielding zero bytes (connection terminated), and on subsequent polling you get to see the EPOLLRDHUP. Other times, it's vice-versa: you get the EPOLLRDHUP together with an EPOLLIN that signals actual bytes to be read. Then, on subsequent reads, you get zero bytes.