I've written an event loop to handle file descriptor read/write events. I have successfully written a version of the code that supports kqueue and a second version that supports select. I am working on my third and final version which will support epoll.
I am experiencing a problem when I register a new descriptor for a EPOLLIN event. The descriptor in question is already "listening" for connections, so I wait for a read event to occur so that I know the next call to "accept" will succeed (common practice for non-blocking accept).
All file descriptors are set to non-blocking.
My call to epoll_wait returns two events for the same descriptor. The first event has the event field set to the value of EPOLLIN. The second event structure has the event field set to 0 / empty. The data.fd field lists the same FD number as the first struct.
What are the circumstances where epoll_wait will return an event structure with a zeroed event field?
This does NOT happen every time but it happens 90+% of the time.
Lastly, I'd post code but this is written in Ruby and there is a LOT of boilerplate to wrap up the socket, listen, accept, etc. functions in FFI, set constants, etc. The example code would be quite long and unwieldy so I am not posting any code.
The problem above was a case of garbage in, garbage out. I had removed the
Ruby
tag after a complaint from a commenter, but I need to add it back. The problem stemmed from the Ruby FFI definition of theepoll_event
struct. Here is the original, incorrect code:The above definition yielded an
EpollEventStruct
with a size of 16 bytes. The struct should be 12 bytes.The problem was that the
data
field in the second struct was offset 8 bytes. By default, Ruby's FFI implementation aligns all fields on a 8-byte boundary. The fix is to specify that the struct should be packed.So when my code was passing the heap memory to the
epoll_ctl
andepoll_wait
functions, it was operating on event structs that were too large. This corrupted memory which in turn produced corrupted results that made no sense (i.e. returning 2 events for the same FD, second struct had noevents
bits set).