I'm writing a client/server process on Suse Linux using Posix message queues to communicate, similar to the accepted answer in "How do I use mqueue in a c program on a Linux based system?". When the server dies, it does an mq_close and mq_unlink. However, the client gets no notification of this, so calls to mq_send in the client will continue to work even though the queue has been unlinked.
The problem is, when the server is restarted, it tries to create a queue with mq_open with O_CREAT, but that fails because the client still has an open fd. So even though the filename in /dev/mqueue doesn't appear to exist, the server can't create one until the client exits and closes its file descriptor. I just wanted to be sure I understand this correctly: If I wanted the server to close, unlink, and re-open the mqueue (eg: with different attributes), does it definitely need the client to either exit or close it's fd? This is much different than the way it works with a plain file: I can remove a file that another process is using, and the file system might rename it ".nfsXXX" and they could continue to use it, but I can make a new file with that name right away.
My first try at fixing this is just to not unlink the mqueue when the server exits -- if I want to allow for the server to be restarted without the client needing to be restarted, then I suppose I shouldn't unlink the queue (because the server knows that the client might still be using the mqueue, it shouldn't be unlinked).
What I would ideally like to happen is for the new mq_open to succeed in the server, and the next mq_send to fail in the client. Is there a simple way to simulate this? The ways that occur to me are:
- Doing an fstat (or something) on "/dev/mqueue/queueName" before every mq_send (yuck!) and closing the fd if the name doesn't exist (while the server tries to recreate it in a loop), but even that doesn't work perfectly if the client is currently blocked on mq_send because the queue was full.
- have a separate socket in the client that the server would message to when it wanted the client(s) to close their mqueues (and probably a separate thread in the client to monitor that socket).
- have the server kill the client(s).