Reliable way to determine file size on POSIX/OS X given a file descriptor

430 views Asked by At

I wrote a function to watch a file (given an fd) growing to a certain size including a timeout. I'm using kqueue()/kevent() to wait for the file to be "extended" but after I get the notification that the file grew I have to check the file size (and compare it against the desired size). That seems to be easy but I cannot figure out a way to do that reliably in POSIX.

NB: The timeout will hit if the file doesn't grow at all for the time specified. So, this is not an absolute timeout, just a timeout that some growing happens to the file. I'm on OS X but this question is meant for "every POSIX that has kevent()/kqueue()", that should be OS X and the BSDs I think.

Here's my current version of my function:

/**
 * Blocks until `fd` reaches `size`. Times out if `fd` isn't extended for `timeout`
 * amount of time. Returns `-1` and sets `errno` to `EFBIG` should the file be bigger
 * than wanted.
 */
int fwait_file_size(int fd,
                    off_t size,
                    const struct timespec *restrict timeout)
{
    int ret = -1;
    int kq = kqueue();
    struct kevent changelist[1];

    if (kq < 0) {
        /* errno set by kqueue */
        ret = -1;
        goto out;
    }

    memset(changelist, 0, sizeof(changelist));
    EV_SET(&changelist[0], fd, EVFILT_VNODE, EV_ADD | EV_ENABLE | EV_CLEAR, NOTE_DELETE | NOTE_RENAME | NOTE_EXTEND, 0, 0);

    if (kevent(kq, changelist, 1, NULL, 0, NULL) < 0) {
        /* errno set by kevent */
        ret = -1;
        goto out;
    }

    while (true) {
        {
            /* Step 1: Check the size */
            int suc_sz = evaluate_fd_size(fd, size); /* IMPLEMENTATION OF THIS IS THE QUESTION */
            if (suc_sz > 0) {
                /* wanted size */
                ret = 0;
                goto out;
            } else if (suc_sz < 0) {
                /* errno and return code already set */
                ret = -1;
                goto out;
            }
        }

        {
            /* Step 2: Wait for growth */
            int suc_kev = kevent(kq, NULL, 0, changelist, 1, timeout);

            if (0 == suc_kev) {
                /* That's a timeout */
                errno = ETIMEDOUT;
                ret = -1;
                goto out;
            } else if (suc_kev > 0) {
                if (changelist[0].filter == EVFILT_VNODE) {
                    if (changelist[0].fflags & NOTE_RENAME || changelist[0].fflags & NOTE_DELETE) {
                        /* file was deleted, renamed, ... */
                        errno = ENOENT;
                        ret = -1;
                        goto out;
                    }
                }
            } else {
                /* errno set by kevent */
                ret = -1;
                goto out;
            }
        }
    }

    out: {
        int errno_save = errno;
        if (kq >= 0) {
            close(kq);
        }
        errno = errno_save;
        return ret;
    }
}

So the basic algorithm works the following way:

  1. Set up the kevent
  2. Check size
  3. Wait for file growth

Steps 2 and 3 are repeated until the file reached the wanted size.

The code uses a function int evaluate_fd_size(int fd, off_t wanted_size) which will return < 0 for "some error happened or file larger than wanted", == 0 for "file not big enough yet", or > 0 for file has reached the wanted size.

Obviously this only works if evaluate_fd_size is reliable in determining file size. My first go was to implement it with off_t eof_pos = lseek(fd, 0, SEEK_END) and compare eof_pos against wanted_size. Unfortunately, lseek seems to cache the results. So even when kevent returned with NOTE_EXTEND, so the file grew, the result may be the same! Then I thought to switch to fstat but found articles that fstat caches as well.

The last thing I tried was using fsync(fd); before off_t eof_pos = lseek(fd, 0, SEEK_END); and suddenly things started working. But:

  1. Nothing states that fsync() really solves my problem
  2. I don't want to fsync() because of performance

EDIT: It's really hard to reproduce but I saw one case in which fsync() didn't help. It seems to take (very little) time until the file size is larger after a NOTE_EXTEND event hit user space. fsync() probably just works as a good enough sleep() and therefore it works most of the time :-.

So, in other words: How to reliably check file size in POSIX without opening/closing the file which I cannot do because I don't know the file name. Additionally, I can't find a guarantee that this would help

By the way: int new_fd = dup(fd); off_t eof_pos = lseek(new_fd, 0, SEEK_END); close(new_fd); did not overcome the caching issue.

EDIT 2: I also created an all in one demo program. If it prints Ok, success before exiting, everything went fine. But usually it prints Timeout (10000000) which manifests the race condition: The file size check for the last kevent triggered is smaller than the actual file size at this very moment. Weirdly when using ftruncate() to grow the file instead of write() it seems to work (you can compile the test program with -DUSE_FTRUNCATE to test that).

1

There are 1 answers

1
Fred the Magic Wonder Dog On
  1. Nothing states that fsync() really solves my problem
  2. I don't want to fsync() because of performance

Your problem isn't "fstat caching results", it's the I/O system buffering writes. Fstat doesn't get updated until the kernel flushes the I/O buffers to the underlying file system.

This is why fsync fixes your problem and any solution to your problem more or less has to do the equivalent of fsync. ( This is what the open/close solution does as a side effect. )

Can't help you with 2 because I don't see any way to avoid doing fsync.