Using fsync() to ensure data consistency on a real time system

584 views Asked by At

I'm having a hard time determining what would be the best way to implement an fsync() into a real time system. The only requirement that I need to meet is that the fsync() must not break frames (100 Hz - 10 ms per frame). I did some initial benchmarking, and I'm currently leaning toward calling fsync() after every fixed size write (about 1 KB) until the file is finished. Another suggestion that I was given is to call fsync() on a slower task/thread (either at the end on the entire file, or every frame of this slower task).

You can probably guess that I'm a newbie at this by the way I described the issue and the options I explored, but hit me with the complicated stuff anyways. Are there any other implementation that I can try? What would be the most efficient/best way to go about this?

Thanks!

Edited: The OS I'm running on is Linux. To perform write, I am using C library with FILE * to perform file I/O. Since this is currently happening on a 100 Hz task, that's 100 frames a second with 1 KB write per frame (that's just for this specific operation and not accounting for other writes happening elsewhere in this frame by other operators).

1

There are 1 answers

0
jschmerge On

You really need to give specifics as to what Operating System you are using to get a good answer for this. Most Unix-like OS's have no concept of real-time guarantees, and the ones that do generally have very lose guarantees regarding file I/O.

For the rest of this answer, I'm going to assume that you are using some variation of modern Linux, which does have some limited real-time scheduling functionality. I'm also going to assume that you are writing data to a simple file on a standard filesystem(ext[234], btrfs, etc). I'm also going to assume you are using the low-level read()/write() style system calls rather than application-level buffering using C-stdio or C++ iostreams...

The way Linux's filesystem layers are designed, all I/O to and from disks ends up cached in memory and is asynchronously marshalled to the hardware storage as needed. There is a kernel thread that periodically flushes dirty pages in memory to disk on a configurable interval, and that interval is a tunable, changeable using sysctl or the /proc/sys interface. Under light I/O loads, this asynchronous scheme is more than adequate for ensuring that your processes won't block for long on I/O, but as your I/O load starts to exceed the amount that can physically be written to disk, your application will block, and this can potentially be an extremely lengthy operation.

What you are doing with your fsync() calls is circumventing the kernel's asynchronous mechanisms for amortizing I/O cost, ensuring that the dirty pages you create are flushed before your I/O operation completes. If you do this with too small of an I/O set size, you are actually, counter-intuitively, making the I/O much slower.

Assuming that your estimate for typical I/O size is 1KiB per frame, and assuming ~30-60 frames per second is correct, I believe that that would be somewhere between 30-60KiB per second, which should be well within the OS's ability to flush data to disk on its own. As such, my advise to you is to just kick the can down the road, and worry about blocking on I/O if it becomes an issue. I would however, also spend some time to write some code that measures the time spent in a write system call and measure it to be sure :)