How to determine reasonable number of bytes to read per read system call?

3.1k views Asked by At

I am playing with file reading/writing but have difficulty deciding how large to make my read buffer for the "read" system call.

In particular, I am looking at "http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html"

It doesn't seem to say any restrictions on how many bytes I can read at once other than SSIZE_MAX.

To make matters worse, If I make an array with SSIZE_MAX characters, the program yields a:

sh: ./codec: Bad file number

Is there any reasonable way to decide how many bytes to read per read system call? My concern is that this may vary system to system (I can't just make as many reads as possible until a read fails to determine exact number of bytes I can read, and even if I do, it won't necessarily be any faster than reading less bytes).

One idea I had was to check my CPU cache size and try to make my buffer no larger than that, but since I don't know how CPU caches work, I am not sure if this is necessarily correct.

Thanks ahead of time.

3

There are 3 answers

2
Nominal Animal On BEST ANSWER

I've pondered basically the same question, and I've come to a very simple conclusion:

Use a conservative default or heuristic, but let the user override it easily if they want.

You see, in some cases the user might not want the maximum throughput for your utility, but perhaps do whatever it is on the background. Perhaps the task is just not that important. Personally, in Linux, I often use nice and ionice utilities to put long-but-not-priority tasks on the back burner, so to speak, so that they don't interfere with my actual work.

Benchmarks within the last decade indicate 128k to 2M block sizes (217 to 221 bytes) to consistently work well -- not far from optimal rates in almost all situations --, with the average slowly shifting towards the larger end of that range. Typically, powers of two sizes seem to work better than non-powers-of-two, although I haven't seen enough benchmarks of various RAID configurations to trust that fully.

Because your utility will almost certainly be recompiled for each new hardware type/generation, I'd prefer to have a default block size, defined at compile time, but have it trivially overridden at run time (via a command-line option, environment variable, and/or configuration file).

If your utility is packaged for current POSIXy OSes, the binaries could use a default that seems to suit best for the types of tasks done on that machine; for example, Raspberry Pis and other SBCs often don't have that much memory to start with, so a smaller (say, 65536 bytes) default block size might work best. Desktop users might not care about memory hogs, so you might use a much larger default block size on current desktop machines.

(Servers, and in high performance computing (which is where I've pondered about this), the block size is basically either benchmarked on the exact hardware and workload, or it is just a barely-informed guess. Typically the latter.)

Alternatively, you could construct a heuristic based on the st_blksizes of the files involved, perhaps multiplied by a default factor, and clamped to some preferred range. However, such heuristics tend to bit-rot fast, as hardware changes.

With heuristics, it is important to remember that the idea is not to always achieve the optimum, but to avoid really poor results. If a user wants to squeeze out the last few percent of performance, they can do some benchmarking within their own workflow, and tune the defaults accordingly. (I personally have, and do.)

31
fuz On

Call stat() or fstat() on the file you want to read. The struct stat member st_blksize contains the optimal buffer size you should use for reading from the file you called stat() on.

0
Luis Colorado On

Well, determining the proper buffer size is completely problem dependant. First of all, I'll check the state of the art in buffer size determining: stdio uses BUFSZ as the buffer size (which is normally the size of one unix disk block, once fixed to 512, and probably now somewhat between 1024 and 4096 ---from a disk block size to a virtual page size---) This is far to low for the quantities that are moving around here, but is a well (and thought) acceptable value.

On other side, think an embedded system with only 8Kb mem, and using a one megabyte buffer storage. It sounds somewhat strange to be using virtual memory for buffer storage (if allowable).

Suppose you are designing a file copy utility, in which the best buffer size determining will be a best. Probably, thinking that the largest admisible value would be a must. But after some testing you'll get to a lot of misused memory. Suppose you design your process to make only one thread to act as a reader and a writer. You make a read of data, then write that data to another file. The first thing that comes around is that the much memory you use, doesn't affect, as that only affects the order of writes and reads from your process... If one read means a disk read (suppose one block at a time) something above the block size of your disk will not make you to make extra reads to get the same data (and this is actually done at the system level, that buffers your data, making it feasible to read data in one single byte chunks, while the system is reading data block by block)

The second approach is to minimize system calls. Everybody knows at this point that making a system call is some expensive thing, so if we can arrange to make the minimum system calls, we'll get something good. But after a while, you'll get that there's no extra performance in that, as the system is reading your data block by block, and your process is waiting for it, making the system call penalty completely invisible as it represents less than 1% of the wait time. Also, the system has to warrant your data doesn't change in the mean while, making atomic read calls (this is done by locking the inode of your file from the beginning to the end) so no process can actually refer to the same file until you finish your call. Allowing a large buffer, makes your process probably too large to fit in memory and loads your system with extra swap processing. This makes you to be careful before going to large buffers.

Lastly, extra system call penalty is so little penalty compared with the ones that arise from using large buffers, that has no use at all. If you live in a normal size system (let's suppose a laptop or desktop computer with some amount of ram in the order of 2-8Gb) a buffer size of 8Kb will be probably nice for all the scenarios.

One final thing to consider is the audio/video streaming buffer size determination. In this case there's normally a producer (the reader of data to be reproduced) that produces data at different speeds (variable over time on network load, for example) and a consumer that eats those data at a fixed rate (let's say 8kbps for a telco call, 192kbps for a cd play, etc) the buffer size must allow to compensate for the data provision speed changes to not empty the buffer, as then, we'll have a no-sound period to fill the gap. This is dynamic in nature, and you'll have to calculate in prevision of possible network packet loss and retransmission allowance. By delaying the consumer and filling some buffer with streaming data allows you to compensate and make the data consumer happy with no data loss. In this case, 2Mb of streaming buffer are common sense in some scenarios, depending on the probability of data loss, retransmission time and streaming quality you want to afford.