Why is the iops observed by fio different from that observed by iostat?

1.3k views Asked by At

Recently, I'm trying to test my disk using fio. My configuration of fio is as follows:

[global]
invalidate=0    # mandatory
direct=1
#sync=1
fdatasync=1
thread=1
norandommap=1
runtime=10000
time_based=1

[write4k-rand]
stonewall
group_reporting
bs=4k
size=1g
rw=randwrite
numjobs=1
iodepth=1

In this configuration, you can see that I configured fio to do random writes using direct io. While the test is running, I used iostat to monitor the I/O performance. And I found that: if I set fdatasync to 1, then the iops observed by fio is about 64, while that observed by iostat is about 170. Why is this different? And if I don't configure the "fdatasync", both iops are approximately the same, but much higher, about 450. Why? As far as I know, direct io does not go through page cache, which, in my opinion, means that it should take about the same time not matter whether fdatasync is used.

And I heard that iostat could come up with wrong statistics under some circumstances. Is that real? What exactly circumstance could make iostat go wrong? Is there any other tools that I can use to monitor the I/O performance?

1

There are 1 answers

0
Anon On

Looking at your jobfile it appears you are not doing I/O against a block device but instead against a file within a filesystem. Thus while you may ask the filesystem "put this data at that location in that file" the filesystem may turn into multiple block device requests because it has to also update metadata associated with that file (e.g. the journal, file timestamps, copy on write etc) too. Thus when the requests are sent down to the disk (which is what you're measuring with iostat) the original request has been amplified.

Something to also bear in mind is that Linux may have an ioscheduler for that disk. This can rearrange, split and merge requests before submission to the disk / returning them further up in the stack. See the different parameters of nomerges in https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt for how to avoid some of the merging/rearranging but note you can't control the splitting of a request that is too large (but a filesystem won't make overly large requests).

(PS: I've not known iostat to be "wrong" so you might need to ask the people who say it directly to find out what they mean)