fio -numjobs=8 -directory=/mnt -iodepth=64 -direct=1 -ioengine=libaio -sync=1 -rw=randread -bs=4k
FioTest: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
iops: (8 threads and iodepth=64)-> 356, 397, 399, 396, ... but when -numjobs=1 and iodepth=64, the iops -> 15873
I feel a little confused. Why the -numjobs larger, the iops will be smaller?
It's hard to make a general statement because the correct answer depends on a given setup.
For example, imagine I have a cheap spinning SATA disk whose sequential speed is fair but whose random access is poor. The more random I make the accesses the worse things get (because of the latency involved in each I/O being serviced - https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html suggests 3ms is the cost of having to seek). So 64 simultaneous random access is bad because the disk head is seeking to 64 different locations before the last I/O is serviced. If I now bump the number of jobs up to 8 that 64 * 8 = 512 means even MORE seeking. Worse, there are only so many simultaneous I/Os that can actually be serviced at any given time. So the disk's queue of in-flight simultaneous I/Os can become completely full, other queues start backing up, latency in turn goes up again and IOPS start tumbling. Also note this is compounded because you're prevent the disk saying "It's in my cache, you can carry on" because
sync=1
forces the I/O to have to be on non-volatile media before it is marked as done.This may not be what is happening in your case but is an example of a "what if" scenario.