I am trying to benchmark a tool I'm developing in terms of time, memory, and disk use. I know /usr/bin/time
gives me basically what I want for the first two, but for disk use I came to the conclusion I would have to roll my own bash script that periodically extracts the 'bytes written' contents from /proc/<my_pid>/io
. Based on this script, here's what I came up with:
"$@" &
pid=$!
status=$(ps -o rss -o vsz -o pid | grep $pid)
maxdisk=0
while [ "${#status}" -gt "0" ];
do
sleep 0.05
delta=false
disk=$(cat /proc/$pid/io | grep -P '^write_bytes:' | awk '{print $2}')
disk=$(disk/1024)
if [ "0$disk" -gt "0$maxdisk" ] 2>/dev/null; then
maxdisk=$disk
delta=true
fi
if $delta; then
echo disk: $disk
fi
status=$(ps -o rss -o vsz -o pid | grep $pid)
done
wait $pid
ret=$?
echo "maximal disk used: $maxdisk KB"
Unfortunately, I am running into two problems:
- The first is that I am piping the output of this script along with that of the tool I would like to benchmark to a file, and it seems occasionally these streams interfere, leading me to see 0 or too low disk use reported at the bottom of this file.
- The second problem is that I don't know what to do about processes that delete temporary files as part of their process. In this case I think the fair benchmark would be to record the maximum net disk use (i.e., the peak in bytes written - bytes erased), but I don't know where the second part of this difference can be found.
How can I resolve these problems?
You may like to have a look at
filetop
from BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more:Brendan Gregg gives good talks and demos about Linux Performance Tools, they are quite instructive.