Is there a standard way to diff du outputs to detect where disk space usage has grown the most

547 views Asked by At

I work with a small team of developers where we share a unix file system to store somewhat large datasets. This file system has a somewhat prohibitive quota on it so about once a month we have to figure out where our free space has gone and see what we can recover.

Obviously we use du a fair amount but this is still a tedious process. I had the thought that we may be able to keep last months du output around and compare it to this months to see where we've had the most growth. My guess this plan isn't very original.

With this in mind I am asking if there are any scripts out there that already do this.

Thanks.

4

There are 4 answers

3
NawaMan On BEST ANSWER

I really don't know if there is a standard way but I need it sometime ago and I wrote a small perl script to handle that. Here is the part of my code:

#!/usr/bin/perl

$FileName = "du-previous";
$Location = ">";
$Sizes;

# Current +++++++++++++++++++++++++++++
$Current = `du "$Location"`;
open my $CurrentFile, '<', \$Current;
while (<$CurrentFile>) {
    chomp;
    if (/^([0-9]+)[ \t]+(.*)$/) {
        $Sizes{$2} = $1;
    }
}
close($CurrentFile);

# Previous ++++++++++++++++++++++++++++
open(FILE, $FileName);
while (<FILE>) {
    chomp;
    if (/^([0-9]+)[ \t]+(.*)$/) {
        my $Size = $Sizes{$2};
        $Sizes{$2} = $Size - $1;
    }
}
close(FILE);

# Show result +++++++++++++++++++++++++
SHOW: while (($key, $value) = each(%Sizes)) {
    if ($value == 0) {
        next SHOW;
    }

    printf("%-10d %s\n", $value, $key);
}
close(FILE);

#Save Current +++++++++++++++++++++++++
open my $CurrentFile, '<', \$Current;
open(FILE, ">$FileName");
while (<$CurrentFile>) {
    chomp;
    print FILE $_."\n";
}
close($CurrentFile);
close(FILE);

The code is not very error-tolerant so you may adjust it.

Basically the code, get the current disk usage information, compare the size with the lastest time it run (saved in 'du-previous'), print the different and save the current usage information.

If you like it, take it.

Hope this helps.

1
DigitalRoss On

What you really really want is the awesome kdirstat.

1
Robie Basak On

For completeness, I've also found du-diff and don't see it mentioned in any other answer. Andrew's diff-du (mentioned in another answer) seems to be more advanced that this one.

2
Andrew On

I wrote a program to do this called diff-du. I can't believe nobody had already done this! Anyhow, I find it useful and I hope you will too.