get the size and other info with the "du" command

1.5k views Asked by At

I'm doing a little script on bash, which shows the total size in mb, the number of files, the number of the folder and the name of folder. I have almost everything except the size in mb

du -a -h | cut -d/ -f2 | sort | uniq -c

It shows something like this:

  4 01 folder 01
  6 02 folder 02
 11 03 folder 03
 13 04 folder 04
 16 05 folder 05
 .....
 15 13 folder 13
  1 5.7G    .

as you see, the sort is: number of files, number of folder and name.

I want this:

  300M 4 01 folder 01
  435M 6 02 folder 02
  690M 11 03 folder 03
  780M 13 04 folder 04
  1.6G 16 05 folder 05
 .....
 15 13 folder 13
  1 5.7G    .

thank you in advance.

PD there is some way to show the name over each column like this?

  M    F  # name
  300M 4 01 folder 01
  435M 6 02 folder 02
  690M 11 03 folder 03
  780M 13 04 folder 04
  1.6G 16 05 folder 05
 .....
 15 13 folder 13
  1 5.7G    .
1

There are 1 answers

1
Josh Jolly On BEST ANSWER

How about this?

echo -e "Size\tFiles\tDirectory"; paste <(du -sh ./*/ | sort -k2 | cut -f1) <(find ./*/ | cut -d/ -f2 | uniq -c | sort -k2 | awk '{print ($1-1)"\t"$2}') | sort -nk2

Sample output:

Size    Files   Directory
172M    36      callrecords
17M     747     manual
83M     2251    input
7.5G    16867   output

Explanation:

Add the header:

echo -e "Size\tFiles\tDirectory";

<(COMMAND) is a structure which allows the output of a command to be used as if it were a file. Paste takes 2 files, and outputs them side by side. So we are pasting together the outputs of two commands. The first is this:

<(du -sh ./*/ | sort -k2 | cut -f1)

Which simply finds the size of subfolders of the current folder, summarising anything inside. This is then sorted according to the names of the files/folders, and then the first column is taken. This gives us a list of the sizes of subfolders of the current folder, sorted by their name.

The second command is this:

<(find ./*/ | cut -d/ -f2 | uniq -c | sort -k2 | awk '{print ($1-1)"\t"$2}')

This is similar to your original command - it finds folders below the current directory, truncates the names to include only the first sublevel, then counts them to give a list of sub-folders of the current folder, and the number of files within each. This is then sorted by the folder names, and the awk command formats the results and also subtracts 1 from the file count for each folder (as the folder itself is included). We can then paste the results together to get the (almost) final output.

Finally, we use sort -nk2 on the output of the paste command to sort by number on the 2nd field - ie the number of files.