delete directories not in the file containing directory names list

443 views Asked by At

I have a file having the list of directory name I want to keep. Say file1 and its contents are names of directories like

  • dir1
  • dir2
  • dir3

My directory (actual directories) on the other hand has directories like

  • dir1
  • dir2
  • dir3
  • dir4
  • dirs

What I want to do is delete dir4, dirs and other directories of which their name doesn't exist on file1 from My directory. file1 has a directory name per line. There might be sub directories or files under dir4 and dirs which needs a recursive deletion.

I can use xargs to delete the files in the list within My directory

xargs -a file1 rm -r

But instead of removing, I want to keep them and remove the others which are not on file1. Can do

xargs -a file1 mv -t /home/user1/store/

And delete the remaining directories in my directory but I am wandering if there is a better way?

Thanks.

3

There are 3 answers

0
Eugeniu Rosca On BEST ANSWER
find . -maxdepth 1 -type d -path "./*" -exec sh -c \
    'for f; do f=${f#./}; grep -qw "$f" file1 || rm -rf "$f"; done' sh {} +
0
zedfoxus On

Anish has a great one-liner answer for you. If you wanted something verbose that can help you in the future with data manipulation or such, here's a verbose version:

#!/bin/bash

# send this function the directory name
# it compares that name with all entries in
# file1. If entry is found, 0 is returned
# That means...do not delete directory
#
# Otherwise, 1 is returned
# That means...delete the directory
isSafe()
{
    # accept the directory name parameter
    DIR=$1
    echo "Received $DIR"

    # assume that directory will not be found in file list
    IS_SAFE=1 # false

    # read file line by line
    while read -r line; do

        echo "Comparing $DIR and $line."
        if [ $DIR = $line ]; then
            IS_SAFE=0 # true
            echo "$DIR is safe"
            break
        fi

    done < file1

    return $IS_SAFE
}

# find all files in current directory
# and loop through them
for i in $(find * -type d); do

    # send each directory name to function and
    # capture the output with $?
    isSafe $i
    SAFETY=$?

    # decide whether to delete directory or not
    if [ $SAFETY -eq 1 ]; then
        echo "$i will be deleted"
        # uncomment below
        # rm -rf $i
    else
        echo "$i will NOT be deleted"
    fi
    echo "-----"

done
0
atti On

you can exclude your directories using grep:

find . -mindepth 1 -maxdepth 1 -type d -printf '%P\n' | grep -f file1 -Fx -v | xargs rm -r

-printf '%P\n' is used in order to remove leading './' from directory names.
From man find, description of -printf format:

%P     File's name with the name of the starting-point under which it was found removed.

grep parameters:

-f FILE   Obtain patterns from FILE, one per line.
-F     Interpret PATTERNS as fixed strings, not regular expressions.
-x     Select only those matches that exactly match the whole line. For a regular expression pattern, this is like parenthesizing the pattern and then surrounding it with ^ and $.
-v     Invert the sense of matching, to select non-matching lines.