Bahs - How To Copy Files From A Directory To Multiple Directories?

285 views Asked by At

The directory (mydir) has 1000 files (ls | wc -l) but I want to copy only those files with file.num.txt to a directory num.

Here is an example:

  1. mydir
    • file.1
    • file.1.txt
    • file.2
    • file.2.txt
    • ...
  2. /home/user1/store dir has dirs like
    • dir1
    • dir2
    • ...

So I want to copy file.1.txt to dir1, file.2.txt in dir2 and so forth.

Thanks.

3

There are 3 answers

1
Jahid On BEST ANSWER

This should work:

#!/bin/bash
src="mydir"
dest="/home/user1/store"
dir="dir" #name of the dir without number, i.e dir from dir1, dir2
regex='(.*\.)([0-9]+)(\.txt$)'
for file in "$src"/*;do
  if [[ -f $file ]];then
    if [[ $file =~ $regex ]];then
      mkdir -p "$dest"/"$dir${BASH_REMATCH[2]}"
      cp "$file" "$dest"/"$dir${BASH_REMATCH[2]}"
    fi
  fi
done

Explanation:

${BASH_REMATCH[2]} contains the captured group #2 (which is the number part of filename) from $file matched against pattern $regex. The pattern matching is done in the if statement:

if [[ $file =~ $regex ]];then

mkdir -p is used in case the directory structure doesn't exist, it will create it.

5
Ole Tange On

With GNU Parallel you can run:

parallel '{= $_ = /\.\d+\.txt$/ ? "true" : "false" =} && mkdir -p dir{= s/\D//g =} && cp {} dir{= s/\D//g =}' ::: file.*.txt

The first part evaluates to 'true' or 'false' and is a way of doing 'grep'. If you know 'file.*.txt' are all of the form 'file.num.txt' then it is not needed.

'mkdir -p' will create the dir if it is not already there.

The &&'s are needed to make sure the command is only run if the first part evaluates to 'true'.

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

0
Marc Bredt On

i was wondering if this could be achieved with find's -exec parameter or xargs but i got stuck on variable substitution for the filenames.

so i ended up on piping to bash's while

find mydir/ -maxdepth 1 -type f -regex ".*\.[0-9]+\(\|\.txt\)" | \
  while read line; do num=${line%\.txt}; \
   cp ${line} /home/user1/store/dir${num##*\.}; \
done