Find files that are newer than another with similar name, but different extension

530 views Asked by At

This is fairly simple issue that has been bothering me. A little backstory. I have a folder full of scripts. These scripts takes data files *.dat and generates output in *.eps. The extension of my scripts is *.plt. I create a one line shell script that runs all the *.plt files in that folder.

#!/bin/sh
find . -name "*.plt" -exec {} \;

I just want to make sure that all the *.pdf images I will use in my document are up to date. For a time, the one line script was good. But when the number of files is over 50, it takes some time to run. I rarely change the data files, but make changes to the *.plt scripts frequently. The scripts are written in such way that a script named this_script_does_something.plt will create a file called this_script_does_something.eps.

Hence, here's my question.

  • Is there way to write a refined shell script that executes only the *.plt files that are newer than the similarly called *.eps?

I know I can do this in Python. But it seems like cheating. I also know that I can look for the newer *.eps and execute all the *.plt that are newer than this. This will solve my problem, for most practical cases. I just realized about this option while I was typing the question, so thank you SX. However, as a didactic exercise, and to solve my original doubt, I would like to search for individual cases: compare the modification time of each *.plt with each *.eps, and execute the script only when they are more recent than the output. Is it possible? Can it be done in a single line?

EDIT: I forgot to add, that the *.plt scripts should also execute when there are no homonym *.eps files, which normally means that the script is new and has not been executed yet.

2

There are 2 answers

5
Jonathan Leffler On BEST ANSWER

I think I'd be using:

#!/bin/bash

for plt in *.plt
do
    eps=$(basename "$plt" .plt).eps
    if [ "$plt" -nt "$eps" ]
    then "$plt"
    fi
done

This uses the Bash/Korn shell operator -nt for 'newer than' (and there's the converse -ot operator for 'older than'). I'm assuming the files are all in a single directory so there's no need for a recursive search. If that's not correct, then use a separate:

find . -type d -exec sh -c "cd {}; new-script.sh" \;

(where new-script.sh is the script I just showed). Or use the Bash extension ** operator:

for plt in *.plt **/*.plt

You might need to set the Bash nullglob option:

shopt -s nullglob

This generates nothing when an expansion does not match any files.


Also generate when the .eps file does not exist:

#!/bin/bash

for plt in *.plt
do
    eps=$(basename "$plt" .plt).eps
    if [ ! -f "$eps" ] || [ "$plt" -nt "$eps" ]
    then "$plt"
    fi
done

The only not-completely-generic shell feature in this is the -nt operator. If your /bin/sh doesn't support it, check the /bin/[ command — it might — or use Korn Shell or Bash instead of /bin/sh in the shebang line.

5
jlliagre On

This script should do what you expect:

find . -name "*.eps" -exec sh -c \
     'plt=$(basename "$1" eps)plt; [ "$plt" -nt "$1" ] && $plt' sh {} \;

It will recurse into subdirectories, if any. If you don't want that, and you use GNU find, a simple workaround is to run:

find . -maxdepth 1 -name "*.eps" -exec sh -c \
     'plt=$(basename "$1" eps)plt; [ "$plt" -nt "$1" ] && $plt' sh {} \;

If you don't use GNU find, you might use that syntax instead:

find *.eps -type f -exec sh -c \
     'plt=$(basename "$1" eps)plt; [ "$plt" -nt "$1" ] && $plt' sh {} \;

but the latter might fail with an "arg list too long" error if you have a very large number of files matching the *.eps pattern. Any solution based on a for file in *.extension loop would suffer from the same issue.

Note also that -nt is not specified by POSIX so depending on your system, you might want to specifically state the shell to use instead of sh (mainstream shells like dash, bash, ksh, ksh93 or zsh do support -nt). For example on Solaris 10, you would use:

find . -name "*.eps" -exec ksh -c \
     'plt=$(basename "$1" eps)plt; [ "$plt" -nt "$1" ] && $plt' ksh {} \;

Edit:

As the script should run if the .eps file does not exist, the command should loop on the .plt files instead, eg:

find *.plt -type f -exec bash -c \
     'eps=$(basename "$0" plt)eps;
     [ ! -f "$eps" -o "$0" -nt "$eps" ] && "$0"' "{}" \;