Many years ago, I wrote a shell function to search text string within PDF files. Like, I was in a directory with hundreds or thousands of PDF files, and call this function to search a particular phrase. This is the function
function pdfsearch() {
local searchStr=${1:?"The string to search must be the argument"}
find . -iname "*.pdf" | while read fname
do
pdftotext -q -enc ASCII7 "$fname" ".$fname~"; grep -s -H --color=always -i $searchStr ".fname~"
rm ".fname~"
done
}
Although ugly, this works fine. The encoding bit of the pdftotext
was to remove the character ligatures of certain documents (the double f in stuff was being interpreted like the single character "ff").
Now, I'm trying to modify, so it search only in certain pdf, based on the filename. So it takes an extra argument. My attempt was this
function pdfsearch() {
local searchStr=${1:?"The string to search must be the argument"}
local fileStr=${2:-"*.pdf"}
find . -iname $fileStr | while read fname
do
pdftotext -q -enc ASCII7 "$fname" ".$fname~"; grep -s -H --color=always -i $searchStr ".fname~"
rm ".fname~"
done
}
However, this does not work. I always get
pdfsearch:3 *.pdf not found
With the *.pdf
substituted with whatever I pass as 2nd argument.
I believe the problem is the quotes I'm putting within fileStr
. I tried with single quotes, protecting them with \
and putting "$fileStr"
.
I'm almost sure this is pretty basic character expansion syntax. Also, in case it matters, I'm using zsh as my shell. I used to put this function on my .bashrc
and it's now on my .zshrc
Any suggestion?
Thanks
POSIX shell behavior regarding variable expansion is awful. I strongly recommend switching to a modern, sane, shell like Fish or Elvish. Having said that, your problem is the unquoted
$fileStr
infind . -iname $fileStr | while read fname
. The shell is replacing$fileStr
with its value:*.pdf
, then attempting to expand the glob before running thefind
command. Since you don't have any files with apdf
extension in the CWD the glob expansion fails with the error you're seeing. Had the glob expanded to two or more files you would get a different error:The solution is to quote the variable expansion:
In fact, you should almost always double-quote variable expansion in POSIX shells.