I have bash script that takes a booklet format PDF and converts it to separate pages. The script is called by php running under nginx.
I am using pdfcrop, which calls pdfTex, which is the point of failure.
The script runs fine as root from the command line. However, when run by nginx (the script is called via php) it fails when pdfcrop calls pdfTex.
Here is the line for the failure point:
pdfcrop --ini --verbose --bbox "0 0 1000 600" --margins "-490 10 10 10" ${tempDir}$1 ${tempDir}right.pdf
I log the verbose output and get the following:
nginx
PDFCROP 1.40, 2020/06/06 - Copyright (c) 2002-2020 by Heiko Oberdiek, Oberdiek Package Support Group.
* PDF header: %PDF-1.5
* Running ghostscript for BoundingBox calculation ...
GPL Ghostscript 9.25 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 2.
Page 1
%%BoundingBox: 90 83 972 571
* Page 1: 0 0 1000 600
%%HiResBoundingBox: 90.674997 83.069997 971.999970 570.725983
Page 2
%%BoundingBox: 33 23 969 572
* Page 2: 0 0 1000 600
%%HiResBoundingBox: 33.731999 23.939999 968.147970 571.697983
* Running pdfTeX ...
The first line 'nginx' is because I log the result of whomami to confirm which user is running the script. Note that the script just stops at the pdfTex call.
Again, the script works as it should when run as root from the command line. It seems that pdfTex is not available to the nginx user. If that is the case, how do I fix it?
TIA
EDIT: thinking that the issue could be permissions, that pdfTex couldn't write its temp files, I changed the owner and group of the directory where my script runs to nginx. The results are the same.
EDIT 2: Here is the PHP call to my script:
chdir($scriptDir);
$result = shell_exec('./friedensBulletin.sh' . ' ' . $bulletinName . ' ' . $bulletinName);
chdir($cwd);
$scriptDir is the location of my script. $cwd is set as the current working dir, then reset here.
EDIT 3: The entire bash script
#!/bin/bash
#################################################
# takes a PDF, crops, exports as html #
# req. pdfcrop for cropping #
# req. poppler (pdftohtml) for file conversion #
# $1 input file #
# $2 output file #
# author: [email protected] #
# 01.09.2021 #
#################################################
tempDir="tmp/"
# handle pages 1 and 3
pdfcrop --ini --bbox "0 0 1000 600" --margins "-490 10 10 10" ${tempDir}$1 ${tempDir}right.pdf
pdfseparate ${tempDir}right.pdf ${tempDir}right%d.pdf
#handle pages 2 and 4
pdfcrop --ini --bbox "0 0 1000 600" --margins "10 10 -490 10" ${tempDir}$1 ${tempDir}left.pdf
pdfseparate ${tempDir}left.pdf ${tempDir}left%d.pdf
#recombine in the correct order
pdfunite ${tempDir}right1.pdf ${tempDir}left2.pdf ${tempDir}right2.pdf ${tempDir}left1.pdf ${tempDir}tmp.pdf
mv ${tempDir}tmp.pdf $2
# clean up uneeded files
rm ${tempDir}*.pdf
My original theory that pdfTex was not available to the nginx user was correct.
In my script, I logged the result of
which pdftex
. This command returned not found. The solution was to create a symlink to the pdftex script. I did this by adding the following to my script.This checks if the link exists, then creates it if it does not. This approach allows my script to work if moved to another server, assuming of course that pdftex is always installed in the same location. I found the location of pdftex by running `which pdftex' as root on the command line.
Thanks to Heiko Oberdiek, the author of pdfcrop for help in solving this.