How can I scale an image while retaining quality when creating a PDF in Swift on macOS?

171 views Asked by At

I have a command-line Swift script that takes a series of images of close to the same pixel dimensions and creates a PDF from them. Part of what this script does is normalize the images so that each image is added as a page of the same height and width, in inches. That is, if the document I’ve scanned from is 6x9, then I normalize the images so that they are all 6x9. Due to some of the corrections I apply and varying detorioration of pages, the various images can differ in actual pixels by up to 3.5%, which leaves white space around the top/bottom or sides if I don’t normalize them.

In the following code, height and width are the desired height and width in inches of the PDF document.

//get the current pixels per inch
let imageRep = image.representations[0]
let originalWidth = CGFloat(imageRep.pixelsWide)
let originalHeight = CGFloat(imageRep.pixelsHigh)
let pixelsPerInch = originalWidth*72.0/imageRep.size.width

//determine horizontal scale factor
let inchesWide = originalWidth/pixelsPerInch
let scaleX = width/inchesWide

//determine vertical scale factor
let inchesHigh = originalHeight/pixelsPerInch
let scaleY = height/inchesHigh

//normalize image so that when added to the PDF it is the correct width and height in inches
let pageSize = NSSize(width:image.size.width*scaleX, height:image.size.height*scaleY)
let normalizedImage = NSImage(size:pageSize, flipped:false) { (resizedRect) -> Bool in
    image.draw(in:resizedRect)
    return true
}

Under macOS Monterey, this worked; the script scaled the height and width without, as far as I can tell, changing the underlying image. If a folder of images contained 10.4 megabytes of images, for example, the resulting PDF file was very close to 10.4 megabytes (10.6, in the files I’ve been using for testing). I could change height and width and the dimensions of the PDF as seen by and reported by PDF viewers would change, but the size of the PDF file would not.

I upgraded to Ventura last week (macOS 13.3.1), and the first time I ran this script I noticed that while the scaling was correct the image quality was drastically lower. I ran it on that same 10.4 megabytes worth of images and the resulting file was 3.5 megabytes rather than 10.6. The text was nearly unreadable.

It appears to be adjusting the quality of the image based on the scaling I apply to the NSImage, where it did not before. That is, this script used to create an image of the correct spatial dimensions while keeping the readability the same; it now creates an image of the correct spatial dimensions while reducing the readability.

Obviously, I can improve the readability by using larger dimensions, but then Preview and other apps open the PDF much larger than I’d like.

I can also, weirdly, forego normalizing the image and apply a new size to the existing image. If I replace the code commented as //normalize image so that when added to the PDF it is the correct width and height in inches with the following, it works, except of course that the image is no longer normalized and has white space either vertically or horizontally:

image.size = pageSize
return image

If I use normalizedImage.size = pageSize however, it adjusts the quality of the image according to pageSize and not according to however arbitrarily much I increase the scaling of normalizedImage. The normalizedImage NSImage does retain the quality of the original, however, at least initially. If I create normalizedImage as in the code above, and then add…

let arbitraryPageSize = NSSize(width:image.size.width*scaleX*4, height:image.size.height*scaleY*4)
normalizedImage.size = arbitraryPageSize

…the script creates a PDF that is readable, albeit (in this case) at 22.5x32.5 inches. Which means that the quality is not lost immediately on creating the new NSImage, it’s lost somewhere else.

The script under Ventura is for some reason treating the normalizedImage NSImage and the image NSImage differently when applying a new .size to them. I’m guessing it has something to do with image having been created from a file, and normalizedImage being one step removed from the file, but I don’t know how to take advantage of this, or even if I should.

What I’d like is to be able to normalize each scanned image to be the exact same dimensions in inches, while retaining the original image’s quality.

Here is a very stripped-down version of the script I use for creating the PDF from a series of images. It’s what I’ve been using for testing possible solutions.

//Create a PDF from a series of images
import PDFKit

class PDFCreator {
    let images: [NSImage]
    var document = PDFDocument()

    init(images: [NSImage]) {
        self.images = images
    }

    public func create() {
        //add images to PDF
        for var image in self.images {
            print("Adding…", image.name()!, image.size)
            image = self.normalizeImage(image: image)

            //add to PDF document
            guard let page = PDFPage(image: image) else {
                print("Unable to add", image.name()!, "to document.")
                exit(0);
            }
            self.document.insert(page, at: self.document.pageCount)
        }

        //save PDF to file
        print("Saving PDF as", outputPath)
        let fileURL = NSURL.fileURL(withPath: outputPath)
        self.document.write(to: fileURL)
    }

    //resize image to the appropriate width and height in inches
    //while retaining the image quality
    private func normalizeImage(image: NSImage) -> NSImage {
        if width <= 0 || height <= 0 { return image }

        //get the current pixels per inch
        let imageRep = image.representations[0]
        let originalWidth = CGFloat(imageRep.pixelsWide)
        let originalHeight = CGFloat(imageRep.pixelsHigh)
        let pixelsPerInch = originalWidth*72.0/imageRep.size.width

        //determine horizontal scale factor
        let inchesWide = originalWidth/pixelsPerInch
        let scaleX = width/inchesWide
    
        //determine vertical scale factor
        let inchesHigh = originalHeight/pixelsPerInch
        let scaleY = height/inchesHigh

        //normalize image so that when added to the PDF it is the correct width and height in inches
        let pageSize = NSSize(width:image.size.width*scaleX, height:image.size.height*scaleY)
        let normalizedImage = NSImage(size:pageSize, flipped:false) { (resizedRect) -> Bool in
            image.draw(in:resizedRect)
            return true
        }

        return normalizedImage
    }
}

//NSImage cannot resolve aliases on its own
func resolvePath(path: String) -> URL {
    do {
        let file = NSURL.fileURL(withPath: path)
        return try URL(resolvingAliasFileAt: file)
    } catch {
        print("Cannot find file at", path)
        exit(0)
    }
}

// set defaults and read images
let outputPath = "test.pdf"
var pages:[NSImage] = []
let width:CGFloat = 5.625
let height:CGFloat = 8.125
for file in CommandLine.arguments[1...] {
    let resolvedPath = resolvePath(path:file)
    let image = NSImage(byReferencing: resolvedPath)
    image.setName(resolvedPath.lastPathComponent)
    pages.append(image)
}
if pages == [] { print("At least one image is required.");exit(0) }

//create the PDF
let pdf = PDFCreator(images: pages)
pdf.create()
1

There are 1 answers

0
Jerry Stratton On

This is not a solution so much as a workaround. I noted in the question that if the original NSImage is not modified, altering image.size appears to alter the resolution, not the actual pixel dimensions, and thus retains the quality—probably, in fact, it retains the actual original image.

So I started to experiment with saving the higher quality image to a file and then re-reading it into an NSImage to alter its .size. It turns out all of those steps are not necessary. It is possible to workaround the issue by:

  1. Requesting an image representation of the page’s image.
  2. Convert that representation to JPEG format (skipping this step will maintain image quality, but will also generate huge files).
  3. Create a new NSImage from the JPEG image data.
  4. Then, assign the smaller pageSize to the image’s .size.

This appears to maintain the higher pixel dimensions of the image on the page while still ensuring that the page registers as the correct size in inches.

Replace:

        let normalizedImage = NSImage(size:pageSize, flipped:false) { (resizedRect) -> Bool in
            image.draw(in:resizedRect)
            return true
        }

        return normalizedImage

with:

let normalizedImage = NSImage(size:pageSize, flipped:false) { (resizedRect) -> Bool in
    image.draw(in:resizedRect)
    return true
}

//it appears that generating an image representation will sort of fix the problem
//this still appears very slightly lower quality, but is much better quality than otherwise
//and the file size is not as nearly exactly the sum of all images, but it is close
guard let imageData = normalizedImage.tiffRepresentation else {
    print("Trouble adding representation to image", image.name()!)
    exit(0)
}
guard let bitmapVersion = NSBitmapImageRep(data: imageData) else {
    print("Trouble getting bitmap of image representation for image", image.name()!)
    exit(0)
}
guard let imageData = bitmapVersion.representation(using: NSBitmapImageRep.FileType.jpeg, properties:[:]) else {
    print("Trouble turning bitmap into a jpeg for image", image.name()!)
    exit(0)
}
guard let normalizedImage = NSImage(data:imageData) else {
    print("Trouble creating image from data for image", image.name()!)
    exit(0)
}
normalizedImage.size = pageSize

return normalizedImage

Using images normalized in this manner generates PDF documents that are very close to the original in quality while also being the correct size in inches.

They do not equal the sum of the constituent images, probably because the JPEG compression used does not match that used on the original images. However, the JPEG creation line can also specify a compression:

let compression:Float = 0.537
guard let imageData = bitmapVersion.representation(using: NSBitmapImageRep.FileType.jpeg, properties:[.compressionFactor : compression]) else {
    print("Trouble turning bitmap into a jpeg for image", filename)
    exit(0)
}

This makes me wonder if there’s a way to simply copy the old image over to the new image, but with the new, normalized, dimensions. That would seem likely to preserve image quality exactly, or at least as exactly as generating PDFs in this manner pre-Ventura.