Edit PDF text using C#

4k views Asked by At

How can I find and then hide (or delete) specific text phrase?

For example, I have created a PDF file containing all sorts of data such as images, tables, text etc.

Now, I want to find a specific phrase like "Hello World" wherever it is mentioned in the file and somehow hide it, or -better even- delete it from the PDF.

And finally get the PDF after deleting this phrase.

I have tried iTextSharp and Spire, but couldn't find anything that worked.


There are 2 answers

astef On

The following snippet from here let you find and black-out the text in pdf document:

PdfDocument pdf = new PdfDocument(new PdfReader(SRC), new PdfWriter(DEST));
ICleanupStrategy cleanupStrategy = new RegexBasedCleanupStrategy(new Regex(@"Alice", RegexOptions.IgnoreCase)).SetRedactionColor(ColorConstants.PINK);
PdfAutoSweep autoSweep = new PdfAutoSweep(cleanupStrategy);

Pay attention to the license. It is AGPL, if you don't buy license.

vaalex On

Try the following code snippets to hide the specifc text phrase on PDF using Spire.PDF.

using Spire.Pdf;
using Spire.Pdf.General.Find;
using System.Drawing;

namespace HideText
    class Program
        static void Main(string[] args)
            //load PDF file
            PdfDocument doc = new PdfDocument();

            //find all results where "Hello World" appears
            PdfTextFind[] finds = null;
            foreach (PdfPageBase page in doc.Pages)
                finds = page.FindText("Hello World").Finds;               

            //cover the specific result with white background color
            finds[0].ApplyRecoverString("", Color.White, false);

            //save to file

Result enter image description here