ColdFusion CFDOCUMENT with links to other PDFs

713 views Asked by At

I am creating a PDF using the cfdocument tag at the moment. The PDF is not much more than a bunch of links to other PDFs.

So I create this PDF index and the links are all HREFs

<a href="Another_PDF.pdf">Another PDF</a>

if I set the localURL attribute to "no" my URLs have the whole web path in them:

<a href="http://www.mywebsite.com/media/PDF/Another_PDF.pdf">Another PDF</a>

if I set the localURL attribute to "yes" then I get:

<a href="File:/D:/website/media/PDF/Another_PDF.pdf">Another PDF</a>

So this index PDF is going to go onto a CD and all of the linked PDFs are going to sit right next to it so I need a relative link ... more like:

<a href="Another_PDF.pdf">Another PDF</a>

cfdocument does not seem to do this. I can modify the file name of the document and make it "File:///Another_PDF.pdf" but this does not work either because I don't know the driveletter of the CD drive ... or if the files are going to end up inside a directory on the CD.

Is there a way (possibly using iText or something) of opening up the PDF once it is created and converting the URL links to actual PDF GoTo tags?

I know this is kind of a stretch but I am at my wits end with this.

So I've managed to get into the Objects but I'm still struggling with.

Converting from:

5 0 obj<</C[0 0 1]/Border[0 0 0]/A<</URI(File:///75110_002.PDF)/S/URI>>/Subtype/Link/Rect[145 502 184 513]>>endobj

To this:

19 0 obj<</SGoToR/D[0/XYZ null null 0]/F(75110_002.PDF)>>endobj 
20 0 obj<</Subtype/Link/Rect[145 502 184 513]/Border[0 0 0]/A 19 0 R>>endobj 

Wow this is really kicking my ass! :)

So I've managed to get the document open, loop through the Link Annotations, capture the Rect co-ordinates and the linked to document name (saved into an array of Structures) and then successfully deleted the Annotation which was a URI Link.

So now I thought I could now loop over that array of structures and put the Annotations back into the document using the createLink method or the setAction method. But all the examples I've seen of these methods are attached to a Chunk (of text). But my document already has the Text in place so I don't need to remake the text links I just need to put the Links back in in the same spot.

So I figured I could reopen the document and look for the actual text that was the link and then attache the setAction to th ealready existing chunk of text .... I can't find the text!!

I suck! :)

2

There are 2 answers

0
Joe Simes On BEST ANSWER

I finally got it:

public function resetLinks( string source, string destination) {

    try {

        // initialize objects
        Local.reader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( arguments.source );
        Local.pdfName = createObject("java", "com.lowagie.text.pdf.PdfName");
        Local.annot = createObject("java", "com.lowagie.text.pdf.PdfAnnotation");
        Local.out = createObject("java", "java.io.FileOutputStream").init( arguments.destination );
        Local.stamper = createObject("java", "com.lowagie.text.pdf.PdfStamper").init(Local.reader, Local.out);
        Local.PdfAction = createObject("java", "com.lowagie.text.pdf.PdfAction");
        Local.PdfRect = createObject("java", "com.lowagie.text.Rectangle");
        Local.PdfBorderArray = createObject("java", "com.lowagie.text.pdf.PdfBorderArray").init(javacast("float", "0"), javacast("float", "0"), javacast("float", "0"));
        Local.newAnnots = [];

        // check each page for hyperlinks
        // Save the data to a structure then write it to an array 
        // then delete the hyperlink Annotation
        for ( Local.i = 1; Local.i <= Local.reader.getNumberOfPages(); Local.i = Local.i + 1) {
            //Get all of the annotations for the current page
            Local.page = Local.reader.getPageN( Local.i );
            Local.annotations = Local.page.getAsArray( Local.PdfName.ANNOTS ).getArrayList();

            // search annotations for links
            for (Local.x = arrayLen(Local.annotations); !isNull( Local.annotations) && Local.x > 0; Local.x--) {
                // get current properties
                Local.current     = Local.annotations[ Local.x ]; 
                Local.dictionary  = Local.reader.getPdfObject( Local.current );
                Local.subType     = Local.dictionary.get( Local.PdfName.SUBTYPE );
                Local.action      = Local.dictionary.get( Local.PdfName.A );
                Local.hasLink     = true;

                //Skip this item if it does not have a link AND action
                if (Local.subType != Local.PdfName.LINK || isNull(Local.action)) {
                    Local.hasLink = false;
                }
                //Skip this item if it does not have a URI
                if ( Local.hasLink && Local.action.get( Local.PdfName.S ) != Local.PdfName.URI ) {
                    Local.hasLink = false;
                } 

                //If it is a valid URI, update link
                if (Local.hasLink) {
                    // extract file name from URL
                    Local.oldLink = Local.action.get( Local.pdfName.URI );
                    Local.newLink  = getFileFromPath( Local.oldLink );
                    Local.Rect = Local.dictionary.Get(PdfName.Rect);
                    arrayStruct = StructNew();
                    arrayStruct.rectSTR = Local.Rect.toString();
                    arrayStruct.link = Local.newLink;
                    arrayStruct.page = Local.i;
                    ArrayAppend(Local.newAnnots, arrayStruct);
                    // Delete
                    Local.annotations.remove(Local.current);
                }
            }

        }

        // Now really remove them!   
        Local.reader.RemoveUnusedObjects();

        // Now loop over the saved annotations and put them back!!
        for ( Local.z = 1; Local.z <= ArrayLen(Local.newAnnots); Local.z++) {
            // Parse the rect we got save into an Array
            theRectArray = ListToArray(ReplaceNoCase(ReplaceNoCase(Local.newAnnots[z].rectSTR, "[", ""), "]", ""));
            // Create the GoToR action
            theAction = Local.PdfAction.gotoRemotePage(javacast("string", '#Local.newAnnots[z].link#'), javacast("string", '#Local.newAnnots[z].link#'), javacast("boolean", "false"), javacast("boolean", "false"));
            // Create the Link Annotation with the above Action and the Rect
            theAnnot = Local.annot.createLink(Local.stamper.getWriter(), Local.PdfRect.init(javacast("int", theRectArray[1]), javacast("int", theRectArray[2]), javacast("int", theRectArray[3]), javacast("int", theRectArray[4])), Local.annot.HIGHLIGHT_INVERT, theAction);
            // Remove the border the underlying underlined text will flag item as a link
            theAnnot.setBorder(Local.PdfBorderArray);
            // Add the Annotation to the Page
            Local.stamper.addAnnotation(theAnnot, Local.newAnnots[z].page);
        }
    }

    finally {
        // cleanup
        if (structKeyExists(Local, "reader")) { Local.reader.close(); }
        if (structKeyExists(Local, "stamper")) { Local.stamper.close(); }
        if (structKeyExists(Local, "out")) { Local.out.close(); }
    }
}

I couldn't have done this without the help of Leigh!!

5
Leigh On

This thread has an example of updating the link actions, by modifying the pdf annotations. It is written in iTextSharp 5.x, but the java code is not much different.

The thread provides a solid explanation of how annotations work. But to summarize, you need to read in your source pdf and loop through the individual pages for annotations. Extract the links and use something like getFileFromPath() to replace them with a file name only.

I was curious, so I did a quick and ugly conversion of the iTextSharp code above. Disclaimer, it is not highly tested:

/**
    Usage:

    util = createObject("component", "path.to.ThisComponent");
    util.fixLinks( "c:/path/to/sourceFile.pdf", "c:/path/to/newFile.pdf");

*/
component {

    /**
        Convert all absolute links, in the given pdf, to relative links (file name only)
        @source - absolute path to the source pdf file
        @destination - absolute path to save copy
    */
    public function fixLinks( string source, string destination) {
        // initialize objects
        Local.reader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( arguments.source );
        Local.pdfName = createObject("java", "com.lowagie.text.pdf.PdfName");

        // check each page for hyperlinks
        for ( Local.i = 1; Local.i <= Local.reader.getNumberOfPages(); Local.i++) {

            //Get all of the annotations for the current page
            Local.page = Local.reader.getPageN( Local.i );
            Local.annotations = Local.page.getAsArray( Local.PdfName.ANNOTS ).getArrayList();

            // search annotations for links
            for (Local.x = 1; !isNull( Local.annotations) && Local.x < arrayLen(Local.annotations); Local.x++) {

                  // get current properties
                  Local.current     = Local.annotations[ Local.x ]; 
                  Local.dictionary  = Local.reader.getPdfObject( Local.current );
                  Local.subType     = Local.dictionary.get( Local.PdfName.SUBTYPE );
                  Local.action      = Local.dictionary.get( Local.PdfName.A );
                  Local.hasLink     = true;

                  //Skip this item if it does not have a link AND action
                  if (Local.subType != Local.PdfName.LINK || isNull(Local.action)) {
                       Local.hasLink = false;
                  }
                  //Skip this item if it does not have a URI
                  if ( Local.hasLink && Local.action.get( Local.PdfName.S ) != Local.PdfName.URI ) {
                       Local.hasLink = false;
                  } 

                  //If it is a valid URI, update link
                  if (Local.hasLink) {
                      // extract file name from URL
                      Local.oldLink = Local.action.get( Local.pdfName.URI );
                      Local.newLink  = getFileFromPath( Local.oldLink );

                      // replace link
                      // WriteDump("Changed link from ["& Local.oldLink &"] ==> ["& Local.newLink &"]");
                      Local.pdfString = createObject("java", "com.lowagie.text.pdf.PdfString");
                      Local.action.put( Local.pdfName.URI, Local.pdfString.init( Local.newLink ) );
                  }
            }

        }

        // save all pages to new file   
        copyPDF( Local.reader , arguments.destination );    
    }

    /**
        Copy all pages in pdfReader to the given destination file
        @pdfReader - pdf to copy
        @destination - absolute path to save copy
    */
    public function copyPDF( any pdfReader, string destination) {
        try {

          Local.doc = createObject("java", "com.lowagie.text.Document").init();
          Local.out = createObject("java", "java.io.FileOutputStream").init( arguments.destination );
          Local.writer = createObject("java", "com.lowagie.text.pdf.PdfCopy").init(Local.doc, Local.out);

          // open document and save individual pages        
          Local.doc.open();
          for (Local.i = 1; i <= arguments.pdfReader.getNumberOfPages(); Local.i++) {
              Local.writer.addPage( Local.writer.getImportedPage( arguments.pdfReader,  Local.i) );
          }
          Local.doc.close();
        }
        finally 
        {
            // cleanup
            if (structKeyExists(Local, "doc")) { Local.doc.close(); }
            if (structKeyExists(Local, "writer")) { Local.writer.close(); }
            if (structKeyExists(Local, "out")) { Local.out.close(); }
        }
    }

}