Auto-Fill I-9 PDF XFA Form

1.5k views Asked by At

Good morning. I am hoping someone can help me on this topic. Last year I set up a VB.NET program using iTextSharp where a user could enter the information to fill the I9 and that information would fill in the PDF and print. With the new I9 I am having unidentified difficulties.

First, the code doesn't error out or anything. I simply get a poor result, as instead of a filled form I get a PDF that says "The document you are trying to load requires Adobe Reader 8 or higher. You may not have the Adobe Reader installed..." etc. So, I made sure that I have the most recent Reader version, tried again and same result.

Thinking that perhaps there were changes in the field name structure, I attempted to read in the format/fields as I had the first time around. (Code below). However, now it tells me that there're no fields to read (AcroFields.Fields.Count = 0).

Private Sub ListFieldNames(pdfTemplate As String)
    Dim pdfTemplate As String = "c:\Temp\PDF\fw4.pdf"
    Dim pdfReader As PdfReader = New PdfReader(pdfTemplate)
    Dim de As KeyValuePair(Of String, iTextSharp.text.pdf.AcroFields.Item)

    For Each de In pdfReader.AcroFields.Fields
        Console.WriteLine(de.Key.ToString())
    Next
End Sub

So, I started doing some searching and found reference to another type of PDF structure that they could have switched to; XFA. I honestly still haven't found any satisfactory documentation/samples of this, but I did find some code that seems like it should work to read in the structure of an XFA PDF. (Code below). There're actually 2 different methods here that I tried. The first essentially shows that there're no xmlNodes in xfaFields. The second does find a node called "data" (that's the only one it finds) but doesn't find any child nodes.

Private Sub ReadXfa(pdfTemplate As String)
    pdfReader.unethicalreading = True
    Dim readerPDF As New PdfReader(pdfTemplate)

    Dim xfaFields = readerPDF.AcroFields.Xfa.DatasetsSom.Name2Node

    For Each xmlNode In xfaFields
        Console.WriteLine(xmlNode.Value.Name + ":" + xmlNode.Value.InnerText)
    Next
    'Example of how to get a field value
    '   Dim lastName = xfaFields.First(Function(a) a.Value.Name = "textFieldLastNameGlobal").Value.InnerText


    Dim reader As New PdfReader(pdfTemplate)
    Dim xfa As New XfaForm(reader)
    Dim node As XmlNode = xfa.DatasetsNode()
    Dim list As XmlNodeList = node.ChildNodes()
    For i As Integer = 0 To list.Count - 1
        Console.WriteLine(list.Item(i).LocalName())
        If "data".Equals(list.Item(i).LocalName()) Then
            node = list.Item(i)
            Exit For
        End If
    Next
    list = node.ChildNodes()
    For i As Integer = 0 To list.Count - 1
        Console.WriteLine(list.Item(i).LocalName())
    Next
    reader.Close()
End Sub

https://www.uscis.gov/system/files_force/files/form/i-9.pdf?download=1

The above link goes to the i9 PDF provided by the government.

SO...I guess I have multiple questions. The simplest is if anybody has done this process/if they can help me. Barring that, if someone could point me in the right direction regarding how to read/write from this new PDF file, that would be stupendous. I'm frankly not even certain how to determine what "type" of form they used - AcroField, XFA, something else?

Thank you so much for your time/help!

2

There are 2 answers

1
kuujinbo On BEST ANSWER

First, sorry I don't do vb.net anymore, but you should be able to convert the code that follows.

You already found out for yourself that the new form is XFA. There's an easy non-programmatic way to see the form fields and data. You noted that you upgraded your version of Adobe Reader, so am guessing you're using Reader DC. From the menu options:

Edit => Form Options => Export Data...

That exports the form to a XML file you can inspect. The XML file gives you a hint that a corresponding XML document is needed to fill the form, which is quite different than how it's done with an AcroForm.

Here's some simple code to get you started. First a method to read the blank XML document and update it:

public string FillXml(Dictionary<string, string> fields)
{
    // XML_INFILE => physical path to XML file exported from I-9
    XDocument xDoc = XDocument.Load(XML_INFILE);
    foreach (var kvp in fields)
    {
        // handle multiple elements in I-9 form
        var elements = xDoc.XPathSelectElements(
            string.Format("//{0}", kvp.Key)
        );
        if (elements.Count() > 0)
        {
            foreach (var e in elements) { e.Value = kvp.Value; }
        }
    }

    return xDoc.ToString();
}

Now that we have a method to create valid XML, fill the form fields with some sample data:

var fields = new Dictionary<string, string>()
{
    { "textFieldLastNameGlobal", "Doe" },
    { "textFieldFirstNameGlobal", "Jane" }
};
var filledXml = FillXml(fields);

using (var ms = new MemoryStream())
{
    // PDF_READER => I-9 PdfReader instance
    using (PDF_READER)
    {
        // I-9 has password security
        PdfReader.unethicalreading = true;
        // maintain usage rights on output file
        using (var stamper = new PdfStamper(PDF_READER, ms, '\0', true))
        {
            XmlDocument doc = new XmlDocument();
            doc.LoadXml(filledXml);
            stamper.AcroFields.Xfa.FillXfaForm(doc.DocumentElement);
        }
    }
    File.WriteAllBytes(OUTFILE, ms.ToArray());
}

To answer your last question, how to determine the form 'type', use the PdfReader instance like so:

PDF_READER.AcroFields.Xfa.XfaPresent

true means XFA, false means AcroForm.

2
Brenda On

Here's my final code in case someone out there can use it...I do have an On Error Resume Next in place because the i9 is a ridiculously picky form and I'm choosing to fill some things in slightly differently than they want me to. I've also chopped out where I'm setting some of my variables in order to keep it shorter. Thanks again to kuujinbo for your help!

Private Sub ExportI9()
    Dim pdfTemplate As String = Path.Combine(Application.StartupPath, "PDFs\2017-I9.pdf")
    pdfTemplate = Replace(pdfTemplate, "bin\Debug\", "")


    Dim fields = New Dictionary(Of String, String)() From {
    {"textFieldLastNameGlobal", Me.tbLast.Text},
    {"textFieldFirstNameGlobal", Me.tbFirst.Text},
    {"textFieldMiddleInitialGlobal", Mid(Me.tbMiddle.Text, 1, 1)},
    {"textFieldOtherNames", Me.tbOtherName.Text},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldAddress", addr1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldAptNum", ""},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldCityOrTown", city1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/State", state1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldZipCode", zip1},
    {"dateFieldBirthDate", Me.dtpBirth.Value},
    {"SSN", Me.tbSSN.Text},
    {"fieldEmail", ""},
    {"fieldPhoneNum", sphone},
    {"radioButtonListCitizenship", citizenship},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subCitizenshipStatus/textFieldResidentType", alienuscis},
    {"dateAlienAuthDate", dauth},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subAuthorizedAlien/numFormI94Admission", Me.tbi94.Text},
    {"numForeignPassport", Me.tbPassport.Text},
    {"CountryofIssuance", Me.tbPassportCountry.Text},
    {"numAlienOrUSCIS", usc},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subAuthorizedAlien/textFieldResidentType", alienuscis},
    {"rbListPerparerOrTranslator", 3},
    {"dropdownMultiPreparerOrTranslator", 1},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow2/textFieldFirstName", prepfirst},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow2/textFieldLastName", preplast},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldAddress", Replace(prepadd, "#", "No. ")},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldCityOrTown", prepcity},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/State", prepstate},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldZipCode", prepzip},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/selectListA1DocumentTitle", doctitle1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/selectListBDocumentTitle", doctitle2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/selectListCDocumentTitle", doctitle3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/textFieldIssuingAuthority", issued1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/textFieldIssuingAuthority", issued2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/textFieldIssuingAuthority", issued3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/dateExpiration", expdate1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/dateExpiration", expdate2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/dateExpiration", expdate3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/textFieldDocumentNumber", docnum1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/textFieldDocumentNumber", docnum2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/textFieldDocumentNumber", docnum3},
        {"form1/section2and3Page2/subSection2/subCertification/subAttest/dateEmployeesFirstDay", CDate(Me.dtpHired.Value).ToShortDateString},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow2/textFieldLastName", certlast},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow2/textFieldFirstName", certfirst},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldAddress", orgadd},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldCityOrTown", orgcity},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/State", orgstate},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldZipCode", orgzip},
        {"textBusinessOrgName", orgname}
    }


    Dim PDFUpdatedFile As String = pdfTemplate
    PDFUpdatedFile = Replace(PDFUpdatedFile, "I9", Me.tbSSN.Text & "-I9")
    If System.IO.File.Exists(PDFUpdatedFile) Then System.IO.File.Delete(PDFUpdatedFile)
    Dim readerPDF As New PdfReader(pdfTemplate)


    Dim filledXml = FillXml(fields)
    Using ms = New MemoryStream()
        Using readerPDF
            ' I-9 has password security
            PdfReader.unethicalreading = True
            Dim stamper As New PdfStamper(readerPDF, ms, ControlChars.NullChar, True)
            Using stamper
                Dim doc As New XmlDocument()
                doc.LoadXml(filledXml)
                stamper.AcroFields.Xfa.FillXfaForm(doc.DocumentElement)
            End Using
        End Using
        File.WriteAllBytes(PDFUpdatedFile, ms.ToArray())
    End Using
End Sub


Public Function FillXml(fields As Dictionary(Of String, String)) As String
    ' XML_INFILE => physical path to XML file exported from I-9
    Dim xmlfile As String

    xmlfile = Path.Combine(Application.StartupPath, "PDFs\2017-I9_data.xml")
    xmlfile = Replace(xmlfile, "bin\Debug\", "")
    Dim kvp As KeyValuePair(Of String, String)

    Dim xDoc As XDocument = XDocument.Load(xmlfile)
    For Each kvp In fields
        ' handle multiple elements in I-9 form
        Dim elements = xDoc.XPathSelectElements(String.Format("//{0}", kvp.Key))
        If elements.Count() > 0 Then
            For Each e As XElement In elements
                On Error Resume Next
                e.Value = kvp.Value
            Next
        End If
    Next

    Return xDoc.ToString()
End Function