Corrupted .docx download using phpdocx

3.9k views Asked by At

I have a project in which we are using phpdocx pro to generate a .docx file in the from templates. I can get the data in to the template easy enough, however when the file is downloaded and opened in MS Word 2010, the program reports that the file cannot be opened because there are problems with the contents, the details being "The file is corrupt and cannot be opened". Word can repair the document, however the issue still stands that it should not be corrupted in the first place.

This is how I'm generating the document:

function generateUnitDesign(){
  if($this->populateRecords()){
      require_once dirname(__FILE__).'/phpdocx/classes/CreateDocx.inc';
      $filename = 'UnitDesignTemplate-'.str_replace(' ', '', $this->rec->title);
      //Create Document
      $document = new CreateDocx();
      $document->addTemplate(dirname(__FILE__).'/templates/unitdesigntemplate.docx');

      // Fill in text fields
      $document->addTemplateVariable('TITLE', $this->rec->title);
      $document->addTemplateVariable('CHALLENGE', $this->rec->challenge, 'html');
      $document->addTemplateVariable('HOOK', $this->rec->hook, 'html');
      $document->addTemplateVariable('RESEARCH', $this->rec->research, 'html');
      $document->addTemplateVariable('AUDIENCE', $this->rec->audience, 'html');
      $document->addTemplateVariable('SUMMARY', $this->rec->project_brief, 'html');
      $document->addTemplateVariable('RESOURCES', $this->rec->resources, 'html');
      $document->addTemplateVariable('REQUIREMENTS', $this->rec->requirements, 'html');
      $document->addTemplateVariable('SCAFFOLDING', $this->rec->scaffolding, 'html');

      $document->createDocx($filename);
      unset($document);
      header("Content-Type: application/vnd.ms-word");
      header("Content-Length: ".filesize($filename.'.docx'));
      header('Content-Disposition: attachment; filename='.$filename.'.docx');
      header('Content-Transfer-Encoding: binary');
      ob_clean();
      flush();
      readfile($filename.'.docx');
      unlink($filename.'.docx');
  }
}

Originally, I was trying to use their createDocxAndDownload() function to get the file, but it would leave a copy of the .docx file on the server, which was not ideal. Am I missing something? Is there someone with more experience with phpdocx to lend a hand?

Edit: Well, I feel like an idiot. After narrowing the issue down to the portion of code that outputs the file, I finally opened the file in a HEX editor and discovered the issue was that after the file was output successfully the web frontend would append the start of it's HTML to the end of the docx file making a 'corrupted' file. This one line immediately after the unlink() fixed the whole thing:

exit;

Pekka: If you would like to answer this with the new information, I'll accept your answer.

2

There are 2 answers

0
Justin Pearce On BEST ANSWER

After narrowing the issue down to the portion of code that outputs the file, I finally opened the file in a HEX editor and discovered the issue was that after the file was output successfully the web front end would append the start of it's HTML to the end of the docx file making a 'corrupted' file. This one line immediately after the unlink() fixed the whole thing:

exit;
1
Richard Keller On

This is difficult to pinpoint without direct access to the template file, but here are some pointers where templating engines often fail:

  • Try logging all your PHP variables to console: print print_r($this->rec->variable_name, true); and then check to make sure that all your variables are strings, and that none are NULL.
  • Inspect your template file and make sure that the style (ie. font type, font size, etc) is consistent in each template variable. In other words, make sure that there are no variables where half the variable is a different style to the rest of the variable. This particular subtlety is very easy to introduce in a template file, and generally the easiest way to fix it is to simple delete and rewrite each template variable.

Lastly, try removing the 'html' parameter when invoking the addTemplateVariable method and see whether that makes a difference. If you're not actually using HTML, then there's no point in passing the 'html' parameter. Conversely, if you are using HTML, then the corrupted file may be a case of incorrectly structured HTML, causing Microsoft Word to flag the the document as corrupted.