Edit docx using nokogiri and rubyzip

2.2k views Asked by At

Here, I'm using a rubyzip and nokogiri to modify a .docx file.

RubyZip -> Unzip .docx file
Nokogiri -> Parse and change in content of the body of word/document.xml

As I wrote the sample code just below but code modify the file but others file were disturbed. In other words, updated file is not opening showing error the word processor is crashed. How can I resolve this issue ?

require 'zip/zipfilesystem'
require 'nokogiri'
zip = Zip::ZipFile.open("SecurityForms.docx")
doc = zip.find_entry("word/document.xml")
xml = Nokogiri::XML.parse(doc.get_input_stream)
wt = xml.root.xpath("//w:t", {"w" => "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}).first
wt.content = "FinalStatement"
zip.get_output_stream("word/document.xml") {|f| f << xml.to_s}
zip.close
2

There are 2 answers

2
Waynn Lue On

According to the official Github documentation, you should Use write_buffer instead open. There's also a code example at the link.

0
Muhammad Ateq Ejaz On

Following is the code that edit the content of a .docx template file.It first creae a new copy of your template.docx remember u will create this template file and keep this file in the same folder where you create your ruby class like you will create My_Class.rb and copy following code in it.It works perfectly for my case. Remember you need to install rubyzip and nokogiri gem in a gemset.(Google them to install).Thanks

require 'rubygems'
require 'zip/zipfilesystem'
require 'nokogiri'
class Edit_docx
def initialize
coupling =  [('a'..'z'),('A'..'Z')].map{|i| i.to_a}.flatten
secure_string  =  (0...50).map{ coupling[rand(coupling.length)] }.join
FileUtils.cp 'template.docx', "#{secure_string}.docx"
zip = Zip::ZipFile.open("#{secure_string}.docx")
doc = zip.find_entry("word/document.xml")
xml = Nokogiri::XML.parse(doc.get_input_stream)
wt = xml.root.xpath("//w:t", {"w"=>"http://schemas.openxmlformats.org/wordprocessingml/2006/main"})
#puts wt
wt.each_with_index do |tag,i|
tag.content = i.to_s + ""
end
zip.get_output_stream("word/document.xml") {|f| f << xml.to_s}
zip.close
puts secure_string
#FileUtils.rm("#{secure_string}.docx")
end
N.new
end