I am trying to convert a html page into pdf using iText and flying-saucer. coding for the html page is
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head>
<title>中文測試</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style type="text/css">
name
{
font-family: "Arial Unicode MS";
color: blue;
font-size: 48;
}
</style>
</head>
<body>
<name>名偵探小怪獸</name>
<h1>भारतीय जनता पार्टी ने फिर कहा है कि बहुमत न होने के कारण वो दिल्ली में सरकार बनाने की
इच्छुक नहीं है और दोबारा चुनाव के लिए तैयार है.
</h1>
<h1>Japanese 日本国</h1>
</body>
</html>
and Java code for this is
import java.io.*;
import org.xhtmlrenderer.pdf.*;
import com.lowagie.text.pdf.*;
public class ChineseToPdf {
public static void main(String[] args) {
try {
String inputFile = "chinese.html";
String url = new File(inputFile).toURI().toURL().toString();
String outputFile = "test.pdf";
OutputStream os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
ITextFontResolver resolver = renderer.getFontResolver();
resolver.addFont("C:/Windows/Fonts/arialuni.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);
os.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}
and in output only chinese fonts are rendered properly, Hindi and Japanese come as White space.
Please help me out.
The style you defined only apply to tag
name
, and the Hindi and Japanese text is outside this tag. It is rendered with the default font, which does not support all unicode characters.To fix the bug, you can change your style to use font "Arial Unicode MS" for all document: