XLIFF source with HTML content

4.9k views Asked by At

Is it common to give to a translator a XLIFF which contains HTML?

Do CAT tools support HTML tags properly?

<trans-unit id="1" xml:space="preserve">
    <source>This is &lt;b&gt;bold&lt;/b&gt;</source>
</trans-unit>

Update:

I'm working on an HTML5 WYSIWYG editor with widgets, and we need to have an export for translation feature.

2

There are 2 answers

4
Thomas On

Have a look at the XLIFF 1.2 Presentation Guide for HTML.

Ideally you need to encapsulate the HTML elements in XLIFF elements like this: <g id='d' ctype='bold'>bold</g>.

Most CAT tools support all the native XLIFF elements but will treat escaped HTML as plain text, which is likely to cause issues.

9
Jenszcz On

If you have a HTML file that needs to be translated it is not common to convert it to XLIFF because most translation environments support HTML. If you have some other format (for example Java .properties, or .resx or .json with embedded HTML content) you should check with your translator if his environment handles your format. Generally speaking, every conversion between file types may break some characters that have a special meaning in the native file format, like < and > in XML and HTML, apostrophes in .properties files, commas and double quotes in csv files, and so on. Avoid unnecessary conversions if possible.

Since you mentioned Trados Studio: for some file types (like XML, for example) it allows to add an embedded content processor that will take care of any embedded HTML tags. Check the Trados Studio help for details. The output format of this process is not nice, but at least it will get the tags right.

You could also use the Okapi framework to convert your format to XLIFF and define a pipeline that does all the conversions you need. Okapi is very powerful, but not always intuitive.