Flexmark Typographic Extension silently dropping characters?

83 views Asked by At

Using Flexmark 0.64.0 with Java 17 I'm expecting the Typographic Extension to turn ' into ’ with its default configuration, but instead it seems merely to drop the character. The same happens e.g. with ---.

I'm setting up my parser and HTML renderer like this:

MutableDataHolder parserOptions = new MutableDataSet()
    //emoji; see https://www.webfx.com/tools/emoji-cheat-sheet/
    .set(EmojiExtension.USE_IMAGE_TYPE, EmojiImageType.UNICODE_ONLY)
    //GFM tables
    .set(TablesExtension.COLUMN_SPANS, false).set(TablesExtension.APPEND_MISSING_COLUMNS, true).set(TablesExtension.DISCARD_EXTRA_COLUMNS, true)
    .set(TablesExtension.HEADER_SEPARATOR_COLUMN_MATCH, true)
    //extensions
    .set(Parser.EXTENSIONS, List.of(DefinitionExtension.create(), EmojiExtension.create(), SuperscriptExtension.create(), TablesExtension.create(),
        TypographicExtension.create(), YamlFrontMatterExtension.create()));
parser = Parser.builder(parserOptions).build();
htmlRenderer = HtmlRenderer.builder().build();

Note that I just use TypographicExtension.create(). Maybe there are further configurations to do, but by default I wouldn't expect the extension just to drop characters.

I use the parser like this:

com.vladsch.flexmark.util.ast.Document markdownDocument = parser.parse("it's working");
System.out.println(htmlRenderer.render(markdownDocument));

I expect:

<p>it&rsquo;s working</p>

Instead I get:

<p>its working</p>

I tried mucking with the settings, using a literal character instead of a character reference:

.set(TypographicExtension.ENABLE_QUOTES, true)
.set(TypographicExtension.SINGLE_QUOTE_UNMATCHED, "x")

Nothing changed. The character still simply disappeared

However I was able to disable quote processing altogether:

.set(TypographicExtension.ENABLE_QUOTES, false)

Then I got my original string back. (Of course this defeats the purpose of the extension.)

Does this mean the Typographic Extension is simply broken, or am I missing some additional configuration? (In any case, I certainly wouldn't expect the default configuration to silently discard content.)

I opened Flexmark Issue #547, but so far I haven't had any responses. Maybe someone can point out my error.

0

There are 0 answers