Pandoc - HTML to Markdown not processing nested unordered lists correctly

1.4k views Asked by At

I am trying to convert a nested unordered HTML list to Markdown using Pandoc. The nested list in the HTML document is in the format:

<ul> <li>outer list item</li> <li>outer list item</li> <li>outer list item</li> <ul> <li>inner list item</li> <li>inner list item</li> <li>inner list item</li> </ul> <li>outer list item</li> <li>outer list item</li> </ul>

The command I am using to convert the HTML to Markdown is:

pandoc -o output.md input.html

The result I am getting in the generated Markdown file is:

outer list item

outer list item

outer list item

- inner list item - inner list item - inner list item

outer list item

outer list item

outer list item

So the outer list is not getting converted to an unordered list in Markdown. I have tried passing the --parse-raw option (see http://pandoc.org/README.html#pandocs-markdown) to Pandoc and the outer HTML is passsed as raw HTML into the Markdown document, indicating that for some reason the outer HTML is untranslatable.

Does anyone have any ideas why this isn't working?

Thanks, Gary

1

There are 1 answers

1
mb21 On BEST ANSWER

Your HTML is not valid, it should be:

<ul>
    <li>outer list item</li>
    <li>outer list item</li>
    <li>outer list item</li>
    <li>
      <ul>
        <li>inner list item</li>
        <li>inner list item</li>
        <li>inner list item</li>
      </ul>
    </li>
    <li>outer list item</li>
    <li>outer list item</li>
</ul>