How to convert SML/NJ HTML4 representation to a string

113 views Asked by At

When using SML/NJ library's HTML4 library, how do I convert the Standard ML representation of HTML4 into a string?

For example, if I have the HTML representation below, what function can I use to get a string similar to <html><head><title>Example</title></head><body><h1>Hello!</h1></body></html>?

(* CM.make "$/html4-lib.cm"; *)
open HTML4;
val myHTML = HTML {
  version=NONE,
  head=[Head_TITLE ([], [PCDATA "Example"])],
  content=BodyOrFrameset_BODY (BODY ([], [
    BlockOrScript_BLOCK (H1 ([], [CDATA [PCDATA "Hello!"]]))]))
};

(SML/NJ version: 110.99.2)

2

There are 2 answers

0
Flux On BEST ANSWER

According to the SML/NJ bug tracker, the following function can be used to convert HTML4.html to a string:

fun toString html =
  let
    val buf = CharBuffer.new 1024
  in
    HTML4Print.prHTML {
      putc = fn c => CharBuffer.add1 (buf, c),
      puts = fn s => CharBuffer.addVec (buf, s)
    } html;
    CharBuffer.contents buf
  end

To be able to use HTML4Print.prHTML in the SML/NJ REPL, the REPL should be started using sml '$/html4-lib.cm'. Alternatively, enter CM.make "$/html4-lib.cm"; after starting the REPL.

The function has signature val toString = fn : HTML4.html -> CharBuffer.vector. CharBuffer is an extension to the Basis Library (reference: 2018 001 Addition of monomorphic buffers). CharBuffer.vector is the same type as CharVector.vector, which is the same type as String.string, which is the same type as string.

2
IonuČ› G. Stan On

It seems you could use the HTML4Print structure (which appears in the export list in the CM file):

$ sml '$/html4-lib.cm'
Standard ML of New Jersey (64-bit) v110.99.2 [built: Thu Sep 23 13:44:44 2021]
[library $/html4-lib.cm is stable]
- open HTML4Print;
[autoloading]
[library $SMLNJ-LIB/Util/smlnj-lib.cm is stable]
[autoloading done]
opening HTML4Print
  val prHTML : {putc:char -> unit, puts:string -> unit} -> HTML4.html -> unit
  val prBODY : {putc:char -> unit, puts:string -> unit} -> HTML4.body -> unit

So, with your value, it produces:

- HTML4Print.prHTML { putc = print o String.str, puts = print } myHTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>
Example
</TITLE>
</HEAD>
<BODY>
<H1>Hello!</H1>
</BODY>
</HTML>
val it = () : unit