When to use SGML and XML?

1.1k views Asked by At

This question is equivalent to : "What are the advantages and disadvantages to use SGML over XML, and XML over SGML ?".

I already know several similarities and differences between SGML and XML, but they don't answer this question.

SIMILARITIES

  1. SGML and XML both allow us to describe documents (structure, data, metadata) ;
  2. Both separate the appearence (colors, etc.) from data/structure/metadata ;
  3. Both SGML and XML can be used in Web pages/in Web (even if XML is more specialized in Web than SGML)
  4. SGML and XML documents must contain a DTD to be "VALID"

DIFFERENCES

  1. SGML provides several ways to write things (e.g. : we can write empty and non-closed tags, we can write <foo>d</>, etc.)
  2. SGML documents can be very hard to write
  3. Thus, SGML documents' parsing can be very low and complexe
  4. XML is a subset of SGML that is more simple to learn, to use
  5. XML doesn't allow to write things in several ways contrary to SGML (e.g. : empty and non-closed tags are NOT allowed)
  6. Thus, XML's parsing is simpler and faster than SGML's one
  7. SGML documents have not status "WELL FORMED", XML ones do. (and have this status if there syntaxe is correct)
  8. SGML documents must contain a DTD : not XML ones.

BUT THE QUESTION REMAINS

What are the advantages and disadvantages of SGML/XML (ie. : when to use one of them and not the other ?) ?

2

There are 2 answers

0
Michael Kay On

The difference is that all the world uses XML and there's vast amounts of software for it, whereas SGML is used only by a small high priesthood and has very little software available.

Technical differences in such a situation are largely irrelevant.

0
imhotap On

Allow me to chime in as someone who has spent a considerable effort on SGML just recently.

I think your point 3 (XML more specialized for the web than SGML) isn't correct, because parsing HTML is beyond XML's capabilities; to the contrary, I'm arguing that we're going to see increased use of SGML in contemporary HTML-based workflows where HTML is used both as authoring and delivery format.

Your point 4 (SGML must have a DTD) holds only for traditional SGML, but in 1998 already, along with the XML specification, the Annex K revision of SGML aka "WebSGML" dropped this requirement, precisely to make DTD-less XML a proper subset of SGML. Of course, without DTD declarations you don't have tag omission/inference, empty elements (HTML "void" elements), Wiki syntaxes, and all the other power features that SGML has over XML.

Also, let me point out that I find a "SGML vs. XML" discussion pointless. SGML is a proper superset of XML and can be down-converted to XML, so you're not giving up anything at all when using SGML. I personally use both XML and SGML; SGML when I need its additional features.

For a modern account on using SGML I'd like to point you to my talk/paper at http://www.xmlprague.cz/day2-2017/ ("The HTML 5.1 DTD").