Steps to reproduce on Debian Jessie GNU/Linux.
Check xmllint version:
$ xmllint --version
xmllint: using libxml version 20901
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma
Make an XHTML 1.0 Transitional file by saving this as example.xhtml:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>A title</title>
</head>
<body>
Some content
</body>
</html>
N.B. Pasting the contents of example.xhtml into the W3C Validator yields "This document was successfully checked as XHTML 1.0 Transitional!", so it should also validate when using xmllint.
xmllint online validation
This fails, despite the fact that the computer has internet access:
$ xmllint --noout --valid example.xhtml
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
^
example.xhtml:2: validity error : Validation failed: no DTD found !
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
^
xmllint offline validation
Install XHTML 1.0 DTDs and entity files:
$ wget -qO- https://www.w3.org/TR/xhtml1/xhtml1.tgz | tar xvz
xhtml1-20020801/
xhtml1-20020801/W3C-REC.css
xhtml1-20020801/xhtml.css
xhtml1-20020801/logo-REC.png
xhtml1-20020801/w3c_home.png
xhtml1-20020801/wcag1AAA.png
xhtml1-20020801/acks.html
xhtml1-20020801/Cover.html
xhtml1-20020801/definitions.html
xhtml1-20020801/diffs.html
xhtml1-20020801/dtds.html
xhtml1-20020801/guidelines.html
xhtml1-20020801/introduction.html
xhtml1-20020801/issues.html
xhtml1-20020801/normative.html
xhtml1-20020801/Overview.html
xhtml1-20020801/prohibitions.html
xhtml1-20020801/references.html
xhtml1-20020801/xhtml1-diff.html
xhtml1-20020801/DTD/
xhtml1-20020801/DTD/xhtml-lat1.ent
xhtml1-20020801/DTD/xhtml-special.ent
xhtml1-20020801/DTD/xhtml-symbol.ent
xhtml1-20020801/DTD/xhtml.soc
xhtml1-20020801/DTD/xhtml1-frameset.dtd
xhtml1-20020801/DTD/xhtml1-strict.dtd
xhtml1-20020801/DTD/xhtml1-transitional.dtd
xhtml1-20020801/DTD/xhtml1.dcl
xhtml1-20020801/xhtml1.ps
xhtml1-20020801/xhtml1.pdf
Still fails:
$ xmllint --noout --dtdvalid xhtml1-20020801/DTD/xhtml1-transitional.dtd example.xhtml
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
^
Likewise if using the --nonet option:
$ xmllint --noout --nonet --dtdvalid xhtml1-20020801/DTD/xhtml1-transitional.dtd example.xhtml
I/O error : Attempt to load network entity http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
^
Questions
I have two questions:
- Why did none of these validation attempts succeed?
- The second one seems to fail because despite using the
--dtdvalidoption,xmllintstill tries to visithttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtdbecause it is referenced inexample.xhtml. Is there some way to tellxmllintto ignore that reference and to instead use a local DTD (e.g. the one already stored atxhtml1-20020801/DTD/xhtml1-transitional.dtd?
It seems the simplest workaround is:
This installs the relevant DTDs locally. Thereafter, validation succeeds:
However, although this allows me to validate the XHTML file, it does not really answer the questions. Therefore, I will not mark this question as "answered", in the hope that someone will provide an answer that does indeed answer them.