Java, xml, catalog file, XSD schema validation and NullPointerException (JAXP09020006: The argument 'systemId' can not be null.)

1.1k views Asked by At

I'm trying to get a small example running that aims to store the catalog file within a Java package together with the schema file. The aim is to get a self contained package that does not rely on any external files, ultimately this would be packaged in a JAR but for now it is ordinary files within the file system.

But I get a NullPointerException deep within the JAXP classes and have so far been unable to understand what it is that triggers this exception and what I should do to get rid of it.

$ java -version
openjdk version "15" 2020-09-15
OpenJDK Runtime Environment (build 15+36-1562)
OpenJDK 64-Bit Server VM (build 15+36-1562, mixed mode, sharing)

$ java -classpath . foo.Foo
java.lang.NullPointerException: JAXP09020006: The argument 'systemId' can not be null.
        at java.xml/javax.xml.catalog.CatalogMessages.reportNPEOnNull(CatalogMessages.java:129)
        at java.xml/javax.xml.catalog.CatalogResolverImpl.resolveEntity(CatalogResolverImpl.java:70)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.resolveEntity(XMLEntityManager.java:1154)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.resolveDocument(XMLSchemaLoader.java:662)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.findSchemaGrammar(XMLSchemaValidator.java:2694)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleStartElement(XMLSchemaValidator.java:2069)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.startElement(XMLSchemaValidator.java:829)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDriver.scanRootElementHook(XMLNSDocumentScannerImpl.java:613)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3078)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:836)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:541)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
        at java.xml/com.sun.org.apache.xerces.internal.jaxp.validation.StreamValidatorHelper.validate(StreamValidatorHelper.java:176)
        at java.xml/com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(ValidatorImpl.java:115)
        at foo.Foo.main(Foo.java:45)

Java source code (Content of ./foo/Foo.java)

package foo;

import java.io.File;
import java.io.StringWriter;

import java.net.URL;
import java.net.URI;
import java.net.URISyntaxException;

import javax.xml.XMLConstants;
import javax.xml.catalog.CatalogFeatures;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.Validator;

public class Foo
{
  public static void main(String[] args)
  {
    final String  catalogFile = CatalogFeatures.Feature.FILES.getPropertyName();
    final String  catalogPath = "foo/catalog.xml";

    final ClassLoader  classLoader = Foo.class.getClassLoader();

    try
    {
      final URL  catalogUrl = classLoader.getResource(catalogPath);
      final URI  catalog = catalogUrl.toURI();

      if (catalog != null)
      {
        SchemaFactory  schemaFactory =
          SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        Schema  schema = schemaFactory.newSchema();

        StreamSource  source = new StreamSource(new File("xyzzy.xml"));
        Validator  validator = schema.newValidator();

        validator.setProperty(catalogFile, catalog.toString());

        StringWriter  writer = new StringWriter();
        StreamResult  result = new StreamResult(writer);
        validator.validate(source, result);  // Triggers NullPointerException

        System.out.println(writer);
      }
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }
  }
}

Catalog file (Content of ./foo/catalog.xml)

<?xml version="1.0"?>
<!DOCTYPE catalog
PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/comittees/entity/release/1.0/catalog.dtd">

<catalog  xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <uri name="urn:foo:bar:xyzzy.xsd:0.1"
       uri="schemas/xyzzy.xsd"/>
</catalog>

XSD Schema file (Content of ./foo/schemas/xyzzy.xsd)

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="urn:foo:bar"
           xmlns:gazonk="urn:foo:bar"
           elementFormDefault="qualified">
  <xs:element name="xyzzy">
    <xs:complexType/>
  </xs:element>
</xs:schema>

XML file xyzzy.xml (Content of ./xyzzy.xml)

<?xml version="1.0"?>
<gazonk:xyzzy xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns:gazonk="urn:foo:bar"
              xsi:schemaLocation="urn:foo:bar:xyzzy.xsd:0.1">
</gazonk:xyzzy>

What should I do to get rid of this exception?

Update

I think I have been able to figure out what is happening here, and my gut feeling tells me that I managed to trigger a bug in java.xml/javax.xml.catalog.CatalogResolverImpl.resolveEntity which only considers the case when systemId is null regardless of whether publicId is null or not. It is perfectly fine to have a situation when systemId is null if publicId is not null.

What I did to work around this problem was to create a wrapper class that implements the CatalogResolver interface and intercepts the pass-through call for the cases when only systemId is null (by simply replacing null with "") as well as when both systemId and publicId are null (throw an exception that provide a more sensible reason and explanation). You can find my modified code below.

And there is a small error in the XML example (it does not match the schema), a matching XML file is

<?xml version="1.0"?>
<gazonk:xyzzy xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns:gazonk="urn:foo:bar"
              xsi:schemaLocation="urn:foo:bar:xyzzy.xsd:0.1"/>

Java file Resolver.java (Content of ./foo/Resolver.java)

package foo;

import java.io.InputStream;

import javax.xml.catalog.CatalogResolver;
import javax.xml.transform.Source;

import org.w3c.dom.ls.LSInput;

import org.xml.sax.InputSource;


public class Resolver implements CatalogResolver
{
  private final CatalogResolver  m_resolver;


  public Resolver(CatalogResolver resolver)
  {
    if (resolver != null)
    {
      m_resolver = resolver;
    }
    else
    {
      String  message = "Wrapped resolver must not be null.";
      throw new IllegalArgumentException(message);
    }
  }

  public Source resolve(String href, String base)
  {
    return m_resolver.resolve(href, base);
  }

  public InputSource resolveEntity(String publicId, String systemId)
  {
    // Ensure systemId is not null.
    return m_resolver.resolveEntity(publicId,
                                    (systemId == null)? "" : systemId);
  }

  public InputStream resolveEntity(String publicId,
                                   String systemId,
                                   String baseUri,
                                   String namespace)
  {
    // Ensure systemId is not null.
    return m_resolver.resolveEntity(publicId,
                                    (systemId == null)? "" : systemId,
                                    baseUri,
                                    namespace);
  }

  public LSInput resolveResource(String type,
                                 String namespaceUri,
                                 String publicId,
                                 String systemId,
                                 String baseUri)
  {
    // Ensure both publicId and systemId are not null at the same time
    // before passing it on to the real resolver.
    if ((publicId == null) && (systemId == null))
    {
      String  message = ("Missing namespace and schema location pair, " +
                         "only have namespace URI '" + namespaceUri +
                         "' which is not enough to go on when trying to " +
                         "locate the schema file...");
      throw new NullPointerException(message);
    }

    // Ensure systemId is not null.
    return m_resolver.resolveResource(type,
                                      namespaceUri,
                                      publicId,
                                      (systemId == null)? "" : systemId,
                                      baseUri);
  }
}

Modified portion of Java file Foo.java (Added a few lines of code to the if-clause with some context)

...
      if (catalog != null)
      {
        CatalogFeatures  features = CatalogFeatures.builder()
          .with(CatalogFeatures.Feature.PREFER, "public")
          .with(CatalogFeatures.Feature.DEFER, "true")
          .with(CatalogFeatures.Feature.RESOLVE, "strict")
          .build();
        CatalogResolver  resolver = CatalogManager.catalogResolver(features,
                                                                   catalog);
        Resolver  wrapper = new Resolver(resolver);

        SchemaFactory  schemaFactory =
          SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
...
        validator.setProperty(catalogFile, catalog.toString());
        validator.setResourceResolver(wrapper);

        StringWriter  writer = new StringWriter();
...

Working result

$ java -version
openjdk version "15" 2020-09-15
OpenJDK Runtime Environment (build 15+36-1562)
OpenJDK 64-Bit Server VM (build 15+36-1562, mixed mode, sharing)

$ java -classpath . foo.Foo
<?xml version="1.0" encoding="UTF-8"?><gazonk:xyzzy xmlns:gazonk="urn:foo:bar">
</gazonk:xyzzy>
1

There are 1 answers

0
Rafael Winterhalter On

The same issue occurs when using the jaxb2-maven-plugin where this cannot be worked around as easily as there is no programmatic option. To compensate for this issue, one can however create a class:

package com.sun.tools.xjc;

import java.io.File;
import java.io.IOException;
import java.net.URI;
import java.util.ArrayList;

import javax.xml.catalog.CatalogFeatures;
import javax.xml.catalog.CatalogManager;
import javax.xml.catalog.CatalogResolver;
import org.xml.sax.EntityResolver;

public class CatalogUtil {

    static EntityResolver getCatalog(EntityResolver entityResolver, File catalogFile, ArrayList<URI> catalogUrls) throws IOException {
        if (entityResolver != null) {
            return entityResolver;
        }
        CatalogResolver resolver = CatalogManager.catalogResolver(CatalogFeatures.builder().build(), catalogUrls.toArray(URI[]::new));
        return (publicId, systemId) -> resolver.resolveEntity(publicId, systemId == null ? "" : systemId);
    }
}

and place it within a jar file to included in the dependencies of the jaxb2-maven-plugin. The dependency will be placed before the actual xjc dependency such that this adjusted file shadows the actual class. The safeguard that replaces the null value with an empty string does now avoid the issue.