XML Validation with XML-Reader in PHP

1.2k views Asked by At

I've got errors during validation of a generated XML string. I loaded the XML-String with XML-Reader and assigned the XSD-File for validation.

There are object IDs and urls to validate against a pattern of allowed characters. I think the IDs and urls are correct. But why does the validation process generates errors?

I've got error messages like this:

Element 'objectID': [facet 'pattern'] The value 'ffc89' is not accepted by the pattern '^[a-z]{1,1}[a-z0-9.-]{3,14}$'.
Element 'objectID': 'ffc89' is not a valid value of the local atomic type.
Element 'originUrl': [facet 'pattern'] The value 'http://domain.com/images/89/f972c66982290125.jpg' is not accepted by the pattern '^(http|https){1}(://){1}[a-zA-Z0-9\-\./#?&_]+'.
Element 'originUrl': 'http://domain.com/images/89/f972c66982290125.jpg' is not a valid value of the local atomic type.

Here is the code snippet:

$reader = new XMLReader();

// we enable user error handling
libxml_use_internal_errors(true);

// load xml sructure for testing against xsd
$reader->xml($xml_str_tocheck);
$reader->setSchema($xsd_file_name);

// read xml structure
while( $reader->read() ) ;

// close xml
$reader->close();

// get found xml errors
$errors = libxml_get_errors();

// we disable user error handling
// (Disabling will also clear any existing libxml errors.)
libxml_use_internal_errors(false);

// check if xml is not valid
if( count($errors) )
{
    foreach ($errors as $error)
    {
        echo $error->message;
    }
}

This is the XML-String for validation:

<?xml version="1.0" encoding="UTF-8"?>
<oimages startFetchDate="2015-06-10T12:48:20+00:00">
  <object>
    <objectID>ffc89</objectID>
    <images>
      <image>
        <originUrl>http://domain.com/images/89/f972c66982290125.jpg</originUrl>
      </image>
    </images>
  </object>
</oimages>

This is the XSD-File:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="images">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="object" maxOccurs="unbounded" minOccurs="1">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="objectID" minOccurs="1" maxOccurs="1">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:minLength value="4"/>
                    <xs:maxLength value="15"/>
                    <xs:pattern value="^[a-z]{1,1}[a-z0-9.-]{3,14}$"/>
                  </xs:restriction>
                </xs:simpleType>
              </xs:element>
              <xs:element name="images" maxOccurs="1" minOccurs="1">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="image" maxOccurs="unbounded" minOccurs="0">
                      <xs:complexType>
                        <xs:sequence>
                          <xs:element name="url" minOccurs="1" maxOccurs="1">
                            <xs:simpleType>
                              <xs:restriction base="xs:string">
                                <xs:minLength value="10"/>
                                <xs:pattern value="^(http|https){1}(://){1}[a-zA-Z0-9\-\./#?&amp;_]+" />
                              </xs:restriction>
                            </xs:simpleType>
                          </xs:element>
                          </xs:element>
                        </xs:sequence>
                      </xs:complexType>
                    </xs:element>
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>
1

There are 1 answers

0
kjhughes On BEST ANSWER

Your XML is not valid with respect to your XSD.

Make the following changes to your XSD:

  1. Delete the extra closing tag for xs:element on line 31.
  2. Change root element name from images to oimages.
  3. Add startFetchDate attribute to oimages.
  4. Remove the leading ^ and trailing $ from ^[a-z]{1,1}[a-z0-9.-]{3,14}$ because regular expressions in XSD are already implied to begin and end at the beginning and ending of the string.
  5. Remove the leading ^ from ^(http|https){1}(://){1}[a-zA-Z0-9\-\./#?&amp;_]+.

After making the above changes to your XSD, the XML will validate successfully against the XSD.