How do we make integer datatype behave like String data type in XML/XSD?

1.2k views Asked by At

I have the following XSD/XML type definition. It has been used by number of business units/applications.

<xsd:simpleType name="NAICSCodeType">
        <xsd:annotation>
            <xsd:documentation>NAICSCode</xsd:documentation>
        </xsd:annotation>
        <xsd:restriction base="xsd:integer">
            <xsd:minInclusive value="000001"/>
            <xsd:maxInclusive value="999000"/>
        </xsd:restriction>
    </xsd:simpleType>

As this one defined as "integer" data type, it strips the leading zeros of input. Eg: 0078 become 78 after parsing.

We need to pass the input as it is without stripping leading zeros eg 0078 become 0078 after parsing.

The ideal fix is to change the integer to string in restriction base. It is non-starter due to buy in from other groups.

Is there a way to redefine the above data type for desired outcome?

How do I do it?

Books and net dont seem to have helped too much either, so I am starting to question if this is theoretically possible at all

2

There are 2 answers

0
C. M. Sperberg-McQueen On BEST ANSWER

It sounds as if the values in question are not in fact integers, but strings consisting only of numeric digits. Why does the schema say that they are integers if 78 and 078 and 0078 are three distinct values instead of three ways of naming the same value?

You can of course restrict xs:integer by requiring leading zeroes in the lexical space, or a fixed number of digits. But that is unlikely to have any effect on the way software reading the document re-serializes it or passes values to other software.

3
Petru Gardea On

In theory, there shouldn't be; and as far as I know, there aren't out of the box XML serializers that would be configurable to get what you described; leading zeroes and padding whitespace are remnants from fixed-length records era (your example would be a PIC 9(6) in a COBOL copybook).

Depending on your platform, you might be able to create custom serializers. In my shop I would argue that as just plain wrong.

If I would be forced to do it, I would simply use a "private" variation of the XSD (based on string), therefore implement whatever formatting on your side and be done with it. Private would mean that you don't need to be "sharing" your XSD artifact that you used internally to generate whatever code you need, with the other groups; this could create the "input" you refer to with minimum overhead. The "refactoring" of the schema could be done with minimum overhead...

I am suggesting it simply because having to put up with this is an indication that in your environment there are obviously bigger problems to deal with, starting with not necessarily understanding how to properly bridge XML with legacy systems (a wild guess, of course).