How to check for errors in the input XML when parsing with Go?

755 views Asked by At

I'm a beginner with golang, writing an XML parser.

My goal is that would like to include checks for whether the xml file is formatted correctly, checking for missing brackets or misspelled words for elements and attributes. If there are missing brackets or misspelled words, the code could throw an exception informing users to correct the mistake.

Let's take a concrete example of an xml file, example.xml:

<?xml version="1.0" encoding="utf-8"?>

<servers version="1">
    <server>
        <model name="Cisco" type="modelA"></model>
        <serverName>Tokyo_VPN</serverName>
        <serverIP>127.0.0.1</serverIP>
    </server>
    <server>
        <model name="Dell" type="modelB"></model>
        <serverName>Moscow_VPN</serverName>
        <serverIP>127.0.0.2</serverIP>
    </server>
</servers>

Using the standard Go package "encoding/xml", it's straightforward to define structures and parse the XML as follows:

package main

import (
    "encoding/xml"
    "fmt"
    "io/ioutil"
    "os"
)

type Servers struct {
    XMLName     xml.Name `xml:"servers"`
    Version     string   `xml:"version,attr"`
    Svs         []server `xml:"server"`
}

type server struct {
    XMLName    xml.Name `xml:"server"`
    Model      model    `xml:"model"`
    ServerName string   `xml:"serverName"`
    ServerIP   string   `xml:"serverIP"`
}

type model struct {
    XMLName    xml.Name   `xml:"model"` 
    Name       string     `xml:"name,attr"`
    Type       string     `xml:"type,attr"`  
}


func main() {

    // open the xml file
    file, err := os.Open("toy.xml")  
    if err != nil {
        fmt.Printf("error: %v", err)
        return
    }
    defer file.Close()

    // read the opened xmlFile as a byte array.
    byteValue, _ := ioutil.ReadAll(file)

    var allservers Servers

    err = xml.Unmarshal(byteValue, &allservers)
    if err != nil {
        fmt.Printf("error: %v", err)
        return
    }

    fmt.Println(allservers)
}

Mistakes such as missing brackets i.e.

<model name="Cisco" type="modelA"></model

or misspelled attributes/elements, e.g.

<serverNammme>Moscow_VPN</serverName>

, these errors are caught via XML syntax errors.

There are other errors which could occur though. For example, misspelled words for the attributes:

<model namMMe="Cisco" typeE="modelA"></model>

Although this is valid XML format, I would like to catch this as an error, as (for my purposes) these are spelling mistakes in the input XML file which should be corrected.

This will be parsed without any errors to be the following:

{{ servers} 1 [{{ server} {{ model}  } Tokyo_VPN 127.0.0.1} {{ server} {{ model} Dell modelB} Moscow_VPN 127.0.0.2}]}

How could I catch these errors and throw an error?

1

There are 1 answers

2
Shubham Srivastava On

If you go to the documentation of encoding/xml

https://golang.org/pkg/encoding/xml/#Unmarshal

There is a example for writing custom Marshal/Unmarshal, You just need to implement the Unmarshaler Interface

So your custom Unmarshaler can check for values while it un-marshales and return errors