I am trying to parse xml content along with all the attributes of an XML element like this
type Node struct {
XMLName xml.Name
Attributes []xml.Attr `xml:",attr"`
BodyElements string `xml:",innerxml"`
Nodes []Node `xml:",any"`
}
var xmldata = []byte("<div><div data-id=\"images/6C7161080\" data-imagesize=\"medium\" data-alignment=\"none\"></div></div>")
func walk(nodes []Node, f func(Node) bool) {
for _, n := range nodes {
if f(n) {
walk(n.Nodes, f)
}
}
}
func main() {
buf := bytes.NewBuffer(xmldata)
dec := xml.NewDecoder(buf)
var n Node
err := dec.Decode(&n)
if err != nil {
panic(err)
}
walk([]Node{n}, func(n Node) bool {
if n.XMLName.Local == "p" {
fmt.Println(string(n.BodyElements))
} else if n.XMLName.Local == "div"{
fmt.Println(string(n.BodyElements))
fmt.Println(len(n.Attributes))
}
return true
})
}
But the value of len(n.Attributes) is always 0. What can I do to get all the attributes in the given element. NOTE: The attribute names are not constant as sometime the element can be a "div" tag or "img" tag or something else. So I can't use the attribute name as
DataId string `xml:"data-id,attr"`
The fundamental problem is that unmarshalling XML to your
struct Node
doesn't work. YourBodyElements
captures the whole content of your root node and nothing is unmarshaled to yourNodes
. (Btw: Adding a simple fmt.Printf would have revealed this.)Why do you try to write your own XML unmarshalling/parsing code? You will fail. Just use the Decoder and the
Token
method to parse your XML by hand, one token after each other, populating your tree manually. And: If your XML actually is HTML you might want to parse it with package html.