How can I convert a text node to a string in Go with Gokogiri?

750 views Asked by At

For my first programming attempt with Go I'm trying to automate the downloading of the lovely wallpapers from Psiu Puxa, saving the images with filenames based on titles in the posts in the HTML.

However, I haven't found how to get the value of a text node as a string.

Example HTML, simplified:

<div class="post">
    <a class="w-inline-block post-name-link" href="/posts/mars-30">
        <h4>#80 Martian Landscape</h4>
    </a>
</div>
<div class="post">
    <a class="w-inline-block post-name-link" href="#">
        <h4><strong>#79 MARTIAN terrain</strong></h4>
    </a>
</div>

My Go package:

package main

import (
    "fmt"
    "net/http"
    "io/ioutil"
    "github.com/moovweb/gokogiri"
)

func main() {
    resp, _ := http.Get("http://psiupuxa3.webflow.io/")
    page, _ := ioutil.ReadAll(resp.Body)
    resp.Body.Close()

    doc, _ := gokogiri.ParseHtml(page)
    res, _ := doc.Search("//div[@class='post']")
    defer doc.Free()

    for i := range res {
        postTitleRes, _ := res[i].Search("a[contains(@class,'post-name-link')]//text()")
        fmt.Printf("%T: %v\n", postTitleRes, postTitleRes)
    }

}

Result:

[]xml.Node: [#80 Martian Landscape]
[]xml.Node: [#79 MARTIAN terrain]
[]xml.Node: [#78 MARTIAN TERRAIN]

How can I obtain #79 MARTIAN terrain, etc., as strings for later use when saving files?

I've tried postTitle := postTitleRes.String() but the method apparently isn't available for xml.Node. I've spent some time looking through Gokogiri's source code and have found methods/instructions on coercing to strings, but I'm quite lost and would appreciate any pointers.

1

There are 1 answers

0
Peter Mellett On BEST ANSWER

You've got an array of xml.Node structs there. You would need to access the nodes contained in that array.

If you're sure you have one element then you can

postTitleRes[0].Content()

or to capture all of those nodes:

for _, node := range postTitleRes {
    fmt.Printf("%T: %v\n", node, node.Content())
}

You can see that the Content function should be available to you once you have a singular xml.Node. Definition.