how to parse result from httpclient in enlive

Question

how to parse result from httpclient in enlive

162 views Asked by Daniel Wu At 24 January 2016 at 12:50

In the following link https://github.com/swannodette/enlive-tutorial/blob/master/src/tutorial/scrape1.clj

it shows how to parse the page from a URL, but I need to use a sock5 proxy, and I can't figure out how to use proxy inside enlive, but I know how to use proxy in httpclient, but how to parse the result from httpclient, I have the following code, but the last line show empty result

    (:require [clojure.set :as set]
                [clj-http.client :as client]
                [clj-http.conn-mgr :as conn-mgr]
                [clj-time.core :as time]
                [jsoup.soup :as soup]
                [clj-time.coerce :as tc]
                [net.cgrand.enlive-html :as html]
                )     
     (def a (client/get "https://news.ycombinator.com/"
                             {:connection-manager (conn-mgr/make-socks-proxied-conn-manager "127.0.0.1" 9150)
                              :socket-timeout 10000 :conn-timeout 10000
                              :client-params {"http.useragent" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.20 (KHTML, like Gecko) Chrome/11.0.672.2 Safari/534.20"}}))
(def b (html/html-resource a))
(html/select b [:td.title :a])

Original Q&A

There are 1 answers

**rabidpraxis** · Answer 1 · 2016-01-24T16:00:18+00:00

When using enlive the html-resource fn performs a fetch from a URL and then converts it to a data structure it can parse. It seems that when you pass it an already fulfilled request, it just returns back the request instead of throwing an error.

Either way, the function you want is html-snippet and you will want to pass it the body of your request. Like so:

;; Does not matter if you are using a connection manager or not as long as
;; its returning a result with a body
(def req (client/get "https://news.ycombinator.com/"))

(def body (:body req))
(def nodes (html/html-snippet body))
(html/select nodes [:td.title :a])

;; Or you can put it all together like this

(-> req
    :body 
    html/html-snippet
    (html/select [:td.title :a])))

TechQA.

how to parse result from httpclient in enlive

There are 1 answers

Related Questions in CLOJURE

Related Questions in ENLIVE

Popular Questions

Trending Questions