I try to use one http client to make multiple requests on the same host through different proxy servers. It is important to make every new req through new proxy (round robin scheme). This is my code sample
package main
import (
"fmt"
"net/http"
"net/url"
)
var client *http.Client
func main() {
roundRobin := NewRoundRobinProxy(
"http://myproxy1:8888",
"http://myproxy2:8888",
"http://myproxy3:8888")
client = &http.Client{
Transport: &http.Transport{
MaxConnsPerHost: 10,
DisableKeepAlives: false, // if it's true - it works fine, app really calls Proxy func on EACH req
Proxy: roundRobin.Proxy,
},
}
sendReq("https://www.binance.com")
sendReq("https://www.binance.com")
sendReq("https://www.binance.com")
sendReq("https://www.binance.com")
}
func sendReq(urlStr string) {
req, _ := http.NewRequest("GET", urlStr, nil)
resp, _ := client.Do(req)
resp.Body.Close()
fmt.Println("got resp from ", urlStr)
}
type RoundRobinProxy struct {
urls []*url.URL
cursor int
}
func NewRoundRobinProxy(urls ...string) *RoundRobinProxy {
p := &RoundRobinProxy{cursor: 0}
for _, v := range urls {
u, _ := url.Parse(v)
p.urls = append(p.urls, u)
}
return p
}
func (p *RoundRobinProxy) Proxy(*http.Request) (*url.URL, error) {
fmt.Println("i'm in proxy, cursor=", p.cursor)
u := p.urls[p.cursor]
if p.cursor < len(p.urls)-1 {
p.cursor++
} else {
p.cursor = 0
}
return u, nil
}
So if I run this code I expect i'm in proxy... message as times as I have requests (4). But in fact I see this picture:
i'm in proxy, cursor= 0
got resp from https://www.binance.com
got resp from https://www.binance.com
got resp from https://www.binance.com
got resp from https://www.binance.com
So it uses first proxy in the pool and then cached it somehow.
Yes, the one solution is to set DisableKeepAlives=true. In that case it works
i'm in proxy, cursor= 0
i'm in proxy, cursor= 1
got resp from https://www.binance.com
i'm in proxy, cursor= 2
i'm in proxy, cursor= 0
got resp from https://www.binance.com
i'm in proxy, cursor= 1
i'm in proxy, cursor= 2
got resp from https://www.binance.com
i'm in proxy, cursor= 0
i'm in proxy, cursor= 1
got resp from https://www.binance.com
There are more in proxy messages, than requests. But it doesn't matter (maybe some redirects done under the hood)
But it's important to reuse tcp connections to avoid handshake overhead on each request.
Are there any ideas besides using pool of clients (each with one proxy) in place of pool of proxies. I wish to find more straightforward and elegant solution)) thanks
This isn't a go issue, it's just the way keepalive works. When you are using TCP keepalive, it keeps the connection open - as you recognize, this lets you avoid some of the handshake overhead. But what you are connected to is the proxy - in this case, yes, the first one in the list.
What's happening:
Since the connection from the client terminates at the specific proxy, that is what is being kept alive.
I would follow the suggestion of 1:1 client-to-proxy - then you can load-balance across the proxies while still using keepalive.