Long request returns with empty response after 120 seconds, caused by Network Load Balancer

1.5k views Asked by At

I have a GKE cluster with 2 nodes, with a service of type LoadBalancer. When I call the service internally a long request will not timeout after 120 seconds. But if I call the external IP of the Network Load Balancer that forwards to the internal service, I get a "Empty reply from server" response.

External call example:

curl -v "http://<public-ip>/longResponse"
*   Trying <public-ip>...
* TCP_NODELAY set
* Connected to <public-ip> (<public-ip>) port 80 (#0)
> GET /longResponse HTTP/1.1
> Host: <public-ip>
> User-Agent: curl/7.54.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host <public-ip> left intact
curl: (52) Empty reply from server

Internal call example:

/ # wget -O - -S <service-name>/longResponse
Connecting to location-service (10.3.255.181:80)
  HTTP/1.1 200 OK
  Access-Control-Allow-Origin: *
  Content-Type: application/json
  Content-Length: 15
  Date: Thu, 28 Feb 2019 10:31:14 GMT
  Connection: close

-                    100% |*********************************************************************************************************************************************************************************************************************|    15  0:00:00 ETA
/ # 

I've tried to find documentation for request or socket timeout in the load balancer level, but I didn't encounter anything. Any idea?

Thanks.

2

There are 2 answers

4
Tim Hockin On

Are you sure that's not a client-side timeout? Network LB doesn't process packets other than to route them, so it should never send any response back.

Try the -m flag to curl?

Also maybe capture a tcpdump on your client-side so you can see what the network is actually doing.

1
spender On

Get the load-balancer's backend name with:

gcloud compute backend-services list

then

BACKEND=name-of-your-backend
gcloud compute backend-services update $BACKEND --timeout=600s

otherwise, in the console: Network services ⇒ Load balancing ⇒ Backends then you can click your HTTP backend(s) and edit the settings, including the timeout.

On a wider note, this may be one of serval hops between server and client, each of which might timeout. You're better off either living with the timeout (and making your long polls complete before the timeout), or drip feeding data down the line... for instance, you can preprend whitespace to json, so for instance, send a space character every 30 seconds until you have a proper response body. This will keep the load-balance from timing out.