Since last month I'm seeing strange behaviour in my application.
I have an k8 go http application where I'm calling internal Go http services within the organization.
Service A calling Service B. In service A I'm getting error while calling service B of:
context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I have checked APM of service B and its response time is in control still we keep getting above errors randomly in service A. The average response time of service B is 2 ms and in APM its not going beyond 50 MS at any cost which I have verified. Also there are no error logs.
In my service A I'm using HTTP connection pooling and keeping the single instance of HTTP client throughout the application life cycle. I also checked HTTP connections in Service A using netstat in k8 POD and that seems to be within control as well during the issue.
Please find my HTTP client connection policy for service A
Generic function to create client:
func DefaultHTTPClient(httpConfig *config.HTTPConfig) *http.Client {
transport := &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: time.Duration(httpConfig.ConnectionTimeout) * time.Millisecond,
KeepAlive: time.Duration(httpConfig.KeepAlive) * time.Millisecond,
DualStack: true,
}).DialContext,
TLSHandshakeTimeout: time.Duration(httpConfig.TLSHandshakeTimeout) * time.Millisecond,
MaxIdleConns: httpConfig.DefaultMaxIdleConns,
MaxIdleConnsPerHost: httpConfig.DefaultMaxIdleConns,
IdleConnTimeout: time.Duration(httpConfig.IdleConnectionTimeout) * time.Millisecond,
Dial: setupDialTimeOut(httpConfig),
}
return &http.Client{
Timeout: time.Duration(httpConfig.RequestTimeout) * time.Millisecond,
Transport: transport,
}
}
HTTP Policy :
defaultMaxIdleConns: 50
defaultMaxIdleConnsPerHost: 50
idleConnectionTimeout: 90000
requestTimeout: 300
connectionTimeout: 1000
keepAlive: 60000
tlsHandshakeTimeout: 1000
apiKey: "1231243214214"
retryConfig:
enabled: true
intervals: [100, 200, 400]
I tried several things still not able to debug this issue. Need help to debug and resolve this issue.