I am having a problem with a Node.js server using Fastify.
At some point during the execution of a request, the server seems to be closing the connection and the client is getting a socket hang up error.
The logic in the server is:
- Fastify client calling a service.
- Service sending HTTP request using Axios to get certain information. The service implements a retry mechanism and after every retry, it waits for 15 seconds to make sure the information is available.
The code is as follows:
Fastify server:
fastify.post('/request', async (request, reply) => {
try {
const result = await service.performOperation(request.body);
return result;
} catch(error) {
console.error('Error during operation: %s', error.toString());
throw error;
}
})
fastify.addHook('onError', (request, reply, error, done) => {
console.error('onError hook: %o', error);
done();
})
Service:
async function performOperation(request) {
let attempt = 0;
let latestErrorMessage;
while(attempt++ < 5) {
try {
await waitBeforeAttempt();
return await getInfoFromServer(request);
} catch (error) {
latestErrorMessage = getErrorMessage(error);
if (attempt < 5) {
console.log(`Re-attempting after error: ${latestErrorMessage}`);
}
}
}
throw new Error(`Error after 5 attempts. Last error: ${latestErrorMessage}`);
}
function waitBeforeAttempt() {
return new Promise(resolve => setTimeout(resolve, 15000));
}
async function getInfoFromServer(request) {
const response = await axios.post('http://localhost:3000/service', request, {timeout: 120000});
return response.data.toString();
}
The problem is that the server seems to be closing the connection.
According to the logs, this is happening after waiting for 15 seconds and before the call via Axios, before finishing the first attempt.
You can see in the logs that after closing the connection, the logic continues and finishes with all the attempts with no problems whatsoever.
There is nothing in the logs as to why the connection is closed, not even from the Fastify onError hook declared.
Nothing from Axios either. I guess if there were any timeouts that would throw an exception and be logged.
Important note
Noted that connections are not dropped if I change the waitBeforeAttempt implementation to implement a busy wait instead of setTimeout ie:
function waitBeforeAttempt() {
const start = new Date();
let now;
while (true) {
now = new Date();
if (now - start >= 15000) {
break;
}
}
}
Is there anything I'm doing wrong that is causing the connections to be dropped? Perhaps the 15 seconds wait is too high? I have other setTimeout in the code via Puppetter (same implementation as mine) that don't seem to be causing the problem.
Just answering my own question. The problem turned out to be nothing related to the wait or timeouts.
This wasn't happening when the Node.js service was running locally, only happening intermittently when running on Kubernets + Nginx.
Nginx was just restarting without any apparent reason.
Nginx has been updated and the issue is not showing anymore.