I have some sporadic transient errors when connecting from my docker container to the hosting machine database.
For this reason, I configure my DB context to retry on failure. This solved the problem partially because the client application still fails since the retry of the request can take up to 2 minutes (on average 30 seconds). This is too long and represents a bad user experience. I was trying to understand why it takes so long but the only thing I can think of is that the timeout until declaring the connection as failure is too long. I thought of making the command timeout smaller. (By default it is 30 seconds if I am not wrong) Maybe put it 2-3 seconds. (most of my queries take less than 30ms) but I don't know if this would create other problems.
When checking my logs I discovered that the problem doesn't rely on the retry logic because it retries straight after the failure but what takes so long is the failure response.
This is my current configuration.
builder.Services.AddDbContext<AuthDbContext>(options =>
{
options.UseNpgsql(EnvironmentVariables.GetEnvironmentVariable(EnvironmentVariables.DB_AUTH), conf =>
{
conf.EnableRetryOnFailure(5, TimeSpan.FromSeconds(5), new List<string> { "4060" });
conf.CommandTimeout(2); //This is the command timeout that I want to add.
});
options.LogTo(
filter: (eventId, level) => eventId.Id == CoreEventId.ExecutionStrategyRetrying,
logger: (eventData) =>
{
var retryEventData = eventData as ExecutionStrategyEventData;
var exceptions = retryEventData.ExceptionsEncountered;
Log.Information("TRANSIENT ERROR Retry #{attemptNumber} with delay {delayMs} due to error: {errorMessage}", exceptions.Count, retryEventData.Delay, exceptions.Last().Message);
});
});
There are a few points that you may consider