"Connection reset by peer" in python gRPC

2.5k views Asked by At

We are sending multiple requests to a gRPC server. But every once in a while we come across "Connection reset by peer" error with UNAVAILABLE status.

GRPC server: NestJS

Client: Python

Python version: 3.8

gRPCio version: 1.50.0

Code:

# Connect to server from client:

def connect_to_user_manager_server() -> AuthorizationControllerStub:
    channel = grpc.insecure_channel(envs.USER_MANAGER_GRPC_URL, options=(
        ('grpc.keepalive_time_ms', 120000),
        ('grpc.keepalive_permit_without_calls', True),
    ))
    stub = AuthorizationControllerStub(channel)

    return stub

client = connect_to_user_manager_server()

user_response = client.CheckAuthorization(authorizationData(authorization=token, requiredRoles=roles))
1

There are 1 answers

3
Serge de Gosson de Varennes On

You can add a retry logic in your client code using a library such as retrying or by implementing it manually.

For instance, you can implement retry logic using the retrying library:

from retrying import retry

@retry(stop_max_attempt_number=3, wait_fixed=1000)
def connect_to_user_manager_server():
    channel = grpc.insecure_channel(envs.USER_MANAGER_GRPC_URL, options=(
        ('grpc.keepalive_time_ms', 120000),
        ('grpc.keepalive_permit_without_calls', True),
    ))
    stub = AuthorizationControllerStub(channel)
    return stub

This will retry the connect_to_user_manager_server function up to 3 times, with a 1 second delay between each retry.

You can also implement it manually using a loop and try-catch block, like this:

attempt = 1
max_attempts = 3
while attempt <= max_attempts:
    try:
        channel = grpc.insecure_channel(envs.USER_MANAGER_GRPC_URL, options=(
            ('grpc.keepalive_time_ms', 120000),
            ('grpc.keepalive_permit_without_calls', True),
        ))
        stub = AuthorizationControllerStub(channel)
        break
    except Exception as e:
        if attempt == max_attempts:
            raise e
        attempt += 1
        time.sleep(1)

This will also retry the connection to the server up to 3 times, with a 1 second delay between each retry.

You can adjust the number of retries and the delay time to fit your needs.