standard C++ TCP socket, connect fails with EINTR when using std::async

1.1k views Asked by At

I am having trouble using the std::async to have tasks execute in parallel when the task involves a socket.

My program is a simple TCP socket server written in standard C++ for Linux. When a client connects, a dedicated port is opened and separate thread is started, so each client is serviced in their own thread.

The client objects are contained in a map.

I have a function to broadcast a message to all clients. I originally wrote it like below:

//  ConnectedClient is an object representing a single client
//  ConnectedClient::SendMessageToClient opens a socket, connects, writes, reads response and then closes socket
//  broadcastMessage is the std::string to go out to all clients

//  iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{
    printf("%s\n", nextClient->second->SendMessageToClient(broadcastMessage).c_str());

}   

I have tested this and it works with 3 clients at a time. The message gets to all three clients (one at a time), and the response string is printed out three times in this loop. However, it is slow, because the message only goes out to one client at a time.

In order to make it more efficient, I was hoping to take advantage of std::async to call the SendMessageToClient function for every client asynchronously. I rewrote the code above like this:

vector<future<string>> futures;

//  iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{   
    printf("start send\n"); 
    futures.push_back(async(launch::async, &ConnectedClient::SendMessageToClient, nextClient->second, broadcastMessage, wait));
    printf("end send\n");

}   

vector<future<string>>::iterator nextFuture;
for( nextFuture = futures.begin(); nextFuture != futures.end(); ++nextFuture )
{
    printf("start wait\n");
    nextFuture->wait();
    printf("end wait\n");
    printf("%s\n", nextFuture->get().c_str());
}

The code above functions as expected when there is only one client in the map. That you see "start send" quickly followed by "end send", quickly followed by "start wait" and then 3 seconds later (I have a three second sleep on the client response side to test this) you see the trace from the socket read function that the response comes in, and then you see "end wait"

The problem is that when there is more than one client in the map. In the part of the SendMessageToClient function that opens and connects to the socket, it fails in the code identified below:

    //  connected client object has a pipe open back to the client for sending messages
int clientSocketFileDescriptor;
clientSocketFileDescriptor = socket(AF_INET, SOCK_STREAM, 0);


//  set the socket timeouts  
    //  this part using setsockopt is omitted for brevity

    //  host name
struct hostent *server;
server = gethostbyname(mIpAddressOfClient.c_str());

if (server == 0) 
{
   close(clientSocketFileDescriptor);
    return "";
}

//
struct sockaddr_in clientsListeningServerAddress;
memset(&clientsListeningServerAddress, 0, sizeof(struct sockaddr_in)); 

clientsListeningServerAddress.sin_family = AF_INET;
bcopy((char*)server->h_addr, (char*)&clientsListeningServerAddress.sin_addr.s_addr, server->h_length);
clientsListeningServerAddress.sin_port = htons(mPortNumberClientIsListeningOn);

    //  The connect function fails !!!
if ( connect(clientSocketFileDescriptor, (struct sockaddr *)&clientsListeningServerAddress, sizeof(clientsListeningServerAddress)) < 0 )
{
    //  print out error code
            printf("Connected client thread: fail to connect %d \n", errno);
    close(clientSocketFileDescriptor);
    return response;
}

The output reads: "Connected client thread: fail to connect 4".

I looked this error code up, it is explained thus:

#define EINTR            4      /* Interrupted system call */

I searched around on the internet, all I found were some references to system calls being interrupted by signals.

Does anyone know why this works when I call my send message function one at a time, but it fails when the send message function is called using async? Does anyone have a different suggestion how I should send a message to multiple clients?

1

There are 1 answers

0
user3487549 On

First, I would try to deal with the EINTR issue. connect ( ) has been interrupted (this is the meaning of EINTR) and does not try again because you are using and asynch descriptor. What I usually do in such a circumstance is to retry: I wrap the function (connect in this case) in a while cycle. If connect succeeds I break out of the cycle. If it fails, I check the value of errno. If it is EINTR I try again. Mind that there are other values of errno that deserve a retry (EWOULDBLOCK is one of them)