select() in a proxy server

1.9k views Asked by At

I am building a proxy server in c and I'm trying to understand the select() function. I have the code done so that a connection is made from a client and then the web address is extracted so that another connection can be made to connect to the actual web server. The page is is then received by the proxy and passed back to the client.

I understand that select() will allow this to handle multiple client requests, but I don't understand how it helps (or rather how to implement) the second connection to the web server. From what I understand I will no longer need a while loop to keep receiving data from the web server and passing it back to the client.

Do I need a second file descriptor set for the web server connections? If I am handling two or more client requests how do I make sure that they are being linked together with the proper connection to the web server? Anyway, I would appreciate any help. I've gone through the been networking tutorial and a few others online, but after a couple days I still haven't wrapped my head around this.

void error(const char *msg)
{
    perror(msg);
    exit(1);
}

void *get_in_addr(struct sockaddr *sa)
{
    if (sa->sa_family == AF_INET)
        return &(((struct sockaddr_in*)sa)->sin_addr);

    return &(((struct sockaddr_in6*)sa)->sin6_addr);
}

string getHostString(const char *buf);

int main(int argc, char *argv[])
{   
    int sockfd, newsockfd, portno;
    int fdmax; //maximum file descriptor number
    socklen_t clilen;
    char buffer[256];
    struct sockaddr_storage remoteaddr; //client address
    char clientIP[INET6_ADDRSTRLEN];

    //fd_set readfds, writefds, exceptfds;
    fd_set masterfds, readfds;
    struct timeval timeout;
    int rc;

    /*Set time limit. */
    timeout.tv_sec = 3;
    timeout.tv_usec = 0;

    /*Create a descriptor set containing the sockets */
    FD_ZERO(&readfds);
    FD_ZERO(&masterfds);
    /*FD_ZERO(&writefds);
    FD_ZERO(&exceptfds);
    FD_SET(newsockfd, &readfds);

    rc = select(sizeof(readfds)*8, &readfds, NULL, NULL, &timeout);
    if (rc==-1){
        perror("select failed");
        return -1;
    }*/

    struct sockaddr_in serv_addr, cli_addr;
    int n;
    if (argc < 2) {
        fprintf(stderr,"ERROR, no port provided\n");
        exit(1);
    }

    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) 
       error("ERROR opening socket");

    bzero((char *) &serv_addr, sizeof(serv_addr));
    portno = atoi(argv[1]);
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_addr.s_addr = INADDR_ANY;
    serv_addr.sin_port = htons(portno);

    if (bind(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) 
             error("ERROR on binding");

    if((listen(sockfd,5)) == -1 )
        error("Server-listen() error!!!");
    printf("Server-listen() is OK...\n");


    FD_SET(sockfd, &masterfds);

    //keep track of the biggest file descriptor so far
    fdmax = sockfd; //so far it's this one

    for(;;){

        readfds = masterfds;

        if(select(fdmax+1, &readfds, NULL, NULL, NULL) == -1){
            error("Server-select() error !");
        }
        printf("Server-select() is OK...\n");

        for(int i = 0; i <= fdmax; i++){
            printf("i: %d sockfd %d\n", i, sockfd);
            if(FD_ISSET(i, &readfds)){
                if(i == sockfd){ //sockfd is the listener
                    //following handles new connections
                    clilen = sizeof(cli_addr);
                    if((newsockfd = accept(sockfd,(struct sockaddr *) &cli_addr, &clilen)) == -1)
                        error("Server-accept() error!!!");
                    else{
                        printf("Server-accept() is OK...\n");
                        FD_SET(newsockfd, &masterfds); //add to master set
                        if(newsockfd > fdmax) 
                            fdmax = newsockfd;
                        printf("selectserver: New connection from %s on  "
                                "socket %d\n",
                                inet_ntop(remoteaddr.ss_family,
                                    get_in_addr((struct sockaddr*)&cli_addr),
                                    clientIP, INET6_ADDRSTRLEN),
                                newsockfd);
                    }
                }
                else{
                    //handle data from a client
                    //Here is where the FIRST read occurs
                    bzero(buffer,256);
                    //n = read(newsockfd,buffer,255);
                    n = read(i,buffer,255);
                    if (n < 0) error("ERROR reading from socket");
                    printf("Here is the message: \n\n%s\n\n",buffer);

                    //string hoststring(buffer);
                    string hoststring(getHostString(buffer));

                    int html_port = 80;
                    int html_socket;

                    printf("Prior to struct addrinfo\n");

                    struct addrinfo hints, *res;

                    memset(&hints, 0, sizeof hints);
                    hints.ai_family = AF_UNSPEC;
                    hints.ai_socktype = SOCK_STREAM;
                    hints.ai_flags = AI_PASSIVE;

                    printf("Prior to getaddrinfo()\n");

                    char *address = new char[hoststring.size() +1];
                    address[hoststring.size()] = 0;
                    memcpy(address, hoststring.c_str(), hoststring.size());

                    //getaddrinfo(token, (char *)html_port, &hints, &res);
                    getaddrinfo(address, "80", &hints, &res);

                    printf("We are past getaddrinfo()\n");

                    //if (html_socket = socket(PF_INET, SOCK_STREAM, 0) < 0){
                    if ((html_socket = socket(res->ai_family, res->ai_socktype, res->ai_protocol)) < 0){
                        printf("socket connection error\n");
                    }

                    printf("We are past socket()\n");

                    //char* address;
                    //address = new char[256];
                    //strncpy(address, "www.cs.ucr.edu", sizeof("www.cs.ucr.edu"));


                    printf("address: %s\n", address);    
                    struct hostent * host;
                    if ((host = gethostbyname(address)) == NULL){
                        printf("Problem with gethostbyname()\n");
                    }

                    printf("We are past gethostbyname() and about to connect.\n");

                    if ( connect(html_socket, res->ai_addr, res->ai_addrlen) < 0)
                        printf("Unsuccessful completion of connect()");

                    int bytes_sent, bytes_recvd;
                    char recv_buff[1024];
                    bytes_sent = send(html_socket, buffer, 256, 0);
                    cout << "bytes_sent: " << bytes_sent << endl;

                    //do{
                        bytes_recvd = recv(html_socket, recv_buff, 1024, 0);
                        cout << "bytes_rcvd: " << bytes_recvd << endl;
                        cout << recv_buff << endl;

                        //FD_ZERO(&readfds);
                        //FD_SET(newsockfd, &writefds);

                        //n = write(newsockfd,"I got your message\n",20);
                        //n = write(newsockfd,recv_buff,bytes_recvd);
                        n = write(i,recv_buff,bytes_recvd);
                        if (n < 0) error("ERROR writing to socket");
                        bzero(recv_buff, 1024);
                    //}while(bytes_recvd !=0);
                }
            }
        }

    }

    close(newsockfd);
    close(sockfd);
    return 0; 
}


string getHostString(const char *buf){
    string hoststring(buf); 
    hoststring = hoststring.substr(11);

    cout << "hoststring: " << hoststring << endl;
    int slashpos;
    slashpos = hoststring.find("/");
    int suffixendpos = hoststring.find("H") - slashpos;
    string suffix = hoststring.substr(slashpos, suffixendpos);
    hoststring = hoststring.substr(0, slashpos);

    cout << "hoststring: " << hoststring << endl;
    cout << "suffix: " << suffix << endl;

    return hoststring;

}
2

There are 2 answers

0
Basile Starynkevitch On

Did you read the select(2) and select_tut(2) man pages?

Did you read relevant chapters in e.g. Advanced Linux Programming or Advanced Unix Programming?

Actually, because of the c10k problem and the limitation of the maximal file descriptor to 256 (or 1024), i.e. to __FD_SETSIZE), the select syscall is becoming obsolete, and you should use the poll(2) syscall instead.

You should set the readfds inside your for loop, just before the select call, with explicit FD_ZERO and FD_SET (The fd_set type may be an array, so you cannot assign it in whole.). The select syscall can modify it.

Don't forget to compile with gcc -Wall -g and to use the debugger. You could also study the source code of existing free software HTTP client libraries or proxies.

1
Chimera On

I know you are trying to write a single process server that handle multiple connections in a loop and using select() to determine when a socket descriptor has data to be read. However sometimes another approach is actually easier and more scalable.

Have you considered using a multi-process server in which after each socket connection is made, you fork() a new process to handle the request? This eliminates the need to worry about how select() works and mapping requests to the correct socket descriptor.

See Handle Multiple Connections here as a reasonable example.

In addition to using fork() you could use threading with the POSIX pthreads library which may gain you a bit of efficiency. Here is a good multi-threaded tcp/ip server sample that uses pthreads.