HTTP request parsing for pipelined request

38 views Asked by At

So I have been trying to make an web server which responds to GET requests for static webpages. So far, I have been using non-blocking sockets and use this to break out of the read() loop.

while(1){
    ssize_t bytesRcv = read(connfd, buff, buff_len);
    if( bytesRcv < 0 ){
        if( bytesRcv == 0 ){
            /*  
                Client closed his side of connection, put it in a disconnected state.
                Note : We can still send messages to the client, thus we will parse this
                request and send a message and then close the client.
            */
            client.state = DISCONNECTED | PENDING_REPLY;
            break;
        }
        if( bytesRcv == -1 ){
            if( errno == EAGAIN || errno == EWOULDBLOCK ){
                /*  
                    We have read all that we could, now the client is waiting for our reply.
                    We will put the client into pending_reply state.
                 */
                
                client.state = PENDING_REPLY;
            }
        }

        buff[bytesRcv] = '\0';
        buff += bytesRcv;
        buff_len -= bytesRcv;

        /*  
            If the message ends with two CRLF, we have gotten 1 full request. Break out of the loop
            put the client in pending reply state 
            Note, it is an assumption that the request is not pipelined, therefore, we will have \r\n\r\n
            at the end of the message.
            How do I make it so that piplined request can be handled?
        */
        if( strcmp(buff - 4, "\r\n\r\n") == 0 ){
            client.state = PENDING_REPLY;
            break;
        }
    }
}

Specifically, this part is quite bothering and error prone, how to make it so that I can handle pipelined request without compromising a lot of performance? Also, should I first read ALL of the message or should I call a parse_http_header() function right after every read() call.

1

There are 1 answers

3
Frederik Deweerdt On

For a given buffer for connfd, which proportion of the buffer you have parsed for far and look at the \r\n\r\n pattern from that point on. Then you could record the end of the current request and restart parsing from that point on for the next, pipelined, request.

See how PicoHTTPParser handles this: https://github.com/h2o/picohttpparser/blob/066d2b1e9ab820703db0837a7255d92d30f0c9f5/picohttpparser.c#L197-L223

static const char *is_complete(const char *buf, const char *buf_end, size_t last_len, int *ret)
{
    int ret_cnt = 0;
    buf = last_len < 3 ? buf : buf + last_len - 3;

    while (1) {
        CHECK_EOF();
        if (*buf == '\015') {
            ++buf;
            CHECK_EOF();
            EXPECT_CHAR('\012');
            ++ret_cnt;
        } else if (*buf == '\012') {
            ++buf;
            ++ret_cnt;
        } else {
            ++buf;
            ret_cnt = 0;
        }
        if (ret_cnt == 2) {
            return buf;
        }
    }

    *ret = -2;
    return NULL;
}

note how it takes the length seen by the previous parsing in last_len. Also note that PicoHTTPParser accepts a bare \n as a line separator, which is something the RFC allows for interoperability reasons:

Although the line terminator for the start-line and fields is the sequence CRLF, a recipient MAY recognize a single LF as a line terminator and ignore any preceding CR.

I would suggest using memcmp or a hand rolled loop like above, since it would handle unexpected null bytes (\0) in the input, and would allow to return an error when parsing, rather than end up with a timeout because strcmp wouldn't 'see' the bytes after the null byte.