Is WinHTTP downloading null bytes or am I copying the results buffer incorrectly?

1.5k views Asked by At

I recently ported a fully working WinInet program to WinHTTP. Here's a function I wrote to wrap an entire GET request in to a single line of code:

bool Get(Url url, std::vector<char>& data, ProgressCallbackFunction progressCallback = nullptr) throw()
{
    long cl = -1;
    DWORD clSize = sizeof(cl);
    DWORD readCount = 0;
    DWORD totalReadCount = 0;
    DWORD availableBytes = 0;
    std::vector<char> buf;

    if (_session != NULL)
        throw std::exception("Concurrent sessions are not supported");

    _session = ::WinHttpOpen(_userAgent.c_str(), WINHTTP_ACCESS_TYPE_NO_PROXY, NULL, NULL, NULL);
    auto connection = ::WinHttpConnect(_session, url.HostName.c_str(), url.Port, 0);
    auto request = ::WinHttpOpenRequest(connection, TEXT("GET"), url.GetPathAndQuery().c_str(), NULL, NULL, NULL, WINHTTP_FLAG_REFRESH);

    if (request == NULL)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    auto sendRequest = ::WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, NULL, WINHTTP_NO_REQUEST_DATA, NULL, NULL, NULL);
    if (sendRequest == FALSE)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    if (::WinHttpReceiveResponse(request, NULL))
    {
        if (progressCallback != nullptr && progressCallback != NULL)
        {
            if (!::WinHttpQueryHeaders(request, WINHTTP_QUERY_CONTENT_LENGTH | WINHTTP_QUERY_FLAG_NUMBER, WINHTTP_HEADER_NAME_BY_INDEX, reinterpret_cast<LPVOID>(&cl), &clSize, 0))
            {
                cl = -1;    
            }
        }

        while (::WinHttpQueryDataAvailable(request, &availableBytes))
        {
            if (availableBytes)
            {
                buf.resize(availableBytes + 1);
                auto hasRead = ::WinHttpReadData(request, &buf[0], availableBytes, &readCount);
                totalReadCount += readCount;
                data.insert(data.end(), buf.begin(), buf.begin() + readCount);
                buf.clear();

                if (progressCallback != nullptr && progressCallback != NULL)
                {
                    progressCallback(totalReadCount, cl, getProgress(totalReadCount, cl));
                }
            }
            else
                break;
        }
    }
    else
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    ::WinHttpCloseHandle(request);
    ::WinHttpCloseHandle(_session);
    _session = NULL;
    return true;
}

The code works in that it downloads the requested URL. The problem arises when the server doesn't return the Content-Length header (which is most of the time). The code will still download all the data, but there will be embedded null bytes when converted to a string.

The code above is called like this:

Url url(TEXT("http://msdn.microsoft.com/en-us/site/aa384376"));
Client wc;
std::vector<char> results;
wc.Get(url, results);
StdString html(results.begin(), results.end());
StdOut << html << endl;

StdString is typedef std::basic_string<TCHAR> and StdOut is a macro that uses cout or wcout depending on if UNICODE is defined.

Because of the embedded nulls, not all of the response is displayed on the console. The output displayed when I run the code with debugging off can be viewed here (Note that the line breaks are simply where the text is wrapped in my console). The first null is seen just after "__in" at the very end and happens right where the "Press any key to continue. . . " output is displayed. Here's a screen cap of the output:

Console output

Here's a text visualizer screen cap of the value of the html variable showing exactly where the nulls appear in relation to what's viewable:

Text visualizer for html

Am I doing some bad copying somewhere or is there some nuance of WinHTTP of which I'm unaware?

1

There are 1 answers

0
jvstech On BEST ANSWER

Upon further review of the output, those are not nulls. They're unicode characters that the console can't display because they're being stored incorrectly (and thus being converted incorrectly). I was able to solve the problem in the Get method (and in the calling code) by changing

std::vector<char>

to

std::vector<unsigned char>

and now all is well.