How to get bare wsgi stream in Flask?

2.2k views Asked by At

Using Flask, I'd like to get at the bare wsgi.input reference. Looking at the code, there seems to be more than one way to do this, both of which appear in:

werkzeug.wsgi.get_input_stream(environ, safe_fallback=True):
    ...
    if environ.get('wsgi.input_terminated'):
        return stream
    ...
    if content_length is None:
        return safe_fallback and _empty_stream or stream
    ...

Annoyingly I can't figure out how to actually get either of these cases to happen (and they're barely mentioned in the docs).

wsgi.input_terminated: I know I can set the wsgi environment if I'm using a proper server like Apache but how do I do it under the Flask dev server, given that Werkzeug hard codes its wsgi environment in werkzeug.serving.make_environ()?

safe_fallback: Can't figure this at all... what's this parameter doing here if it's just called by itself and never passed? How am I supposed to activate it?

Quite possibly missing something easy here...

3

There are 3 answers

1
ddzialak On

I know it's pretty old thread but I've found one way to handle chunked encoding requests (I need to have stream not data) and within custom python application (gunicorn + flask), so instead of using flask.Flask as app I've created subclass:

class FlaskApp(flask.Flask):
    def request_context(self, environ):
        # it's the only way I've found to handle chunked encoding request (otherwise flask.request.stream is empty)
        environ['wsgi.input_terminated'] = 1
        return super(FlaskApp, self).request_context(environ)

Anyone has better idea how to do that in a better way?

6
MTKnife On

Here's a slightly more elaborate version of the function @javabrett suggested in his comment to @ddzialek's answer. This one checks for chunked input before setting the flag (I'm not positive that's necessary, but it doesn't seem like a great idea to set a flag you might not need).

@app.before_request
def handle_chunking():
    """
    Sets the "wsgi.input_terminated" environment flag, thus enabling
    Werkzeug to pass chunked requests as streams; this makes the API
    compliant with the HTTP/1.1 standard.  The gunicorn server should set
    the flag, but this feature has not been implemented.
    """

    transfer_encoding = request.headers.get("Transfer-Encoding", None)
    if transfer_encoding == "chunked":
        request.environ["wsgi.input_terminated"] = True

The value of "Transfer-Encoding" that actually comes through will probably be in Unicode rather than ASCII, hence "u'chunked'" would be more accurate--but for characters in the ASCII range, the difference is irrelevant (Python will match the strings either way), and Python 3 doesn't need the "u".

EDIT: It's important to note that some WSGI servers no longer require this fix. Specifically, gunicorn has respected the wsgi.input_terminated flag since version 20.0, though it should be noted that 20.0 was the first version that didn't support Python 2.x--so if you're stuck using Python 2.x, you'll still need this fix.

0
Josh Peterson On

The solution @ddzialak posted works, but reading the stream with either request.stream.read(chunk_size) or request.get_data() gives the complete stream's content, including headers. Here is what I use to get my tools the information they use:

str_data    = request.get_data() # stream data
pattern = re.compile(r'\r\n\r\n(.+?)\n\r\n', flags=re.DOTALL) # content regex
file_data   = pattern.findall( str_data.decode('UTF-8') )[0] # what my tools use