How the builtin Python's next function works as used in the Gunicorn server

27 views Asked by At

The following is the code cross section from gunicorn.workers.sync.SyncWorker class self.handle method:

def handle(self, listener, client, addr):
        req = None
        try:
            if self.cfg.is_ssl:
                client = sock.ssl_wrap_socket(client, self.cfg)
            parser = http.RequestParser(self.cfg, client, addr)
            req = next(parser)
            self.handle_request(listener, req, client, addr)

The above method executes the line: parser = http.RequestParser(self.cfg, client, addr) which returns a gunicorn.http.parser.RequestParser object that inherits from gunicorn.http.parser.Parser which is given as follows:

class Parser(object):

    mesg_class = None

    def __init__(self, cfg, source, source_addr):
        self.cfg = cfg
        if hasattr(source, "recv"):
            self.unreader = SocketUnreader(source)
        else:
            self.unreader = IterUnreader(source)
        self.mesg = None
        self.source_addr = source_addr

        # request counter (for keepalive connetions)
        self.req_count = 0

    def __iter__(self):
        return self

    def __next__(self):
        # Stop if HTTP dictates a stop.
        if self.mesg and self.mesg.should_close():
            raise StopIteration()

        # Discard any unread body of the previous message
        if self.mesg:
            data = self.mesg.body.read(8192)
            while data:
                data = self.mesg.body.read(8192)

        # Parse the next request
        self.req_count += 1
        self.mesg = self.mesg_class(self.cfg, self.unreader, self.source_addr, self.req_count)
        if not self.mesg:
            raise StopIteration()
        return self.mesg

    next = __next__

The object gunicorn.http.parser.Parser above defines both self.__iter__ and self__next__ special methods, a protocol for iterables and iterators. However, parser, the local variable which holds the returned object, defined in the method self.handle is an instance of gunicorn.http.parser.RequestParser, not an iterator, as confirmed by output from pdb (see stracktrace below) python debugger: <class 'gunicorn.http.parser.RequestParser'>. Following the returned object which has been assigned to parser, the method executes the line: req = next(parser), and it is passed parser as its argument.

Question:

Shouldn't the builtin function next() accepts the first argument as an iterator? parser is not an iterator it is, but an instance of class gunicorn.http.parser.RequestParser. The following is next()'s documentation from Python:

next(iterator)
next(iterator, default)
Retrieve the next item from the iterator by calling its __next__() method. If default is given, it is returned if the iterator is exhausted, otherwise StopIteration is raised.

I've run the pdb to produce the stacktrace to confirm what actually happens when the line: req=next(parser) executes. It drops into the method self.__next__ of gunicorn.http.parser.Parser. The method self.__iter__, which should return an iterator, is not called at all, and actually I commented it out of the code and the code works with no complications. Is the method self.__iter__ actually required after all?

Stacktrace:

/home/humbulani/django/env/bin/gunicorn(8)<module>()
-> sys.exit(run())
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py(71)run()
-> WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/app/base.py(264)run()
-> super().run()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/app/base.py(74)run()
-> Arbiter(self).run()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/arbiter.py(226)run()
-> self.manage_workers()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/arbiter.py(602)manage_workers()
-> self.spawn_workers()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/arbiter.py(673)spawn_workers()
-> self.spawn_worker()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/arbiter.py(640)spawn_worker()
-> worker.init_process()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/workers/base.py(144)init_process()
-> self.run()
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/workers/sync.py(126)run()
-> self.run_for_one(timeout)
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/workers/sync.py(70)run_for_one()
-> self.accept(listener)
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/workers/sync.py(32)accept()
-> self.handle(listener, client, addr)
  /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/workers/sync.py(140)handle()
-> req = next(parser) # self.msg attribute
> /home/humbulani/django/env/lib/python3.10/site-packages/gunicorn/http/parser.py(37)__next__()
-> if self.mesg and self.mesg.should_close():

Your response will be highly appreciated.

1

There are 1 answers

3
chepner On

next does very little; you can imagine its implementation looks like this:

_sentinel = object()

def next(itr, default=_sentinel):
    try:
        return itr.__next__()
    except StopIteration:
        if default is _sentinel:
            raise
        return default

So next(parser) just calls parser.__next__, because parser is an iterator: it's an instance of a class (Parser) that defines __next__.

What next does not accept is an instance of a class that is not an iterator. By defining __iter__, Parser just happens to also (like all other iterators) be an iterable. You can be an iterable without being an iterator (like list), but not vice versa.