Tornado Server using most of the cpu while using tornado-sockjs and only two clients.

449 views Asked by At

I am using Tornado Server, 4.4.2 and pypy 5.9.0 and python 2.7.13, hosted on Ubuntu 16.04.3 LTS

A new client logs in and a new class is created and passed the socket, so dialog can be maintained. I am using a global clients[] list to contain the classes. initial dialog looks like :

clients = []

class RegisterWebSocket(SockJSConnection):
  # intialize the class and handle on-open (some things left out) 

    def on_open(self,info):
        self.ipaddress = info.headers['X-Real-Ip']

    def on_message(self, data):
        coinlist = []
        msg = json.loads(data)
        if 'coinlist' in msg:
            coinlist = msg['coinlist']
        if 'currency' in msg:
            currency = msg['currency']
            tz = pendulum.timezone('America/New_York')
            started = pendulum.now(tz).to_day_datetime_string()
            ws = WebClientUpdater(self, self.clientid, coinlist,currency, 
                 started, self.ipaddress)
            clients.append(ws)

The ws class is shown below and I use a tornado periodiccallback to update the clients with their specific info every 20 seconds

class WebClientUpdater(SockJSConnection):

    def __init__(self, ws,id, clist, currency, started, ipaddress):
        super(WebClientUpdater,self).__init__(ws.session)
        self.ws = ws
        self.id = id
        self.coinlist = clist
        self.currency = currency
        self.started = started
        self.ipaddress = ipaddress
        self.location = loc
        self.loop = tornado.ioloop.PeriodicCallback(self.updateCoinList, 
                  20000, io_loop=tornado.ioloop.IOLoop.instance())                                    
        self.loop.start()
        self.send_msg('welcome '+ id)

    def updateCoinList(self):
        pdata = db.getPricesOfCoinsInCurrency(self.coinlist,self.currency)
        self.send(dict(priceforcoins = pdata))

    def send_msg(self,msg):
        self.send(msg)

I also start at 60 second periodiccallback at startup, to monitor the clients for closed connections and remove them from the client[] list. Which I put on the startup line to call a def internally like

if __name__ == "__main__":
    app = make_app()
    app.listen(options.port) 
    ScheduleSocketCleaning()

and

def ScheduleSocketCleaning():
    def cleanSocketHouse():
        print "checking sockets"
        for x in clients:
            if x.is_closed:
              x = None

    clients[:] = [y for y in clients if not y.is_closed ]

    loop = tornado.ioloop.PeriodicCallback(cleanSocketHouse, 60000,                             
          io_loop=tornado.ioloop.IOLoop.instance())
    loop.start()

If I monitor the server using TOP I see that it uses 4% cpu typical with bursts to 60+ immediately, but later, say after a few hours it becomes in the 90% and stays there.

I have used strace and I see an enormous amount of Stat calls on the same files with errors shown in the strace -c view, but I cannot find any errors in a text file using -o trace.log. How can I find those errors ?

But I also notice that most of the time is consumed in epoll_wait.

%time

  • 41.61 0.068097 7 9484 epoll_wait
  • 26.65 0.043617 0 906154 2410 stat
  • 15.77 0.025811 0 524072 read
  • 10.90 0.017840 129 138 brk
  • 2.41 0.003937 9 417 madvise
  • 2.04 0.003340 0 524072 lseek
  • 0.56 0.000923 3 298 sendto
  • 0.06 0.000098 0 23779 gettimeofday
  • 100.00 0.163663 1989527 2410 total

Notice 2410 errors above.

When I view the strace output stream using attached pid, I just see endless Stat calls on the same files..

Can someone advise me as to how to better debug this situation? With only two clients and 20 seconds between client updates, I would expect the CPU usage (there are no other users of the site during this prototype stage) would be less than 1% or thereabouts.

1

There are 1 answers

0
freakish On

You need to close PeriodicCallbacks, otherwise its a memory leak. You do that by simply calling .close() on a PeriodicCallback object. One way to deal with that is in your periodic cleaning task:

def cleanSocketHouse():
    global clients
    new_clients = []
    for client in clients:
        if client.is_closed:
            # I don't know why you call it loop,
            # .timer would be more appropriate
            client.loop.close()
        else:
            new_clients.append(client)
    clients = new_clients

I'm not sure how accurate .is_closed is (some testing is required). The other way is to alter updateCoinList. The .send() method should fail when the client is no longer connected, right? Therefore try: except: should do the trick:

def updateCoinList(self):
    global clients
    pdata = db.getPricesOfCoinsInCurrency(self.coinlist,self.currency)
    try:
        self.send(dict(priceforcoins = pdata))
    except Exception:
        # log exception?
        self.loop.close()
        clients.remove(self)  # you should probably use set instead of list

If ,send() actually doesn't fail (for whatever reason, I'm not that familiar with Tornado) then stick to the first solution.