Gearman + SQLAlchemy - keep losing MySQL thread

1.4k views Asked by At

I have a python script that sets up several gearman workers. They call into some methods on SQLAlchemy models I have that are also used by a Pylons app.

Everything works fine for an hour or two, then the MySQL thread gets lost and all queries fail. I cannot figure out why the thread is getting lost (I get the same results on 3 different servers) when I am defining such a low value for pool_recycle. Also, why wouldn't a new connection be created?

Any ideas of things to investigate?

import gearman
import json
import ConfigParser
import sys
from sqlalchemy import create_engine

class JSONDataEncoder(gearman.DataEncoder):
    @classmethod
    def encode(cls, encodable_object):
        return json.dumps(encodable_object)
    @classmethod
    def decode(cls, decodable_string):
        return json.loads(decodable_string)

# get the ini path and load the gearman server ips:ports
try:
    ini_file = sys.argv[1]
    lib_path = sys.argv[2]
except Exception:
    raise Exception("ini file path or anypy lib path not set")

# get the config
config = ConfigParser.ConfigParser()
config.read(ini_file)
sqlachemy_url =  config.get('app:main', 'sqlalchemy.url')
gearman_servers =  config.get('app:main', 'gearman.mysql_servers').split(",")

# add anypy include path
sys.path.append(lib_path)
from mypylonsapp.model.user import User, init_model
from mypylonsapp.model.gearman import task_rates

# sqlalchemy setup, recycle connection every hour
engine = create_engine(sqlachemy_url, pool_recycle=3600)
init_model(engine)

# Gearman Worker Setup
gm_worker = gearman.GearmanWorker(gearman_servers)
gm_worker.data_encoder = JSONDataEncoder()

# register the workers
gm_worker.register_task('login', User.login_gearman_worker)
gm_worker.register_task('rates', task_rates)

# work
gm_worker.work()
1

There are 1 answers

1
David On

I've seen this across the board for Ruby, PHP, and Python regardless of DB library used. I couldn't find how to fix this the "right" way which is to use mysql_ping, but there is a SQLAlchemy solution as explained better here http://groups.google.com/group/sqlalchemy/browse_thread/thread/9412808e695168ea/c31f5c967c135be0

As someone in that thread points out, setting the recycle option to equal True is equivalent to setting it to 1. A better solution might be to find your MySQL connection timeout value and set the recycle threshold to 80% of it.

You can get that value from a live set by looking up this variable http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_connect_timeout

Edit: Took me a bit to find the authoritivie documentation on useing pool_recycle http://www.sqlalchemy.org/docs/05/reference/sqlalchemy/connections.html?highlight=pool_recycle