Too much selects are made in one-to-one relationships in SQLALchemy

250 views Asked by At
import sqlalchemy as db
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Entity(Base):
    __tablename__ = 'entity'
    id = db.Column(db.Integer, primary_key=True, index=True)
    a = db.Column(db.Integer, index=True)
    b = db.Column(db.Integer)

    foos = relationship('Foo')


class Foo(Base):
    __tablename__ = 'foo'
    id = db.Column(db.Integer, primary_key=True, index=True)

    entity_id = db.Column(db.Integer, db.ForeignKey('entity.id'))

    entity = relationship('Entity', uselist=False)
    bars = relationship('Bar')


class Bar(Base):
    __tablename__ = 'bar'
    id = db.Column(db.Integer, primary_key=True, index=True)

    foo_id = db.Column(db.Integer, db.ForeignKey('foo.id'))

    foo = relationship('Foo', uselist=False)

engine = db.create_engine('sqlite:///:memory:', echo=False)
session = sessionmaker(bind=engine)()
Base.metadata.create_all(engine)

def relationship_optimizing():
    engine.echo = True
    entity = Entity(a=1000000, b=10000000000)
    foo = Foo(entity=entity)
    bar = Bar(foo=foo)
    session.add_all([entity, foo, bar])
    session.commit()
    bla = session.query(Entity).filter_by(id=bar.foo.entity_id).one()
    session.commit()

relationship_optimizing()

When I try to access some objects using one-to-one relashionship access chains, I get one SELECT per every dot operation even if it is not necessary to make a query.

Look at the code. When I try to get object "bla", SQLAlchemy generates 3 queries:

SELECT bar.id AS bar_id, bar.foo_id AS bar_foo_id 
FROM bar 
WHERE bar.id = 1

SELECT foo.id AS foo_id, foo.entity_id AS foo_entity_id 
FROM foo 
WHERE foo.id = 1

SELECT entity.id AS entity_id, entity.a AS entity_a, entity.b AS entity_b 
FROM entity 
WHERE entity.id = 1

I tried to use lazy="joined" and lazy="subquery" in all relationships, but first 2 queries are still present. I want to get rid of them. Final query may use joins, but it should be the only one query.

This is a toy example, but in the real project there are too much parasite queries like that when I just access relatioonship fields. My project does a lot of small queries (mostly with one record) so performance is really slow :(

1

There are 1 answers

3
Eevee On BEST ANSWER

The problem is that you're committing after you call add_all. After you commit, SQLAlchemy has no way of knowing that other transactions haven't modified the rows corresponding to your objects, so it "forgets" all of their attributes until the next time you use them.

There's no reason to commit in the middle of your work, and it really defeats the purpose of transactions in the first place. If all you need to do is get ids populated on your objects, use session.flush(), which updates the database to match your objects — without committing.

session.commit() only happens to solve this problem because it (necessarily) flushes first, but for some reason, it's become apocryphal advice to commit instead of flush when all you need is automatic keys.