Query language for python objects

10.3k views Asked by At

I'm looking for a library that lets me run SQL-like queries on python "object databases". With object database I mean a fairly complex structure of python objects and lists in memory. Basically this would be a "reverse ORM" - instead of providing an object oriented interface to a relational database, it would provide a SQL-ish interface to an object database.

C#'s LINQ is very close. Python's list comprehensions are very nice, but the syntax gets hairy when doing complex things (sorting, joining, etc.). Also, I can't (easily) create queries dynamically with list comprehensions.

The actual syntax could either be string based, or use a object-oriented DSL (a la from(mylist).select(...)). Bonus points if the library would provide some kind of indices to speed up search.

Does this exist or do I have to invent it?

7

There are 7 answers

1
Tobu On

Dee is aiming to be SQL (ish; the author prefers relational calculus) for in-memory structures. There's a GROUP BY equivalent and everything.

4
Steven On

If you like list comprehensions, don't forget about their "lazy" counterpart: generator expressions. These should at least to some extent solve the question of dynamically constructing queries. Complement this with itertools, and of course some of the builtin functions that work on iterables, and you could say you have your Python "LINQ to Object" (or at least something very close).

You won't get "pseudo-sql" syntax built in in python as with C# and LINQ, but that is a matter of taste I guess. (personally, I like the fact that it remains Python, uses the same keywords etc. instead of trying to be sql: familiar if and for clauses instead of introducing where and from and so on).

The object-oriented DSL approach seems feasible (and easier to implement than string-based I think). You will find this sort of thing in ORMs like SqlAlchemy, but I don't know if anyone has done that already for "normal" data structures (I guess people don't really feel it's necessary?)

For indices and so on: I guess you'll have to search for actual "object databases"...

0
SingleNegationElimination On

One major difference between what SQL does and what you can do in idiomatic python, in SQL, you tell the evaluator what information you are looking for, and it works out the most efficient way of retrieving that based on the structure of the data it holds. In python, you can only tell the interpreter how you want the data, there's no equivalent to a query planner.

That said, there are a few extra tools above and beyond list comprehensions that help alot.

First, use a structure that closely resembles the declarative nature of SQL. Many of them are builtins. map, filter, reduce, zip, all, any, sorted, as well as the contents of the operator, functools and itertools packages, all offer a fairly concise way of expressing data queries.

0
PaulMcG On

Not quite exactly what you describe, but littletable works with lists of objects in memory, supports joins, pivots, queries. And each result is in turn another littletable, so you can easily perform a succession of filtering steps.

0
paul-g On

I came across this as I was looking for a good library to support a similar use case. Turns out that Pandas works nicely for cases where join/group by type of queries are needed or for time series processing.

0
randomsurfer_123 On

We're launching PythonQL, which does exactly what you mention (its quite similar to C# LINQ). PythonQL demo site

0
PaulMcG On

See if sql4csv might be of help.