Access composite data type using sqlalchemy in postgres

1.1k views Asked by At

I'm trying to pull out composite columns from tiger.geocode function in python using sqlalchemy.
In pure sql form this looks like this:

SELECT   
    g.rating  
    ,ST_X(g.geomout) As lon  
    ,ST_Y(g.geomout) As lat  
    ,(addy).address As stno  
    ,(addy).streetname As street  
    ,(addy).streettypeabbrev As styp  
    ,(addy).location As city  
    ,(addy).stateabbrev As st  
    ,(addy).zip  
FROM geocode(pagc_normalize_address('1 Capitol Square Columbus OH 43215')) As g  
;

This produces the following output:

#   rating  lon lat stno    street  styp    city    st  zip
1   17  -82.99782603089086  39.96172588526335   1   Capital St  Columbus    OH  43215

The issue I'm faced with is how to reference composite columns when querying the object from sqlalchemy (rating, lon, lat, stno, street, styp, city, st, zip)?

Please and Thank you.

1

There are 1 answers

0
univerio On

SQLAlchemy doesn't support set returning functions directly, but its FunctionElements are considered FromClauses, which means you can already treat them as tables; we just need to add the ability to select a particular column from the function. Fortunately, this is straightforward (although not obvious):

from sqlalchemy.sql.base import ColumnCollection
from sqlalchemy.sql.expression import column
from sqlalchemy.sql.functions import FunctionElement

NormAddy = CompositeType(
    "norm_addy",
    [
        Column("address", Integer),
        Column("predirAbbrev", String),
        Column("streetName", String),
        Column("streetTypeAbbrev", String),
        Column("postdirAbbrev", String),
        Column("internal", String),
        Column("location", String),
        Column("stateAbbrev", String),
        Column("zip", String),
        Column("parsed", Boolean),
    ],
)

class geocode(GenericFunction):
    columns = ColumnCollection(
        Column("rating", Integer),
        column("geomout"),  # lowercase column because we don't have the `geometry` type
        Column("addy", NormAddy),
    )

Subclassing from GenericFunction has the added benefit of registering the geocode function globally so that func.geocode will work as expected.

g = func.geocode(func.pagc_normalize_address("1 Capitol Square Columbus OH 43215")).alias("g")
query = session.query(
    g.c.rating,
    func.ST_X(g.c.geomout).label("lon"),
    func.ST_Y(g.c.geomout).label("lat"),
    g.c.addy.address.label("stno"),
    g.c.addy.streetName.label("street"),
    g.c.addy.streetTypeAbbrev.label("styp"),
    g.c.addy.location.label("city"),
    g.c.addy.stateAbbrev.label("st"),
    g.c.addy.zip,
).select_from(g)

Unfortunately this doesn't quite work. There seems to be a bug that makes g.c.addy.address syntax not work on recent versions of SQLAlchemy. We can fix it real quick (although this should really be fixed in sqlalchemy_utils):

from sqlalchemy_utils.types.pg_composite import CompositeElement
import sqlalchemy_utils

class CompositeType(sqlalchemy_utils.CompositeType):
    class comparator_factory(_CompositeType.comparator_factory):
        def __getattr__(self, key):
            try:
                type_ = self.type.typemap[key]
            except KeyError:
                raise AttributeError(key)
            return CompositeElement(self.expr, key, type_)

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.typemap = {c.name: c.type for c in self.columns}

Now it works:

print(query.statement.compile(engine))
# SELECT g.rating, ST_X(g.geomout) AS lon, ST_Y(g.geomout) AS lat, (g.addy).address AS stno, (g.addy).streetName AS street, (g.addy).streetTypeAbbrev AS styp, (g.addy).location AS city, (g.addy).stateAbbrev AS st, (g.addy).zip AS zip_1 
# FROM geocode(pagc_normalize_address(%(pagc_normalize_address_1)s)) AS g