Express SQL UNION query Rails way

3.2k views Asked by At

I got a query which works well, but expressed in SQL. I would like to have the same query expressed using ActiveRecord Query Interface (Arel would also do fine). The query should preferably return ActiveRecord::Relation or, at least, it's result should be convertible to an array of Customer models.

The goal is to fetch company's customers having no associated import_logs with remote_type = 'account', as well as customers having an import_log with remote_type = 'account' and status = 'pending'.

A customer can have no import_logs associated at all, or have an import_log per each remote_type, or only for some remote_types. There can be only one associated import_log with specific remote_type value.

This reflects a requirement that a customer can be imported as account or contact or both and import_log tracks the status of the import.

Although import_log has a polymorphic association with customer, this is not relevant to the task.

Existing query:

Customer.find_by_sql(
  <<-SQL
    SELECT
      customers.*
    FROM
      customers
    WHERE
      company_id = #{@company.id}
      AND NOT EXISTS
          ( SELECT *
            FROM import_logs
            WHERE import_logs.importable_id = customers.id
              AND import_logs.importable_type = 'Customer'
              AND import_logs.remote_type = 'account'
          )
    UNION
    SELECT
      customers.*
    FROM
      customers,
      import_logs
    WHERE
      import_logs.importable_id = customers.id AND
      import_logs.importable_type = 'Customer' AND
      company_id = #{@company.id} AND
      import_logs.remote_type = 'account' AND
      import_logs.status = 'pending';
  SQL
)

Relevant portions of ImportLog model:

create_table "import_logs", force: true do |t|
  t.integer  "importable_id"
  t.string   "importable_type"
  t.string   "status",          default: "pending", null: false
  t.string   "remote_type"
  ...
end

add_index "import_logs", ["importable_id", "importable_type", "remote_type"], unique: true ...

class ImportLog < ActiveRecord::Base
  ...
  belongs_to :importable, polymorphic: true
  ...
end

Relevant portions of Customer model:

create_table "customers", force: true do |t|
  t.integer  "company_id"
  ...
end

class Customer < ActiveRecord::Base
  ...
  belongs_to :company
  has_many :import_logs, as: :importable
  ...
end

And of Company model, just in case:

class Company < ActiveRecord::Base
  ...
  has_many :customers
  ...
end
2

There are 2 answers

11
D-side On BEST ANSWER

merge associations

In fact, there is only one assocation that is driven by a query constant.

"customers"."company_id" = #{@company.id}

It's the same as:

.merge(@company.customers)

...and this looks a bit safer and more sensible

Arel tables

We'll need it soon.

customers = Customer.arel_table

NOT EXISTS ... subquery

Arel can do that, the only not-so-obvious thing is how to reference the outside table:

ne_subquery = ImportLog.where(
                importable_type: Customer.to_s,
                  importable_id: customers[:id],
                    remote_type: 'account'
              ).exists.not

This results in a hunk of Arel AST we can feed to Rails' where-statement.

Now both queries become obvious:

first  = @company.customers.where(ne_subquery)
second = @company.customers.joins(:import_logs).merge(
           ImportLog.where(
           # importable_id: customers[:id], # `joins` already does it
           importable_type: Customer.to_s,
               remote_type: 'acoount',
                    status: 'pending'
           )
         )

That is pretty much a 1-to-1 conversion.

Union

That is a tricky part, and the only solution I have found has terribly ugly syntax and outputs a bit different query. Given A union B we can only build select X.* from (A union B) X. The effect is the same.

All right, let's get to it:

Customer.from(
  customers.create_table_alias(
    first.union(second),
    Customer.table_name
  )
)

Of course, to make this query more readable, you should:

  • Place it as a scope in Customer class
  • Split out reusable parts into scopes and associations
0
Nic Nilov On

Based on the code suggested by @D-side I was able to reach a working solution. Here is the initially suggested code:

customers = Customer.arel_table

ne_subquery = ImportLog.where(
  importable_type: Customer.to_s,
  importable_id: customers['id'],
  remote_type: 'account'
).exists.not

first  = @company.customers.where(ne_subquery)
second = @company.customers.joins(:import_logs).merge(
  ImportLog.where(
    importable_type: Customer.to_s,
    remote_type: 'account',
    status: 'pending'
  )
)

Customer.from(
  customers.create_table_alias(
    first.union(second),
    Customer.table_name
  )
)

Running it leads to this error:

PG::ProtocolViolation: ERROR:  bind message supplies 0 parameters, but prepared statement "" requires 1

: SELECT "customers".* FROM ( SELECT "customers".* FROM "customers"  WHERE "customers"."company_id" = $1 \
AND (NOT (EXISTS (SELECT "import_logs".* FROM "import_logs"  WHERE "import_logs"."importable_type" = 'Customer' \
AND "import_logs"."importable_id" = "customers"."id"))) UNION SELECT "customers".* FROM "customers" \
INNER JOIN "import_logs" ON "import_logs"."importable_id" = "customers"."id" \
AND "import_logs"."importable_type" = 'Customer' WHERE "customers"."company_id" = $1 \
AND "import_logs"."importable_type" = 'Customer' AND "import_logs"."remote_type" = 'contact' \
AND "import_logs"."status" = 'pending' ) "customers"

This error I believe is a manifestation of Rails issue #20077 which at this time stands unresolved. Since the issue relates to parameter binding, making that binding more explicit helps. Here is a working solution:

customers = Customer.arel_table

ne_subquery = ImportLog.where(
  importable_type: Customer.to_s,
  importable_id: customers['id'],
  remote_type: 'account'
).exists.not

first  = Customer.where(ne_subquery).where(company_id: @company.id)
second = Customer.joins(:import_logs).merge(
  ImportLog.where(
    importable_type: Customer.to_s,
    remote_type: 'account',
    status: 'pending'
  )
).where(company_id: @company.id)

Customer.from(
  customers.create_table_alias(
    first.union(second),
    Customer.table_name
  )
)

Note the .where(company_id: @company.id) are applied explicitly and first and second queries begin unscoped.