I have a Rails application that stores its configuration in 34 MySQL tables consisting of various objects and associations, for a total of about 900 records altogether. Up to recently the business logic was built on ActiveRecord, but performance was choppy because I didn't have enough control on how many queries were fired. Recently I "ported" the business logic to Dry::Struct
, duplicating all the involved ActiveRecord classes to Dry::Struct
value models and preloading all the configuration objects inside a Configuration
instance: this drastically reduced the number of queries to a small and fixed amount, and improved performance by a noticeable margin because all the association “walking” is done in memory, and there's a lot of that.
So far so good, but the loading of the 34 tables, most of which I need at every request, still took 34 queries and about 160 ms. The strategy was going low level on ActiveRecord, loading all the records in all the tables as plain hashes, and then initializing the structs with those and storing everything in the Configuration
object.
I wanted to improve performance further, so I had the idea of getting all the data in just one query, by making a UNION
of all the fields in all the tables. That makes up a beefy 30-kilobyte SQL query, which is surprisingly executed in just 20 ms. Excellent! Now I just have to unroll this big structure in ~10 ms and I'm a winner!
Well, no. It turns out that just scanning the result array takes 48 ms (20 of which are the SQL query), that time grows up to 78 ms when parsing some JSON fields here and there and preparing the hashes to initialize the structs... and then initializing those takes an additional 89 ms all by itself. I wouldn't believe it if I hadn't measured every step by repeating the algorithm 100 times (after pre-warming memoized values, of course), but there it is. All in all, compared to the previous, much simpler algorithm of loading every table separately, there's no performance gain whatsoever although the single query is efficient.
Here is what the SQL looks like:
SELECT a1, a2, NULL, NULL, NULL, NULL FROM table_a
UNION ALL
SELECT NULL, NULL, b1, b2, NULL, NULL FROM table_b
UNION ALL
SELECT NULL, NULL, NULL, NULL, c1, c2 FROM table_c
which yields a “diagonal” structure like this
"string", 3, NULL, NULL, NULL, NULL -- from table_a
"other string", 5, NULL, NULL, NULL, NULL -- from table_a
NULL, NULL, 1, "{\"json\":true}", NULL, NULL -- from table_b
NULL, NULL, 2, NULL, NULL, NULL -- from table_b
NULL, NULL, NULL, NULL, 7, 10 -- from table_c
NULL, NULL, NULL, NULL, 9, 51 -- from table_c
then the following algorithm unfolds it into the original records:
def preload_all!
ranges = self.class.preload_field_ranges.invert
logger.measure_debug("Preloaded configuration") do
ApplicationRecord.connection.execute(self.class.preload_query).each do |data|
# finding where the first significant column is
pos = data.index { |i| !i.nil? }
# resolving the table name based on where the significant value was found, exiting early
range, table_name = ranges.select { |k, v| break [ k, v ] if pos.in?(k) }
v_class = self.class.tables_to_value_classes[table_name]
values = data[range].map do |i|
case i
when String
# horrible kludge to parse JSON fields, because I wasn't able to inspect AR
# classes to ask them which fields are serialized, any help is appreciated
case
when i[0].in?([ "{", "[" ]) then JSON.parse(i)
else i
end
else i
end
end
# assignments are for clarity, doing these operations inline shaves about 10 ms
ivar = "@#{table_name}"
hash = self.class.ar_attribute_names[table_name].zip(values).to_h
v_model = v_class.new(hash.merge(configuration: self))
vhash = instance_variable_get(ivar) || {}
instance_variable_set(ivar, vhash.tap { |h| h[v_model.id] = v_model })
end
end
self
end
I tried to tighten the code as much as I could, but just removing the v_class.new
part cuts the time in half. Is there any room for improvement?
As a side note, loading the Marshal
ed finished Configuration
object from Redis only takes 10 ms, but I wanted to avoid using Redis to prevent misalignments.
As suggested by none other than solnic himself: