Why does adding a JOIN completely modify the query planner behaviour?

116 views Asked by At

I have two queries:

SELECT "recipes_recipe"."short_name",
    COUNT(DISTINCT "recipes_recipe"."quantity_type") AS "quantity_type_count",
    SUM("measures_measure"."standard") AS "volume",
    CASE WHEN COUNT(DISTINCT "recipes_recipe"."quantity_type") = 1 
      THEN (SUM((T7."standard" * T8."standard")) / SUM(T8."standard"))
      ELSE NULL END AS "weighted_temperature"
FROM "orders_orderitemresult"
INNER JOIN "orders_orderitem" 
  ON ( "orders_orderitemresult"."order_line_id" = "orders_orderitem"."id" )
INNER JOIN "orders_order" 
  ON ( "orders_orderitem"."order_id" = "orders_order"."id" )
INNER JOIN "recipes_recipe" 
  ON ( "orders_orderitem"."recipe_id" = "recipes_recipe"."id" )
LEFT OUTER JOIN "measures_measure" 
  ON ( "orders_orderitemresult"."measured_volume_id" = "measures_measure"."id" )
LEFT OUTER JOIN "measures_measure" T7 
  ON ( "orders_orderitemresult"."temperature_id" = T7."id" )
INNER JOIN "measures_measure" T8 
  ON ( "orders_orderitemresult"."loaded_quantity_id" = T8."id" )
WHERE ("orders_orderitemresult"."deleted" = False
AND "orders_order"."freight_id" IN (
  SELECT U0."id" 
  FROM "freights_freight" U0 
  WHERE (
   U0."deleted" = False
   AND U0."state" = 'completed'
   AND U0."completed_date" > '2015-06-06 00:00:00+00:00'
) ) )
GROUP BY "recipes_recipe"."short_name"
ORDER BY "recipes_recipe"."short_name" ASC

And the same with one more JOIN:

SELECT "recipes_recipe"."short_name",
   COUNT(DISTINCT "recipes_recipe"."quantity_type") AS "quantity_type_count",
   SUM("measures_measure"."standard") AS "volume",
   CASE WHEN COUNT(DISTINCT "recipes_recipe"."quantity_type") = 1
     THEN (SUM((T7."standard" * T8."standard")) / SUM(T8."standard")) 
     ELSE NULL END AS "weighted_temperature"
FROM "orders_orderitemresult"
INNER JOIN "orders_orderitem" 
  ON ( "orders_orderitemresult"."order_line_id" = "orders_orderitem"."id" )
INNER JOIN "orders_order" 
  ON ( "orders_orderitem"."order_id" = "orders_order"."id" )
INNER JOIN "recipes_recipe" 
  ON ( "orders_orderitem"."recipe_id" = "recipes_recipe"."id" )
LEFT OUTER JOIN "measures_measure"
  ON ( "orders_orderitemresult"."measured_volume_id" = "measures_measure"."id" )
LEFT OUTER JOIN "measures_measure" T7
  ON ( "orders_orderitemresult"."temperature_id" = T7."id" )
INNER JOIN "measures_measure" T8
  ON ( "orders_orderitemresult"."loaded_quantity_id" = T8."id" )
LEFT OUTER JOIN "measures_measure" T9
  ON ( "orders_orderitemresult"."converted_volume_id" = T9."id" )
WHERE ("orders_orderitemresult"."deleted" = False
AND "orders_order"."freight_id" IN (
  SELECT U0."id"
  FROM "freights_freight" U0
  WHERE (
   U0."deleted" = False
   AND U0."state" = 'completed'
   AND U0."completed_date" > '2015-06-06 00:00:00+00:00'
)))
GROUP BY "recipes_recipe"."short_name"
ORDER BY "recipes_recipe"."short_name" ASC

Theses queries were generated by Django ORM. The first one takes 3ms while the second one takes 120s (!). I did a VACUUM ANALYZE. The EXPLAIN ANALYZE give the following:

What can caused the query planner to go this way ? Why are the index not used in the second query ?

I tried disabling hash but it used a more complex method.

Edit:

More information about the indexes:

       table_name       |                         index_name                          |           column_names           
------------------------+-------------------------------------------------------------+----------------------------------
 measures_measure       | measures_measure_pkey                                       | id
 orders_orderitem       | orders_orderitem_69dfcb07                                   | order_id
 orders_orderitem       | orders_orderitem_7339046a                                   | completed_date
 orders_orderitem       | orders_orderitem_9ed39e2e                                   | state
 orders_orderitem       | orders_orderitem_da50e9c3                                   | recipe_id
 orders_orderitem       | orders_orderitem_pkey                                       | id
 orders_orderitem       | orders_orderitem_quantity_id_key                            | quantity_id
 orders_orderitem       | orders_orderitem_state_68c8c3135683908e_like                | state
 orders_orderitemresult | orders_orderitemresult_5fc62ccf                             | order_line_id
 orders_orderitemresult | orders_orderitemresult_8424d087                             | creation_date
 orders_orderitemresult | orders_orderitemresult_98ea6344                             | result_type
 orders_orderitemresult | orders_orderitemresult_c00077c4                             | loading_rack_id
 orders_orderitemresult | orders_orderitemresult_converted_volume_id_key              | converted_volume_id
 orders_orderitemresult | orders_orderitemresult_counter_end_id_key                   | counter_end_id
 orders_orderitemresult | orders_orderitemresult_counter_start_id_key                 | counter_start_id
 orders_orderitemresult | orders_orderitemresult_e35d4212                             | loading_arm_id
 orders_orderitemresult | orders_orderitemresult_loaded_quantity_id_key               | loaded_quantity_id
 orders_orderitemresult | orders_orderitemresult_mass_id_key                          | mass_id
 orders_orderitemresult | orders_orderitemresult_measure_temperature_density_id_key   | measure_temperature_density_id
 orders_orderitemresult | orders_orderitemresult_measured_volume_id_key               | measured_volume_id
 orders_orderitemresult | orders_orderitemresult_pkey                                 | id
 orders_orderitemresult | orders_orderitemresult_reference_temperature_density_id_key | reference_temperature_density_id
 orders_orderitemresult | orders_orderitemresult_result_type_19bfde1efbd2fb2e_like    | result_type
 orders_orderitemresult | orders_orderitemresult_target_quantity_id_key               | target_quantity_id
 orders_orderitemresult | orders_orderitemresult_temperature_id_key                   | temperature_id
 recipes_recipe         | recipes_recipe_9bea82de                                     | product_id
 recipes_recipe         | recipes_recipe_bdc611c2                                     | external_code
 recipes_recipe         | recipes_recipe_c76a5e84                                     | active
 recipes_recipe         | recipes_recipe_code_64a829b99ae673f2_like                   | code
 recipes_recipe         | recipes_recipe_code_key                                     | code
 recipes_recipe         | recipes_recipe_external_code_492873921b11a1bf_like          | external_code
 recipes_recipe         | recipes_recipe_long_name_38b0490ebf585c5c_like              | long_name
 recipes_recipe         | recipes_recipe_long_name_key                                | long_name
 recipes_recipe         | recipes_recipe_pkey                                         | id
 recipes_recipe         | recipes_recipe_short_name_5e9f02ca3b6b7843_like             | short_name
 recipes_recipe         | recipes_recipe_short_name_key                               | short_name
1

There are 1 answers

1
kiorky On BEST ANSWER

Use a subselect to limit the scope of the cartesian product which will drastically improve your query cost.

Something like

SELECT "recipes_recipe"."short_name",
   COUNT(DISTINCT "recipes_recipe"."quantity_type") AS "quantity_type_count",
   SUM("measures_measure"."standard") AS "volume",
   CASE WHEN COUNT(DISTINCT "recipes_recipe"."quantity_type") = 1
     THEN (SUM((T7."standard" * T8."standard")) / SUM(T8."standard"))
     ELSE NULL END AS "weighted_temperature",
   SUM(T9."standard") AS "converted_volume"
FROM (
  SELECT
  orders_orderitem.recipe_id,
  orders_orderitem.id as orders_orderitem_id,
  "orders_orderitemresult"."measured_volume_id",
  "orders_orderitemresult"."temperature_id",
  "orders_orderitemresult"."loaded_quantity_id",
  "orders_orderitemresult"."converted_volume_id"
  FROM "orders_orderitemresult"
  INNER JOIN "orders_orderitem"
    ON ( "orders_orderitemresult"."order_line_id" = "orders_orderitem"."id" )
  INNER JOIN "orders_order"
    ON ( "orders_orderitem"."order_id" = "orders_order"."id" )
  WHERE (
  "orders_orderitemresult"."deleted" = False
  AND "orders_order"."freight_id" IN (
    SELECT U0."id"
    FROM "freights_freight" U0
    WHERE (
     U0."deleted" = False
     AND U0."state" = 'completed'
     AND U0."completed_date" > '2015-06-06 00:00:00+00:00'
)))) subselect
INNER JOIN "recipes_recipe"
  ON ( "subselect"."recipe_id" = "recipes_recipe"."id" )
INNER JOIN "measures_measure"
  ON ( "subselect"."measured_volume_id" = "measures_measure"."id" )
INNER JOIN "measures_measure" T7
  ON ( "subselect"."temperature_id" = T7."id" )
INNER JOIN "measures_measure" T8
  ON ( "subselect"."loaded_quantity_id" = T8."id" )
INNER JOIN "measures_measure" T9
  ON ( "subselect"."converted_volume_id" = T9."id" )
GROUP BY "recipes_recipe"."short_name"
ORDER BY "recipes_recipe"."short_name" ASC;

should be better