Trouble with LEFT OUTER JOIN

161 views Asked by At

I'm trying to get a list of products order by the amount sold and by date. I also want to display the products that haven't been sold in the list so I tried doing a subquery first but MYSQL is giving me this message:

Operand should contain 1 column(s)

SELECT product.product_id, 
       product.product_brand_id, 
       product.product_model_id, 
       product.product_subcategory_id, 
       product.product_retail_price, 
       product.product_wholesale_price
FROM product
WHERE product.product_subcategory_id = $subcategory_id 
AND (SELECT SUM(product_sold.product_quantity) AS product_quantity_sold, 
            SUM(product_sold.product_total_price) AS total_price_sold 
     FROM product
     INNER JOIN product_sold 
        ON product.product_id = product_sold.product_id
     INNER JOIN sales 
        ON sales.sales_id = product_sold.product_sales_id 
     WHERE sales.sales_approved = '1' 
       AND sales.sales_approved_time > '$start_timestamp' 
       AND sales.sales_approved_time < '$end_timestamp')

The subquery did not work, So i tried using LEFT OUTER JOIN as suggested by another member with this query:

SELECT product.product_id, 
       product.product_brand_id, 
       product.product_model_id, 
       product.product_subcategory_id, 
       product.product_retail_price, 
       product.product_wholesale_price, 
       SUM(product_sold.product_quantity) AS product_quantity_sold, 
       SUM(product_sold.product_total_price) AS total_price_sold 
FROM product
LEFT OUTER JOIN product_sold ON product.product_id = product_sold.product_id 
                            AND product.product_subcategory_id = $subcategory_id
LEFT OUTER JOIN sales ON sales.sales_id = product_sold.product_sales_id 
WHERE sales.sales_approved = '1' 
  AND sales.sales_approved_time > '$start_timestamp' 
  AND sales.sales_approved_time < '$end_timestamp'
GROUP BY product.product_id 
ORDER BY SUM(product_sold.product_quantity) DESC

But this query with LEFT OUTER JOIN is giving me the list of product sold only, what I want is to also show the products that haven't been sold in the list.

Here is the schema used sqlfiddle.com/#!2/967ee

1

There are 1 answers

3
OGHaza On BEST ANSWER

This query will work, to SELECT more columns from the products table add them to both the SELECT and GROUP BY clauses:

SELECT p.product_id
    ,SUM(IFNULL(ps.product_quantity,0)) AS product_quantity_sold
    ,SUM(IFNULL(ps.product_total_price,0)) AS total_price_sold 
FROM product p
LEFT JOIN product_sold ps ON p.product_id = ps.product_id
LEFT JOIN sales s ON ps.product_sales_id = s.sales_id
WHERE p.product_subcategory_id = $subcategory_id
  AND ( s.sales_id IS NULL
       OR ( s.sales_approved = '1' 
          AND s.sales_approved_time > '$start_timestamp'  
          AND s.sales_approved_time < '$end_timestamp'
          )
       )
GROUP BY p.product_id
ORDER BY SUM(IFNULL(ps.product_quantity,0)) DESC

Explanation

This query would have been much simpler if you could have used product_sold_approved and product_sold_approved_time in the WHERE clause instead of values from the sales table.

You LEFT JOIN product_sold, a LEFT JOIN means you keep all records from the products table, and those that have been sold will get joined to each of the product_sold records. Then you do the same for the sales table.

So at this stage, you have lots of rows that are product + product_sold + sales but you also have all the unsold products product + NULL + NULL. You need to filter out all the joined records where the matching sale fields do not meet your criteria, but you need to leave all the records that failed to join alone.

To achieve this you have a WHERE clause that deals with each set of records separately. WHERE (condition A) OR (condition B).

Condition A deals with our unsold products, WHERE s.sales_id IS NULL - all records that couldn't join to a sale are included in the result set and don't have to match the other criteria.

OR (sales_approved = 1 AND ... AND ...) records where s.sales_id isn't NULL will have to pass this half of the where clause, hence we filter out unwanted sales.

At the end we're left with a result set that contains all records that didn't have any sales + all product/product_sales/sales record that met your criteria. Then we can just GROUP BY product_id AND SUM what's left.

I have IFNULLs in my SQL, these are because if you're summing lots of values and some of them might be NULL, if a single value is NULL then the result of the SUM is NULL. The IFNULL(column,0) protects against this by converting any NULL values to a 0. 1 + NULL = NULL, 1 + 0 = 1. HOWEVER, on reflection I thats probably not needed for the query - remove them and you should notice the change.