I have a small database of various computer parts and prices. There are 2 tables, Parts and Prices.
Parts:
partID desc mfgr timeStamp
---------- ---------- ---------- ----------
1 RAM Crucial 1
2 MOBO MSI 1
3 I7 970 Intel 1
1 RAM Crucial 2
Prices:
productID qty price timeStamp
---------- ---------- ---------- ----------
1 1 50.0 1
1 2 100.0 1
1 3 130.0 1
2 1 140.0 1
3 1 499.99 1
3 1 449.99 2
1 4 150.0 2
2 1 150.0 2
1 1 40.0 2
1 4 200.0 3
I need to grab everything from Parts that has the most recent timestamp, and GROUP_CONCAT(price) for all of the prices that match both the partID and the timestamp. So the output should look something like this:
partID desc mfgr timeStamp GROUP_CONCAT(price)
---------- ---------- ---------- ---------- -------------------
1 RAM Crucial 2 150.0,40
2 MOBO MSI 1 140.0
3 I7 970 Intel 1 499.99
I'm really close, but not quite getting the right results. I have tried
SELECT * FROM Parts INNER JOIN
(SELECT partID, MAX(Parts.timeStamp) as maxTS, GROUP_CONCAT(price) FROM
Parts, Prices WHERE partID = Prices.productID GROUP BY partID) grouped
ON Parts.partID = grouped.partID AND Parts.timeStamp = grouped.maxTS;
But this grabs everything from the pricing table where the part ID matches, whether or not the timestamp also matches.
partID desc mfgr timeStamp partID maxTS GROUP_CONCAT(price)
---------- ---------- ---------- ---------- ---------- ---------- -------------------
2 MOBO MSI 1 2 1 140.0,150.0
3 I7 970 Intel 1 3 1 449.99,499.99
1 RAM Crucial 2 1 2 40.0,50.0,100.0,130
So I changed my command to
SELECT * FROM Parts INNER JOIN
(SELECT partID, MAX(Parts.timeStamp) AS maxTS, GROUP_CONCAT(price)
FROM Parts, Prices
WHERE partID = Prices.productID AND (SELECT MAX(parts.timestamp) FROM Parts) = Prices.timeStamp) grouped
ON Parts.partID = grouped.partID AND Parts.timeStamp = grouped.maxTS;
But this only matches rows from pricing that have the largest timestamp that occurs in parts. (Which is 2)
What am I doing wrong here?
The mistake you are making is that you are grouping price entries before filtering out only the entries that you need. So, your
grouped
sub-query will contain all the prices for apartID
, and there is no way to separate them out later, because they are grouped.The best approach for solving these kind of things is breaking down your queries into pieces that you need.
You say:
So, let's do that. Note that this will pretty much require a sub-query, or "pivot", since RDBMS are not that good in letting you pick a row on a condition that depends on other rows (in this case, selecting rows where a field is maximum among some group). We'll call this sub-query
aux
, and use it to select entries fromparts
that match thepartID
/timeStamp
combination that satisfies the condition:This is using implicit joins, you can rewrite this query using
JOIN
syntax as well. I personally avoidJOIN
syntax unless I need left, or other special joins:Now, you wanna join it with the prices of the same ID/timestamp, but group the prices together (as
GROUP_CONCAT
). The key here is to only select (which happens before grouping) the price entries that match the "latest" part entries.Because the first query produces output that can be directly joined with prices tables, the query just needs to be extended to include price table, and grouping:
This can be re-written using
JOIN
syntax as well:This query is a little bit convoluted, and depending on the data set, it may be beneficial to re-write it in a different way, to make sure database understands (good for the optimizer) what gets filtered first. We can move the filtered "parts" into a sub-query of its own (call it
bux
), and then join that withprices
table:If you check the execution plans between the two, you will see a difference. Selecting which one to use on production would depend on which one performs better.
http://sqlfiddle.com/#!9/f12c8/10/0