Subquery is returning more than one value? Not getting a result set?

537 views Asked by At

Write a SELECT statement that returns two columns: VendorName and LargestInv (LargestInv is the correlation name of the subquery)

Subquery portion: SELECT statement that returns the largest InvoiceTotal from the Invoices table (you will need to perform the JOIN within the subquery in one of the clauses).

Sort the results by LargestInv from largest to smallest.(Subquery Must be in the Select statement)

I have tried this but My subqueries returning more than one value

USE AP

SELECT VendorName, (SELECT MAX(InvoiceTotal) FROM Invoices JOIN Vendors
                        ON Invoices.VendorID = Vendors.VendorID
                    GROUP BY Invoices.VendorID) AS LargestInv
FROM Vendors
2

There are 2 answers

4
Eric Brandt On BEST ANSWER

Your issue is scope.

The sub-query shouldn't be joining to the Vendor table if the goal is a correlated sub-query. The "correlated" part comes from joining the results of the inner query (the sub-query) to the outer query.

As written, you're finding VendorID inside the sub-query and the results aren't correlated to the outer query at all. Hence your error message.

SELECT 
  VendorName, 
  (SELECT MAX(InvoiceTotal) 
   FROM Invoices 
   WHERE Invoices.VendorID = Vendors.VendorID
  ) AS LargestInv
FROM Vendors
ORDER BY LargestInv DESC;

Edit (extended explanation):

A correlated sub-query isn't designed to pull up a full result set, like the sub-query you were writing at first. It's designed to go over to another table and use a value (or values) from the outer query to bring back a single result, one row at a time.

In this case, using the VendorID from the Vendors table, go over to Invoices, calculate a MAX value "WHERE" the VendorID in Invoices matches the VendorID ON THIS ROW, bring that single value back, then, next row, go back and do that again. And again and again.

It's one way to get the data, but it's not usually efficient. Later, though, you'll learn to use correlated sub-queries in (NOT) EXISTS clauses, and in that context they tend to be extremely efficient. Story for another day, but it's one reason the construct is important to know.

So, your way was good, because it was set based and would tend to be more efficient as a sub-query in the FROM clause, but this way, row by row, is important to understand conceptually.

4
Hogan On

This is how I would do it.

SELECT VendorName, LargestInv.MaxI
FROM Vendors
FROM (
  SELECT VendorName, MAX(InvoiceTotal) as MaxI 
  FROM Invoices 
  JOIN Vendors ON Invoices.VendorID = Vendors.VendorID
  GROUP BY VendorName
) AS LargestInv ON LargestInv.VendorName = Vendors.VendorName

Now having more than one in the sub-query won't give you an error and you can look at the results.