Suppose I have Tables:
dbo.Purchases
Id | Value | UserId
1 | 10.00 | 3
2 | 1.00 | 1
3 | 15.50 | 2
4 | 13.40 | 1
dbo.Users
Id (UQ) | Name
1 | Bob
2 | Sarah
3 | Alex
And a VIEW:
dbo.PurchasesWithUsers
SELECT *
FROM dbo.Purchases
LEFT JOIN dbo.Users ON Users.Id = UserId
And I'm going to run SELECT SUM(Value) FROM dbo.PurchasesWithUsers
.
Now ... as a human, I can see that that JOIN doesn't affect that query:
- It's obviously not directly used in the
SUM
. - It's a
LEFT JOIN
so it can't excludePurchase
rows. - It's joining to a column with a
UQ
constraint so it can't duplicatePurchase
rows.
But when I run the query and look at the execution plan, the Engine (MS SQL Server) is still performing the JOIN, which degrades the performance :(.
Is there any way that I can give the engine additional clues that it can work out that it could completely skip the JOIN, whilst still using the VIEW as the thing I'm querying?
Context:
- Obviously the tables are huge which is why the performance impact is material
- The Tables and the View are obviously a little more complex than that, but not actually all that much - the logical simplification is still valid, and the UQ constraints are explicit (as either UQ CONSTRAINTs or UQ indexes).
- The VIEW is being used so that the users can filter on a variety of different options. The Data API process those options and applies the relevant
WHERE
clauses to a singleVIEW
. Alas that means various of the JOINs aren't relevant to the VIEW depending on what filters have been chosen :( - I'm aware that I could materialise and directly index the VIEW, but I'd prefer to avoid that if possible, given that I can see that a simpler query plan could logically exist already.
Unless indexed, a view does not exist as a stored set of data values in a database. The rows and columns of data come from tables referenced in the query defining the view and are produced dynamically when the view is referenced. Because you don't want to "directly index the VIEW", there is no other data to work with and the server is getting the data by using the query behind the view so there is no work around to avoid that
LEFT JOIN
with this approch.