Entity framework doesn't use JOIN in SQL queries

1.6k views Asked by At

I wrote some code for one to many relationships getting Invoice instances from Defect instances.

mydbEntities ef = new mydbEntities (); //mydbEntities is derived from DbContext
ef.Database.Log = s => System.Diagnostics.Debug.WriteLine (s);
Invoice inv = ef.Defects.Where (i => i.Id == 5).SingleOrDefault ().Invoice;

That is part of EDM diagram:

EDM diagram

I'm curious why it did not invoke INNER JOIN, instead of it 2 SQL queries were executed. In the book I'm now reading ("Mastering Entity Framework"), in the same situation an INNER JOIN was invoked.

Output from Debug:

SELECT TOP (2) 
    [Extent1].[Id] AS [Id], 
    [Extent1].[PositionId] AS [PositionId], 
    [Extent1].[InvoiceId] AS [InvoiceId], 
    [Extent1].[Count] AS [Count], 
    [Extent1].[Reason] AS [Reason]
    FROM [dbo].[Defect] AS [Extent1]
    WHERE 5 = [Extent1].[Id]


-- Executing at 6/21/2015 11:21:02 AM +05:00

-- Completed in 1 ms with result: SqlDataReader



Closed connection at 6/21/2015 11:21:02 AM +05:00

Opened connection at 6/21/2015 11:21:02 AM +05:00

SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Number] AS [Number], 
    [Extent1].[InvoiceDate] AS [InvoiceDate]
    FROM [dbo].[Invoice] AS [Extent1]
    WHERE [Extent1].[Id] = @EntityKeyValue1


-- EntityKeyValue1: '1' (Type = Int32, IsNullable = false)

-- Executing at 6/21/2015 11:21:02 AM +05:00

-- Completed in 0 ms with result: SqlDataReader



Closed connection at 6/21/2015 11:21:02 AM +05:00

And one more question: How do 2 queries against 1 with INNER JOIN impact on performance in high-loaded application with large database.

2

There are 2 answers

1
AudioBubble On BEST ANSWER

EntityFramework (and most other Linq to Sql tools) are only lazy up to a certain point. There are a set of methods that force the query to run. Some examples are:

  • Single/SingleOrDefault
  • First/FirstOrDefault
  • ToList
  • ToArray

The trick I use is that each of these methods has an asynchronous version that returns Task<T>. So if your autocomplete in your IDE includes SingleOrDefaultAsync then it's obvious that SingleOrDefault causes the query to be run.

Note that the same behaviour occurs with Linq to Objects; each of the ToList, Single etc. are eager methods, whereas the ones that return IEnumerable are lazy and get executed on the first eager operation.

To answer your question, the Include method fixes this problem by returning the join data. This will improve performance if you use it correctly.

In your example, you are getting the Invoice property from your object, so the correct call would be to use .Include(d => d.Invoice) or .Include("Invoice"). Both of these tell EF to pull back the Invoice information in the first query.

1
Alexander On

You should add .Include(d => d.Invoice) call. Try this:

    mydbEntities ef = new mydbEntities (); //mydbEntities is derived from DbContext
    ef.Database.Log = s => System.Diagnostics.Debug.WriteLine (s);
    Invoice inv = ef.Defects
        .Include(d => d.Invoice)
        .Where (i => i.Id == 5)
        .SingleOrDefault()
        .Invoice;