SQL EXISTS Why does selecting rownum cause inefficient execution plan?

1.6k views Asked by At

Problem

I'm trying to understand why what seems like a minor difference in these two Oracle Syntax Update queries is causing a radically different execution plan.

Query 1:

UPDATE sales s
   SET status = 'DONE', trandate = sysdate
 WHERE EXISTS (Select *    
 FROM tempTable tmp
     WHERE s.key1 = tmp.key1
       AND s.key2 = tmp.key2
       AND s.key3 = tmp.key3)

Query 2:

UPDATE sales s
   SET status = 'DONE', trandate = sysdate
 WHERE EXISTS (Select rownum    
 FROM tempTable tmp
     WHERE s.key1 = tmp.key1
       AND s.key2 = tmp.key2
       AND s.key3 = tmp.key3)

As you can see the only difference between the two is that the subquery in Query 2 returns a rownum instead of the values of every row.

The execution plans for these two couldn't be more different:

  • Query1 - Pulls the total results from both tables and uses a sort and a hashjoin to return the results. This peforms well with a favorable 2,346 cost (despite the use of the EXISTS clause and the cohesive subquery).

  • Query2 - Pulls both table results as well but uses a count and a filter to accomplish the same task and returns an execution plan with an astonishing 77,789,696 cost! I should note that his query just hangs on me so I'm not actually positive this returns the same results (though I believe it should).

From my understanding of the Exists clause it is just a simple boolean check that runs per line of the main table. It doesn't matter if a single row is returned in my EXISTS condition or 100,000 rows... if any results are returned for the row that it is being run, then you've passed the exist check. So why would it matter what my subquery SELECT statement returns?

--------------------EDIT----------------------

Per request, below are the execution plans I'm running in TOAD... please note I edited the table names in my example above for ease - In these plans ALSS_SALES2 = sales above and SALESEXT_TMP = tempTABLE above.

Also should have mentioned but neither of the two tables has indices at this point.. I haven't yet added them to my tempTable and I'm testing with a cheap copy of the sales table which only contains the fields and data but no indices, constraints or security.

Thanks for the assistance everyone!

Query 1 Execution Plan

Query1 Execution Plan

Query 2 Execution Plan

Query2 Execution Plan

------------------------------------------------

Questions

1) Why did the call for rownum cause the execution plan to change?

2) What is it about the Filter that is so incredibally inefficient?

3) Am I missing something fundamental with the way the Exists clause works that is causing this change?

2

There are 2 answers

1
Justin Cave On BEST ANSWER

Posting the actual query plans would be quite helpful.

In general, though, when the optimizer sees a subquery with rownum, that radically limits its ability to transform the query and merge the results from the subquery with the main query because doing so potentially affects the results. That can be a quick way to force Oracle to materialize a subquery if that happens to be more efficient than the plan chosen by the optimizer. In this case, though, it is probably causing the optimizer to forego a transform step that makes the query more efficient.

Occasionally, you'll see someone take a query like

SELECT b.*
  FROM (SELECT <<columns>>
          FROM driving_table
         WHERE <<conditions>>) a,
       b
 WHERE a.id = b.id

and tack on a rownum to the a subquery

SELECT b.*
  FROM (SELECT <<columns>>, rownum
          FROM driving_table
         WHERE <<conditions>>) a,
       b
 WHERE a.id = b.id

in order to force the optimizer to evaluate the a subquery before executing the join. Normally, of course, the optimizer should do this by default if it is more efficient. But if the optimizer makes a mistake, adding rownum can be quicker than figuring out the right set of hints to force a plan or digging in to the underlying problem to figure out the right solution.

Of course, in the particular case that you have a subquery in a WHERE EXISTS where the only use of rownum comes in the SELECT list, we humans can detect that the rownum shouldn't prevent any query transform step that the optimizer would care to use. The optimizer, though, is probably using a more general rule that says that subqueries that reference a function like rownum must be completely executed (this may depend on the exact Oracle version and/or the optimizer settings). So the optimizer is realistically doing a bunch of extra work because it's not smart enough to recognize that the rownum you added cannot possibly affect the results of the query.

0
Wernfried Domscheit On

Just a question, what's the execution plan for this query:

UPDATE sales s
   SET status = 'DONE', trandate = sysdate
 WHERE EXISTS (Select NULL
 FROM tempTable tmp
     WHERE s.key1 = tmp.key1
       AND s.key2 = tmp.key2
       AND s.key3 = tmp.key3);

It visualize what is needed in an EXISTS (...) expression - actually nothing! As already stated Oracle just have to check if anything is returned, not what is returned in Sub-Query.