Today a production issue was reported on a large corporate system that I work on whereby an export was taking over 4 hours to run. This export was previously tested on similar or larger amounts of data and usually completed in under 10 minutes.

This issue was tracked down to a query similar to the following

select *
from action a 
left join action_history h on a.action_id = h.action_id 
where h.user = (select max(user) from action_user u where u.action_id = a.action_id)
and a.action_id = 1234;

Inserting the actual action id in the sub query fixed the problem

select *
from action a 
left join action_history h on a.action_id = h.action_id 
where h.user = (select max(user) from action_user u where u.action_id = 1234)
and a.action_id = 1234;

The same export is now running in a matter of minutes since we patched in the fix. However this issue seems quite concerning as I would have expected any decent Query Execution Engine to treat these queries as equivalent. However when we look at QEP they are very different for the two queries. I have a hunch that something has gone wrong with the ingres query engine since we upgraded to Ingres 10.2 but this cannot be easily tested as all our environments have now been upgraded to 10.2.

I do know that when this query was first introduced extensive testing was done under 10.0 and no performance issues were found. Can anyone confirm whether they would expect these two queries to be treated as equivalent?

0

There are 0 answers