how to improve the performance of my SQL query?

123 views Asked by At

I am a beginner in SQL and I want to query data from an Oracle database using SQL. I have a tablespace that records the positions of many cars. Each record has a mileage and a time. The tablespace has three columns: ‘SENDTIME’, ‘MILEAGE’, and ‘PLATENO’. They represent the time, the mileage (in the timestamp format), and the plate number of the cars respectively. The values in the ‘SENDTIME’ column are in the datetime format. I want to find out when a specific car did not move for more than 30 seconds and less than 300 seconds. So i wrote a sql query:

WITH gpsinfo_cte AS (
  SELECT plateno, sendtime, longitude, latitude, mileage, createdate,
    FIRST_VALUE(sendtime) OVER (PARTITION BY plateno, mileage ORDER BY sendtime) AS first_sendtime,
    LAST_VALUE(sendtime) OVER (PARTITION BY plateno, mileage ORDER BY sendtime) AS last_sendtime
  FROM GPSINFO 
  WHERE plateno = '京AEW302'
)
SELECT /*+ NO_MERGE(gpsinfo_cte) */ plateno, sendtime, longitude, latitude, mileage, createdate 
FROM gpsinfo_cte
WHERE (last_sendtime - first_sendtime) * 24 * 60 *60 < 300
AND (last_sendtime - first_sendtime) * 24 * 60 *60 > 30;

However, it runs rather slowly in oracle database. Acoording to web search results, I tried to use the EXPLAIN PLAN statement to generate the execution plan for your query and store it in a table called PLAN_TABLE. Here's the output:

enter image description here

I still have no idea to improve the performance. Anyone can help? Many thanks!

2

There are 2 answers

0
Paul W On BEST ANSWER

This is probably all that matters:

 FROM GPSINFO 
  WHERE plateno = '京AEW302'

Assuming you have lots (thousands, millions maybe) of cars, and are asking for only one license plate number, you want to use an index to access only table rows for that particular car. Your execution plan shows that this isn't happening:

TABLE ACCESS FULL GPSINFO

Simply create an index on the plateno column and that should resolve your issue:

CREATE INDEX gpsinfo_plateno_idx1 on gpsinfo(plateno) compress 1

As for the rest of your query, you'll have to work on the logic to get the right results, but it is unlikely that you have a noticeable performance problem with it as long as you are working with only one car. In terms of the logic, if I understand rightly you want to know when there is a 30-300 second gap between position recordings with no mileage between them. So you need to compare a row with a neighboring row. For that, use LAG (or LEAD):

SELECT x.*,
       (sendtime - last_sentime) * 86400 seconds_elapsed
  FROM (SELECT x.*,
               LAG(sendtime) OVER (ORDER BY sendtime) last_sendtime,
               LAG(mileage) OVER (ORDER BY sendtime) last_mileage
          FROM gpsinfo x
         WHERE plateno = '京AEW302') x
 WHERE (sendtime - last_sentime) * 86400 BETWEEN 30 AND 300
   AND mileage = last_mileage 

Note: If you are asking for only one plateno value, there is no need to include plateno in a PARTITION BY clause, that's redundant as there is only one. You can certainly use PARTITION BY mileage as an alternative to my mileage = last_mileage logic shown here, but mileage is likely to have many different values and internally grouping by that many values means a lot of tiny (single row) groups and that's not very memory/temp efficient. Yet as I said, the indexing is really your only serious issue.

0
gotqn On

In order to get fastest possible results you need to pre-calculated these values (first and last send time). This can be achieved with:

  • the routine responsible for adding the record - in the context of one plateno it will be faster to calculated and save the first and last time
  • trigger on the table
  • materilized view and process for refreshing its data

All required some dev time and can't be difficult.

I can also offer you to try to first calculate the plateno and then extract its details. In case of sendtime we are interested in the first (smallest) and last (largest) values. So, you can use:

WITH gpsinfo_cte AS (
  SELECT plateno
  FROM GPSINFO 
  WHERE plateno = '京AEW302'
  GROUP BY plateno
  HAVING (MAX(sendtime) - MIN(sendtime)) * 24 * 60 * 60 < 300
    AND (MAX(sendtime) - MIN(sendtime)) * 24 * 60 * 60 > 30
)
SELECT *
FROM GPSINFO
WHERE plateno IN (SELECT plateno FROM gpsinfo_cte);

If this is working, you can add index on plateno and sendtime in order to further optimize it.