Optimising a slow running SQL Server Stored procedure ordered by calculated fields to return a closest match

75 views Asked by At

I have a stored procedure that is painfully slow.

It concatenates a bunch of fields and performs several REPLACE functions but I would expect the problem is mostly related to the calculated decimal fields within the ORDER BY to return addresses that are closest to a given latitude and longitude point.

I guess the query is running across the entire data set - hence the dire speed?

Is there any way such a query can be optimised?

Here's an example of the query that runs:

SELECT TOP 1 [LocatorID], [POI], [Name] AS StreetName, [Settlement] As Town, 
  [COU_Unit] AS County, [Latitude], [Longitude], [Postcode], 
  REPLACE(REPLACE([Code],'County of ', ''),' County', ''), [Source], REPLACE((ISNULL(POI + ', ','') 
  + ISNULL(Name + ', ','') + ISNULL(Settlement + ', ','') + ISNULL(Cou_Unit + ', ','') 
  + ISNULL(Postcode,'')),', , ', ', ') 
 AS DisplayAddress FROM [UKStreetsAndPlaces].[dbo].[OS_Locator] ORDER BY 
 (Longitude + 0.157992) * (Longitude + 0.157992) + (Latitude - 51.176551) * (Latitude - 51.176551) ASC

Here's the actual stored procedure:

USE [UKStreetsAndPlaces]
GO
/****** Object:  StoredProcedure [dbo].[SP_GetAddressFromLatLong]    Script Date: 09/06/2015 09:54:29 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[SP_GetAddressFromLatLong] 
@Latitude decimal(9,6) = 0,     -- Latitude - format of 9 digits with 6 decimal places
@Longitude decimal(9,6)= 0      -- Longitude - format of 9 digits with 6 decimal places

AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    DECLARE @PATINDEX AS INT
    DECLARE @SQLQuery AS VARCHAR(900)
    DECLARE @LongitudeOperator AS VARCHAR(1)
    SET @LongitudeOperator = '-'

    -- Ensure we have the right operator for Longitude (we're safe with Latitude in the UK)
    IF (@Longitude < 0)
        -- Negative, turn to a positive number
        BEGIN       
            SET @LongitudeOperator = '+';
        END

    SET @SQLQuery = 'SELECT TOP 1 [LocatorID], [POI], [Name] AS StreetName, [Settlement] As Town, 
      [COU_Unit] AS County, [Latitude], [Longitude], [Postcode], 
      REPLACE(REPLACE([Code],''County of '', ''''),'' County'', ''''), [Source], REPLACE((ISNULL(POI + '', '','''') 
      + ISNULL(Name + '', '','''') + ISNULL(Settlement + '', '','''') + ISNULL(Cou_Unit + '', '','''') 
      + ISNULL(Postcode,'''')),'', , '', '', '') 
     AS DisplayAddress FROM [UKStreetsAndPlaces].[dbo].[OS_Locator] ORDER BY 
     (Longitude ' + @LongitudeOperator + ' ' + CAST(ABS(@Longitude) AS VARCHAR(20)) + ')'
    + ' * (Longitude ' + @LongitudeOperator + ' ' + CAST(ABS(@Longitude) AS VARCHAR(20)) + ')'
    + ' + (Latitude - ' + CAST(@Latitude AS VARCHAR(20)) + ') * (Latitude - ' + CAST(@Latitude AS VARCHAR(20)) + ') ASC'

    EXECUTE(@SQLQuery)
END
1

There are 1 answers

0
ChrisCurrie On

The solution was to add a spatial index on a geography field (previously created by inserting a new geography field for each row calculated from my latitude and longitude points) and to use the modified query below.

The modified query ran in a few milliseconds the previous query ran in 45 seconds...

Omitting the 'WITH INDEX(CenterGeographySpatialIndex)' (i.e. the name of the index I created) did not seem to impact on the speed.

This query ran in just a few milliseconds (goal achieved):

DECLARE @param nvarchar(50);
set @param = 'POINT(-0.1579925 51.1765506)';
Declare @paramGeog geography = geography::STPointFromText(@param, 4326);
Select top 1 * FROM [dbo].[UKAddressesLookup] WITH (INDEX(CenterGeographySpatialIndex))
WHERE CentreGeography.STDistance(@paramGeog) < 200
ORDER BY CentreGeography.STDistance(@paramGeog);

This 'nearest neighbour' query took 1 minute 36 seconds to run:

my USE UKStreetsAndPlaces
GO
DECLARE @g geography = 'POINT(-0.1579925 51.1765506)';
SELECT TOP(1) DisplayAddress FROM OS_Locator
ORDER BY CentreGeography.STDistance(@g);

This original version took over 5 minutes to run:

SELECT TOP 1 [LocatorID], [POI], [Name] AS StreetName, [Settlement] As Town, 
  [COU_Unit] AS County, [Latitude], [Longitude], [Postcode], 
  REPLACE(REPLACE([Code],'County of ', ''),' County', ''), [Source], REPLACE((ISNULL(POI + ', ','') 
  + ISNULL(Name + ', ','') + ISNULL(Settlement + ', ','') + ISNULL(Cou_Unit + ', ','') 
  + ISNULL(Postcode,'')),', , ', ', ') 
 AS DisplayAddress FROM [UKStreetsAndPlaces].[dbo].[OS_Locator] ORDER BY 
 (Longitude + 0.157992) * (Longitude + 0.157992) + (Latitude - 51.176551) * (Latitude - 51.176551) ASC

Conclusion: From 5 minutes to a few milliseconds... There's a reason Spatial types were created...