I have a stored procedure that is painfully slow.
It concatenates a bunch of fields and performs several REPLACE functions but I would expect the problem is mostly related to the calculated decimal fields within the ORDER BY to return addresses that are closest to a given latitude and longitude point.
I guess the query is running across the entire data set - hence the dire speed?
Is there any way such a query can be optimised?
Here's an example of the query that runs:
SELECT TOP 1 [LocatorID], [POI], [Name] AS StreetName, [Settlement] As Town,
[COU_Unit] AS County, [Latitude], [Longitude], [Postcode],
REPLACE(REPLACE([Code],'County of ', ''),' County', ''), [Source], REPLACE((ISNULL(POI + ', ','')
+ ISNULL(Name + ', ','') + ISNULL(Settlement + ', ','') + ISNULL(Cou_Unit + ', ','')
+ ISNULL(Postcode,'')),', , ', ', ')
AS DisplayAddress FROM [UKStreetsAndPlaces].[dbo].[OS_Locator] ORDER BY
(Longitude + 0.157992) * (Longitude + 0.157992) + (Latitude - 51.176551) * (Latitude - 51.176551) ASC
Here's the actual stored procedure:
USE [UKStreetsAndPlaces]
GO
/****** Object: StoredProcedure [dbo].[SP_GetAddressFromLatLong] Script Date: 09/06/2015 09:54:29 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[SP_GetAddressFromLatLong]
@Latitude decimal(9,6) = 0, -- Latitude - format of 9 digits with 6 decimal places
@Longitude decimal(9,6)= 0 -- Longitude - format of 9 digits with 6 decimal places
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE @PATINDEX AS INT
DECLARE @SQLQuery AS VARCHAR(900)
DECLARE @LongitudeOperator AS VARCHAR(1)
SET @LongitudeOperator = '-'
-- Ensure we have the right operator for Longitude (we're safe with Latitude in the UK)
IF (@Longitude < 0)
-- Negative, turn to a positive number
BEGIN
SET @LongitudeOperator = '+';
END
SET @SQLQuery = 'SELECT TOP 1 [LocatorID], [POI], [Name] AS StreetName, [Settlement] As Town,
[COU_Unit] AS County, [Latitude], [Longitude], [Postcode],
REPLACE(REPLACE([Code],''County of '', ''''),'' County'', ''''), [Source], REPLACE((ISNULL(POI + '', '','''')
+ ISNULL(Name + '', '','''') + ISNULL(Settlement + '', '','''') + ISNULL(Cou_Unit + '', '','''')
+ ISNULL(Postcode,'''')),'', , '', '', '')
AS DisplayAddress FROM [UKStreetsAndPlaces].[dbo].[OS_Locator] ORDER BY
(Longitude ' + @LongitudeOperator + ' ' + CAST(ABS(@Longitude) AS VARCHAR(20)) + ')'
+ ' * (Longitude ' + @LongitudeOperator + ' ' + CAST(ABS(@Longitude) AS VARCHAR(20)) + ')'
+ ' + (Latitude - ' + CAST(@Latitude AS VARCHAR(20)) + ') * (Latitude - ' + CAST(@Latitude AS VARCHAR(20)) + ') ASC'
EXECUTE(@SQLQuery)
END
The solution was to add a spatial index on a geography field (previously created by inserting a new geography field for each row calculated from my latitude and longitude points) and to use the modified query below.
The modified query ran in a few milliseconds the previous query ran in 45 seconds...
Omitting the 'WITH INDEX(CenterGeographySpatialIndex)' (i.e. the name of the index I created) did not seem to impact on the speed.
This query ran in just a few milliseconds (goal achieved):
This 'nearest neighbour' query took 1 minute 36 seconds to run:
This original version took over 5 minutes to run:
Conclusion: From 5 minutes to a few milliseconds... There's a reason Spatial types were created...