ST.RelateQuery speed

Jul 31, 2007 at 4:42 PM
I am attempting to reference a point to a provicial boundry polygon to determine which province it it in. I am using the ST.RelateQuery and 'Within' to do so. I have only 12 boundries. It takes 3 seconds to run the query.
I can run the ST.Is WithinDistanceQuery on a table with 411,000 lines with a performance of under a second.

What can I do to improve performance?
One thought is to make a bunch of smaller polygons to import, instead of 12 big ones.
Or perhaps there is a more appropriate query.

Thanks!
Aug 2, 2007 at 5:22 AM
You can start your query by removing all polygons whose bounding box doesn't overlap the point, by using the min/max columns. Basically if the min is higher than and max is lower than the point (in terms of both X and Y). This should minimize some of the processing needed for the relate query.
Coordinator
Aug 2, 2007 at 3:04 PM

deepsheep wrote:
I am attempting to reference a point to a provicial boundry polygon to determine which province it it in. I am using the ST.RelateQuery and 'Within' to do so. I have only 12 boundries. It takes 3 seconds to run the query.
I can run the ST.Is WithinDistanceQuery on a table with 411,000 lines with a performance of under a second.


Could you post both queries to compare them, please?

Best regards,
Ricardo Stuven.
Aug 2, 2007 at 4:58 PM
Sorry, I've done a work around already and didn't post it!

The slow query I was using was based off a post I found in the boards:
ST.RelateQuery('boundaries', 'thegeom', ST.GeomFromText('POINT(' + CAST(ISNULL(@long,0) AS nvarchar(15)) + ' ' + CAST(ISNULL(@lat,0) AS nvarchar(15)) + ' )', - 1), 'Within')

The work around that was far faster was to create a Stored Procedure that does the following:
-check the bounding box
-put the results into a cursor
-use the ST.within function for each reslut

This resulted in an exceution time of less than a second - much more inline with what I expected for a table with just 12 polygons.

I have since also discoved that the shp file I was given had a couple of messed up polygons. If there's intrest, I can try again once I get a corrected file.

Sorry for the lack of code, I don't have internet access on that machine.
Aug 13, 2007 at 12:17 AM
Could you provide some code for your faster stored procedure?
Aug 13, 2007 at 4:49 PM
I got the fixed shp file and this stored procedure was still faster. It was taking 1/3 to 1/2 of the total time of the built in one.


CREATE PROCEDURE dbo.GetProvince
-- Add the parameters for the stored procedure here
@latitude float,
@longitude float,
@return nvarchar(100) OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare
@result as int,
@provname as varchar(100),
@geom as varbinary(max)
set @return='NONE'
DECLARE vend_cursor CURSOR
FOR select thegeom, province from boundries where @latitude between thegeomEnvelopeMinY and thegeomEnvelopeMaxY and @longitude between thegeomEnvelopeMinX and thegeomEnvelope_MaxX
OPEN vend_cursor
FETCH NEXT FROM vend_cursor
INTO @geom,@provname

WHILE @@FETCH_STATUS = 0
BEGIN
--Select * from ST.RelateQuery(@geom,-1, st.geomfromtext('POINT('+ cast(@long as nvarchar(15)) ' ' cast(@lat as nvarchar(15)) +')', -1),'within')
select @result=st.within (st.geomfromtext('POINT('+ cast(@longitude as nvarchar(15)) ' ' cast(@latitude as nvarchar(15)) +')', -1),@geom)
print @result
print @provname
if (@result>0 )
set @return=@provname

FETCH NEXT FROM vend_cursor
INTO @geom,@provname
END
CLOSE vend_cursor
DEALLOCATE vend_cursor

END
Aug 13, 2007 at 7:06 PM
OK, I duplicated your sp (needed a few fixes here and there to work, see below), but am now wondering how I may be able to use this in a select query to loop through a table of records & produce a resultset of mismatched provinces (i.e. coordinates in a table whose latitude & longitude fall outside that returned from the sp). Is that not your ultimate goal?

Line 17:
FOR select thegeom, province from boundries where @latitude between thegeomEnvelopeMinY and thegeomEnvelopeMaxY and @longitude between thegeomEnvelopeMinX and thegeomEnvelope_MaxX

Line 25:
select @result=ST.Within(ST.GeomFromText('POINT(' + cast(@longitude as nvarchar(15)) + ' ' + cast(@latitude as nvarchar(15)) + ')', -1),@geom)

Aug 13, 2007 at 7:09 PM
Oops. So much for the fixes. Seems the underscores get stripped. Let me try again with an HTML entity:

Line 17:
FOR select thegeom, province from boundries where @latitude between the_geom_Envelope_MinY and the_geom_Envelope_MaxY and @longitude between the_geom_Envelope_MinX and the_geom_Envelope_MaxX
Aug 13, 2007 at 11:09 PM
My goal was to determine which province my point was within. My SP does that perfectly for my tables. And faster than my testing of st.relatequery equivilants.

I'm not too sure I understand your problem. It sounds like you are trying to do the opposite of what I was doing. So you're trying to find the points that DON'T belong to any province. Is that right? If not, could you explain it again?
Aug 14, 2007 at 12:23 AM
Sure. We're probably doing much the same thing, though I have a front-end, client-run, datagrid AJAX app, so need to construct a SELECT statement in the VBScript associated with this application. What I am programming is a function in this app for clients to query all their records in their non-spatial tables with latitude, longitude, and StateProvince (e.g. Alberta or North Dakota) columns for instances where the latitude & longitue aren't actually in the StateProvince they indicated. The result from such a function is a scrollable, AJAX datagrid (hence has to be a SELECT statement) containing all these "geocoding issues" as a I call them. In other words, this is a simple reverse geocoding query. So, what I have is the following where 'NearcticSpatial' is the spatial table & 'CollectionDataAll' is the client's non-spatial table:

FUNCTION ST.GetSTATEPROVINCE
(@long float, @lat float)
RETURNS TABLE
AS
RETURN
SELECT ST_t.STATEPROVINCE
FROM ST.RelateQuery('NearcticSpatial', 'thegeom', ST.GeomFromText('POINT(' + CAST(ISNULL(@long,0) AS varchar(8)) + ' ' + CAST(ISNULL(@lat,0) AS varchar(8)) + ' )', - 1),'Within') AS STq
INNER JOIN NearcticSpatial AS ST_t
ON STq.oid = STt.oid

and the query...

SELECT * From CollectionDataAll t1 OUTER APPLY ST.GetSTATEPROVINCE(t1.Longitude,t1.Latitude) AS q WHERE (t1.StateProvince <> q.STATEPROVINCE OR q.STATEPROVINCE IS NULL) AND (t1.Latitude IS NOT NULL OR t1.Longitude IS NOT NULL)

That can churn through ~250 records / second.

Do you see anything else I can do here?
Aug 14, 2007 at 12:26 AM
Darned underscores. Let me try that again:

FUNCTION ST.GetSTATEPROVINCE
(@long float, @lat float)
RETURNS TABLE
AS
RETURN
SELECT ST_t.STATEPROVINCE
FROM ST.RelateQuery('NearcticSpatial', 'the_geom', ST.GeomFromText('POINT(' + CAST(ISNULL(@long,0) AS varchar(8)) + ' ' + CAST(ISNULL(@lat,0) AS varchar(8)) + ' )', - 1),'Within') AS ST_q
INNER JOIN NearcticSpatial AS ST_t
ON ST_q.oid = ST_t.oid
Coordinator
Aug 14, 2007 at 3:46 PM
Just a side note: Remember to enclose your code snippets in double curly brackets (see CodePlex Wiki Markup Guide).
Aug 15, 2007 at 8:09 PM
Your applicatin is a bit different than what I'm doing. I'm evaluating points one at a time as they become available, where as you are evaluating all past points.

One idea to try though, If your State label in your points table is the same as the State label in the Boundries table, you could join them into a view. Then you can evaluate the point against the state it should be and get all the ones that don't match. Then you skip all the other states. I'm not sure it will be any faster, but I've had alot of luck with Veiws speeding things up.

Aug 16, 2007 at 4:13 AM
deepsheep,

I think I get what you mean, but that's essentially what the OUTER APPLY is doing. See here: http://aspalliance.com/1065_Understanding_APPLY_Clause_in_SQL_Server_2005. I see do other way around a row-row comparison. I'm also perplexed as to why your new query would have been faster that what you found in the forums...I think that was one of my old posts...isn't the poor performance you were experiencing because you had SELECT * instead of specifying the columns you needed?