This project is read-only.

How to load data from non-spatial sources

This tutorial uses some features will be available in 0.2 Beta release or already available in latest source code. See the differences at the end of the document

MsSqlSpatial can import spatial data directly from ESRI shapefile format and PostGIS tables. However, often you will receive data in a non-spatial form such as fixed-length or comma delimited data with latitude and longitude fields. To take full advantage of MsSqlSpatial abilities, you will want to create geometry columns in your new table and update that column using the longitude/latitude fields you have available.

General Note: All the command statements that follow should be run from the SQL Server Management Studio or any other SQL Server administrative tool you have available. If you are a command line freak, you can use the sqlcmd command line tool packaged with SQL Server.

Getting the data

For this exercise, we will use US zip code tabulation areas (ZCTAs). The techniques here will apply to any data you get actually.

First step is to download the file ZCTAs.zip from US Census website and uncompress it to a location in the server. For this tutorial we'll use: C:\usa_census\zcta5.txt

Importing the data into SQL Server

SQL Server 2005 comes with several methods to import data from a text file.
In this example, we will use OPENROWSET. Since the ZCTAs data is provided in fixed-width format, we will define a XML BCP Format file accordingly. For data in other formats such as DBF, you'll either want to convert it using OPENROWSET, SSIS or tools such as Excel or Access (which allows you to link any tables or do a straight import and export to any ODBC compliant database such as SQL Server 2005), or any third party tool that will allow you to import from one format to another.

Create a XML Format file

Look at the ZCTA table layout spec. Below is the XML BCP Format file I have built for importing this data. Save it in a file somewhere on your server. For our example we'll save it to: C:\usa_census\zcta5.format.xml

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
		   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
	<RECORD>
		<FIELD ID="f01" xsi:type="CharFixed" LENGTH="2"  />
		<FIELD ID="f02" xsi:type="CharFixed" LENGTH="5" />
		<FIELD ID="f03" xsi:type="CharFixed" LENGTH="59" />
		<FIELD ID="f04" xsi:type="CharFixed" LENGTH="9"/>
		<FIELD ID="f05" xsi:type="CharFixed" LENGTH="9" />
		<FIELD ID="f06" xsi:type="CharFixed" LENGTH="14" />
		<FIELD ID="f07" xsi:type="CharFixed" LENGTH="14" />
		<FIELD ID="f08" xsi:type="CharFixed" LENGTH="12" />
		<FIELD ID="f09" xsi:type="CharFixed" LENGTH="12" />
		<FIELD ID="f10" xsi:type="CharFixed" LENGTH="10" />
		<FIELD ID="f11" xsi:type="CharFixed" LENGTH="11" />
		<FIELD ID="f99" xsi:type="CharTerm" TERMINATOR="\r\n" />
	</RECORD>
	<ROW>
		<COLUMN SOURCE="f01" xsi:type="SQLCHAR" LENGTH="2" NAME="state" />
		<COLUMN SOURCE="f02" xsi:type="SQLCHAR" LENGTH="5" NAME="zcta" />
		<COLUMN SOURCE="f03" xsi:type="SQLVARYCHAR" LENGTH="100" NAME="junk" />
		<COLUMN SOURCE="f04" xsi:type="SQLBIGINT" NAME="population_tot" />
		<COLUMN SOURCE="f05" xsi:type="SQLBIGINT" NAME="housing_tot" />
		<COLUMN SOURCE="f06" xsi:type="SQLFLT8" NAME="water_area_meter" />
		<COLUMN SOURCE="f07" xsi:type="SQLFLT8" NAME="land_area_meter" />
		<COLUMN SOURCE="f08" xsi:type="SQLFLT8" NAME="water_area_mile" />
		<COLUMN SOURCE="f09" xsi:type="SQLFLT8" NAME="land_area_mile" />
		<COLUMN SOURCE="f10" xsi:type="SQLFLT8" NAME="latitude" />
		<COLUMN SOURCE="f11" xsi:type="SQLFLT8" NAME="longitude" />
	</ROW>
</BCPFORMAT>


Note I broke out the Name field further into first 5 for zcta and the rest for a junk column you could drop after.

Importing the data

Now copy the data into the a new table using OPENROWSET. Remember this statement works using the SQL Server 2005 service so the files location must be specified relative to the server.

SELECT *
INTO zctas
FROM OPENROWSET(BULK 'C:\usa_census\zcta5.txt',
FORMATFILE = 'C:\usa_census\zcta5.format.xml') AS Z

Creating and populating a geometry column

Create the geometry column

To create the geometry column, use the ST.AddGeometryColumn stored procedure provided by MsSqlSpatial. This will add a geometry column to the specified table as well as adding a record to the ST.GEOMETRY_COLUMNS metadata table and creating useful constraints on the new column.

EXECUTE ST.AddGeometryColumn 'dbo', 'zctas', 'thepoint_lonlat', 4269, 'POINT'

The above code will create a geometry column named thepoint_longlat in the table zctas that validates to make sure the inputs are points in the coordinate systems coded 4269 (NAD83 longlat).

Populate the geometry column using the longitude and latitude columns

UPDATE zctas
SET thepoint_lonlat = ST.Point(longitude, latitude, 4269)

The above code will convert every longitude/latitude pair to a MsSqlSpatial geometry object of spatial reference SRID 4269.

You can't just put any arbitrary SRID in there and expect the system to magically transform to that. The SRID you specify has to be the reference system that your text coordinates representation is in.

Transforming to another spatial reference system

The above is great if you want your geometry in longlat spatial reference system. In many cases, longlat is not terribly useful. For example if you want to do distance queries with your data, you don't want your distance returned back in longlat. You want it in a metric that you normally measure things in.

In the code below, we will create a new geometry column that holds points in the "NAD83 / BC Albers" reference system and then updates that column accordingly.

EXECUTE ST.AddGeometryColumn 'dbo', 'zctas', 'thepoint_meter', 3005, 'POINT'

GO

UPDATE zctas
SET thepoint_meter = ST.Transform(ST.Point(longitude, latitude, 4269), 3005)

Compatibility with MsSqlSpatial versions prior to 0.2

This tutorial used some features will be available in 0.2 Beta release or already available in latest source code. If you are using 0.1.1 or a previous version, take in account the following:
ST.Point function
Replace the occurences of ST.Point function. For example:
ST.Point(longitude, latitude, 4269)

Replace it by:
ST.PointFromText('POINT(' + CAST(longitude AS VARCHAR) + ' ' + CAST(latitude AS VARCHAR) + ')', 4269)
Coordinate reference system
EPSG:3005, "NAD83 / BC Albers" is not available prior to 0.2.x. Choose another that fit your purpose

Acknowledgement

The present tutorial was adapted from this work. Thanks to Paragon Corporation for their authorization.

Last edited Sep 22, 2007 at 2:34 AM by rstuven, version 3

Comments

ChrisProto Jan 8, 2008 at 10:00 PM 
For SQL Server, casting the float as decimal(10,1) does not work as intended because that will allow only one digit after the decimal point! Instead, use this: convert(varchar, convert(decimal(9,6), latitude))
This defines a total of 9 digits, with 6 following the decimal point.

rstuven Sep 22, 2007 at 2:36 AM 
Corrected. Thank you!

milovanderlinden Sep 20, 2007 at 2:23 PM 
When creating points from real values, a new problem appears. reals are rounded!!!

Therefor, the correct function when your lat/lon values are stored as real is:

ST.PointFromText('POINT(' + CAST(CAST(locX AS DECIMAL(10,1)) AS VARCHAR) + ' ' + CAST(CAST(locY AS DECIMAL(10,1)) AS VARCHAR) + ')', 28992)

milovanderlinden Sep 20, 2007 at 2:21 PM 
ST.PointFromText('POINT(' + CAST(longitude AS VARCHAR) + ', ' + CAST(latitude AS VARCHAR) + ')', 4269)

should be

ST.PointFromText('POINT(' + CAST(longitude AS VARCHAR) + ' ' + CAST(latitude AS VARCHAR) + ')', 4269)

There was a , between the longitude and latitude that shouldn't be there