I wanted to do a cool experiment on SSIS that I haven’t tried before. I am really interested in this whole mapping component on SSRS, and I know that most likely the biggest challenge of companies is how to geocode addresses that are already in their databases. So what I wanted to do was to try and retrieve lat/long information from public geocoding web services within SSIS.

Please note that the exercise below is purely for experimentation. Please read the restrictions of each of the geocoding services, and go with whichever is appropriate for your purpose.

Also this is a simplistic code to show you how to get the latitude/longitude information. On your own versions, test thoroughly and add the appropriate exception handlers.

Ok, let’s start.

Step 1 – Look for the appropriate geocoding service.

There are few publicly available ones. Some are limited in content, some are very restricted. Check out the summary here – http://en.wikipedia.org/wiki/Geocoding

For this experiment I am going to use Google Maps v3 API since it’s publicly available, and V3 doesn’t need the Maps API key anymore! Please make sure you read the restrictions of using the Google Maps v3 API before deciding to use it in your applications/systems.

Note that for v3, there are only 2 available output formats: json and xml. Previous version of the API supported other output formats like kml and csv. I am going to stick with XML. (I’ll leave JSON for the next experiment).

Here is an example URL and result: http://maps.googleapis.com/maps/api/geocode/xml?address=Pacific+West+Outlet,Gilroy,California,US&sensor=false

Try to study the structure, it will help later on with the parsing for latitude and longitude.

Step 2 – Prep your package

Create your SSIS package. Drop a data flow task, and add your source. In my case my source is an OLE DB source pointing to my SQL Server table that has address fields already.


 

Step 3 – Code the transformation inside your Script Component

a. Still inside the data flow task, drop a Script Component task that you will use for Transform. Drag the data flow path from your source to the script component.

b. Specify your input parameters, and specify Latitude, Longitude and GeocodeURL as Read/Write. I am only adding the GeocodeURL because I want to be able to spot check my numbers later on.


c. Edit your script in the Script Component. Create a class that will contain your GeoCoding class. Note I’m using C# in my scripts.

using System;
using System.Collections.Generic;
using System.Text;
//added these
using System.Data;
using System.Net;
using System.Web;
using System.Xml;
//  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
//  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
//  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
//  PARTICULAR PURPOSE.
namespace sqlmusings
{
    public interface IGeoLocation
    {
        double latitude { get; set; }
        double longitude { get; set; }
        string geocodeurl { get; set; }
    }
    public struct GeoLocation : IGeoLocation
    {
        private double _latitude;
        private double _longitude;
        private string _geocodeurl;
        public GeoLocation(double latitude, double longitude, string geocodeurl)
        {
            _latitude = latitude;
            _longitude = longitude;
            _geocodeurl = geocodeurl;
        }
        public double latitude
        {
            get { return _latitude; }
            set { _latitude = value; }
        }
        public double longitude
        {
            get { return _longitude; }
            set { _longitude = value; }
        }
        public string geocodeurl
        {
            get {return _geocodeurl; }
            set { _geocodeurl = value; }
        }
    }
    public class GeoCode
    {
        const string _googleUri = "http://maps.googleapis.com/maps/api/geocode/xml?address=";
        //sample
        //http://maps.googleapis.com/maps/api/geocode/xml?address=1128+West+Hastings+Street,+Vancouver+British+Columbia+Canada&sensor=true
        private static Uri GetGeoCodeURI(string address)
        {
            address = HttpUtility.UrlEncode(address);
            string uri = String.Format("{0}{1}&sensor=false", _googleUri, address);
            return new Uri(uri);
        }
        public static GeoLocation GetCoordinates(string address)
        {
            WebClient wc = new WebClient();
            Uri uri = GetGeoCodeURI(address);
            try
            {
                string geoCodeInfo = wc.DownloadString(uri);
                XmlDocument xmlDoc = new XmlDocument();
                xmlDoc.LoadXml(geoCodeInfo);
                string status = xmlDoc.DocumentElement.SelectSingleNode("status").InnerText;
                double geolat = 0.0;
                double geolong = 0.0;
                XmlNodeList nodeCol = xmlDoc.DocumentElement.SelectNodes("result");
                foreach (XmlNode node in nodeCol)
                {
                    geolat = Convert.ToDouble(node.SelectSingleNode("geometry/location/lat").InnerText, System.Globalization.CultureInfo.InvariantCulture);
                    geolong = Convert.ToDouble(node.SelectSingleNode("geometry/location/lng").InnerText, System.Globalization.CultureInfo.InvariantCulture);
                }
                return new GeoLocation(geolat, geolong, uri.ToString());
            }
            catch
            {
                return new GeoLocation(0.0, 0.0, "http://");
            }
        }
    }
}

d. Still inside the script editor of the Script Component, now switch to your main.cs. Add the code that calls and uses your Geocoding class.

/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using SC_65a998228f6546ed965116b9f8b76953.csproj;
//  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
//  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
//  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
//  PARTICULAR PURPOSE.
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }
    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        /*
          Add your code here
        */
        sqlmusings.GeoLocation geolocation = sqlmusings.GeoCode.GetCoordinates(Row.AddressLine1 + "," + Row.City + "," + Row.State + "," + Row.Country);
        Row.Latitude = Convert.ToDecimal(geolocation.latitude);
        Row.Longitude = Convert.ToDecimal(geolocation.longitude);
        Row.GeoCodeURL = geolocation.geocodeurl;
   }
}

e. Add your destination. In my case, I’ll just drag a multicast, and add a data viewer, because I just want to see the results.



f. Test your package.



There you go, we’re done with our cool SSIS experiment!
Cool eh?

References:

http://rickcaminiti.com/misc/google-geocoding-ssis/
http://blog-of-darius.blogspot.com/2011/02/geocoding-with-c-and-google-geocoding.html

VN:F [1.9.22_1171]
Rating: 8.9/10 (15 votes cast)
VN:F [1.9.22_1171]
Rating: +4 (from 4 votes)
Geocode Locations Using Google Maps v3 API and SSIS, 8.9 out of 10 based on 15 ratings  
Be Sociable, Share!
  • Tweet