Location plays an
inportant role in many aspects of our daily lives. In Business
Intelligence, every so often you will encounter geospatial data. From
a BI perspective, this geospatial data is different from traditional
data stacked in source systems. It will even look different, or even
unreadable when you query it. But there's a growing awareness that
location is valuable and useful for business purposes. So as a BI
specialist you can deliver added business value by using this data in
reporting, analytics and presentation.
What is geospatial
data anyway? Geospatial data is used to locate an item or event on,
below or over the earth's surface. Whether it's the location of a
store, the perimeter of a building or the route of a delivery truck,
if you can pinpoint it anywhere on earth it is geospatial data. It
doesn't even have to represent a real-world object. County borders,
property boundaries and environmental zones are clearly geospatial
objects.Though you can't literally point to them and say “there it
is” you can print them on a map.
The geospatial
location of an object is often stored as one of three geometric
representations: a point, line or polygon. In essence, all geography
consists of coordinates. A point is simply a coordinate pair. Lines
are made of 2 or more coordinate pairs linked in a specific order,
while polygons are made of 1 or more connecting lines. There are more
elaborate type objects like arcs, multigeometries, raster grids and
TINs (and there's also 3D).
But let's keep it
simple for now. In BI, geospatial data can be encountered in roughly three
different ways.
Geocoding
Any object that
contains a reference to an address, town, zipcode, county, state or
country can be located on the map using that reference. There are
several geocoding services out there that are happy to provide you
with the location of objects. You simply enter your reference into
the geocoding service and it will return the location as point, line
or polygon. You don't need to maintain geography. Of course your
reference needs to be complete to return a valid result. For example,
if the name of a state is misspelled, the geocoding service might
return no or an incorrect result.
Below is an example of a geocoding request to the Google Maps Geocoding API with the output it generates. You can see the location coordinates returned by the service, as well as some additional geographical info.
https://maps.googleapis.com/maps/api/geocode/json?address=bosscheweg+56+boxtel+the+netherlands
{
"results" : [
{
"address_components" : [
{
"long_name" : "56",
"short_name" : "56",
"types" : [ "street_number" ]
},
{
"long_name" : "Bosscheweg",
"short_name" : "Bosscheweg",
"types" : [ "route" ]
},
{
"long_name" : "Boxtel",
"short_name" : "Boxtel",
"types" : [ "locality", "political" ]
},
{
"long_name" : "Boxtel",
"short_name" : "Boxtel",
"types" : [ "administrative_area_level_2", "political" ]
},
{
"long_name" : "Noord-Brabant",
"short_name" : "NB",
"types" : [ "administrative_area_level_1", "political" ]
},
{
"long_name" : "Netherlands",
"short_name" : "NL",
"types" : [ "country", "political" ]
},
{
"long_name" : "5283 WB",
"short_name" : "5283 WB",
"types" : [ "postal_code" ]
}
],
"formatted_address" : "Bosscheweg 56, 5283 WB Boxtel, Netherlands",
"geometry" : {
"location" : {
"lat" : 51.60054179999999,
"lng" : 5.323087699999999
},
"location_type" : "ROOFTOP",
"viewport" : {
"northeast" : {
"lat" : 51.60189078029149,
"lng" : 5.324436680291502
},
"southwest" : {
"lat" : 51.59919281970849,
"lng" : 5.321738719708497
}
}
},
"place_id" : "ChIJWxp4d1noxkcRm32m9pvecrs",
"types" : [ "street_address" ]
}
],
"status" : "OK"
}
Geocoding services
also work the other way around. You can give them a location (e.g. a
coordinate pair) and request for the return of an address, town,
county, state or country. This is called reverse geocoding.
Professional geocoding services provide additional functionality like
returning the distance or shortest route between two objects. Both
Google (Google
Maps Geocoding API ) and Microsoft (Geocode
Dataflow API) provide a geocode service, but there are other free
and/or paid services. Free geocoding services can have a limitation
in the number of feature requests you can place with them.
In addition,
geocoding services are integrated in off-the-shelf BI solutions. An
object can be presented on the map using on-the-fly geocoding. You
simply throw your location reference onto the map component and it
automatically adds location through a geocoding service.
Coordinate pairs
Another common form
of geospatial data is the usage of coordinate pairs to store the
location of an object. In a table, this is noticeable by the presence
of two columns which each carry one part of the coordinate pair. One
column contains an X-value or latitude, the other contains an Y-value
or longitude. A big advantage of coordinate pairs is that both
columns are number type columns that can easily be read or edited by
anyone. The downside is that coordinate pairs can only be used for
point locations. But for many uses, like store locations, addresses
or incident reports, this is more than enough.
Coordinate pairs are
also the lifeblood of big data. Every post, tweet, photo or video
made with your smartphone can (and probably is) geographically tagged
by adding the latitude and longitude recorded by your phones GPS
unit. So as a business intelligence specialist you will be running
into this data once in a while.
Using
coordinate pairs is fairly straightforward. Most BI tools for
reporting and dashboarding will support some kind of mapping
component that accept two columns for location reference. Even
spreadsheet software has plugins enabling the use of maps with
coordinate pairs. The usage of coordinate pairs is inherently limited
to plotting placemarkers on the map. These placemarkers can carry
additional information through use of different sizes, colors and
fonts.
Spatial attributes
We're entering the real spatial domain now. Some data sources contain
spatial information stored in table attribute columns. These columns
will contain actual points, lines and polygons. From the outside, the
columns will probably look like to contain some unreadable BLOB type
information. But in reality, these columns contain what we call
spatial attributes. Below are examples of a geometry column in Oracle, MS SQL Server and PostgreSQL.
These spatial attributes can be used for complex geospatial analysis
through querying or with use of specialized geospatial tooling.
Nowadays, all major RDBMS vendors support spatial information through
a special datatype, which allows you to define attribute columns
specifically for storage of spatial information. Besides the
datatype, these products will also offer functions and procedures for
definining spatial relationships and spatial indexes.
When using spatial attributes, some basic knowledge of geospatial
techniques is required. Spatial attributes will contain more relevant
information than the point, line or polygon alone, like a defined
coordinate system, map projection, rectangular bouding box, precision
settings and some other stuff. Storage can be an issue too as spatial
data can grow quite large.
All this makes handling of true spatial attributes more of a
specialty, but it also gives you the most possibilities in terms of
analysis and defining relationships between datasets. I will elaborate on the possibilities in future posts.
In conclusion
These three types of geospatial data all have their pros and cons.
But they don't mutually exclude one another. In fact, they complement
and even merge with one another on a regular basis. If you obtain the
coordinate pair of an object with a request to a geocoding service,
it is fairly simple to subsequently store the coordinate pair in your
system. And coordinate pairs can be translated into spatial point
attributes with the same amount of ease. Coordinate pairs can also be derived from spatial attributes. Furthermore, mapping
components can also use spatial attributes for visualisation
purposes. So whatever geospatial data you encounter, you always have
options.
If you have any comments or suggestions, feel free to do so below!