Home -> Online Articles -> The Growing Need for Spatial ETL
The Growing Need for Spatial ETL
On Data, Technologies and Convergence
Spatial ETL (extracting, transforming, and loading) tools have been around for over a decade. Yet only in the past few years has their true strategic value emerged in geospatial initiatives around the globe. Now more than ever, it’s the data that’s pushing the capabilities of Spatial ETL tools into a whole new dimension. This article describes how the future of these critical data moving tools is being shaped by emerging data model, source, and format requirements alongside the trends in web and traditional IT technologies.
![]() Figure 1. The number of formats that Spatial ETL tools have had to add support for continues to increase as demonstrated by this timeline of supported formats in FME, a Spatial ETL tool built by Safe Software. |
When many people think about Spatial ETL, they usually think about translating data between formats. Common responses are “Spatial ETL is used to move data from format X to format Y!” Certainly, as figure 1 indicates, Spatial ETL tools must support an ever increasing number of formats in order to continue to meet the diverse and evolving needs of organizations around the world.
While format support is most definitely required for moving data (if you can’t read or write data, then you can’t apply the full power of Spatial ETL), Spatial ETL is much more than just format translation technology. This point was really driven home to the people at Safe Software by a recent poll of its users. It showed that the number one use of our Spatial ETL tool was to convert data from ESRI Shape to ESRI Shape! For MapInfo users, the number one destination was MapInfo, and the same pattern was found for AutoCAD and Microstation users! This was very humbling discovery.
![]() Figure 2. Traditionally, Spatial ETL tools simply extracted, transformed, and loaded data from one format to another. Today’s Spatial ETL tools must be able to provide full data transformation capabilities including format translation, data model transformation, and data integration, along with full distribution capabilities. They must also be able to support a wide variety of data types – from CAD to vector to 3D and so on. |
The Backbone of Spatial ETL
So if today, Spatial ETL isn’t just about format translation, then what is it about? As a customer survey distinctly indicated, Spatial ETL is about reconciling data model differences. This reconciliation between the source data model and the destination data model is performed by the transformation capabilities in Spatial ETL. These transformations can be as simple as a coordinate reprojection or more sophisticated, such as combining multiple data sources while changing both attribution and geometry structures.
Ultimately, users need to be able to use data with the tools of their choice, and these tools need to be able to consume data in both the correct format and the appropriate data model. Spatial ETL enables this effective communication of spatial information so users can leverage the power of their spatial data assets.
![]() Figure 3. ETL tools are facilitating the convergence between traditional IT and GIS. |
Convergence
In the early days of Spatial ETL, the tools were limited to one type of spatial data. Some focused on vector formats which encapsulated GIS and CAD while others focused on raster formats. Today’s spatial ETL tools must answer to all worlds, as convergence is becoming a common requirement at multiple levels.
First, there’s a convergence of different spatial data types because state-of-the-art spatial/GIS systems now support multiple types of spatial data. For example, Oracle and ESRI database technologies now support vector, raster, tabular, point cloud and 3D data types at the database level. With these database technologies comes a whole new set of applications that can, for the first time, exploit multiple types of spatial data.
Populating these databases is a challenge for users since they must retrieve data from traditional spatial sources and push it into these new databases in order to effectively exploit the new set of spatial tools available to them. To satisfy this requirement, users need a Spatial ETL solution that supports multiple types of spatial data in order to be capable of extracting data from multiple sources, transforming it into the organization’s chosen data model, and loading it into the database (see Figure 2).
Secondly, there’s a convergence between traditional IT and what could historically be considered the mapping or GIS department. With the advent of Google Earth, users are now exposed to spatial data in ways that only a few years ago were impossible. The support for spatial data types within corporate databases such as Oracle and ESRI Geodatabase further facilitates this convergence, as now there is a single data store for all corporate data. This has begun to blur the traditional line between mapping and IT applications. Nowadays, standard IT systems incorporate maps as a new way for users to visualize and analyze their data. GIS and mapping tools begin to leverage standard IT data, thereby providing extra value to organizations.
|
Constructive Solid Geometries (CSG) CSG enables a model to be built out of relatively simple objects to create very complicated geometries. They are typically used in 3D systems for BIM and CAD. The illustrations below is from Wikipedia and illustrates how a very complex object can be constructed with very simple objects. |
New Data Types
The geospatial industry has traditionally focused on exterior spaces such as countries, counties, cities, and parcels. Up until recently, geospatial data would end at the building footprint and not contain true 3D geometries but simply elevation, or 2 ½ D data. In the past year, the geospatial community’s interest in Building Information Model (BIM) data has increased substantially. This is reflected in that databases have been extended by Oracle and ESRI to support this growing requirement for 3D data storage.
BIM and 3D data has historically been the domain of the Architecture, Engineering and Construction (AEC) community. The AEC community has great experience working with BIM models and is able to create entire 3D models of building construction projects, greatly improving the efficiency of the construction process. Convergence between the world of BIM (interior) data and the world of GIS (exterior) data promises a marriage that will open up a whole new set of opportunities. Imagine making a city’s entire infrastructure available at the fingertips of its occupants, builders, and emergency responders; not just the exterior infrastructure! This marriage can revolutionize emergency response and improve many other facets of city living. For example, a firefighter will know exactly how much hose is required to reach a specific area within a building on fire. This potentially life-saving information is a direct result of combining the power of GIS and BIM. Google and Microsoft have also inspired people’s imaginations by building 3D views of cities. Currently, these models are exterior views of buildings which construct a model of the “cityscape.” The future possibilities for convergence are clear: complete cities with both “interior” and “exterior” information available in an integrated environment.
The key to making the marriage of BIM, 3D and GIS work is the data model transformation capabilities of Spatial ETL tools. This adds a whole new dimension to the development process for Spatial ETL tools like FME. Embracing a new data type requires Spatial ETL vendors to first identify leading data sources and targets to ensure the most significant impact in the new market. In first foray into 3D, Safe Software found that one of the best sources of 3D building data is a format called IFC (Industry Foundation Classes). This format is an open specification that is developed by the International Alliance for Interoperability (IAI) to facilitate interoperability in the building industry. According to Safe Software, the best data targets are the leading databases with which they were already familiar, such as Oracle and ESRI Geodatabase, and Adobe PDF for its impressive support for 3D data.
When Safe Software began developing support for 3D data, they first analyzed the 3D models of several different Geospatial systems so that they had some level of confidence in the new 3D data model that they had created within their technology. Since they have added support for new data types before, they were not surprised to find out that some of their initial IFC to PDF translation tests were missing data!
It turns out that IFC supports a 3D concept called Constructive Solid Geometries (CSG, see box section) which the other systems don’t support. To resolve this substantial difference between IFC and other systems, the challenge was to find a solution to the data model conflict so that the data could be moved from a system that supports CSG to systems that do not. To be effective, it also had to ensured that CSG objects were not lost when moving data between two CSG capable systems.
This geometric requirement is yet another example of the data model inconsistencies that must be resolved by Spatial ETL tools in order to achieve effective communication from one system to another. It also illustrates that the data model transformation which occurs in Spatial ETL tools is often not just about attribution, but also about geometric representation.
![]() Boolean |
![]() Boolean difference: The subtraction of one object from another. |
![]() Boolean intersection: The portion common to both objects. |
Extending Spatial ETL to the Web
No discussion surrounding Spatial ETL would be complete without addressing the role that Spatial ETL plays within upcoming Spatial Data Infrastructure (SDI) projects. As already expressed, the role of Spatial ETL is always about getting data from one or more data stores into a form that can immediately be used. Traditional Spatial ETL has typically been a batch process which is run periodically. While there have been instances where it has been used to transform spatial data in a live process, these projects have been infrequent because of the burden they impressed on the GIS department.
Web service technologies and new Spatial ETL server solutions have now come together making dynamic, or on-the-fly, Spatial ETL available for the first time. Dynamic Spatial ETL enables web services to serve data to users in a data model that is totally different from the data model of the underlying data. It is even possible for a single web site to provide different user communities with different views of the data through the power of dynamic Spatial ETL. This is fundamental for a SDI project to be effective, as different user communities require different views of the world and have different priorities of what they need to see.
Until this technology emerged, SDI initiatives were reminiscent of the Model T Ford in which customers could have any color they want as long as it was black. But as we learned with traditional Spatial ETL: when it comes to data, one size doesn’t fit all.
The need for specific data models for distinct communities is best demonstrated by the INSPIRE project in the European Union. This project has the challenge of building a pan-European SDI that will serve users in multiple countries. To be effective, the INSPIRE SDI must be able to serve the same data to users who speak different languages, and thus must also serve the data in different languages. If ever there was a need for a single system to remodel data on-the-fly, then this is it.
While there are standards that all Spatial ETL servers must support, there is also a
proliferation of web protocols, or formats, that is occurring. Here the winning approach is to once again for Spatial ETL tools to support as many different web formats as possible, for example Open Geospatial Consortium (OGC) protocols, GeoRSS, GeoJSON, KML, and GML.
A Spatial ETL vendor’s role is not to try to predict which formats are going to succeed in the marketplace, but rather to support as many of the leading technologies as possible, thereby allowing the market to decide. When it comes to web technologies, advancements move very quickly and are incredibly exciting to watch.
While one major use of Spatial ETL server technologies is to make data available to web applications and users, conversely, web technologies can be used as their own source for spatial data and services. An example of this is MapQuest’s recent release of a new web API that provides users with a set of routing and mapping capabilities. There are many other new web services being released all the time. In fact, the OGC has just announced its Web Processing Service (WPS) standard which will enable more and more web services to become available.
The Future for Spatial ETL
Throughout the industry, we are seeing an explosion of data sources in a wide variety of areas from new data types to databases and web services. At the same time, we are seeing a great increase in the number of applications that require access to that data. With the proliferation of applications and data, the future need for Spatial ETL is growing as more than ever, it is all about the data.
Don Murray don.murray@safe.com is President of Safe Software.
Have a look at www.safe.com


















