Whether you are setting out to optimize delivery routes, analyse catchment areas, deploy geomarketing techniques or define the ideal sectorization for your sales force, geocoding is without question the place to start. Geocoding generates the basic units of information for building the geographic dimension of your business and fully exploiting your applications and business processes.
Let’s start with a definition:
Geocoding is an operation that consists of assigning geographic coordinates (X,Y / latitude and longitude) to a postal address so as to be able to locate it in geographical space.
Any object that already has an associated postal address, or that can be associated with a postal address, can be geocoded.
Taken in isolation, geocoding is a data-enhancing operation that processes input data to match it with stored standard address data using a geocoding engine designed specifically for that purpose. The three main components of this treatment are:
This is a rather simplified description, but it will serve perfectly to reflect and clarify the three main points (developed below) you need to keep in mind before finally taking the step of purchasing and using a geocoding solution.
Armed with even the most powerful geocoding engine on the market and the most comprehensive referential database available, if the data is of low quality, the algorithm will be unable to make intelligent and accurate matches between addresses it is presented with, and those it can find in the database. When it encounters an
At GEOCONCEPT, we know from experience that businesses tend to overrate the quality of their address data files. That’s why we usually recommend carrying out an audit of the databases to geocode, and follow this up with a radical rationalisation of existing data to bring it into line with established standards for the various address fields that make up the exact postal location, town or city name, country…
Once this is done, if you decide to take the wise and proactive step of setting up systems to reduce geocoding engine error and rejection rates sustainably, the best solution is to take action upstream at the point data is initially collected, by putting in place input constraints and address standardisation help routines in your applications and address gathering forms. This way, you can side-step a host of problems treating incorrectly spelt or wildly imaginative concoctions of city or street names, incomprehensible abbreviations, post codes that don’t exist, or that are entered in the wrong field, and above all, input of
The choice of database is hugely important. Most geocoding solution publishers work with several suppliers, and allow their customers to select the database that most suits their needs as regards precision,
You can also try using so-called ‘free’ databases, sourced by community development groups that collect data and share it, so these can be used is free of charge. Because of their long experience and world coverage, OpenStreetMap is without doubt the most comprehensive offering of this type in this category. It is the more reliable for France because, since 2016, the IGN has made its aerial coverage available to the OpenStreetMap community for the whole of the French territory.
Not forgetting to mention using GoogleMaps as a database, while this rates highly in terms of quality, and data is constantly being updated, users quickly find it heavy-going if used intensively for the purpose of geocoding. Another disadvantage worth considering is that if you opt for GoogleMaps as your database, you will not be able to keep the geographic coordinates collected, nor will you be able to exploit them elsewhere in applications where you might need them. These data remain the property of Google.
So our advice is: when choosing a database, be sure to check not only that it contains all the data you need for your business perimeter or project, but you should also carefully scrutinise the terms and conditions for using data, the status of any geocoding results you obtain, and the cost of updates.
There are two main types of geocoding solution:
In either type of solution, you should have the option to perform all three of the following: one-off geocoding operations on-the-fly, and so, for the sake of practicality, configure geocoding to take place each time an address is entered in the source application so it is automatically geocoded at the point of entry; batch geocoding (batch mode) that is essential when handling large volumes of data; and also incremental geocoding, that makes it possible to shorten treatment times by only geocoding addresses that have been added recently to the source database. It should be possible for all these processes to be completely automated and performed in back-office conditions, or performed on demand by the user.
However you intend to deploy and use it, a geocoding engine should offer configuration options and an interface for manual error handling and treatment of rejected address records. Without these interactive and control tools, your geocoding engine will be like a black box, impenetrable for the purpose of control and manipulation of strategies for address recognition, close-match handling and machine learning. To side-step this «black box» effect, the tolerance criteria of the GEOCONCEPT geocoding engine can be modified as necessary, and the strategies deployed when making choices for address handling fine-tuned and aligned so as to produce a desired result. Our engine also features a geocoding wizard that displays a recognition score, suggests correction options for rejected records, and gives the user an option to save any corrections applied to memory. This means you can capitalise on acquired knowledge while keeping control as to how strictly the algorithm applies its configured criteria to data being treated. |
Now that you know what to watch out for when choosing a geocoding solution, you are ready to take the next step:
Alexandra GOMEZ, Product Owner