Naoki Ueda
Founder and CEO
Locazing Inc., Japan
nao@LocaPort.com
Imagine if your MP3 player could
store only 20 songs, or if you are
unable to send or receive an email
because it has a large file attached. Data
compression would be your need. Today,
data compression technologies are a very
popular, common, and fundamental part
of Information Technology and our daily
lives as well.
In 1948, Dr. Claude E. Shannon formulated
the theory of data compression.
After that, a variety of data compression
theories and methods have been
developed. Today, they an integral part
of the applications software and not
explicitly mentioned, but they certainly
contribute a great deal of convenience
to our daily life. There are various
'ready-to-use' compression tools available
for many different types of usage.
You may have heard of some of them,
such as MP3, AAC, WMA formats,
which were designed for audio data
compression, JPEG, for picture compression,
and ZIP, for the compression of
data. Data compression technologies
always work behind the scenes to
improve the usability, conformability,
and convenience of hardware and
related services.
Today, Location-Based Services (LBS)
is focused on expanding their markets
and becoming as popular as other IT
services. Especially in the mobile phone
market in advanced countries, almost
every mobile phone is equipped with a
GPS and is LBS-ready. Yet, people do not
use these services unless they are as
easy and comfortable to use as MP3
players and music download services.
Now new data compression technologies
especially designed for LBS/GIS
data are in high demand because of the
following reasons:
Existing data compression technologies
are insufficient to meet the
demands of LBS/GIS
GIS data compression improve customer's
experience of LBS
GIS data compression boosts LBS
EXISTING DATA COMPRESSION TECHNOLOGIES ARE INSUFFICIENT TO MEET THE DEMANDS OF LBS/GIS
There are a lot of data compression
algorithms and methods available in
the world. Each of them is designed and
optimised for specialised purposes.
Most compression tools are combinations
of multiple compression algorithms
and try to maximise the compression
efficiency of target object
data. Compression theory always uses
the characteristics and inherent redundancy
of target data.
EXAMPLES OF DATA
TYPE AND COMPRESSION
METHODS
Compression of audio data utilises the
limited nature of human's cognition
process, and cuts out sound which can
hardly be heard.
In this case, the data compression is
not 100% reversible but ensures adequate
quality to allow pleasant listening.
This is called 'Lossy' data compression.
MP3 audio - MPEG layer 3 - is
based on this principle and combines
many algorithms to balance compression
efficiency and quality loss. On the
contrary, for file data, the compressionmust
be 100% reversible. This is called
'Lossless' data compression. If the file is
a text file, such as an English article or
computer programming code, there
should be repeating pattern of words
and phrases. If there is a repeated pattern,
an algorithm called Universal
Coding can be used for lossless compression.
LZW coding, which is used in
GIF image format, is one example of
Universal Coding. For visual image
data, the fact that a pixel and its neighbour
have similar colour is utilised as
part of the mechanism of compression.
When data has some relation to neighbouring
data, Data Relativity Coding
may be used. DPCM - Differential Pulse
Coding Modulation - is one example of
Data Relativity Coding. The result of
calculating the difference between one
data point and the next data point is a
small number. When there is a bias of
appearance probability among values,
the method called 'Entropy Coding' is
suitable. The famous Huffman coding
is one such method. The JPEG image
format uses one of these algorithms.
GIS DATA IS DIFFERENT
FROM AUDIO, VISUAL, OR
TEXT/BINARY DATA
First, GIS data has a layer structure. It
often represents lines, areas or both. For
example, navigation routes (line) or the
outer limits of a park (area), etc. These
GIS data consist of multiple geographic
points or locations. Each point data consists
of latitude, longitude, and optionally
altitude, time, or other parameters.
Second, any geographic point and the
next point in GIS data are probably
'close' to each other and stay in the local
region, if data represents routes or
areas. However, data scale, latitude and
longitude, is available globally. Thus,
sets of data cause large redundancy
that can be compressed easily. Even
single usage of differential-based compression
method may work fine.
shows an example of differential-based
compression.
Third, GIS data require 'Lossless' compression
and 'Lossy' compression,
depending on the situation and purposes.
In other words, what you need is to
pinpoint an exact location, route, or
shape of path/area. 'Lossy' compression
may fit when the 'route' is more important
than each of the points it consists
of. In this case, the following options
may be considered. (Fig. 1) In addition,
unlike other types of data, a GIS 'route'
data can be recovered by Map-Matching
technologies that adjust position
error onto road shape on the map.
Thinning out the points it consists of
Leaving only the intersection points
remaining
Decreasing the precision of these
points
LBS may be used with handheld computers
or mobile phones. Generally
speaking, data compression algorithms
are 'heavy' processes for computers. PCbased-
LBS have the power to handle
this, but mobile-device-based-LBS have
to be capable of handling these kinds of
compute intensive processes. 'Heavy'
processing slows down systems and
consumes batteries quickly. Thus, GIS
data compression should be a 'light'
process for such devices.
In addition, it is preferable that compressed
data has 'later adding capability'.
In case of real-time locating, data
arrive periodically and need to be
added to previous data. If GIS data compression
needs decompression, adding,
and compression again every time, it
will consume a lot of hardware
resources.
As explained above, GIS data has
unique characteristics, so existing data
compression methods that are optimised
for other types of data do not
cover the demands of GIS data compression.
GIS DATA COMPRESSION
IMPROVE CUSTOMER'S
EXPERIENCE OF LBS
Usability or human interfacing is the
key for any product or service, if your
business target is consumers. In other
words, people only use what is easy
and comfortable to use. Like the MP3
makes your music more accessible, GIS
data compression could improve your
customers' experience.
First, GIS-compressed data saves storage
capacity. Users may use LBS in
handheld devices or mobile phones
with limited storage capacity. Saving
data storage space immediately
increases available memory.
shows an example of GIS data compressing.
Second, it improves communication
speed - an important factor to
ensure user comfort in the service,
especially for network-based LBS.
Many IT services - whether LBS or non-
LBS - start considering packet saving for
enhancing performance, because
the slowest element determines the
whole performance in network-based
services.
Thus, many Web services have started
offering the JSON data format along
with the XML format to improve the
performance of their services. JSON is a
much 'lighter' processing system and
provides a smaller data size than XML.
In addition, if customers use LBS on
mobile devices, GIS data compression
may save them money, assuming that
the cost of data communication
depends on data traffic.
GIS DATA COMPRESSION
BOOSTS LBS
LBS will evolve in two directions. One is
"M-LBS" and another is "C-LBS".
Mash-up LBS
What I call 'M-LBS' is 'Mash-up LBS'.
Generally speaking, IT services are
moving towards independent services
to 'mash-up' their services. The era of
mash-up services is just arriving. Commercial
companies are shifting from
contents provider to platform provider.
The best example of mash-up in LBS is
the Google 'Mapplets' gadget. By connecting
Google Maps™ to other information
providers, you can show any
location-based information on Google
Maps™.
Platform vs Content
Google never provides any service content.
However, Google provides a platform
- i.e., materials to build up services.
Thousands of amateur developers
called "mash-uppers" and many commercial
companies create services by
combining parts called APIs or Applicaton
Programming Interface. Some are
offered by Google, some are from other
Web-service companies, and some are
the original creations of individuals. As
a result, Google Maps™ gets thousands
of LBS contents available for Google
Maps™ in a short period. Not every
commercial company that makes LBS
contents can provide contents in this
manner. In addition, many 'Mapplets'
are used by many customers, so the
contents that people use most come to
the top of list.
In short, Google only provides a platform
for mash-uppers, and as a result
many excellent LBS contents come is
added to Google Maps™. Because of
Google Maps™ API, most of the early
mash-up services that people created
are Google Map-based LBS. So, surprisingly,
in the field called 'Web 2.0', LBS
play a popular part.
Complex GIS Data interaction LBS
Second, what I call 'C-LBS' is Complexdata-
LBS. In the era of the mash-up,
Web services need to exchange data. If
both services are LBS, then the GIS data
Fig. 2 Fig. 3
should be exchanged or
shared. Today, only a few
services have started exporting/
importing GIS data
among other services and
even in these cases, they
only share GIS data of a single
point, probably a 'destination',
or two points, probably
'start and goal'.
If LBS platform services
start exchanging complex
GIS data, such as routes,
directions, or areas, then
service interaction will
become richer, rather than
interacting with a single
point. Figure 3 shows the
concept of M-LBS and C-LBS.
A navigation system
exports your 'route', and then
a city guide service shows
only those restaurants that
are along your route.
A real estate search service
exports a 'list of the houses
you chose', and then search
services give you a list of
sports gyms close to each of
the houses.
Travel planning services
export 'your travel schedule',
and then an ad service gives
you coupons of shops that
you may visit later.
There must be more creative
combinations of LBS
services. The key is the portability
of GIS data. In independent
GIS or LBS, GIS data
stay within a system database.
In inter-LBS mash-up,
GIS data must be available
even outside a system as
well as sharable and
exchangeable with other
mash-up services. In
other words, GIS data
need to be "portable"
There are already
some file formats that
enable data exchange
among LBS. The formats
called KML, KMZ, GML, and
GeoJSON are commonly used
to share data among GIS systems.
KMZ is a ZIP-compressed
file of KML. However,
KMZ is not efficient. Our
experiments found that
replacing GIS data with GIScompressed-
format, such as
Google Maps™ API encoded
Polyline algorithm or LocaPort™ GIS data compression Algorithm, in
KML is much more effective
than KMZ. This is only
because the ZIP method is
not suitable for GIS data, as
explained before.
CONCLUSION
Once practical GIS data compression
becomes popular
and available, LBS will rise to
a new mash-up-based stage.
GIS systems - the foundation
of LBS - should support GIS
data compression and
decompression for the services
that run on it. With
thousands of mashed-up LBS
contents, there will
doubtlessly be very creative
ones that nobody could even
imagine today. LBS and its
market will then expand at
an increasing rate. In the
near future, GIS data compression
technologies will
definitely play an important
role behind the scenes.