0% found this document useful (0 votes)
44 views10 pages

Mapcruncher: Integrating The World'S Geographic Information

This document describes MapCruncher, a tool that allows users to easily create new interactive map data that can be layered on top of existing online maps. MapCruncher geographically registers user-uploaded maps to the same coordinate system as existing maps and renders image tiles at different zoom levels, allowing the new maps to be seamlessly integrated. The goal is to enable more composable and discoverable geographic mashups on the web to facilitate integration of map data from different sources. MapCruncher assists users in georeferencing their maps and emitting metadata that allows other applications to discover and integrate the new map layers.

Uploaded by

AAn Sorri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views10 pages

Mapcruncher: Integrating The World'S Geographic Information

This document describes MapCruncher, a tool that allows users to easily create new interactive map data that can be layered on top of existing online maps. MapCruncher geographically registers user-uploaded maps to the same coordinate system as existing maps and renders image tiles at different zoom levels, allowing the new maps to be seamlessly integrated. The goal is to enable more composable and discoverable geographic mashups on the web to facilitate integration of map data from different sources. MapCruncher assists users in georeferencing their maps and emitting metadata that allows other applications to discover and integrate the new map layers.

Uploaded by

AAn Sorri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MapCruncher:

Integrating the World’s Geographic Information


Jeremy Elson, Jon Howell, and John R. Douceur
Microsoft Research Redmond

[email protected]

ABSTRACT new ways of constructing geographic Web mashups so that they


can seamlessly interoperate.
Current large-scale interactive web mapping services such as
Virtual Earth and Google Maps use large distributed systems for Existing mashups are implemented largely in imperative code –
delivering data. However, creation and editorial control of their JavaScript that runs in the client. This design gives site designers
content is still largely centralized. The Composable Virtual Earth enormous flexibility, which led to the explosion of creative and
project’s goal is to allow seamless interoperability of geographic innovative mashups. Well-known standards for describing
data from arbitrary, distributed sources. geographically-tagged points, lines, and raster graphics had
already existed for many years (e.g., GML [17], GeoRSS [6]);
MapCruncher is a first step in this direction. It lets users easily however, the sudden appearance of the mashups suggests many
create new interactive map data that can be layered on top of applications are not well-served by these standards. The
existing imagery such as road maps and aerial photography. combination of HTML and JavaScript allowed developers to go
MapCruncher geographically registers and reprojects the user’s beyond creating layers, to create applications. In other words,
map into a standard coordinate system. It then emits metadata mashup developers are using imperative code to customize exactly
that makes it easy for anyone on the Internet to find the published how their application operates, rather than simply creating layers
map data and import it. Interactive maps them become declaratively whose user interactivity would be limited to “on”
distributed, seamlessly composable building blocks – similar to and “off.”
images in the early days of the Web.
A key design goal of CVE is to offer a mashup framework that is
Categories and Subject Descriptors sufficiently structured to enable composition, yet sufficiently
flexible to admit innovation. This interoperability balancing act is
H.2.8 [Database Management]: Database Applications – spatial common in distributed systems design, from domain-specific
databases and GIS; D.2.6 [Software Engineering]: Programming frameworks such as the Flux OSKit [4], x-Kernel [15], and
Environments – graphical environments; D.2.12 [Software stackable file systems [9], to application-agnostic schemes such as
Engineering]: Interoperability; G.4 [Mathematical Software]. Placeless active properties [3]. We plan to exploit the
geographical domain constraints to best achieve this balance.
Keywords
Interactive maps, composition, mashups, geographic coordinate 1.1 MapCruncher
systems, graphical interactive georeferencing, map projections, As a first step, we created MapCruncher, a tool that allows users
approximate reprojection, decentralized publishing, image tiling. to add custom raster overlays onto the existing road and aerial
imagery provided by Virtual Earth or Google Maps. Overlays are
typically detailed maps, such as a bicycle route map, building
1. INTRODUCTION floor plan, or campus map. The resulting web site is an
In the relatively short time since the introduction of online interactive web map that features both the user’s maps and the
mapping sites like Google Maps [8] and Microsoft Virtual Earth standard imagery. Like the underlying maps, user maps are pre-
[16], hundreds of user-created “mashups” have appeared. These rendered into small image tiles at a variety of zoom levels,
mashups cover a wide diversity of subjects. For example, Seattle allowing the client to efficiently request the portions of a large
Bus Monster [19] plots public transportation routes in Seattle; virtual image that are needed for display.
chicagocrime.org [2] highlights dangerous areas of Chicago;
RunwayFinder [18] summarizes weather and airspace surrounding MapCruncher first assists in registering the foreign map into the
general aviation airports; housingmaps.com [11] shows real estate same (Mercator) coordinate system used by existing online map
prices. These specialized sites each display the data from their sites. Users select correspondence points between their own maps
particular application domain on top of maps and aerial imagery and existing maps, using road intersections or other recognizable
supplied by Google or Microsoft. landmarks. Once enough points are selected, MapCruncher
estimates the transformation from the original map’s coordinate
While useful individually, mashups can be far more useful when system into Mercator by finding the best fit coefficients of a
integrated with each other. Today, however, mashups are largely second-degree polynomial; while inexact, the error is typically
independent. For example, to find cheap real estate in a low- small enough not to affect the results. MapCruncher then
crime neighborhood, or find the public transportation near a reprojects the original map and renders correctly registered and
general-aviation airport, users must visit each mashup individually zoomed image tiles that can be seamlessly integrated with existing
and manually integrate the results. The goal of Microsoft imagery.
Research’s Composable Virtual Earth (CVE) project is to find
Mashups created with MapCruncher do not restrict the Interactive maps have several advantages over their on-demand
developer’s freedom to write arbitrary JavaScript that customizes predecessors. Perhaps most importantly, user interaction is
the experience of their end-users, satisfying one of our design significantly more intuitive. Because the final step of assembling
constraints for CVE. However, MapCruncher also emits metadata tiles into an image is done on the client, it’s possible to support
about the mashup, such as its geographic bounds, the file naming fluid panning of a seemingly infinitely-sized image. Pre-rendered
scheme for the tiles, and a brief description of the data as entered services typically have higher quality images as well; because tiles
by the user. Because this data is semantically meaningful, it are no-longer rendered in realtime, slow enhancements such as
facilitates later discovery and integration of the imagery into other anti-aliased fonts can be used.
applications. In addition, much of this data is encoded as
Shifting so much of the map’s implementation to the client also
specially constructed strings that enable ordinary web search
had an unexpected effect. Soon after the release of Google Maps
engines to find mashups matching geographic criteria. This
(the first such public service), web hackers learned to create
combination of composability and discoverability takes us a step
Google Maps mashups. A mashup is a combination of maps with
closer to our goal of a system that is capable of more seamless
other geographically interesting data, such as those listed in the
integration of geographic data on the web.
Introduction. Geographic mashups gained popularity quickly.
In the next section, we briefly describe the history of geographic Most major online mapping sites released official APIs that
mashups on the web. In Section 3, we review some of the allowed web developers to create mashups “legally.”
difficulties in creating mashups using raster overlays. Section 4
For the most part, geographical mashups so far have consisted of
describes approximate reprojection, the central technique used by
drawing fairly simple shapes on top of the online maps—for
MapCruncher to simplify the creation of raster mashups. In
example, layers of pushpins (houses for sale) or polylines (bus
Section 5, we describe how this idea can be used to efficiently
routes). Largely ignored, however, is the practice of
generate a database of image tiles. We review deployment issues
superimposing an entire image layer onto the underlying imagery.
and briefly describe a few sample applications in Section 6.
We speculate that this is because raster overlays are difficult to
Finally, we conclude in Section 7 with some thoughts on how
construct, as we will explore in the next section.
MapCruncher might lead us towards an integrated and
composable Virtual Earth. The difficulty in constructing raster overlays is unfortunate,
because they can be quite useful. Figure 1 shows an example.
2. INTERACTIVE WEB MAPS AND THE The left pane shows the image of the University of California, Los
Angeles as seen in either Google Maps or Microsoft Virtual Earth.
RISE OF MASHUPS The aerial imagery shows a densely built area, but the street atlas
Online mapping services have existed for years. Until recently, has no data describing any of the buildings or the campus’
they all had the same general architecture: maps would be custom- internal roads. However, UCLA publishes a detailed campus
rendered from the underlying data on demand, in response to map. The right pane of Figure 1 shows the same area after we
users’ viewing requests. The only way to move the viewport was used MapCruncher to generate a raster overlay. The map can be
by clicking discrete buttons (e.g., “north”). The web service panned and zoomed, just as was possible before the overlay was
would then render a new custom map in a slightly different added.
position.
The main contribution of MapCruncher is that it makes a task
Starting in 2005, Google, Microsoft, Yahoo!, and MapQuest accessible to casual users that had typically been the domain of
began to offer a new class of online, interactive maps. These map geographic-information-systems (GIS) professionals.
services pre-rendered a standard set of map tiles covering the
entire coverage area. A sophisticated JavaScript program running
in the user’s browser dynamically downloads the set of tiles that 3. CHALLENGES TO THE CASUAL MAP
cover the user’s desired map viewport. The client positions and MASHER
crops those tiles on the screen to produce a map with exactly the In this section, we consider the difficulties encountered in taking
desired size and extent. an arbitrary map—say, a PDF map of a university campus—and

Figure 1. (left) Base imagery of the UCLA campus (right) UCLA’s campus map superimposed, using MapCruncher
turning it into an interactive map layer. That is, we’d like to produce less scale distortion. It is hard to guess exactly which
superimpose our map onto the road and aerial photography projection a map uses by inspection because there are so many
already provided by online mapping sites, such that the two maps projections. For example, the USGS1 produces maps depicting
can be viewed together, as in Figure 1. each of the 50 United States using custom projection parameters
tailored to each state.
Map overlays have existed for as long as maps have existed, so it
may seem surprising that a new tool was necessary to accomplish Unfortunately, this problem is not well solved by any of the
a seemingly well-known task. In fact, our original intent was not numerous tools available that aid in the production of map
to create a tool, but to create a mashup using existing tools. In overlays. After a week or two of tinkering with various test maps,
this section, we describe some of the hurdles we encountered and we concluded the existing tools were all either too simple or too
how they motivated us to build a new tool to overcome them. complex. The simple tools were limited to linear transformations
such as scaling, translation and rotation. Our test maps did not
3.1 Reprojection of Unknown Map use the Mercator projection, so the simple tools could not warp
Projections them sufficiently to produce good alignment at all points. The
complex tools could perform arbitrary reprojections, but required
The Earth is round, but maps and the computer screens that
complete specification of the projection, which was unavailable
display them are flat. Maps that depict very small extents of the
for our test maps.
Earth relative to their level of detail, such as building blueprints,
can make the simplifying assumption that the Earth is as flat as MapCruncher addresses this problem using approximate
the map that depicts it. However, maps of larger extent can not reprojections. As we will see in Section 4, MapCruncher allows
ignore the curvature of the Earth. A cartographer must therefore users to point out correspondences between the two maps, then
select a method to convert the position of points on the three- estimates how to reproject the user’s map into Mercator without a
dimensional Earth’s surface to the two-dimensional map. The model of the source map’s projection. Although less accurate
mathematical functions used for this purpose are called map than an exact reprojection, this design choice fills a useful niche
projections [20]. in between the low- and high-end.
One spatial relationship or another is lost whenever the three-
dimensional Earth is projected into a two-dimensional 3.2 Management of Large Datasets
representation. Consequently, an astonishing variety of map The simple, intuitive pan-and-zoom interface provided by online
projections have been invented. Each projection makes different maps makes it easy to forget that they are providing access to
tradeoffs, typically maintaining high fidelity in some aspect of the immense repositories of data. Microsoft’s Virtual Earth platform
Earth’s representation (e.g., the shape of objects) by giving up has nearly 200 terabytes (1014 bytes) of imagery. While a casual
fidelity in some other aspect (e.g., apparent relative sizes of user is unlikely to ever create such a large dataset, we’ve found
objects). Cartographers select the best projection based on a that even modest maps can overwhelm normal desktop image
map’s intended use. Most map projections are parameterized, to processing tools.
enable them to be fine-tuned to the location, size, and aspect ratio For example, consider the map of neighborhood bicycle routes
of the extent of the map. produced by King County, Washington. Two of the authors
For two maps to be superimposed correctly, as is our goal, they commute to work by bicycle, so this map was of particular
must both be drawn using the same projection. In the world of interest. We tried to overlay it on several interactive maps
traditional GIS systems, this problem is usually easy to solve. (Google Maps [8], Google Earth [7], and Microsoft Virtual Earth
Most spatial data comes annotated with metadata describing [16]) using previously existing tools. All of them required that we
which projection was used to draw it, along with the projection’s provide the overlay as a single rasterized image (e.g. a PNG).
parameters. This information can be used to perform a The 2005 edition of the King County bicycle map is a 30”x36”
mathematically exact transformation of a map into any other poster. If rendered at a zoom level large enough that its smallest
projection. features are easily readable, it is a 3-gigapixel image. Despite
For casual mashups, the situation is more difficult. The vast considerable effort, we could not find a PDF rendering program
majority of maps available on the Web have been stripped of the under Windows or Linux capable of producing an output image of
metadata that describes the map projection. For maps that do that size. Their failure modes were diverse and often amusing.
have metadata, it is often in a format that can not be automatically Some ran out of RAM (3GB was available). Others filled the disk
parsed—for example, a text file describing the projection in with temporary files. Some simply froze the computer.
English. Consequently, it is nearly impossible to precisely or Even if we had we succeeded in creating such a large image from
automatically reproject a typical map found on the Web. our source PDF, other roadblocks would have awaited us. Similar
This is a problem for a user who wishes to create an overlay. limitations existed in the tools available both for registration of
Most maps are not drawn using the same projection as is used by the image to a reference map and cutting it into browser-
the major interactive online map services. Microsoft’s and compatible 256x256 pixel tiles. Our early failure in the seemingly
Google’s mapping sites, for example, use the Mercator simple task of creating a bicycle-map overlay was among our
Projection. (Mercator is used because it is conformal. Conformal motivations to write MapCruncher.
projections do not distort features’ shapes, making it possible to
overlay street maps on undistorted aerial photography.) In 1
contrast, most other maps are not expected to be used as overlays The United States Geological Survey (USGS) is the official
for photographs, so instead use one of the many projections that mapping agency for the United States.
MapCruncher was designed with enormous output images in transformation. First, we ask the user to identify some landmark
mind. As we will describe in Section 5, our tool uses the same that can be found both on the user’s map and also on the Virtual
strategy as the large interactive map sites: instead of producing a Earth map or aerial imagery; we call this identification a
single image, MapCruncher renders a large number of small “correspondence.” After obtaining several correspondences, we
(256x256) image tiles. This allows browsers to navigate through find the coefficients to a polynomial function that best fits them.
large custom overlays just as they do the underlying road maps A transformation with a 2nd-degree polynomial can look very
and aerial photography: efficiently downloading just the sub- similar to the transformation from many projections into
images they need, on-demand. In contrast, most other overlay Mercator.
generators that require the user download the entire overlay image
“But wait!” a GIS professional might insist. “Polynomials may
before displaying any of it. This is impractical for our 3-gigapixel
look similar to the right answer, but to reproject correctly, you
test map.
need trigonometry. And asking for user input by pointing out
MapCruncher also handles large source maps gracefully map landmarks is horribly prone to error!” This is true – and the
generating each 256x256 tile individually, directly from just the users who have spatial data annotated with all the metadata
portion of the source map that it requires. Again, this is in required to do an exact transformation are likely to use GIS tools,
contrast to other tile generators that require the entire source map not MapCruncher. While not exact, we’ve found polynomials
to be rendered in advance, even though the image may be giga- or produce excellent results in a wide variety of maps. By analogy,
even tera-pixels in size. the existence of AutoCAD does not obviate the value of Microsoft
Paint.
3.3 Mashing Without Programming In Section 4.1, we describe the process of gathering enough data
In the earliest days of the Web, content production was an
from the user to reproject the user’s map. In Section 4.2, we
engineering discipline. Writing HTML is similar in some ways to
describe how MapCruncher uses that input to produce a usable
computer programming. Like programming, it is inaccessible to
map overlay.
people who do not happen to be experts in the field – that is,
inaccessible to most people who want to create content. Various
HTML authoring tools quickly appeared, making it easier for non-
4.1 Georeferencing
The first step in creating an overlay with MapCruncher is
experts to write web pages without needing an intimate
specifying a number of correspondence points between the user’s
understanding of the underlying technology.
map (the “source map”) and the existing road maps and aerial
The situation today is similar with the creation of mashups, both photography (the “reference map”). Because the reference maps
geographic and otherwise. They are difficult to create without are, themselves, already registered to the Earth’s coordinate
first learning JavaScript, HTML, XML, esoteric APIs, map system2, each correspondence identifies the real latitude and
projections, and geographic coordinate systems. Our first attempt longitude of a point on the source map.
at creating a bicycle-route mashup was slowed by the requirement
MapCruncher provides a simple interface for specifying
we learn many new disciplines, from web APIs to map projections
correspondence points. The MapCruncher GUI, shown in
to online maps’ coordinate systems and naming schemes.
Figure 2, has two viewing panes. The source pane displays the
One of our motivations for writing MapCruncher was to make source map, which can be panned and zoomed to arbitrary
geographic mashups accessible to non-experts – including people locations and zoom levels. The reference pane displays the
who would not have been able to create a mashup without it. As reference map, using imagery from Microsoft Virtual Earth. The
we will see in Section 6, MapCruncher lets beginners create
point-and-click mashups, while still allowing advanced users to
customize arbitrarily.

4. APPROXIMATE REPROJECTION source map


In this section, we describe how MapCruncher reprojects (changes crosshairs
the shape of) and registers (correctly positions) the user’s map
such that it correctly overlays the existing Mercator-projected
maps and aerial photography.
MapCruncher differs from traditional GIS systems, which
generally perform mathematically exact map reprojections. GIS
reference map
software usually includes a large library of commonly used
projection families; the user is asked to select the one that was crosshairs
used to draw the source map. The user must then enter the
numerical parameters that specify the exact projection. The
nature of these parameters depends on the projection.
Unfortunately, as we described in Section 3.1, most maps used by Figure 2. Establishing a correspondence between a source
our target audience have unknown projections. A user who is not map and the reference map
a GIS expert may not know even what a projection is.
Consequently, we designed MapCruncher to estimate the 2
Specifically, the “WGS84” datum.
reference map can be panned and zoomed independently of the
source map.
The user employs the two panes to find a location on the source
map and a location on the reference map that visually correspond
to each other. Any landmark that appears in both maps can be
used as a correspondence. For example, in Figure 2, we are
registering a building floor plan to Virtual Earth’s aerial
photograph of the same building. In this example, the corners of
the buildings are readily visible in both views and make excellent
references for a correspondence. For some source maps, it may be
more convenient to use Virtual Earth’s road-map view instead of
aerial photography. Street intersections make excellent
correspondences for many maps.
Figure 4. Correspondences sorted by disagreement
Each pane includes crosshairs that identify the center of the pane.
The user indicates a location by panning the image until a feature The reprojection process will dutifully attempt to distort the
is under the crosshairs. Once the same feature is under the source map to satisfy the erroneous correspondence. However, if
crosshairs in both panes, the user clicks a button labeled “Add the reprojection is overconstrained and there are enough correct
Point”. This process is repeated until there are enough correspondences, the result will mostly respect the majority.
correspondences. In practice, between two and twenty are MapCruncher uses the distance between the reprojected point and
required, depending on the source map. the user-placed point to find outliers. It computes the magnitude
of disagreement for each correspondence, sorts by decreasing
We discovered that in maps that cover a large geographic extent, disagreement, and presents the list to the user (see Figure 4).
establishing 20 correspondences can be time-consuming. To
speed the process, MapCruncher can helpfully guess the spot on The observed amount of disagreement provides the user with a
the reference map that corresponds to an arbitrary source map quick suggestion of which points might have been placed
point. As soon as the first two correspondences are defined, the incorrectly. The user can then revisit the top few “suspicious”
user can “lock” the views of the source and reference maps correspondences to ensure they’re in the right place.
together. When one locked map is panned or zoomed, the other As an additional aid to the user, MapCruncher plots a vector from
follows suit. where the user placed a correspondence point towards where the
With just two or three points, the lock is based on a poor majority suggests the point should have been placed (see
approximation, but it is usually good enough that it greatly assists Figure 5). In this case, the correct source map position (left side)
the user in establishing additional correspondences. Using locked corresponding to the marked reference map position (right side) is
views, the user can zoom rapidly to a new location in the source one block south of the point selected by the user. The
map, and the reference map follows along. Often, only a little disagreement vector points south, suggesting “perhaps the point
nudging of the (unlocked) reference map is required to find the belongs somewhere down there.”
exact matching point. Thus, the third and following
correspondences in a mashup become much less tedious to define. 4.2 Reprojection
As each new point is added, the reprojection approximation After the user has created correspondences, the next step is to
improves. generalize them, relating the entire source map to global
coordinates. Mathematically, we need to produce a function that
4.1.1 Error display captures the relationship between image coordinates on the source
Sometimes, the user accidentally establishes a correspondence map and image coordinates of the Mercator-projected reference
between points on the source and reference maps that do not map.
actually correspond. A common instance is an “off-by-one-block”
error (see Figure 3).

Figure 3. Establishing a correspondence between a source Figure 5. Disagreement vector points toward likely correct
map and the reference map location.
The mathematically exact relationship between two maps is solve for the affine reprojection parameters as described above.
determined by (1) the projection of each map and (2) the Suppose we have two correspondences A and B, each comprised
parameters of that projection. The projection of the reference map of points (As,Ar) and (Bs,Br) on the source and reference maps,
and its parameters are known (in our case, Mercator). Therefore, respectively. To synthesize the third correspondence, we find on
one possible approach (which we do not employ) is to try to fit each map a point C that forms a right isosceles triangle with A
various selections of projection and parameters to the user-entered and B.
correspondence data to discover a best fit. Given the fitted model
for the source map projection and the known reference projection, 4.2.3 Quadratic reprojection
the function is determined. To accommodate maps where the constraints of affine
reprojection introduce significantly visible error, we also provide
Unfortunately, the set of projections in which source maps may be polynomial reprojection, in particular the subclass quadratic
drawn is quite large, and the process of fitting parameters to each reprojection. A quadratic reprojection takes the form:
projection is diverse and involved. An alternative approach that
we use in our application is to ignore the precise projections, and sx = c01rx2 + c01rxry+ c02rx + c03ry2 + c04ry + c05
instead use an approximation to model the entire class of potential
reprojections. sy = c11rx2 + c11rxry+ c12rx + c13ry2 + c14ry + c15
Like a projection, an approximate reprojection is a class of By introducing terms of higher degree than the linear terms of
functions selectable by parameters. MapCruncher includes two affine reprojection, the quadratic reprojection can better
classes of reprojections: (1) affine reprojections, including both approximate an exact reprojection, including some curvature. The
general affine reprojections and the restricted subclass of rigid curvature is still not perfect, because exact reprojection generally
reprojections, and (2) bivariate polynomial reprojections, involves trigonometric functions rather than polynomials. In
specifically the subclass of quadratic reprojections. These will be practice, however, we have found that the quadratic reprojection
discussed in the following sections. usually suffices. For most source maps, reprojection error is
dominated by sources other than the limitations of our quadratic
4.2.1 Affine reprojection model.
The affine reprojection is a linear relationship between the source
and reference coordinate systems: The disadvantage versus affine of quadratic reprojection is that it
requires six user-entered correspondence points to completely
sx = c00rx + c01ry + c02 constrain its parameters. These parameters are inferred in the
same manner as those for affine reprojection, as discussed in
sy = c10rx + c11ry + c12 Section 4.2.5.
An advantage of the affine reprojection is that it has only six 4.2.4 Higher-degree polynomials
parameters, which can be inferred with as few as three Of course, the technique used for quadratic reprojections can be
correspondences. (Each correspondence provides two constraint extended to polynomials of higher degree. We have found in
equations, one in x and one in y.) In Section 4.2.5, we discuss practice that quadratics are sufficient for most applications.
how these parameters are estimated. Higher degree polynomials might better approximate the exact
A limitation of affine reprojection is that it preserves straight
lines. If the source map is in, for example, a conic projection, then
exact reprojection will change straight lines in the source map into
curved lines in the reference projection. Affine reprojection
cannot produce this effect, and will therefore introduce errors into
maps where this effect is noticeable.

4.2.2 Rigid reprojection


A restricted subclass of affine reprojection is rigid reprojection. A
rigid reprojection constrains the affine projection to only allow
translation, scaling, and rotation, eliminating asymmetric scaling
and skew. If both source map and reference map obey conformal
projections (a common property which is true of Mercator), then
the best affine projection will always be rigid.
The advantage of a rigid reprojection is that it has only four
degrees of freedom instead of six, and can thus be determined
with only two user-provided correspondences rather than three.
MapCruncher includes a simple mechanism by which the
implementation of affine reprojection may be reused to implement
rigid reprojection. As described above, affine reprojection
requires three correspondences, whereas rigid reprojection
requires only two. MapCruncher synthesizes a third Figure 6. Reprojecting from a conic projection requires
correspondence and uses the resulting three correspondences to bending straight lines.
trigonometric projection for some maps where curvature is is improved by using parameter fitting to average out error. Once
exaggerated, but we have only rarely encountered such situations. there are at least n correspondences, our application switches to a
quadratic reprojection.
The top image In Figure 6 shows a source map in conic
projection. The bottom image shows the map reprojected into The minimum value of the threshold n is six, since that many
Mercator, based on eleven manually identified correspondences. correspondences are required to determine a quadratic
Because the image covers a large longitudinal extent, its curvature reprojection. We chose to use n=7, because with only six points
is noticeable in the reprojection. Even in this extreme case, the there is no redundant information, so tiny errors can cause the
quadratic reprojection is sufficient for the scales of interest. application to generate a quadratic projection with undesirable
distortions. In contrast, the same six points overspecify an affine
4.2.5 Parameter fitting and Error Minimization reprojection, providing sufficient redundancy to average out error.
The preceding subsections describe formulas and their parameters,
but not how the parameters are determined. If a user provides the MapCruncher allows the user to disable quadratic projection, in
exact number of correspondences necessary for the reprojection cases where affine reprojections with more than 7
(three correspondences for affine or six correspondences for correspondences are desired. This behavior is useful for source
quadratic), the parameter values can be determined with a simple maps where it is important that straight lines not be curved, such
matrix inverse. The resulting reprojection will place the specified as building floorplans.
correspondence points of the reprojected map at the exact
locations on the reference map that the user has identified. 5. TILE RENDERING
Once enough correspondences have been established,
A user may choose to provide more correspondence points than MapCruncher has sufficient information to determine the source-
strictly necessary. There are several reasons for this: The user may map pixel corresponding to every latitude and longitude covered
be concerned about the possibility of errors in the source map; the by the source map. When the program is used interactively in the
user may have some uncertainty about which locations in the “locked” mode (i.e., the source map and reference map moving in
source map correspond to which locations in the reference map; tandem), reprojected tiles are rendered and cached on-demand
or the user may be unsure of where points should be optimally each time the user looks at a new area of the world.
placed to minimize distortion of the reprojected map. When
additional correspondences are specified, it is not generally However, in the final mashup, we decided against on-demand
possible to satisfy all correspondences simultaneously. Instead, rendering, for several reasons. First, rendering can take a long
MapCruncher produces a reprojection that places the specified time. This is a particular problem for slow computers, mashups
correspondence points of the reprojected map at locations nearby that have a large number of input source maps, and mashups that
those on the reference map that the user has identified. In have complex PDFs as source maps. Because storage is cheap
particular, it attempts to minimize the mean squared distance and responsiveness of web applications is important, it makes
between the reprojected correspondence points and the reference more sense for MapCruncher to exhaustively pre-render all image
points. In other words, the parameters are determined using a tiles—just like Virtual Earth and Google Maps. Second, on-
linear least-squares fit. In practice, our system employs singular demand rendering places a much higher complexity burden on the
value decomposition (SVD) [1] to implement the fitting user. It would require special configuration of the web server,
procedure. which is often not possible for people without administrative
access to one, and difficult for beginners. On-demand rendering
4.2.6 Automatic selection would also limit the number of compatible web server
MapCruncher can reproject a source map with as few as two implementations and server operating systems. In contrast, pre-
correspondence points established, using rigid reprojection. When rendered tiles are just data: they can be served from any Plain Old
a third point is added, the application begins using a general Web Server (see Section 6.2).
affine reprojection. As more points are added, the approximation For these reasons, MapCruncher allows users to a pre-render a

Boundaries in source into boundaries in reference and used to select tiles which
coordinates are projected… coordinates… contain the region of the source
map.
Figure 7. Identifying the set of tiles that cover a reprojected map
2. Transformed
boundary is axis-
aligned to select a 1. Tile boundary
region of source
transformed into
map to sample. source map
coordinates.

Figure 8. Identifying the set of tiles that cover a reprojected map

database of image tiles. Users can first select the maximum zoom MapCruncher’s approach is to render the pre-image of each tile
level for which tiles are produced. Each additional zoom level one at a time. This approach is efficient in both computation and
increases the spatial resolution of tiles by a factor of two in each memory. For each final rendered tile to be generated, it
dimension, and therefore increases the total storage requirements determines the section of the source map needed to generate the
by a factor of four. tile, and renders only that part of the source map. To determine
the section, the boundary of the reference tile in reference
5.1 Determining geographic extent of source coordinates is transformed through the reprojection function to
map produce a boundary in the source map coordinate system (arrow 1
in the Figure 8). An axis-aligned bounding box is drawn around
The geographic extent of the source map is determined by
the transformed tile boundary (as shown in the figure). The
applying the inverse of the reprojection function to the boundaries
region is axis-aligned because most source map formats are
of the source map. The inverse function maps from source map
amenable to sampling in such regions. The region is also slightly
coordinates to reference map coordinates, so this process
enlarged to account for projections with high curvature.
produces a boundary in reference coordinates that corresponds to
the boundary of the source map. The points on the reference Once this target region is computed, we ask the underlying PDF
boundary are converted into tile coordinates to select the set of renderer to produce a sample image of only the portion of the
tiles that contain the entire reprojected source image (see source map needed to render the final tile. This is memory-
Figure 7). This tile selection process is repeated for each zoom efficient because it only requires rasterization of small (approx.
level for which the user desires to output tiles. 300x300 pixel) images. Of course, at high zoom levels, these
images may cover a minute portion of the source map.
5.2 Selecting region of source map to sample MapCruncher uses a PDF renderer licensed from Foxit Software
In theory, the best-fit reprojection function is all that is needed to [5], which cleverly stores the list of image vectors in the PDF so
produce a complete set of rendered tiles: it allows us to find the that most of them can be pruned (not rendered) when viewing a
source-map pixel that corresponds to every possible reference- tiny region, making the pre-image approach even more
map pixel. However, there are many choices in the computationally efficient.
implementation of tile rendering that can have dramatic effects on
its efficiency and resource requirements. Finally, this small region of the rasterized source-map image is
sampled to produce the final rendered tile. For each of the
There are two straightforward approaches by which rendering 256x256 pixels in the final tile, the reprojection function is used
could be done, neither of which we use. First, one could use the to find the four nearest pixels in the source-map image. These
reprojection function (along with information about the location four pixels are combined using bilinear interpolation.
and zoom level of the tile being rendered) to map each individual
pixel’s location to a location in the source map; render the area of 6. DEPLOYMENT
the source map defined by the extent of the pixel; and use the One of our guiding principles in writing MapCruncher was that it
result of the rendering to assign visibility and color to the pixel. should minimize the specialized knowledge required by the user
This approach is prohibitively expensive in terms of the as much as practical. Therefore, it was important that
computational cost per pixel. MapCruncher not only create map image tiles, but automatically
emit a fully working web application that gives users instant
A second inefficient approach is to first render the entire source
gratification of seeing their creation come alive.
map at the scale dictated by the tile set’s zoom level. Then, for
each pixel in a final rendered tile, find the corresponding pixel in
the enormous, rendered source map. This approach, used by 6.1 Sample Web Page
many overlay tools, is computationally efficient and conceptually When MapCruncher renders output tiles, it also creates a sample
simple because the source map needs to be rendered only once. web page that shows the user’s map layers overlaid on top of
However, it is prohibitively memory-intensive when rendering Virtual Earth’s street maps and aerial imagery. The sample page
maps at high zoom levels. This is because rasterizing a vector also includes a “Find…” box, allowing users to search for
image such as a PDF source map requires memory proportional to businesses (using Virtual Earth’s yellow pages service) and
the size of the raster. For many source maps, rasterizing the entire overlay pushpins right on top of their custom maps. The new
thing at a high zoom level can result in a giga- or tera-pixel image. “VE3D” digital globe is also supported – instantly draping the
user’s map tiles on top of a three-dimensional rendering of Earth maps from 7 counties and 8 municipalities around Washington
that can be viewed from any position and angle. VE3D uses a and Oregon. Overlaying bicycle maps on top of the underlying
digital elevation map that is compatible with MapCruncher tiles, street maps is quite valuable. Bicycle maps typically do not show
so bicycle routes can actually be seen going up and over the smaller off-trail roads, making it difficult to plan an end-to-
mountains (see Figure 9). end trip without the overlay. The seamless integration of aerial
To some, it might seem that this sample web page is unnecessary: photography can also clear up ambiguities in sometimes casually-
surely anyone who bothered to create a mashup will also bother to drawn bicycle maps. For example, we used it to discover that a
write their own web page to display it! By way of pedestrian overpass was available on trail not clearly depicted as
counterargument, consider Microsoft’s basic HTML editor, crossing a major highway. The “Find a business…” feature of
FrontPage. When a user opens FrontPage, it titles the default Virtual Earth also makes it easy to, say, find an ice cream shop
blank document “New Page 1” – a string that appears 6 million along your route on a hot day.
times in the MSN Search index as of this writing. 6.3.2 National Park Service Maps
The United States’ National Park Service publishes maps of more
6.2 “Plain Old Web Server” Requirement than 200 National Parks in the public domain [10]. Each is
Another important constraint in our design was that the rendered
annotated with a rich set of data, including hiking trails, the
mashup can be served by a “POWS” – Plain Old Web Server.
names of many small lakes and rivers, geological formations, etc.
That is, we do not depend on the availability of any special server
In contrast, vendors of the street-map data found in most online
features, such as the ability to execute CGI scripts, interpret
mapping sites simply depict the park as a large blank area with the
server-side includes, or configure custom error documents.
park name.
Dependence on these features would limit our audience to
technical users who have administrative access to a web server. Using MapCruncher, it’s easy to combine the rich annotations
found in the park maps with the aerial and satellite photography
MapCruncher requires nothing from a web server other than its
provided by Virtual Earth [13]. It’s also easy to leverage Virtual
most basic function: return a file if it exists and a 404 error code if
Earth’s other features to produce new composite services – for
it does not exist. This means that users can create public mashups
example, getting driving directions from your home to the ranger
even without owning a web server – they can simply upload the
station, drawn right on top of the park map.
output directory to any public web service. This includes both
beginner-oriented services such as GeoCities and more advanced 6.3.3 Do-It-Yourself Aerial Photography
offerings such as Amazon S3. In both of these examples, server- Virtual Earth and Google are both adding and updating imagery
side execution and custom web configuration are not available. as quickly as they can; it's a top priority for them. However, for
the foreseeable future, there will always be people who want high-
6.3 Applications quality aerial photography in areas that do not yet have coverage.
MapCruncher has a wide variety of uses. Three of our favorites Previously, there was no way for users to add their own
are described here and available on the web. photography. MapCruncher makes this easy for the first time.
6.3.1 Pacific Northwest Bicycling Guide Two members of the MapCruncher team, coincidentally, are
Our most ambitious mashup to date is the Pacific Northwest private pilots. While on a flight 4,000 feet over the small town of
Bicycling Guide [14] – a seamless combination of bicycle route Forks, Washington, we had the idea of using new aerial

Figure 9. Bike trails on tiles emitted by MapCruncher draped over VE3D terrain.
photography as a source-image instead of a map. We circled for 8. ACKNOWLEDGMENTS
several minutes, taking a few snapshots out the side window with The authors would like to extend their sincere thanks to Danyel
an old digital camera. Fisher, Steve Lombardi, Karen Luecking, Joe Schwartz, Chandu
On the ground, we imported the photos into MapCruncher, using Thota, and the many testers who provided us valuable feedback.
distinctive landmarks shared by both our photos and the Virtual
Earth reference photos. The results were surprisingly good [12]. 9. REFERENCES
While seams between the images are visible, the polynomial
[1] H. Abdi. "Singular Value Decomposition (SVD) and
fitting function was able to effectively ortho-rectify large portions
Generalized Singular Value Decomposition (GSVD)." In
of our photos. (Most of them had severe perspective distortion
N.J. Salkind (Ed.): Encyclopedia of Measurement and
due to being shot at an oblique angle.)
Statistics. Thousand Oaks, Oct 2006.
Despite these problems, there was a dramatic increase in image
[2] chicagocrime.org, http://www.chicagocrime.org/map/.
quality, especially relative to the time and financial cost of our
project. In May of 2006, Virtual Earth’s coverage of Forks was [3] W. K. Edwards, A. LaMarca. Balancing Generality and
1m/pixel, 12-year old, black-and-white USGS aerial photography; Specificity in Document Management Systems, Interact'99.
Google’s was 8m/pixel satellite photography. After one hour in a [4] B. Ford, G. Back, G. Benson, J. Lepreau, A. Lin, O. Shivers.
small airplane and a few hours on the ground, we had modern, The Flux OSKit: A Substrate for OS and Language Research,
full-color, 0.5m/pixel photography of a market so small that it’s 16th SOSP, Oct 1997.
unlikely to be re-photographed by Microsoft or Google in the near
[5] Foxit Software, http://www.foxitsoftware.com/.
future.
[6] GeoRSS. Graphically Encoded Objects for RSS feeds,
7. A COMPOSABLE VIRTUAL EARTH http://www.georss.org/.
Most of the mashups we’ve seen to date are interesting because [7] Google. Google Earth, http://earth.google.com/.
the whole is greater than the sum of the parts. For example,
having a bicycle map integrated with a street map is more useful [8] Google. Google Maps, http://maps.google.com/.
than either one individually. To get the most utility from [9] J. S. Heidemann, G. J. Popek. File-System Development with
mashups, it’s not enough to combine users’ maps with Virtual Stackable Layers, ACM TOCS 12 (1), Feb 1994.
Earth. We also need a way to make them easily composable with
[10] Harpers Ferry Center. National Parks Service Maps,
each other.
http://www.nps.gov/carto/.
Ideally, mashups will no longer be thought of as individual sites,
[11] housingmaps.com, http://www.housingmaps.com/.
disconnected from the rest of the world. Instead, the building
blocks of mashups—the layers of rasters, points, and lines that [12] MapCruncher team. Do-It-Yourself Aerial Photography,
underlie them—should be composable, interchangable building http://research.microsoft.com/mapcruncher/Gallery/Forks/
blocks. We envision a world where mashups have more structure, [13] MapCruncher team. National Park Maps,
so that the bicycle layer we render can easily import the Doppler http://research.microsoft.com/mapcruncher/Gallery/National
weather data you’ve rendered, and can be imported into the web Parks/
site that features hiking layers. If people publis their applications
and the underlying data in a semantically meaningful way, a [14] MapCruncher team. Pacific Northwest Bicycling Guide,
nearly infinite set of innovative and diverse applications are sure http://research.microsoft.com/mapcruncher/Gallery/NWBike/
to follow. [15] N. C. Hutchinson, L. L. Peterson. The x-Kernel: an
MapCruncher tries to take a step in this direction by cleanly Architecture for Implementing Network Protocols, IEEE
separating the imperative code that run the mashup from Transactions on Software Engineering 17 (1), pp. 64-76, Jan
declarative code that describes the raster layer being imported. 1991.
Specifically, each time MapCruncher renders tiles, it also [16] Microsoft. Microsoft Virtual Earth,
describes those tiles—their geographic position, rendering depth, http://www.microsoft.com/virtualearth/default.mspx.
and so forth—in an XML file specially seeded with strings that [17] Open Geospatial Consortium. Geography Markup Language,
can be found by search engines. With enough people creating version 3.1.1.
MapCruncher layers, we can collectively create an enormous
database of interesting data layers, all geographically registered to [18] RunwayFinder – a flight planning tool for pilots,
compatible coordinate systems and instantly searchable using http://www.runwayfinder.com/.
existing search engines. [19] Seattle Bus Monster, http://www.busmonster.com/.
Who knows what kind of interesting mega-mashups might follow? [20] J. Snyder. Map Projections-A Working Manual, United
States Government Printing, Feb 1983.

You might also like