Pages

Showing posts with label projection. Show all posts
Showing posts with label projection. Show all posts

Monday, September 5, 2011

Map Projections

A reader pointed out to me recently that that the pyshp documetnatin or wiki should include something about map projections.  And he is right.   Many programmers working with shapefiles are not necessarily geospatial professionals but have found themselves working with geodata on some project.

It is very difficult to just "scratch the surface" of GIS.  You don't have to dig very deep into this field before you uncover some of the eccentricities of geographic data. Map projections are one such feature that is easy to understand at a basic level but has huge implications for geospatial programmers.

Map projections are conceptually straight-forward and intuitive.  If you try to take any three-dimensional object and flatten it onto a plane, such as your screen or a sheet of paper, the object is distorted.  (Remember the orange peel experiment from 7th grade geography?) You can manipulate this distortion to preserve common properties such as area, scale, bearing, distance, shape, etc. 

I won't go into the details of map projections as there are thousands of web pages and online videos devoted to the subject.  But there are some things you need to know for dealing with them programmatically.  First of all, most geospatial data formats don't even contain any information about map projections.  This lack of metadata is really mostly just geospatial cultural history with some technical reasons.  And furthermore, while the concept of map projections is easy to grasp, the math to transform a coordinate from one projection to another is quite complex.  The end result is most data libraries don't deal with projections in any way.

But now, thanks to modern software and the Internet making data exchange easier and more common, nearly every data format, both images and vector, have tacked on a metadata format that defines the projection.  For shapefiles this is the .prj projection file which follows the naming convention .prj   In this file, if it exists, you will find a string defining the projection in a format called well-known text or WKT.  And here's a gotch that blew my mind as a programmer a long time ago: if you don't have that projection definition, and you don't know who created the data - there is no way you are ever going to figure it out.  The coordinates in the file are just numbers and offer no clue to the projection.  You don't run into this problem much any more but it used to be quite common because GIS shops typically produced maps and not data.  All your coworkers knew the preferred projection for your shop so nobody bothered to create a bunch of metadata.  But now, modern GIS software won't even let you load a shapefile into a map without forcing you to choose a projection if it's not already defined.  And that's a good thing.

If you do need to deal with projections programmatically you basically have one choice: the PROJ4 library.  It is one of the few free libraries, if not the only library period, that comprehensively deals with re-projecting goespatial data.  Fortunately it has bindings for just about every language out there and is incorporated into many libraries including OGR.  There is a Python project called pyproj which provides python bindings.

So be aware that projections are not trivial and can often add a lot of complexity to what would otherwise be a simple programming project.  And also know that pyshp does nothing to work with map projections.  I did an earlier post on how to create a .prj file for a shapefile and why I chose not to include this functionality in the library itself.

Here are some other resources related to map projections.

SpatialReference.org - a clearning house for projection definitions

PROJ4 - the #1 map projection library

OGR2OSM - Python script to convert OGR vector formats to the Open Street Map format with projection support

PyProj - Python bindings for Proj4 library

GDAL - Python bindings to GDAL which contains OGR and PROJ4 allowing you to reporject raster and vector data

Saturday, February 12, 2011

Create a .prj Projection File for a Shapefile

An example of a cordiform map projection a.k.a. 
heart-shaped projection
. Happy Valentine's!
If you create a shapefile with ESRI software or receive one from someone who did you may see a ".prj" file included along with the shp, shx, and dbf files.  In fact, the prj file is one of up to 9 possible "official" file extensions for various indexes and other meta files.  Most of these file formats are proprietary.  There are an additional two formats created by the open source community to work around the closed formats created by ESRI for spatial indexing.

The shapefile format does not allow for specifying the map projection of the data. When ESRI created the shapefile format everyone worked with data in only one projection. If you tried to load a layer in a different projection into your GIS weird things would happen.  Not too long ago as hardware capability increased according to Moore's Law, GIS software packages developed the ability to reproject geospatial layers on the fly.  You could now load in layers in any projection and as long as you told the software what projections were involved the map would come together nicely.

ArcGIS 8.x allowed you to manually assign each layer a projection.  This information was stored in the prj file.  The prj file contains a WKT (Well-Known Text) string which has all the parameters for the map projection. So the format is quite simple and was created by the Open GIS Consortium.

But there are several thousand "commonly-used" map projections which were standardized by the European Survey Petroleum Group (EPSG). And there's no way to accurately detect the projection from the coordinates in the shapefile. For these reasons the Python Shapefile Library does not currently handle prj files.

If you need a prj file, the easiest thing to do is write one yourself. The following example creates a simple point shapefile and then the corresponding prj file using the WGS84 "unprojected" WKT.

import shapefile as sf
filename = 'test/point'

# create the shapefile
w = sf.Writer(sf.POINT)
w.point(37.7793, -122.4192)
w.field('FIRST_FLD')
w.record('First','Point')
w.save(filename)

# create the PRJ file
prj = open("%s.prj" % filename, "w")
epsg = 'GEOGCS["WGS 84",'
epsg += 'DATUM["WGS_1984",'
epsg += 'SPHEROID["WGS 84",6378137,298.257223563]]'
epsg += ',PRIMEM["Greenwich",0],'
epsg += 'UNIT["degree",0.0174532925199433]]'
prj.write(epsg)
prj.close()

I've thought about adding the ability to optionally write prj files but the list of "commonly-used" WKT strings is over .5 megs and would be bigger than the shapefile library itself.  I may eventually work something out though.

The easiest thing to do right now is just figure out what WKT string you need for your data and write a file after you save your shapefile. If you need a list of map projection names, epsg codes, and corresponding WKT strings you can download it from the geospatialpython Github "Learn" repository here.

A word of warning if you are new to GIS and shapefiles: the prj file is just metadata about your shapefile.  Changing the projection reference in the prj file will not change the actual projection of the geometry and will just confuse your GIS software.