Pages

Showing posts with label subset. Show all posts
Showing posts with label subset. Show all posts

Friday, November 4, 2011

Deleting Shapefile Features

Sometimes, usually as a server-based operation, you need to delete all of the features in a shapefile. All you want left is the shapefile type, the dbf schema, and maybe the overall bounding box. This shapefile stub can then be updated by other processes. Pyshp currently doesn't have an explicit "delete" method. But because pyshp converts everything to native Python data types (strings, lists, dicts, numbers) you can usually manipulate things fairly easily. The solution is very similar to merging shapefiles but instead you are writing back to the same file instead of a new copy. There's only one hitch in this operation resulting from a minor difference in the pyshp Reader and Writer objects. In the reader the "bbox" property returns a static array of [xmin, ymin, xmax, ymax]. The Writer also has a "bbox" property but it is a method that is called when you save the shapefile. The Writer calculates the bounding box on the fly by reading all of the shapes just before saving. But in this case there are no shapes so the method would throw an error. So what we do is just override that method with a lambda function to return whatever bbox we want whether it be the original bbox or a made up one.
import shapefile 
# Read the shapefile we want to clear out
r = shapefile.Reader("myshape") 
# Create a Writer of the same type to save out as a blank
w = shapefile.Writer(r.shapeType) 
# This line will give us the same dbf schema 
w.fields = r.fields 
# Use the original bounding box in the header 
w.bbox = lambda: r.bbox 
# Save the featureless, attribute-less shapefile
w.save("myshape") 
Instead of using the original bounding box we could just populate it with 0's and let the update process repopulate it:
w.bbox = lambda: [0.0, 0.0, 0.0, 0.0]
Note coordinates in a shapefile must always be floating-point numbers. Sometimes you may not want to delete all of the features. You may want to select certain features by attribute or using a spatial operation.

Tuesday, August 23, 2011

Point in Polygon 2: Walking the line

Credit: SpikedMath.com
This post is a follow-up to my original article on testing if a point is inside a polygon.  Reader Sebastian V. pointed out the ray-casting alogrithm I used does not test to see if the point is on the edge of the polygon or one of the verticies.  He was even nice enough to send a PHP script he found which uses an indentical ray-casting method and includes a vertex and edge test as well.

These two checks are relatively simple however whether they are necessary is up to you and how you apply this test.  There are some cases where a boundary point would not be considered for inclusion.  Either way now you have an option.  This function could even be modified to optionally check for boundary points.

# Improved point in polygon test which includes edge
# and vertex points

def point_in_poly(x,y,poly):

   # check if point is a vertex
   if (x,y) in poly: return "IN"

   # check if point is on a boundary
   for i in range(len(poly)):
      p1 = None
      p2 = None
      if i==0:
         p1 = poly[0]
         p2 = poly[1]
      else:
         p1 = poly[i-1]
         p2 = poly[i]
      if p1[1] == p2[1] and p1[1] == y and x > min(p1[0], p2[0]) and x < max(p1[0], p2[0]):
         return "IN"
      
   n = len(poly)
   inside = False

   p1x,p1y = poly[0]
   for i in range(n+1):
      p2x,p2y = poly[i % n]
      if y > min(p1y,p2y):
         if y <= max(p1y,p2y):
            if x <= max(p1x,p2x):
               if p1y != p2y:
                  xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
               if p1x == p2x or x <= xints:
                  inside = not inside
      p1x,p1y = p2x,p2y

   if inside: return "IN"
   else: return "OUT"

# Test a vertex for inclusion
poligono = [(-33.416032,-70.593016), (-33.415370,-70.589604),
(-33.417340,-70.589046), (-33.417949,-70.592351),
(-33.416032,-70.593016)]
lat= -33.416032
lon= -70.593016

print point_in_poly(lat, lon, poligono)

# test a boundary point for inclusion
poly2 = [(1,1), (5,1), (5,5), (1,5), (1,1)]
x = 3
y = 1
print point_in_poly(x, y, poly2)
You can download this script here.

Saturday, December 18, 2010

Subsetting a Shapefile by Attributes

If you want to select only certain features in one shapefile and export them to another you have two options.  You can select features spatially or by the database attributes.  You can subset by attributes using the Python Shapefile Library in just a few lines of code.  In this example I use a building footprint shapefile which spans three counties and extract building footprints from just one of the counties.  The county name is one of the attributes.  The first step is to create a shapefile reader for the original 41 megabyte building footprint shapefile, Next we create a shapefile writer as a target for extracted features.  We copy the database fields from the first shapefile to the second.  We then make the selection based on attributes.  Next the features in this selection are added to the writer.  Finally the new the shapefile is written.

import shapefile

# Create a reader instance
r = shapefile.Reader("Building_Footprint")
# Create a writer instance
w = shapefile.Writer(shapeType=shapefile.POLYGON)
# Copy the fields to the writer
w.fields = list(r.fields)
# Grab the geometry and records from all features 
# with the correct county name 
selection = [] 
for rec in enumerate(r.records()):
   if rec[1][1].startswith("Hancock"):
      selection.append(rec) 
# Add the geometry and records to the writer
for rec in selection:
   w._shapes.append(r.shape(rec[0]))
   w.records.append(rec[1])
# Save the new shapefile
w.save("HancockFootprints") 

I originally used python list comprehensions for the two loops in this example.  They usually run faster than "for" loops. However some basic testing showed them to be about the same speed in this case and a little harder to read.  If your selection were more complex you probably want to use a for loop anyway to select by multiple attributes or other filters.

As usual the code for this example can be found on the "geospatialpython" Google Code project in the source tree. The shapefile can be found on the same site in the download section.