Swimming Pool Detection and Classification Using Deep Learning
Swimming Pool Detection and Classification Using Deep Learning
ArcGIS Pro showing residential parcels with pools highlighted in blue. Note
that some parcels with pools are missing, indicating outdated data. The goal
of this project was to identify all such parcels.
Doing this through GIS and AI would certainly reduce the heavy
amount of expensive human labor involved in updating the records
through field visits of each property.
ArcGIS Pro includes tools for labeling and exporting training data
For training, we used the Adam optimizer using one-cycle learning rate
schedule. We also employed discriminative learning rates while fine-
tuning the model. All of the stated techniques are provided by the
fast.ai library.
Imagery
An important consideration for training deep learning models is to
pick the imagery to be used. Using the most current and spatially
accurate satellite imagery is important. The resolution at which to
perform the training and inference, as well as which bands to be used
can be critical.
Satellite imagery often includes bands other than just the visible
spectrum. It might seem obvious to use all available bands for training
the model. However, there are certain advantages to using just 3 bands
and that worked quite well for us in practice.
First, the RGB bands are always available no matter which satellite /
sensor is used. In theory, we could train a model on imagery from one
satellite / sensor and deploy it on another. This strategy could also be
used for data and test time augmentation and further improve model
performance.
The next logical step was to use three bands but try a band
combination other than RGB. The USA NAIP Imagery: Color
Infrared uses the Near-Infrared, Red and Green bands and allows the
pools to stand out due to their being cooler than the surroundings. An
example of the NAIP Color Infrared imagery (False Color composite) is
below.
We can easily locate blue patches where the swimming pools are. We
then chipped out these images from the NAIP infrared imagery and
trained our model to finally see improved results . We were able to get
decent results with around 2,000 NAIP infrared images but the model
still made mistakes in detecting all pools. At this point, we recalled the
golden rule. We did heavy data augmentation, by taking 50 random
jitters around each pool location. Using this technique, we were able to
convert those 2,000 images to 100k images. Upon training the
complete model again, the validation loss went down to around 18. We
tried more training but the model started to overfit after that. Let’s see
some of the results after training completely on NAIP Infrared
Imagery.
Results of the model trained on infrared bands from NAIP imagery.
Inferencing
Once the model was fully trained and giving good results, we wanted to
test it out on a larger area than just the small image chips used for
training and validation. We created a script to export a larger area of
the NAIP imagery and find all pools within it. This was done by further
splitting the larger image into smaller sized chips which the model
requires. All these chips were simultaneously passed as a batch to the
model and the predictions were gathered, combined and visualized.
Below is the result of that visualization.
Secondly, we did predictions twice. The first on the actual chip and the
second one on a center crop from the original chip. We selected the
center crop in such a way that the pools which were positioned at the
edges of the smaller chips now started to appear at the center. This
simple strategy allowed the missing pools to be detected. Extending
this approach to five different center crops enabled us to increase the
recall (fraction of correctly detected pools over the total amount of
actual pools) without negatively affecting the precision (fraction of
correctly detected pools among all detected pools).
The mechanism of inner cropping.
Deep learning is great at what it does, but can still make silly mistakes
at times — for instance, we occasionally got false positives for pools on
freeways and as well as on the hills! Many of the false positives were of
low confidence and got filtered out, but some were high confidence
false positives, perhaps as a result of overfitting. There were false
positives specifically over large water bodies which actually contained
water and appeared blue in the NAIP imagery. We considered several
options to remove these false positives, like training the network to
detect a second class of water bodies, but that seemed to be an overkill
to solve the problem.
Now that we had a good pool detector, we wanted to find those parcels
containing swimming pools which are not being assessed correctly.
The Join Features tool in ArcGIS Online came in handy and we were
able to create information products like feature layers of the
unassessed pools as well as web maps for visualizing the results.
Surprisingly, we were able to identify approximately 600 new pools
that were not marked correctly in the database.
The red parcels are the ones that are not being correctly assessed for having a
pool, based on our data
Once we got good results for detecting the pools, we took a step further
to classify them as clean or green (i.e. neglected pools, sometimes also
referred to as ‘zombie pools’). Green pools often contain algae and can
be breeding grounds for mosquitoes and other insects. Mosquito
Control agencies need a simple solution that helps them locate such
pools and drive field activity and mitigation efforts.
Distributed inferencing
One of the things that bothered us was the relatively long time it used
to take to do the inference. It took us approximately 10 minutes on
Google Cloud Platform to perform pool detections on the complete City
of Redlands — this could be a problem for a live demo. We then got our
hands dirty with distributed computing on GPUs. We wrote an
inference script which used python’s subprocess module to call
different GPU processes and inference on pre-downloaded chips. On a
single p2.16xlarge AWS instance, we were able to inference within 50
seconds on the entire city covering an area over 100,000 sq km. With
this speed we can detect pools in San Bernardino, which is the largest
county in the US, in under an hour.
Deployment
A primary goal of this project was to apply the latest research in deep
learning and use it to solve real-world problems, be it for updating
outdated county records or to galvanize mosquito abatement drives.
Esri has also recently introduced (as beta) an Image Visit configurable
app template that lets image analysts visually inspect the results of an
object detection workflow and categorize them as correct detections or
errors. A live demo of the configured web app is here. This information
could then also be fed into better training or to filter the results and
prioritize field activities.
Image Visit app to enable visual inspection of neglected pools detected by
deep learning model.
That’s where the field mobility capabilities of the ArcGIS platform can
be put to use. Workforce for ArcGIS allows for creation of assignments
for mobile workers, such as inspectors in mosquito control agencies,
and drive field activity. We used the recently introduced apps module
in ArcGIS API for Python to automate creation of Workforce
assignments for field workers. These assignments make it easy for field
workers to stay organized, report progress, and remain productive
while conducting mosquito abatement drives based on the results of
the neglected pool detection analysis.
Pool inspection assignments for field workers in Workforce for ArcGIS