Tutorial Nearest Neighbor Analysis Using QGIS
Tutorial Nearest Neighbor Analysis Using QGIS
pfstyle=wp
T he topics covered by this tutorial are Importing CSV f ile to QGIS. Understanding and using the Distance Matrix tool. Using table joins to merge the result of the analysis with the source data. Lets get started. In this tutorial, we will walk through the process and answer this question. Given the locations of all known signif icant earthquakes, f ind out the nearest populated place f or each location where the earthquake happened. We will be using the Natural Earth Populated Places dataset along with NOAAs National Geophysical Data Centers Signif icant Earthquake Database. Follow the instructions in this tutorial to import the Signif icant Earthquake CSV f ile to QGIS. Also open the Natural Earth populated places layer using Layer Add Vector Layer.
In the screenshot, each green point represents the location of a signif icant earthquake and each blue point represents the location of a populated place. We need a way to f ind out the nearest point f rom the populated places layer f or each of the points in the earthquake layer.
We will use a tool called Distance Matrix f or this analysis. Open the tool f rom Vector Analysis Tools Distance Matrix.
Here select the earthquake layer as the Input point layer and the populated places as the target layer. You also need to select a unique f ield f rom each of these layers which is how your results will be displayed. In this analysis, we are looking to get only 1 nearest point, so check the box next to Use only the nearest(k) target points, and enter 1. Name your output f ile matrix.csv, and click OK.
Once your f ile is generated, you can view it in Notepad or any text editor. QGIS can import CSV f iles as well, so we will add it to QGIS and view it there. Click Layer Vector Layer. Navigate to the path where you saved matrix.csv and click OK.
Right click on the table layer and select Open Attribute Table.
Now you will be able to see the content of our results. T he InputID f ield contains the f ield name f rom the Earthquake layer. T he TargetID f ield contains the name of the f eature f rom the Populated Places layer that was the closest to the earthquake point. T he Distance f ield is the distance between the 2 points.
T his is very close to the result we were looking f or. For some uses, this table would be suf f icient. I will demonstrate, how we can use Table Joins to integrate this results in our original Earthquake layer. Look at this tutorial f or more details on Table Joins. Right click on the Earthquake layer, and select Properties.
We want to join the data f rom our analysis result (matrix.csv) to this layer. We need to select a f ield f rom each of the layers that has the same values. Select the f ields as shown below.
You will see the join appear in the Joins tab. Click Ok.
You will see that f or every Earthquake f eature, we now have an attribute which is the nearest neighbor (closest populated place) and the distance to the nearest neighbor.
A usef ul thing to note is that you can even perf orm the analysis with only 1 layer. Select the same layer as both Input and Target. T he result would be a nearest neighbor f rom the same layer instead of a dif f erent layer as we used here. Let me know in the comments how you have used this tool and what kind of cool applications you can think of using it.