Task 2
Task 2
Problem Statement
In modern urban planning, the efficient design of road networks is crucial for fostering
connectivity and sustainable development. Night-time satellite imagery offers a unique
perspective, with illuminated areas often indicative of urban centers. In this task, you have to
develop an algorithm to extract the cities from this satellite image.
You've got a simulated night-time satellite image, a grid of 64x64 squares with white dots
representing places where there's light. Your job is to create an algorithm that can find these dots
and locate the cities on the 64x64 grid by clustering them.
(NOTE: Refrain from employing any libraries such as Scikit Learn for clustering purposes)
After identifying cities through clustering, calculate the distances between them using the center
point of cities. Present the distances in a clear format, such as a table.
Resources for the task
1. How are images stored?
Think of an image like a grid of tiny colored squares. Each square is called a "pixel". These
pixels come together to form the image you see on your screen. Each pixel has information about
its color. The images given in this task are grayscale images.
In a grayscale image, each pixel is represented by a single value that denotes its brightness or
intensity. This value typically ranges from 0 to 255, where 0 represents black and 255 represents
white. The values in between represent varying shades of gray.
To explore more do read How Images Are Stored in a Computer | Grayscale & RGB
Python offers various libraries for working with images. One popular library is Pillow. It allows
you to open, manipulate, and save many different image file formats.
Here you will be using it to convert the image into a 2D numpy array. This conversion allows
for easy manipulation and analysis of image data using the functionalities provided by NumPy.
Here is a guided example A Guide to Converting PIL Images to NumPy Arrays in Python
Now that your image has been transformed into an array, you should be capable of pinpointing
the coordinates corresponding to the lights. Subsequently, clustering these points can aid in
identifying the cities depicted in the provided images.
You need to implement the K Means Algorithm on your own from scratch. No points in using
Scikit learn implementation.
Pseudo Code:
The Elbow Method is a popular technique used in cluster analysis to determine the optimal
number of clusters in a dataset. It's a graphical approach that helps in finding the point of
diminishing returns when adding more clusters to the data.
Read more:
https://www.analyticsvidhya.com/blog/2021/01/in-depth-intuition-of-k-means-clustering-algorith
m-in-machine-learning/
Pseudo Code
1. Initialize an empty list to store the within-cluster sum of squares (WCSS) for each
value of K
2. For k = 1 to max_clusters:
Perform clustering using K-means algorithm with k clusters
Calculate the within-cluster sum of squares (WCSS) for the clustering `
result
Append the WCSS value to the list
3. Plot the WCSS values against the number of clusters (k)
4. Identify the "elbow point" where the rate of decrease in WCSS slows down
5. Return the optimal number of clusters (k) corresponding to the elbow point