Many times we use a method called data smoothing to make the data proper and qualitative for statistical analysis. During the smoking process we define a range also called bin and any data value within the range is made to fit into the bin. This is called the binning method. Below is an example of binning. Then we will see how we can achieve the binning method using a Python program.
Binning Example
Let’s take a series of numbers. Find the maximum and minimum values. Decide on the number of bins we need depending on how many data points the analysis needs. Create these groups and assign each of these numbers to this groups.The upper value is excluded and belongs to next group.
Example
Given numbers: 12, 32, 10, 17, 19, 28, 22, 26, 29,16 Number of groups : 4 Here Max Value: 32 Min Value: 10 So the groups are – (10-15), (15-21), (21-27), (27-32)
Output
On putting the numbers into bins, we get the following result −
12 -> (10-15) 32 -> (27-32) 10 -> (10-15) 17 -> (15-21) 19 -> (15-21) 28 -> (27-32) 22 -> (21-27) 26 -> (21-27) 29 -> (27-32) 16 -> (15-21)
Binning Program
For this program we define two functions. One for creating the bins by defining the upper and lower bounds. The other function is to assign the input values to each of the bin. Each of the bin also gets an index. We see how each of the input value is assigned to the bin and keep track of how many values go to a specific bin.
Example
from collections import Counter def Binning_method(lower_bound, width, quantity): binning = [] for low in range(lower_bound, lower_bound + quantity * width + 1, width): binning.append((low, low + width)) return binning def bin_assign(v, b): for i in range(0, len(b)): if b[i][0] <= v < b[i][1]: return i the_bins = Binning_method(lower_bound=50, width=4, quantity=10) print("The Bins: \n",the_bins) weights_of_objects = [89.2, 57.2, 63.4, 84.6, 90.2, 60.3,88.7, 65.2, 79.8, 80.2, 93.5, 79.3,72.5, 59.2, 77.2, 67.0, 88.2, 73.5] print("\nBinned Values:\n") binned_weight = [] for val in weights_of_objects: index = bin_assign(val, the_bins) #print(val, index, binning[index]) print(val,"-with index-", index,":", the_bins[index]) binned_weight.append(index) freq = Counter(binned_weight) print("\nCount of values in each index: ") print(freq)
Output
Running the above code gives us the following result −
The Bins: [(50, 54), (54, 58), (58, 62), (62, 66), (66, 70), (70, 74), (74, 78), (78, 82), (82, 86), (86, 90), (90, 94)] Binned Values: 89.2 -with index- 9 : (86, 90) 57.2 -with index- 1 : (54, 58) 63.4 -with index- 3 : (62, 66) 84.6 -with index- 8 : (82, 86) 90.2 -with index- 10 : (90, 94) 60.3 -with index- 2 : (58, 62) 88.7 -with index- 9 : (86, 90) 65.2 -with index- 3 : (62, 66) 79.8 -with index- 7 : (78, 82) 80.2 -with index- 7 : (78, 82) 93.5 -with index- 10 : (90, 94) 79.3 -with index- 7 : (78, 82) 72.5 -with index- 5 : (70, 74) 59.2 -with index- 2 : (58, 62) 77.2 -with index- 6 : (74, 78) 67.0 -with index- 4 : (66, 70) 88.2 -with index- 9 : (86, 90) 73.5 -with index- 5 : (70, 74) Count of values in each index: Counter({9: 3, 7: 3, 3: 2, 10: 2, 2: 2, 5: 2, 1: 1, 8: 1, 6: 1, 4: 1})