0% found this document useful (0 votes)
6 views21 pages

ML-Notes - 4 and 5 - 16 Marks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views21 pages

ML-Notes - 4 and 5 - 16 Marks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 21

1. Perform K-Means Clustering with K=2 using the Euclidean distance.

Use A (2,10) and C


(8,4) as initial centroids. Perform two iterations and calculate the new centroids after each
iteration and show the following:
1. Distance computations
2. Cluster assignments
3. Updated centroids
4. Final cluster assignment after 2 iterations
Data Point X Y
A 2 10
B 2 5
C 8 4
D 5 8
E 7 5
F 6 4
Solution:
 K = 2 clusters
 Initial centroids:
o Cluster 1 (Centroid 1) = A = (2, 10)
o Cluster 2 (Centroid 2) = C = (8, 4)
 Distance metric: Euclidean distance
 Iterations: 2
 Data Points:
Data Point X Y
A 2 10
B 2 5
C 8 4
D 5 8
E 7 5
F 6 4
Iteration 1
Step 1: Distance Computations
Compute the Euclidean distance between each point and the centroids:
Distance= sqrt((x2−x1)2+(y2−y1)2)
Let:
 Centroid 1 = (2, 10)
 Centroid 2 = (8, 4)
Point Distance to C1 (2,10) Distance to C2 (8,4)
A (2,10) √0 = 0.00 √(36+36) = √72 ≈ 8.49
B (2,5) √(0+25) = 5.00 √(36+1) = √37 ≈ 6.08
C (8,4) √(36+36) = 8.49 √0 = 0.00
D (5,8) √(9+4) = √13 ≈ 3.61 √(9+16) = √25 = 5.00
E (7,5) √(25+25) = √50 ≈ 7.07 √(1+1) = √2 ≈ 1.41
F (6,4) √(16+36) = √52 ≈ 7.21 √(4+0) = √4 = 2.00

Step 2: Cluster Assignments


Assign each point to the nearest centroid:
Point Cluster Assignment
A Cluster 1
B Cluster 1
C Cluster 2
D Cluster 1
E Cluster 2
F Cluster 2

Step 3: Updated Centroids


Cluster 1 (Points A, B, D):
New X=(2+2+5)/3=9/3=3
New Y=(10+5+8)/3=23/3≈7.67
⇒New Centroid 1=(3.0,7.67
Cluster 2 (Points C, E, F):
New X=(8+7+6)/3=21/3=7
New Y=(4+5+4)/3=13/3≈4.33
⇒New Centroid 2=(7.0,4.33)
Iteration 2
Step 1: Distance Computations (with updated centroids)
 Centroid 1 = (3.0, 7.67)
 Centroid 2 = (7.0, 4.33)
Point Distance to C1 (3.0, 7.67) Distance to C2 (7.0, 4.33)
A (2,10) √(1 + 5.44) = √6.44 ≈ 2.54 √(25 + 32.11) = √57.11 ≈ 7.56
B (2,5) √(1 + 7.11) = √8.11 ≈ 2.85 √(25 + 0.44) = √25.44 ≈ 5.04
C (8,4) √(25 + 13.44) = √38.44 ≈ 6.20 √(1 + 0.11) = √1.11 ≈ 1.05
D (5,8) √(4 + 0.11) = √4.11 ≈ 2.03 √(4 + 13.44) = √17.44 ≈ 4.18
E (7,5) √(16 + 7.11) = √23.11 ≈ 4.81 √(0 + 0.44) = √0.44 ≈ 0.66
F (6,4) √(9 + 13.44) = √22.44 ≈ 4.74 √(1 + 0.11) = √1.11 ≈ 1.05
Step 2: Cluster Assignments (Iteration 2)

Point Cluster Assignment

A Cluster 1
B Cluster 1
C Cluster 2
D Cluster 1
E Cluster 2
F Cluster 2
Same as Iteration 1 → Converged
Final Cluster Assignment after 2 Iterations
 Cluster 1: A, B, D
 Cluster 2: C, E, F
Final Centroids
 Centroid 1: (3.0, 7.67)
 Centroid 2: (7.0, 4.33)

2. Organize the functioning of unsupervised learning with its two primary approaches.
Compare the strengths and weaknesses of this algorithm with respect to K-Means and K-
Medoids clustering. Also, identify the specific situations where Hierarchical Clustering
would be more advantageous than K-Means or K-Medoids and provide practical examples
where this might be applicable.
Unsupervised Learning:
Two Primary Approaches in Unsupervised Learning
1. Clustering
o Objective: Group data points into clusters such that items in the same group are
more similar to each other than to those in other groups.
o Common algorithms: K-Means, K-Medoids, Hierarchical Clustering, DBSCAN,
etc.
2. Dimensionality Reduction
o Objective: Reduce the number of variables in the dataset while preserving
important information.
o Common algorithms: PCA (Principal Component Analysis), t-SNE,
Autoencoders, etc.

Comparison: K-Means vs. K-Medoids vs. Hierarchical Clustering:


Aspect K-Means K-Medoids Hierarchical Clustering
Mean of cluster Actual data point No centroid, uses tree
Centroid Type
points (medoid) structure
Aspect K-Means K-Medoids Hierarchical Clustering
Any (supports various
Distance Metric Euclidean (default) Any (more flexible)
linkages)
Sensitivity to
High Low Moderate to Low
Outliers
Prefers spherical Can handle arbitrary
Cluster Shape Can handle complex shapes
clusters shapes
Fast, especially for Slower (due to pairwise Slower, especially for large
Efficiency
large data distances) datasets
Scalability Good (O(nk)) Poorer (O(n²) or worse) Poor (O(n² log n))
Number of Clusters Not required; can be chosen
Must be pre-defined Must be pre-defined
(k) post hoc
No (depends on initial
Deterministic? No Yes (deterministic)
seeds)
Interpretability Moderate High (uses actual points) High (via dendrograms)
Strengths and Weaknesses
K-Means
Strengths:
 Simple and fast on large datasets.
 Works well when clusters are well-separated and spherical.
Weaknesses:
 Assumes equal-sized, spherical clusters.
 Sensitive to outliers and noise.
 Requires k to be specified.

K-Medoids
Strengths:
 More robust to noise and outliers.
 Can use any distance metric (e.g., Manhattan, cosine).
Weaknesses:
 Slower than K-Means on large datasets.
 Still requires k to be specified.
Hierarchical Clustering
Strengths:
 No need to pre-define number of clusters.
 Builds a hierarchy of clusters (dendrogram).
 Effective for nested or non-convex clusters.
Weaknesses:
 Computationally expensive on large datasets.
 Cannot easily undo previous clustering decisions.
 Sensitive to linkage and distance metric choice.
When is Hierarchical Clustering More Advantageous?
Hierarchical clustering is better than K-Means/K-Medoids in scenarios where:
 The number of clusters is unknown and flexible grouping is desired.
 The data has a nested or hierarchical structure.
 You want interpretability via dendrograms.
 Clusters are of unequal size, shape, or density.
Practical Examples
Application Why Hierarchical Clustering?
Gene expression analysis Reveals nested gene groupings with biological significance.
Document or text clustering Captures topic hierarchies (e.g., topics and subtopics).
Customer segmentation in Useful when customer behaviors are hierarchically
marketing structured.
Captures complex behavior not well-separated into k
Anomaly detection in networks
clusters.
Taxonomy of species or organisms Natural fit due to inherent hierarchical classification.
3. Consider the data set
Cust-id Annual Income Spending
store
1 15000 39
2 16000 81
3 17000 6
4 18000 77
5 19000 40
6 20000 76

Perform K-means clustering on the given data set K=2. Calculate the centroids of the two
clusters after applying K- means.
Solution:
K-Means Clustering on the dataset with K = 2. Steps:
1. Initialize centroids (we’ll select 2 points from the data).
2. Compute distances and assign clusters.
3. Calculate new centroids.
4. Iterate once more and show final centroids.
Dataset
Cust-id Annual Income Spending Score
1 15000 39
2 16000 81
3 17000 6
4 18000 77
5 19000 40
6 20000 76

Step 1: Choose Initial Centroids (Random or Deterministic)


Select:
 Centroid 1: Customer 1 → (15000, 39)
 Centroid 2: Customer 2 → (16000, 81)
Step 2: Distance Calculation (Euclidean)
Compute distance of each point to the two centroids.
Formula:
d=sqrt((x2−x1)2+(y2−y1)2)
Cust-id Point (X,Y) To Centroid 1 (15000, 39) To Centroid 2 (16000, 81) Cluster
1 (15000, 39) 0 √((1000)^2 + (42)^2) ≈ 1000.88 1
2 (16000, 81) √(1000^2 + 42^2) ≈ 1000.88 0 2
3 (17000, 6) √(2000^2 + 33^2) ≈ 2000.27 √(1000^2 + 75^2) ≈ 1002.81 1
4 (18000, 77) √(3000^2 + 38^2) ≈ 3000.24 √(2000^2 + 4^2) ≈ 2000.00 2
5 (19000, 40) √(4000^2 + 1^2) ≈ 4000.00 √(3000^2 + 41^2) ≈ 3000.28 1
6 (20000, 76) √(5000^2 + 37^2) ≈ 5000.14 √(4000^2 + 5^2) ≈ 4000.00 2

Cluster Assignment After Iteration 1


 Cluster 1: Cust 1, 3, 5
 Cluster 2: Cust 2, 4, 6
Step 3: Recalculate Centroids
Cluster 1: Customers 1, 3, 5
Points:
 (15000, 39), (17000, 6), (19000, 40)
Centroid X=(15000+17000+19000)3=51000/3=17000
Centroid Y=(39+6+40)/3=85/3≈28.33
→ New Centroid 1: (17000, 28.33)

Cluster 2: Customers 2, 4, 6
Points:
 (16000, 81), (18000, 77), (20000, 76)
Centroid X=(16000+18000+20000)/3=54000/3=18000
Centroid Y=(81+77+76)/3=234/3=78.00
→ New Centroid 2: (18000, 78.00)
Final Output
Final Clusters:
 Cluster 1: Cust-ids 1, 3, 5
 Cluster 2: Cust-ids 2, 4, 6
Final Centroids:
 Cluster 1 Centroid: (17000, 28.33)
 Cluster 2 Centroid: (18000, 78.00)

4. Given the transaction dataset as below and the thresholds of minimum support = 2 and
minimum confidence = 50%, apply the Apriori algorithm to find all frequent itemsets and
generate the corresponding association rules.
TID ITEMSETS
T1 A, B
T2 B, D
T3 B, C
T4 A, B, D
T5 A, C
T6 B, C
T7 A, C
T8 A, B, C, E
T9 A, B, C
Solution:
Apply the Apriori algorithm to the transaction dataset with:
 Minimum support = 2
 Minimum confidence = 50%
1. Find all frequent itemsets.
2. Generate association rules from those itemsets that meet the minimum confidence
threshold.
Step 0: Understand the Dataset
Transactions:
TID Items
T1 A, B
T2 B, D
T3 B, C
T4 A, B, D
T5 A, C
TID Items
T6 B, C
T7 A, C
T8 A, B, C, E
T9 A, B, C

Step 1: Generate L1 – Frequent 1-itemsets


Count support of individual items:
Item Support Count
A 6 (T1, T4, T5, T7, T8, T9)
B 7 (T1, T2, T3, T4, T6, T8, T9)
C 6 (T3, T5, T6, T7, T8, T9)
D 2 (T2, T4)
E 1 (T8) ❌ Not frequent

Frequent 1-itemsets:
L1 = {A, B, C, D}
Step 2: Generate L2 – Candidate 2-itemsets from L1
C2 candidates:
 {A,B}, {A,C}, {A,D}, {B,C}, {B,D}, {C,D}

Support count:
Itemset Transactions Support
A, B T1, T4, T8, T9 4
A, C T5, T7, T8, T9 4
A, D T4 1❌
B, C T3, T6, T8, T9 4
B, D T2, T4 2
C, D — 0❌
Frequent 2-itemsets:
L2 = {A,B}, {A,C}, {B,C}, {B,D}
Step 3: Generate L3 – Candidate 3-itemsets from L2
Possible candidates:
 {A, B, C}, {A, B, D}, {B, C, D}
 Support count:
Itemset Transactions Support
A, B, C T8, T9 2✅
A, B, D T4 1❌
B, C, D — 0❌
Frequent 3-itemsets:
L3 = {A, B, C}

Final Frequent Itemsets (Support ≥ 2)


Itemset Support Count
{A} 6
{B} 7
{C} 6
{D} 2
{A, B} 4
{A, C} 4
{B, C} 4
{B, D} 2
{A, B, C} 2
Step 4: Generate Association Rules (Confidence ≥ 50%)
Formula:
Confidence (X⇒Y)=Support(X∪Y)/Support(X)
From {A, B}
 A → B = 4/6 = 66.7% ✅
 B → A = 4/7 = 57.1% ✅
From {A, C}
 A → C = 4/6 = 66.7% ✅
 C → A = 4/6 = 66.7% ✅
From {B, C}
 B → C = 4/7 = 57.1% ✅
 C → B = 4/6 = 66.7% ✅
From {B, D}
 B → D = 2/7 = 28.6% ❌
 D → B = 2/2 = 100% ✅
From {A, B, C}
 A, B → C = 2/4 = 50% ✅
 A, C → B = 2/4 = 50% ✅
 B, C → A = 2/4 = 50% ✅
Final Association Rules (Confidence ≥ 50%)
Rule Support Confidence
A→B 4 66.7% ✅
B→A 4 57.1% ✅
A→C 4 66.7% ✅
C→A 4 66.7% ✅
B→C 4 57.1% ✅
C→B 4 66.7% ✅
D→B 2 100% ✅
A, B → C 2 50% ✅
A, C → B 2 50% ✅
Rule Support Confidence
B, C → A 2 50% ✅

5. Enumerate the architecture of a Neural Network and explain the key components such
as the input layer, hidden layer and output layer.
Architecture of a Neural Network
A Neural Network (NN) is a computational model inspired by the structure and function of the
human brain. It consists of layers of interconnected nodes (neurons) that process data through
weighted connections
Basic Structure
A typical neural network includes:
1. Input Layer
2. One or more Hidden Layers
3. Output Layer
Input Layer → Hidden Layer(s) → Output Layer

Each layer is made up of neurons, and connections between neurons carry weights that are
adjusted during training.
Key Components :
1. Input Layer
 Function: Receives the input data (features).
 Structure: Each neuron in this layer corresponds to one feature of the input vector.
 No computation: It simply passes data to the next layer.
Example:
For image recognition, the input layer might have 784 neurons (for 28x28 pixel grayscale
images).
2. Hidden Layer(s)
 Function: Perform computations to extract patterns from the data.
 Can be multiple layers, hence the term deep learning for deep networks.
 Each neuron applies a weighted sum and passes it through an activation function like:
o ReLU (Rectified Linear Unit) – common in deep networks
o Sigmoid – used in binary classification
o Tanh – zero-centered activation
Role:
 Capture complex, nonlinear relationships.
 More layers and neurons = more capacity to model complex data.
3. Output Layer
 Function: Produces the final result (prediction).
 Structure: Number of neurons depends on the task:
o 1 neuron for binary classification
o n neurons for n-class classification (with softmax activation)
o Regression might use a linear neuron (no activation)
Activation Examples:
 Softmax – for multi-class classification
 Sigmoid – for binary classification
 Linear – for regression problems
Additional Key Components
Component Description
Numeric values associated with each connection; they determine
Weights
importance.
Biases Extra parameters that allow shifting activation functions.
Activation Functions Introduce non-linearity so networks can model complex data.
Loss Function Measures the difference between predicted and true values.
Component Description
Optimizer Algorithm that adjusts weights to minimize loss (e.g., SGD, Adam).
Summary of Flow
1. Input features go into the input layer.
2. Data moves through hidden layers, each applying weights, biases, and activations.
3. The output layer produces predictions.
4. During training, backpropagation adjusts weights using the loss to improve accuracy.

6. Assume the neurons use the sigmoid activation function for the forward and backward
pass. The target output is 0.5 and the learning rate is 1.

a) Compute the following for Forward Pass:


 The net input and output of each hidden neuron.
 The net input and output of the output neuron.
 The output error.
b) Compute the following for Backward Pass:
 The error signal for the output neuron.
 The error signal for each hidden neuron.
 Update all the weights in the network using gradient descent and the
backpropagation rule.
 Show all intermediate steps.
Solution:
A simple neural network with:
 2 input neurons
 2 hidden neurons
 1 output neuron
 Sigmoid activation for all neurons
 Learning rate (η) = 1
 Target output = 0.5
Initial Values:
 Input vector: x1=0.4, x2=0.6
 Weights:
o Input to Hidden:
 w1,1=0.1, w2,1=0.2 (to Hidden Neuron 1)
 w1,2=−0.1, w1,2=−0.1(to Hidden Neuron 2)
o Hidden to Output:
 wh1=0.7, wh2=−0.5
 Biases:
o Hidden: bh1=0.1, bh2=−0.3
o Output: bo=0.05
(a) Forward Pass
Compute net input and output of each hidden neuron:
Hidden Neuron 1:
neth1=(0.4)(0.1)+(0.6)(0.2)+0.1=0.04+0.12+0.1=0.26
outh1=σ(0.26)=1/(1+e−0.26)≈0.5646
Hidden Neuron 2:
neth2=(0.4)(−0.1)+(0.6)(0.1)−0.3=−0.04+0.06−0.3=−0.28
outh2=σ(−0.28)=1/(1+e0.28)≈0.4300
Compute net input and output of output neuron:
neto=(0.5646)(0.7)+(0.4300)(−0.5)+0.05=0.3952−0.2150+0.05=0.2302
outo=σ(0.2302)=1/(1+e−0.2302)≈0.5573
Compute Output Error:
Target=0.5, Predicted=0.5573
Error=(1/2)(0.5−0.5573)2≈0.00164
(b) Backward Pass
Error signal for output neuron:
Let δo be the error signal.
δo=(target−output)⋅σ′(neto)
σ′(neto)=outo⋅(1−outo)=0.5573(1−0.5573)≈0.2467 δo=(0.5−0.5573)
(0.2467)=−0.0573⋅0.2467≈−0.0141
Error signal for hidden neurons:
Let δh1 and δh2 be the error signals.
δh1=σ′(neth1)⋅(wh1⋅δo)=0.5646(1−0.5646)(0.7)(−0.0141)= 0.2458-0.00987 ≈ -0.00243
δh2=σ′(neth2)⋅(wh2⋅δo)=0.4300(1−0.4300)(−0.5)(−0.0141)=0.2451⋅0.00705≈0.00173
Weight Updates
Hidden to Output:
Using gradient descent:
Δw=η⋅δo⋅output of hidden neuron
 Δwh1=1⋅(−0.0141)⋅0.5646≈−0.0080
 Δwh2=1⋅(−0.0141)⋅0.4300≈−0.0061
 Δbo=1⋅(−0.0141)=−0.0141
Updated Weights:
 wh1=0.7−0.0080=0.692
 wh2=−0.5−0.0061=−0.5061
 bo=0.05−0.0141=0.0359
Input to Hidden:
Δw=η⋅δh⋅x
Δw1,1=1⋅(−0.00243)⋅0.4≈−0.00097
Δw2,1=1⋅(−0.00243)⋅0.6≈−0.00146
Δbh1=−0.00243\Delta b_{h1} = -0.00243Δbh1=−0.00243
Δw1,2=1⋅0.00173⋅0.4≈0.00069
Δw2,2=1⋅0.00173⋅0.6≈0.00104
Δbh2=0.00173
Updated Weights:
 w1,1=0.1−0.00097=0.0990
 w2,1=0.2−0.00146=0.1985
 bh1=0.1−0.00243=0.0976
 w1,2=−0.1+0.00069=−0.0993
 w2,2=0.1+0.00104=0.1010
 bh2=−0.3+0.00173=−0.2983
Final Updated Parameters (after 1 pass)
Weights:
Connection Updated Value
w1,1 0.0990
w2,1 0.1985
w1,2 -0.0993
w2,2 0.1010
wh1 0.6920
wh2 -0.5061

Biases:
Neuron Updated Value
bh1 0.0976
bh2 -0.2983
bo 0.0359

7. A neuron receives the following inputs and weights as


Input (xi) Weight (wi)
x1 = 0.5 w1 = 0.6
x2 = 0.8 w2 = -0.4
x3 = 0.3 w3 = 0.9
x4 = 0.6 w4 = 0.1
The bias is 0.2. Calculate the net input (z) to the neuron and apply ReLU activation
function on it.
Solution:
Given:
Input xi Weight wi

x1=0.5 w1=0.6

x2=0.8 w2=−0.4

x3=0.3 w3=0.9

x4=0.6 w4=0.1

 Bias (b) = 0.2


Step 1: Calculate the Net Input z
z=(x1⋅w1)+(x2⋅w2)+(x3⋅w3)+(x4⋅w4)+b
z=(0.5⋅0.6)+(0.8⋅−0.4)+(0.3⋅0.9)+(0.6⋅0.1)+0.2
z=0.3+(−0.32)+0.27+0.06+0.2=0.51
Step 2: Apply ReLU Activation
The ReLU function is defined as:
ReLU(z)=max⁡(0,z)
ReLU(0.51)=0.51
Final Answer
 Net input zzz = 0.51
 ReLU activation output = 0.51

8. Illustrate the backpropagation algorithm with an example.


 A step-by-step illustration of the backpropagation algorithm using a simple neural
network. This will include both forward and backward passes, and gradient updates using
the Mean Squared Error (MSE) loss function.
Example: 1 Hidden Layer Neural Network
Structure:
 Inputs: x1=0.05,x2=0.10
 Hidden layer: 2 neurons
 Output layer: 1 neuron
 Activation: Sigmoid
 Loss Function: Mean Squared Error (MSE)
Initial Weights and Biases
Layer Weights Bias
w1=0.15, w2=0.20
Input → Hidden b1=0.35
w3=0.25, w4=0.30
Hidden → Output w5=0.40,w6=0.45 b2=0.60
Target Output y=0.01
Learning Rate η=0.5
STEP 1: Forward Pass
Compute Hidden Layer Outputs
 h1=σ(z1), z1=x1⋅w1+x2⋅w2+b1
 h2=σ(z2), z2=x1⋅w3+x2⋅w4+b1
z1=0.05*0.15+0.10*0.20+0.35=0.3775
h1=σ(0.3775)=1/(1+e−0.3775)≈0.59327
z2=0.05*0.25+0.10*0.30+0.35=0.3925
h2=σ(0.3925)≈0.59688
Compute Output
z3=h1*w5+h2*w6+b2=0.59327*0.40+0.59688*0.45+0.60=1.1059
y^=σ(1.1059)≈0.7514
Compute Loss (MSE)
E=(1/2)(y−y^)2=(1/2)(0.01−0.7514)2≈0.2748
STEP 2: Backward Pass (Backpropagation)
Output Layer Gradient
δoutput=(y^−y)⋅y^(1−y^)=(0.7514−0.01)(0.7514)(1−0.7514)
≈0.7414*0.1868≈0.1385
Update weights from hidden to output:
Δw5=−η⋅δoutput*h1=−0.5⋅0.1385⋅0.59327≈−0.0411
Δw6=−η⋅δoutput*h2=−0.5⋅0.1385⋅0.59688≈−0.0413
Δb2=−η⋅δoutput=−0.5⋅0.1385=−0.0692
Updated Weights:
 w5=0.40−0.0411=0.3589
 w6=0.45−0.0413=0.4087
 b2=0.60−0.0692=0.5308
Hidden Layer Gradient
δh1=δoutput*w5⋅h1(1−h1)=0.1385⋅0.40⋅0.59327⋅(1−0.59327)≈0.0131
δh2=δoutput*w6⋅h2(1−h2)=0.1385⋅0.45⋅0.59688⋅(1−0.59688)≈0.0142
Update weights from input to hidden:
Δw1=−η⋅δh1⋅x1=−0.5⋅0.0131⋅0.05≈−0.00033
Δw2=−0.5⋅0.0131⋅0.10≈−0.00066
Δw3=−0.5⋅0.0142⋅0.05≈−0.00035
Δw4=−0.5⋅0.0142⋅0.10≈−0.00071
Δb1=−η⋅(δh1+δh2)≈−0.5⋅(0.0131+0.0142)=−0.0137
Updated Weights:
 w1=0.15−0.00033=0.14967
 w2=0.20−0.00066=0.19934
 w3=0.25−0.00035=0.24965
 w4=0.30−0.00071=0.29929
 b1=0.35−0.0137=0.3363
Summary
Weight New Value
w1 0.14967
w2 0.19934
w3 0.24965
w4 0.29929
w5 0.3589
w6 0.4087
b1 0.3363
b2b_2b2 0.5308

You might also like