Assignment 2: Part 1 (4 PTS) - Conceptual Understanding of ANN Workflow
Assignment 2: Part 1 (4 PTS) - Conceptual Understanding of ANN Workflow
Bias is the error in the machine learning model. It is the error between model prediction
and the ground truth. High bias tends to underfit and miss the relevant relations
between features and target outputs
Variance: It is the average variability in the model prediction. One small change will
affect the entire dataset. The high variance will lead the model to overfit due to noisy
and unseen data.
Dealing with Bias and Variance means dealing with under and overfitting. They have an
inverse relationship. If one increases, then the other decreases. The tradeoff between
bias and variance is nothing but minimizing the bias and variance to avoid underfitting
and overfitting of data. We need to choose data that is having low bias and low variance.
Mathematical:
Bias of this estimator g(X) can be mathematically written as :-
Bias[g(xₒ)] = E[g(xₒ)] − f(xₒ), where E[g(xₒ)] is the estimated value, and f(xₒ) is the actual value
Variance of the estimator at test point xₒ can be mathematically written as :-
Var[g(xₒ)] = E[(g(xₒ) − E[g(xₒ)])²]
Err(xₒ) = E[(Y − g(xₒ))²]
= E[(f + ϵ − g)²]
= E[ϵ²] + E[(f − g)²] + 2.E[(f − g)ϵ]
= E[(ϵ − 0)²] + E[(f − E[g] + E[g] − g)²] + 2.E[fϵ] − 2.E[gϵ]
= E[(ϵ − E[ϵ])²] + E[(f − E[g] + E[g] − g)²] + 0 − 0
= Var(ϵ) + E[(g − E[g])²] + E[(E[g] − f)²] + 2.E[(g − E[g])(E[g] − f)]
= Var(ϵ) + Var(g) + Bias(g)² + 2.{E[g]² − E[gf] − E[g]² + E[gf]}
= σ² + Var(g) + Bias(g)²
So, there is an inverse relationship between bias and variance. The estimator that can balance
between bias and variance is able to minimize the error that is needed.
Programmatically:
Submitted the python code
Gradient Descent:
It is an optimization algorithm that works iteratively to find the minimal cost or error values in
model parameters. From a random point in a function, we move towards a negative direction in
order to find the local minima.
Mathematical Function:
f(m,b)=1N∑i=1n(yi−(mxi+b))2f(m,b)=1N∑i=1n(yi−(mxi+b))2
f′(m,b)=⎡⎣df/dm
dfdb⎤⎦=[1/N∑−2xi(yi−(mxi+b)) 1/N∑−2(yi−(mxi+b))]
Program:
Submitted by code.