Machine Learning – Mtech (SEM-I)
Unit – 1
Workflow for using machine
learning in predictive modeling
↓
Supervised Learning Process
Machine Learning
Types of ML: (Explain each and example of models)
Supervised
Unsupervised
Reinforcement Training
Artificial neuron (Perceptron) VS Biological neuron
Biological neurons are interconnected nerve cells in the brain that are involved in the processing and
transmitting of chemical and electrical signals. McCulloch and Pitts described such a nerve cell as a simple logic
gate with binary outputs; multiple signals arrive at the dendrites, they are then integrated into the cell body,
and, if the accumulated signal exceeds a certain threshold, an output signal is generated that will be passed on
by the axon.
Artificial neuron
decision function, 𝜎(z), that takes a linear combination of certain input values, x, and a corresponding weight
vector, w, where z is the so-called net input z = w1x1 + w2x2 + ... + wnxn
The perceptron learning rule(Similar algo for adaline)
𝜂 is the learning rate,
y(i) is the true class label, 𝑦^(𝑖) is the predicted class label
Adaline neuron vs Perceptron neuron
Program for Perceptron and Adaline
Perceptron plt.scatter(x_test[:,0], x_test[:,1], c=y_test ,label=’Train’)
import matplotlib.pyplot as plt plt.show()
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron Adaline
from sklearn.metrics import accuracy_score import matplotlib.pyplot as plt
data=load_iris() from sklearn.datasets import load_iris
x,y=data.data,data.target from sklearn.linear_model import SGDClassifier
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2, from sklearn.metrics import accuracy_score
random_state=42) data=load_iris()
model=Perceptron(max_iter=500,tol=2e-1,learning_rate=0.2) x,y=data.data,data.target
model.fit(x_train,y_train) x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,
y_pred=model.predict(x_test) random_state=42)
acc=accuracy_score(y_test,y_pred) model=SGDClassifier(max_iter=500,loss=’squared_error’,learning_ra
plt.scatter(x_train[:,0], x_train[:,1],c=y_train,label=’Train’) te=0.2)
model.fit(x_train,y_train)
y_pred=model.predict(x_test) plt.scatter(x_test[:,0], x_test[:,1], c=y_test ,label=’Train’)
acc=accuracy_score(y_test,y_pred) plt.show()
plt.scatter(x_train[:,0], x_train[:,1],c=y_train,label=’Train’)
Gradient Descent - Characteristics
Deterministic: Produces the same updates for the same dataset.
Slow Updates: Since it processes the entire dataset, it is computationally expensive, especially for large
datasets.
Global Convergence: Provides a smooth path to the minimum but can get stuck in local minima.
Stochastic Gradient Descent - Characteristics
• Faster Updates: Updates are made after processing each sample, making it faster for large datasets.
• Noisy Updates: Updates fluctuate due to randomness, which can help escape local minima.
• Non-Deterministic: Each run may produce slightly different results because of random sampling.
• Less Memory Intensive: Only a single sample is required in memory at any time
Gradient Descent VS Stochastic Gradient Descent (Advantages and Disadvantages)
Feature Scaling
Feature scaling is a technique used to normalize the range of independent variables (features) in a dataset. It
ensures that no single feature dominates others due to its scale(no skewness). This is especially important in
algorithms like gradient descent, k-NN, and SVM(Scale Varient models). Common Methods: Min-Max Scaling,
Standardization (Z-score).
Need of Feature Scaling
•Slow Convergence: Larger ranges in feature values can cause the gradient descent to take more time to
converge, as the algorithm makes smaller steps in some dimensions and larger ones in others.
•Risk of Getting Stuck: With unscaled features, gradient descent may “oscillate” across the cost surface,
increasing the risk of missing the optimal solution.
•Difficulties with Optimal Learning Rates: Without scaling, finding a learning rate that works for all features
becomes challenging. A learning rate that works well for one feature might not be suitable for another with a
larger range.
**Perceptron Problem, Activation Functions
**GD & SGD Algo, graph and formulas - Loss Function
UNIT – 2
Under Fitting and Over Fitting in model
In machine learning, underfitting and overfitting are two common problems related to model performance and
generalization
Variance VS Bias
Using k-fold cross-validation to assess model performance– advs, code
- holdout cross-validation
- k-fold cross-validation
Types of Bias
one class or multiple classes are over-represented in a dataset
Class Imbalance – Challenges, Dealing with imbalance
Metrics: Confusion Matrix
False Positive (FP) (Type I Error):
True Positive Rate/ Recall
A False Positive occurs when the model incorrectly predicts the positive class
False Negative (FN) (Type II Error): True Negative Rate
A False Negative occurs when the model incorrectly predicts the negative class
**Problems
False Positive on confusion
Rate/ Recall matrix