Introduction and Basics of Machine Learning
Introduction and Basics of Machine Learning
Example: Teaching a spam filter to classify emails as spam or not based on previous emails.
Bias-Variance Tradeoff
● Bias: The error due to overly simplistic assumptions in the model (often leads to
underfitting).
● Variance: The error due to a model's sensitivity to small fluctuations in the training data
(often leads to overfitting).
● Goal: Find a balance between bias and variance to minimize the total error and achieve
good generalization.
2. Supervised Learning
What is Supervised Learning?
Supervised learning trains a model on labeled data, where each example has an input and a
known output (target). The goal is to learn a function that maps inputs to inputs.
Regression
● Linear Regression: Predicts a continuous target variable as a weighted sum of input
features.
○ Example: Predict house price based on size, number of bedrooms, etc.
○ Formula: y=w0+w1x1+w2x2+⋯+wnxn
● Polynomial Regression: Extends linear regression by adding polynomial terms to
capture non-linear relationships.
● Logistic Regression: Used for binary classification (output is 0 or 1). Models the
probability that an input belongs to a class using a sigmoid function.
Classification
● k-Nearest Neighbors (k-NN): Classifies based on the majority class of the k closest
training examples in the feature space.
● Support Vector Machines (SVM): Finds the best boundary (hyperplane) that
separates classes with maximum margin.
● Decision Trees: Splits data based on feature thresholds to form a tree where leaves
represent class labels.
● Random Forests: An ensemble of decision trees trained on different subsets of data
and features, voting for the final class.
● Gradient Boosting Machines: Builds models sequentially to correct errors of previous
models (e.g., XGBoost).
● Neural Networks: Layers of interconnected nodes that can model complex
relationships.
Performance Metrics
● Accuracy: Fraction of correct predictions.
● Precision: How many predicted positives are actually positive.
● Recall: How many actual positives are correctly predicted.
● F1-score: Harmonic mean of precision and recall.
● ROC Curve & AUC: Plots true positive rate vs false positive rate; AUC summarizes
performance.
● Confusion Matrix: Table showing true positives, false positives, true negatives, false
negatives.
Hyperparameter Tuning
Hyperparameters (like number of trees, max depth, learning rate) are parameters set before
training.
Techniques:
● Grid Search: Try all combinations exhaustively.
● Randomized Search: Randomly sample combinations to save time.
3. Unsupervised Learning
What is Unsupervised Learning?
It finds patterns or structure in data without labeled outputs.
Clustering
● K-means Clustering: Assigns data points into k clusters by minimizing the distance
between points and cluster centers.
● Hierarchical Clustering: Builds a tree of clusters by either merging (agglomerative) or
splitting (divisive) clusters.
● DBSCAN: Density-based clustering that groups points that are closely packed and
marks points in low-density areas as noise.
Dimensionality Reduction
● Principal Component Analysis (PCA): Projects data into fewer dimensions while
preserving variance.
● t-SNE: Visualizes high-dimensional data by reducing dimensions while preserving local
structure.
● Autoencoders: Neural networks trained to compress and then reconstruct data,
learning efficient representations.
Association Rule Learning
● Apriori Algorithm: Finds frequent itemsets in data to identify association rules (e.g.,
market basket analysis).
4. Reinforcement Learning
Basics of RL
An agent interacts with an environment through actions, receives rewards, and learns a policy
to maximize cumulative rewards.
Markov Decision Processes (MDP)
Framework defining states, actions, transition probabilities, and rewards.
Q-Learning
A value-based method where the agent learns a function Q(s,a) that estimates the expected
return of taking action a in state s.
Deep Reinforcement Learning
Combines deep neural networks with RL (e.g., Deep Q-Networks) to handle high-dimensional
inputs like images.
5. Deep Learning
Neural Networks Basics
Composed of layers of neurons (nodes). Each neuron receives inputs, multiplies by weights,
adds bias, applies an activation function, and passes output to next layer.
Activation Functions
● ReLU (Rectified Linear Unit): Outputs input if positive, else zero.
○ f(x)=max(0,x)
○ Popular for hidden layers.
● Sigmoid: Outputs between 0 and 1, useful for probabilities.
○ f(x)=1+e−x1
● Tanh: Outputs between -1 and 1, zero-centered.
Feedforward Neural Networks
Information flows from input to output layer through hidden layers. Used for
regression/classification on tabular data.
Backpropagation and Gradient Descent
● Backpropagation: Calculates gradients of loss with respect to weights.
● Gradient Descent: Updates weights to minimize loss.
Convolutional Neural Networks (CNNs)
Designed for images. Uses convolutional layers that apply filters to detect edges, shapes,
textures. Followed by pooling layers to reduce spatial size.
Recurrent Neural Networks (RNNs), LSTM, GRU
Designed for sequential data (time series, text). RNNs have loops to maintain a state. LSTM
and GRU are special units that handle long-term dependencies better by controlling
information flow.
Transfer Learning
Using a pretrained model on a new but related task. Saves training time and improves
performance.
Generative Adversarial Networks (GANs)
Two networks: Generator creates fake data, Discriminator tries to distinguish real vs fake.
Trains both networks simultaneously to improve data generation.
TensorFlow
● Type: Open-source deep learning framework developed by Google.
● Focus: Building and training large-scale deep learning models.
● Features:
○ Flexible computation graphs for complex model building.
○ Supports CPUs, GPUs, and TPUs.
○ High-level APIs (Keras) built on top for easier model design.
○ TensorBoard for visualization.
● Use Cases:
○ Deep learning projects like image recognition, NLP, reinforcement learning.
○ Production-level deployment with TensorFlow Serving and TensorFlow Lite for
mobile.
● Example:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(input_dim,)),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
PyTorch
● Type: Open-source deep learning framework developed by Facebook.
● Focus: Dynamic computation graphs, flexibility, and ease of use for research.
● Features:
○ Dynamic graph allows modification on-the-fly, great for debugging.
○ Strong Python integration, intuitive API.
○ TorchVision for computer vision tasks.
● Use Cases:
○ Research and experimentation in deep learning.
○ Rapid prototyping and complex model architectures.
○ Production-ready with TorchScript and deployment tools.
● Example:
import torch
import torch.nn as nn
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(input_dim, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.softmax(self.fc2(x), dim=1)
return x
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
# Training loop would follow here
Keras
● Type: High-level neural network API written in Python.
● Focus: User-friendly API to build and train deep learning models.
● Features:
○ Runs on top of TensorFlow (mostly), Theano, or CNTK backends.
○ Simplifies model building with Sequential and Functional APIs.
○ Easy to use for beginners and prototyping.
● Use Cases:
○ Quick prototyping of neural networks.
○ Ideal for beginners in deep learning.
○ Production-ready since it integrates well with TensorFlow ecosystem.
● Example:
from tensorflow import keras
model = keras.Sequential([
keras.layers.Dense(128, activation='relu', input_shape=(input_dim,)),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
Summary Table
Library Best For Strengths Typical Use Cases
Scikit-learn Classical ML Easy to use, extensive Tabular data,
algorithms algorithms prototyping
TensorFlow Deep Learning Scalable, Large scale DL,
production-ready production
PyTorch Deep Learning Flexible, dynamic Research,
research graphs experimentation
Keras Deep Learning Simple, high-level API Quick prototyping,
beginners/prototyping education