AIML105
AIML105
This course material includes expanded examples, additional context, and deeper
insights to ensure comprehensive preparation for the AIML105 examination.
Machine Learning (ML) is a branch of Artificial Intelligence that allows systems to learn
and make decisions without explicit programming. It is achieved by training algorithms
on data to identify patterns and make predictions.
Real-Life Example: Spam email detection uses ML algorithms like Naive Bayes to
classify emails as spam or legitimate based on patterns in the content, such as specific
keywords or email structure.
• Overfitting: When a model learns both the data and noise, leading to excellent
training performance but poor generalization on new data.
Preventing Overfitting:
Example: Imagine training a house price prediction model with square footage as the
only feature. An overfitted model might predict perfectly for training data but fail on
unseen properties.
Feature Scaling
Feature scaling ensures that numerical data is on a similar scale, preventing large
values from dominating small ones during model training.
Example: In a dataset with weight (in kg) and height (in cm), scaling ensures both
features contribute equally to a BMI prediction model.
Classification vs Regression
Logistic Regression
Real-Life Example: Predicting customer churn (yes or no) based on behaviour metrics
like time spent on the platform and number of purchases.
Additional Context: Logistic regression is widely used due to its simplicity and
interpretability. It works well for binary outcomes but can be extended to multi-class
problems using techniques like One-vs-Rest (OvR).
K-Means Clustering
An unsupervised algorithm that groups data into ‘k’ clusters by minimizing the intra-
cluster distance.
Steps:
Cross-Validation
This technique evaluates model performance by splitting data into training and
validation subsets multiple times to ensure the model generalizes well.
Example: In a 5-fold cross-validation, data is split into 5 parts. Each part is used as
validation data while the others train the model, cycling through all parts.
Additional Context: Cross-validation helps prevent data leakage and ensures that the
model’s performance is not skewed by a particular split of training and testing data.
Example: In facial recognition, pooling identifies critical features like eyes or nose,
ignoring unnecessary details like background.
Additional Context: Modern architectures like ResNet often use global average
pooling to reduce feature maps to a single value per channel before classification.
Data Augmentation
Hyperparameter Optimization
Hyperparameters are settings not learned by the model (e.g., learning rate, number
of layers). Optimization ensures the best model performance.
Example: Finding the optimal learning rate and batch size for training a deep neural
network.
Additional Context: Early stopping can also be used to halt training when
performance on a validation set stops improving, saving computational resources.
Optimization Algorithms
• Adam Optimizer: Combines SGD with adaptive learning rates for faster
convergence.
Comparison:
Additional Context: Optimizers like RMSprop are particularly useful for handling non-
stationary objectives, such as those encountered in reinforcement learning.
1. Explain overfitting and underfitting with examples. How can they be prevented?
Answer: Overfitting occurs when a model captures noise along with the signal,
leading to excellent performance on training data but poor generalization to new
data. Example: A model that memorizes exact data points in a training dataset. To
prevent overfitting, use regularization, cross-validation, or data augmentation.
Underfitting happens when a model is too simple, failing to capture the underlying
data patterns. Example: A linear regression model applied to non-linear data.
Solutions include increasing model complexity or training longer.
Answer: K-means clustering is an unsupervised algorithm that divides data into ‘k’
clusters by iteratively updating centroids and assigning data points based on
proximity. Real-world example: Segmenting customers by purchasing patterns in e-
commerce to target specific marketing campaigns.
Additional Context: Random search is a simpler alternative for scenarios with a limited
computational budget, often yielding comparable results to grid search.
Answer: Anomaly detection identifies patterns or behaviours that deviate from the
norm. In unsupervised learning, clustering algorithms like DBSCAN or statistical
methods detect outliers without labelled data. Example: Detecting unusual login
patterns or excessive file access in cybersecurity to identify potential breaches.
Additional Context: Deep learning methods like autoencoders can learn compressed
representations of normal data, flagging deviations as anomalies.