Machine Learning Basics
Machine Learning Basics
Why: ML enables computers to learn from experience (data) to perform tasks more
accurately.
2. Supervised Learning:
Why: It's used for making predictions based on input data, e.g., predicting housing
prices based on features like size, location, etc.
3. Regression:
Why: It's useful when we want to predict a continuous outcome, such as predicting
house prices, stock prices, etc.
4. Classification:
Why: It's used when we want to classify data into categories, such as spam vs. non-
spam emails, identifying handwritten digits, etc.
5. Unsupervised Learning:
Why: It's used for tasks where we don't have labeled data or when we want to
explore the structure of the data.
6. Clustering:
Why: It's useful for tasks like customer segmentation, image segmentation, etc.
Why: It's useful for visualizing high-dimensional data, reducing computational costs,
and removing noise.
8. Deep Learning:
Why: It's used for tasks like image recognition, natural language processing, and
many more, often achieving state-of-the-art performance.
9. Reinforcement Learning:
Why: It's used in scenarios where an agent learns to optimize its actions over time to
maximize cumulative rewards.
Classification models are used when the output variable is a category or a class. For
example, classifying emails as spam or not spam, predicting whether a tumor is malignant or
benign.
Regression models are used when the output variable is a continuous value. For example,
predicting house prices, stock prices, etc.
It's important to choose the appropriate type of model based on the nature of the problem and the
type of output you want.
1. Logistic Regression
Definition: Logistic regression is a statistical method for
analyzing a dataset in which there are one or more
independent variables that determine an outcome. It models
the probability of a binary outcome.
Why use it?: Logistic regression is suitable for binary
classification tasks. It works well when the relationship
between the independent variables and the outcome is linear.
For example, predicting whether an email is spam or not spam
based on features like the sender, subject, and content.
How does it work?: Logistic regression uses the logistic
function to model the probability of the binary outcome given
the input features. It calculates the weighted sum of the input
features and applies the logistic function to it, which maps the
output to a value between 0 and 1.
2. Decision Trees
Definition: Decision trees are a non-parametric supervised
learning method used for classification and regression. They
split the data into subsets based on the most significant
attribute at each node.
Why use it?: Decision trees are useful for both categorical
and numerical data and provide a clear decision-making
process. They work well when the data has non-linear
relationships. For example, in healthcare, decision trees can
be used to predict whether a patient has a certain disease
based on symptoms.
How does it work?: Decision trees split the data based on
the features that provide the best separation of classes at
each node. This process continues recursively until the data is
split into pure subsets or a stopping criterion is met.
3. Random Forests
Definition: Random forests are an ensemble learning method
that constructs multiple decision trees during training and
outputs the mode of the classes (classification).
Why use it?: Random forests improve upon decision trees by
reducing overfitting and increasing accuracy. They work well
for high-dimensional datasets and are robust to outliers. For
example, predicting customer churn in a subscription-based
service by analyzing customer behavior data.
How does it work?: Random forests train multiple decision
trees on random subsets of the data and average their
predictions. This ensemble approach reduces variance and
improves the overall performance of the model.
4. Support Vector Machines (SVM)
Definition: SVM is a supervised learning model used for
classification. It finds the hyperplane that best separates data
into classes by maximizing the margin between classes.
Why use it?: SVM is effective in high-dimensional spaces and
is particularly useful when the number of features exceeds the
number of samples. It works well with both linear and non-
linear data. For example, SVMs can be used in image
classification tasks to classify objects in images.
How does it work?: SVM constructs a hyperplane in the
feature space that best separates the classes. It maximizes
the margin between the closest points of different classes
(support vectors). For non-linear data, SVM can use a kernel
trick to map the data into a higher-dimensional space where a
hyperplane can be found.
5. Neural Networks
Definition: Neural networks are algorithms designed to
recognize patterns, modeled loosely after the human brain.
Why use it?: Neural networks are highly flexible and can
learn complex patterns in data. They are suitable for large-
scale problems with high-dimensional data. For example, in
sentiment analysis of product reviews, neural networks can
classify reviews as positive or negative based on text data.
How does it work?: Neural networks consist of layers of
interconnected nodes (neurons) that transmit signals between
each other. Each layer applies transformations to the input
data, gradually extracting features and learning
representations. The final layer produces the output, which
can be used for classification.
Regression Models:
1. Linear Regression
Definition: Linear regression is a statistical method used to
model the relationship between a dependent variable and one
or more independent variables by fitting a linear equation to
observed data.
Why use it?: Linear regression is straightforward and
interpretable. It works well when the relationship between the
independent and dependent variables is linear. For example,
predicting house prices based on features like square footage,
number of bedrooms, and location.
How does it work?: Linear regression calculates the
coefficients of the linear equation that best fits the data. It
minimizes the sum of the squared differences between the
observed and predicted values.
2. Decision Trees (for Regression)
Definition: Decision trees can also be used for regression
tasks. Instead of predicting classes, they predict continuous
values at leaf nodes.
Why use it?: Decision trees are useful for regression when
the relationship between features and target variables is non-
linear. For example, predicting the price of a used car based
on its age, mileage, and condition.
How does it work?: Decision trees recursively split the data
based on the features that provide the best separation of
values. The predicted value at a leaf node is the average of
the target values of the samples in that node.
3. Random Forests (for Regression)
Definition: Random forests can be applied to regression
tasks as well, where they output the mean prediction of the
individual trees.
Why use it?: Random forests are robust to overfitting and
handle high-dimensional datasets well, making them suitable
for regression tasks with many input variables. For example,
predicting the sales volume of a product based on various
marketing factors.
How does it work?: Random forests train multiple decision
trees on random subsets of the data and average their
predictions. This ensemble approach reduces variance and
improves the overall performance of the model.
4. Support Vector Machines (SVM) (for Regression)
Definition: SVM can also be used for regression, where it
finds the hyperplane that best fits the data while maintaining
a maximum margin.
Why use it?: SVM regression is useful when dealing with
datasets with high dimensionality or when there are outliers.
For example, predicting stock prices based on historical data.
How does it work?: SVM regression aims to find a
hyperplane that best fits the data while minimizing the margin
violations. It tries to find the hyperplane that passes through
as many points as possible, while still keeping the margin as
large as possible.
5. Neural Networks (for Regression)
Definition: Neural networks can be used for regression tasks
by predicting a continuous value as the output.
Why use it?: Neural networks are capable of learning
complex relationships in data and are suitable for regression
tasks where the relationship is non-linear. For example,
predicting the temperature based on weather variables like
humidity, pressure, and wind speed.
How does it work?: Neural networks consist of layers of
interconnected nodes (neurons) that transmit signals between
each other. Each layer applies transformations to the input
data, gradually extracting features and learning
representations. The final layer produces the output, which
can be a continuous value for regression tasks.