0% found this document useful (0 votes)
18 views35 pages

UNIT I-Part 2

The document provides an introduction to machine learning, focusing on supervised learning, classification, and regression. It discusses key concepts such as model selection, generalization, Bayes theorem, and association rules, along with methods like the Apriori algorithm for mining frequent item sets. Additionally, it highlights the importance of loss and risk in decision-making processes within machine learning applications.

Uploaded by

janarthana9789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views35 pages

UNIT I-Part 2

The document provides an introduction to machine learning, focusing on supervised learning, classification, and regression. It discusses key concepts such as model selection, generalization, Bayes theorem, and association rules, along with methods like the Apriori algorithm for mining frequent item sets. Additionally, it highlights the importance of loss and risk in decision-making processes within machine learning applications.

Uploaded by

janarthana9789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

UNIT I

Introduction to Machine
Learning and Supervised
Learning

Code:U18CST7002
Presented by: Nivetha R
Department: CSE
Learning Multiple Classes

• Definition: Classification task with more than two


classes.
Learning Multiple Classes
• Example:
• Classes: Family Car, Sports Car, Luxury Sedan.
• Goal: Learn boundaries separating each class.
Learning Multiple Classes

• Handling Doubt
• Doubt Cases: When no hypothesis or multiple
hypotheses predict 1 for an instance.
• Rejection: Classifier rejects instances in doubt
regions for further human review.
Example 1

Consider the problem of assigning the label “family car” (indicated by “1”) or “not family
car” (indicated by “0”) to cars. Given the following training set for the problem and
assuming that the hypothesis space is as defined by (p1 ≤ price ≤ p2) AND (e1 ≤ engine
power ≤ e2) , find the version space for the problem.
Example 1-solution
•Dependent Variable (y):
Regression
•The variable we are trying to predict or
explain.
•Also known as the response variable.

•Independent Variable (x):


•The variable used to predict the dependent
variable.
•Also known as the predictor variable or
feature.

•Linear Relationship:
•The relationship between the dependent and
independent variables is modeled as a straight
line.
•The general form of the linear equation:
y=mx+c
Regression

https://www.youtube.com/watch?
v=CtsRRUddV2s
Regression
Regression
Regression is a type of supervised learning where the
output is a numeric value, not a Boolean class.

Interpolation: Finding a function f(x) that passes


through all data points when there is no noise:

Extrapolation :Predicting values outside the range


of the training set.
Regression with Noise :The output includes random
noise
Regression
Polynomial Regression
1. Extends linear regression by considering
polynomial relationships between dependent and
independent variables.
2. Can model non-linear data more accurately.
Model Selection and Generalization
• Model Selection: Choosing the best model based on
performance metrics.
• Generalization: The model's ability to perform well on
new, unseen data.
The Problem of Learning
• Hypothesis Elimination: Each training example
removes half of the hypotheses.
• Ill-Posed Problem: With limited training examples,
the solution is not unique, leading to multiple
consistent hypotheses.
Inductive Bias :Assumptions made to have a unique
solution with the given data.
Examples:
• Assuming the shape of a rectangle for family cars.
• Assuming a linear function in linear regression.
Model Selection and
Generalization
• Hypothesis Class Capacity: The class of functions that can be
learned depends on the hypothesis class capacity.
Model Selection:
• Inductive Bias: Necessary for learning and impacts model
selection.
• Goal: Choose the hypothesis class H that generalizes well to new
data.
• Generalization: Ability of a model to make accurate predictions
on new, unseen data
Complexity and Generalization:
• Underfitting: H is too simple (e.g., fitting a line to third-order
polynomial data).
• Overfitting: H is too complex (e.g., fitting a sixth-order
polynomial to noisy data).
• Optimal Complexity: Match the complexity of H with the
underlying function's complexity.
Model Selection and
Generalization
• Triple Trade-Off (Dietterich 2003)
Factors:
• Complexity of the hypothesis class H.
• Amount of training data.
• Generalization error on new examples.
• Balancing:
• Increasing training data reduces generalization
error.
• Increasing model complexity initially reduces,
then increases generalization error.
• Validation and Cross-Validation
• Validation Set: Used to test generalization ability.
• Cross-Validation: Dividing data into training and
validation sets multiple times to ensure robust model
selection.
Model Selection and
Generalization
Steps for Model Selection
1.Divide Data: Split dataset into training, validation, and test
sets.
2.Fit Models: Train models on the training set for different
hypothesis classes Hi​.
3.Evaluate Models: Use validation set to measure
generalization error.
4.Select Best Model: Choose the model with the least
validation error.
Test Set
• Purpose: Evaluate the final model's performance on unseen
data.
• Importance: Avoids overfitting to validation set, providing
an unbiased estimate of model performance.
Analogy:
• Training Set: Problems solved in class.
• Validation Set: Exam questions.
• Test Set: Real-world problems.
Bayes Theorem in ML
• Bayes theorem is used in Machine Learning where we need to
predict classes precisely and accurately.
• It is used to calculate conditional probability in Machine
Learning application that includes classification tasks.
• A simplified version of Bayes theorem (Naïve Bayes
classification) is also used to reduce computation time and
average cost of the projects.
• It helps to determine the probability of an event with random
knowledge.
• It is used to calculate the probability of occurring one event
while other one already occurred.
• It is a best method to relate the conditional probability and
marginal probability.
• It is extensively applied in Financial, health and medical,
research and survey industry, aeronautical sector
Bayes Theorem in ML
• In machine learning, we try to determine the best hypothesis
from some hypothesis space H, given the observed training
data X.
• In Bayesian learning, the best hypothesis means the most
probable hypothesis, given the data X plus any initial
knowledge about the prior probabilities of the various
hypotheses in H.
• Prior knowledge can be combined with observed data to
determine the final probability of a hypothesis.
• In Bayesian learning, prior knowledge is provided by asserting
• – a prior probability for each candidate hypothesis
• – a probability distribution over observed data for each
possible hypothesis.
• Bayesian methods can accommodate hypotheses that make
probabilistic predictions
• New instances can be classified by combining the predictions
of multiple hypotheses, weighted by their probabilities.
Bayesian Classification
• - Uses Bayes' theorem to calculate the probability
of each class given the input data.

• - Selects the class with the highest posterior probability.


• Hence, Bayes Theorem can be written as: posterior =
likelihood * prior / Marginal
Bayes Decision Theory
Bayes Decision Theory
Bayes Decision Theory
Bayes Decision Theory
Losses and Risk

Loss: The cost associated with making an incorrect


decision. It quantifies the penalty for errors in decision-
making.
In medical diagnosis, predicting no disease when
the patient has one (false negative) can delay treatment.
The loss could include additional medical costs and
health deterioration .

Risk: The expected loss, which considers both the


probability of various outcomes and their associated
losses. It aims to minimize potential negative impacts by
factoring in the likelihood and cost of different decisions.
In loan approval, The bank uses this risk to decide
on loan approvals, aiming to minimize financial losses.
Losses and Risks
Losses and Risk
Discriminant Functions
Discriminant functions are mathematical functions used
to distinguish between different classes in a dataset.
Discriminant Functions
Discriminant Functions:
Discriminant Functions
Association Rules
• Association rules are used
to find interesting
relationships or patterns
between variables in large
datasets.
• An association rule is an
implication of the form X→Y
where X is the antecedent
and Y is the consequent of
the rule.
• Example: market basket
analysis, where the goal is
to discover associations
between products
purchased together by
customers​
Keys used to measure Association
Rules
• Support:
• The support of an association rule X→Y measures how
frequently the items X and Y appear together in the
dataset.
• Support =

• Confidence:
• The confidence of an association rule X→Y measures the
probability that Y is purchased given that X is purchased.
• Confidence =
• Lift:
• Lift (also known as interest) measures the strength of an
association rule compared to the random co-occurrence
of X and Y
• Lift =
Apriori algorithm
• The Apriori algorithm is a popular method for mining
frequent item sets and generating association rules. It
operates in two main steps:
• Finding Frequent Item sets:
• The algorithm first identifies item sets that have
sufficient support.
• It uses the fact that any subset of a frequent
itemset must also be frequent to reduce the
search space.
• Generating Rules
• Once frequent item sets are identified, the
algorithm generates rules by dividing the item
sets into antecedents and consequents and
calculating their confidence
• Rules that do not meet the minimum confidence
threshold are discarded​
References
• 1. Ethem Alpaydin, “Introduction to Machine
Learning”, Second Edition, MIT Press, 2013.
• 2. Tom M. Mitchell, “Machine Learning”, McGraw-Hill
Education, 2013.
• 3. Stephen Marsland, “Machine Learning: An
Algorithmic Perspective”, CRC Press, 2009.
• 4. Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T.
Lin, “Learning from Data”, AML Book Publishers,
2012.
• 5. K. P. Murphy, “Machine Learning: A Probabilistic
Perspective”, MIT Press, 2012.
• 6. M. Mohri, A. Rostamizadeh, and A. Talwalkar,
“Foundations of Machine Learning”, MIT Press,
2012.

You might also like