Introduction To Machine Learning and Logistic Regression
Introduction To Machine Learning and Logistic Regression
Machine
Learning-
Classification &
Regression
Presented By_
K M Abir Mahmud
CEO - Skoder
25/09/2024 1
What is Logistic
Regression?
Logistic Regression is a statistical method used for binary
classification that models the probability of a binary
outcome based on one or more predictor variables.
Key Points:
Used for binary classification (e.g., yes/no, 0/1).
Outputs probabilities using the sigmoid function.
Assumes a linear relationship between the independent
variables and the log-odds of the dependent variable.
25/09/2024 2
How Logistic Regression Works
- Uses the logistic function (sigmoid) to map predicted values to probabilities.
- The decision boundary is determined by the threshold (commonly 0.5).
Formula:
Logistic Function
25/09/2024 3
How Logistic Regression Looks
25/09/2024 4
Classification vs Regression
25/09/2024 5
Decision
Boundary
•Explanation:
•The decision boundary separates the two
classes.
•Logistic Regression finds the best boundary (line
or curve) to classify the data.
25/09/2024 6
Decision
Boundary
•Explanation:
•The decision boundary separates
the two classes.
•Logistic Regression finds the best
boundary (line or curve) to classify
the data.
25/09/2024 7
Key Assumptions of Logistic
Regression
Assumptions:
- Binary dependent variable.
- Independent variables can be continuous or categorical.
- No multicollinearity between predictor variables.
- Large sample size is preferred.
25/09/2024 8
The Cost Function
Cost Function Explanation:
Logistic Regression uses the Log Loss (Cross-Entropy Loss)
function to measure the performance of the model.
25/09/2024 9
Gradient Descent in Logistic
Regression
Concept:
Gradient Descent is used to minimize the cost function by iteratively
adjusting the model parameters.
In this process, we try different values and update them to reach
the optimal ones, minimizing the output.
In this class, we can apply this method to the cost function of
logistic regression.
25/09/2024 10
Applications of Logistic
Regression
Applications:
•Medical diagnosis (e.g., predicting disease presence).
•Credit scoring (e.g., predicting default).
•Marketing (e.g., predicting customer purchase likelihood).
25/09/2024 11
Basic Implementation in Python
Age Salary Purchased
22 25000 0
25 32000 0
47 28000 1
52 90000 1
46 68000 1
56 76000 1
48 69000 0
55 80000 1
60 83000 1
62 87000 1
25/09/2024 12
Interpretation of Results
Key Metrics:
Accuracy: Percentage of correct predictions.
Confusion Matrix: Shows true positives, true negatives,
false positives, and false negatives.
Classification Report: Provides precision, recall, and F1-
score.
25/09/2024 13
PLEASE FEEL
FREE TO Let’s
Discuss …
SHARE YOUR
THOUGHTS
09/25/2024 14
Introduction to
Machine
Learning and
Logistic
Regression
Presented By_
K M Abir Mahmud
CEO - Skoder
25/09/2024 15
K-Nearest Neighbors and Naive
Bayes Algorithms
A Comparative Study of Two Popular Machine Learning
Algorithms
• Agenda
• Introduction to KNN
• Introduction to Naive Bayes
• Key Differences
• Pros and Cons
• Use Cases
25/09/2024 16
K-Nearest Neighbors (KNN)
•Key Points:
•Instance-based, non-parametric learning algorithm.
•Classifies a data point based on the majority vote of its neighbors.
•Distance metrics like Euclidean, Manhattan, etc., are used.
25/09/2024 17
K-Nearest Neighbors (KNN)
•Key Points:
•Choose a value for K (number of neighbors).
•Calculate the distance between the test data and all the training data points.
•Identify the K-nearest neighbors.
•Assign the class based on the majority label among neighbors.
25/09/2024 18
Naive Bayes
•Key Points:
• Probabilistic classifier based on Bayes' Theorem.
• Assumes independence among features.
• Types: Gaussian, Multinomial, Bernoulli.
•Formula: Show Bayes’ Theorem:
25/09/2024 19
Naive Bayes
•Key Points:
•Calculate prior and likelihood probabilities.
•Use Bayes’ Theorem to compute the posterior probability.
•Predict the class with the highest posterior probability.
25/09/2024 20
Key Differences Between KNN
and Naive Bayes
Feature KNN Naive Bayes
Type
Non-parametric, instance-based Probabilistic, parametric
Assumptions
Based on distance metrics Assumes feature independence
Performance
Slow for large datasets Fast for large datasets
Requires understanding
Interpretability Easy to interpret
probabilities
25/09/2024 21
Pros and Cons of KNN
•Pros:
•Simple to understand and implement.
•Works well with complex decision boundaries.
•Cons:
•Slow during predictions (especially with large datasets).
•Sensitive to noisy data and irrelevant features.
25/09/2024 22
Use Cases of KNN
•Key Points:
•Image recognition
•Recommendation systems
•Anomaly detection
•Example: Use KNN for a movie recommendation system based
on user preferences.
25/09/2024 23
Pros and Cons of Naive Bayes
•Pros:
•Fast and efficient for large datasets.
•Performs well even with a naive assumption of feature independence.
•Cons:
•Assumes independence among features, which may not be true.
•Less effective with highly correlated features.
25/09/2024 24
Use Cases of Naive Bayes
•Key Points:
•Spam filtering
•Sentiment analysis
•Text classification
•Example: Naive Bayes for email spam detection based on word
frequencies.
25/09/2024 25
Classification of Simple Dataset
weight texture color label
150 1 1 Apple
130 1 1 Apple
180 0 0 Orange
170 0 0 Orange
160 1 0 Apple
120 1 1 Apple
200 0 0 Orange
210 0 0 Orange
25/09/2024 26
Practice Coding
Github Repository:
https://github.com/skabirgithub/edge_ai.git
25/09/2024 27
PLEASE FEEL
FREE TO Let’s
Discuss …
SHARE YOUR
THOUGHTS
09/25/2024 28