0% found this document useful (0 votes)

17 views23 pages

Chapter 6: Classification and Prediction: Classify Predictions

Uploaded by

Ecem Eryılmaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views23 pages

Chapter 6: Classification and Prediction: Classify Predictions

Uploaded by

Ecem Eryılmaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Chapter 6: Classification and Prediction

This chapter focuses on techniques used to classify data and make predictions based on learned models.
The main topics include:

● Bayesian Classification
● Instance-Based Methods
● Classification Accuracy

Bayesian Classification: Why?

Bayesian classification is a probabilistic approach used in machine learning and statistics. Key points are:

1. Probabilistic learning:
○ It calculates explicit probabilities for a hypothesis.
○ Useful for practical learning problems where probabilistic approaches provide reliable results.
2. Incremental:
○ Bayesian methods allow incremental learning.
○ With each training sample, the probability of a hypothesis being correct can be updated.
3. Probabilistic prediction:
○ Allows multiple hypotheses, weighted based on their probabilities.
4. Standard:
○ Bayesian methods offer a standard for decision-making, even when computational resources
are limited.
-
Key Takeaways:

● Bayesian classification assigns a sample to the class with the highest posterior probability.
● It simplifies to maximizing the product of likelihood P(X∣Ci) and prior P(Ci).
● Computing likelihood directly can be difficult in practice.

-
-
-

Naïve Bayesian Classifier: Comments

Advantages

1. Easy to implement:
○ The algorithm is straightforward and simple to apply.
2. Good results in most cases:
○ Despite its assumptions, the Naïve Bayesian Classifier performs well in many practical
applications.

Disadvantages

1. Class Conditional Independence Assumption:

○ Assumes that the features (attributes) are conditionally independent given the class.
○ In real-world scenarios, this assumption often does not hold, reducing accuracy.
2. Dependencies Between Variables:
○ Practically, dependencies exist among variables.
○ Example:
■ Hospitals: Patient profiles depend on age, family history, symptoms, and diseases.
■ Dependencies like fever and cough indicating diseases such as lung cancer cannot be
captured by Naïve Bayesian Classifier.
3. Inability to Model Dependencies:
○ The classifier fails to account for interdependencies among variables.

Solution: Bayesian Belief Networks

To address dependencies:

● Use Bayesian Belief Networks.

Bayesian Networks

Overview:

● A Bayesian Belief Network allows a subset of variables to be conditionally independent.

Key Points:

1. Graphical Model:
○ Represents causal relationships among variables.
○ Nodes: Random variables.
○ Links: Dependencies (arrows indicate relationships).
2. Features:
○ Specifies joint probability distributions.
○ Handles dependencies between variables.
○ Has no loops or cycles (acyclic graph).
3. Example:
○ X and Y are parents of Z.
○ Y is the parent of P.
○ There is no direct dependency between Z and P.

-
Learning Bayesian Networks

Several Cases for Learning Bayesian Networks

1. Given Network Structure and All Variables Observable:

○ Only the Conditional Probability Tables (CPTs) need to be learned.
○ This is the simplest case because the structure (relationships) is already provided.
2. Network Structure Known, Some Hidden Variables:
○ Use methods like gradient descent to learn parameters.
○ This approach is similar to training neural networks.
3. Network Structure Unknown, All Variables Observable:
○ Requires searching through the model space to reconstruct the network structure (graph
topology).
○ Algorithms optimize based on observed data to determine the best graph structure.
4. Unknown Structure and All Hidden Variables:
○ This is the most complex case.
○ Currently, no good algorithms exist to effectively solve this problem due to its complexity.

Instance-Based Methods

What are Instance-Based Methods?

● Definition:
Instance-based learning stores training examples and delays processing (lazy evaluation) until a
new instance needs to be classified.
● Key Feature:
The method compares a new instance to existing training examples to make a classification decision.

Key Approach: k-Nearest Neighbor (k-NN)

1. Representation:
○ Instances are treated as points in a Euclidean space (geometric space).
2. How it Works:
○ Given a new instance, calculate the distance between the new instance and all training
examples.
○ Find the k closest neighbors (based on Euclidean distance).
○ Assign the class based on the majority class of the nearest neighbors.

Memory-Based Reasoning (MBR)

Concept:

MBR mirrors human reasoning by identifying similar examples from the past and applying what is
known/learned to solve a new problem.

1. Examples in Daily Life:

○ Recognizing traffic patterns/routes.
○ Meeting new people by recalling past experiences.
○ Trying new food based on past preferences.
2. Terminology:
In MBR, the similar examples are referred to as neighbors.

Applications of MBR

1. Fraud Detection:
○ Identify fraudulent activities by comparing with known cases.
2. Customer Response Prediction:
○ Predict customer behavior based on historical patterns.
3. Medical Treatments:
○ Suggest treatments by matching symptoms with similar past cases.
4. Classifying Responses:
○ Process free-text responses and assign codes, often used in natural language processing
tasks.

The k-Nearest Neighbor (k-NN) Algorithm

Key Concept

● k-NN is an instance-based learning algorithm where:

○ All training instances are represented as points in an n-dimensional space.
○ The classification or prediction of a new instance is based on the distance to its nearest
neighbors.

Target Functions in k-NN

1. Discrete Target Function (Classification):

○ k-NN returns the most common class (majority vote) among the kkk-nearest training
examples.
○ Example: Predict if an object is "yes" or "no" based on the majority class.
2. Continuous Target Function (Regression/Numerical Prediction):
○ k-NN returns the mean value of the kkk-nearest neighbors.
○ Example: Predicting a numerical value like temperature, house price, etc.
Robustness to Noise

● k-NN is robust to noisy data because averaging the kkk-nearest neighbors smoothens the impact of
outliers.

Summary

The k-NN algorithm is:

● Non-parametric (no assumptions about the data distribution).

● Lazy learning (stores the training data and delays computation until prediction).
● Used for both classification and regression tasks.

MBR Challenges

1. Choosing Appropriate Historical Data

○ Ensuring relevant and high-quality data is used for training the system.
2. Efficient Representation of Training Data
○ Lazy learning delays computation until prediction.
○ Efficient indexing methods are critical for quick retrieval of neighbors.
3. Choosing the Number of Neighbors (k)
○ Small k:
■ Too noisy; sensitive to outliers.
○ Large k:
■ Approaches the size of the dataset, reducing specificity and leading to the same result
for all cases.
Discussion on the k-NN Algorithm

1. Choosing the Distance Function

○ Common measures:
■ Euclidean Distance: Most widely used.
■ Manhattan Distance: Sum of absolute differences.
2. Impact of Variable Units
○ Units of variables can distort distance calculations.
○ Solution:
■ Scaling/Normalization: Standardize variables to eliminate unit dependency.
■ Methods:
■ z-scores: Scale values to have mean 0 and standard deviation 1.
■ Min-max scaling: Normalize values between 0 and 1.

Choosing the Combination Function

1. Categorical Target Variables (Classification)

○ Majority Rule: Class with the highest frequency among neighbors.
○ Weighted Voting:
■ Neighbors closer to the query point have higher weights.
2. Numerical Target Variables (Regression)
○ Average: Mean of the values of the k-nearest neighbors.
○ Weighted Average:
■ Assign higher weights to closer neighbors (inverse of the distance).

Weighted k-NN

1. Why Weight Neighbors?

○ Not all neighbors contribute equally.
○ Neighbors closer to the query point should have a greater impact.
2. Weight Formula:
w=1d(xq,xi)2w = \frac{1}{d(x_q, x_i)^2}w=d(xq,xi)21
Here:
○ d(xq,xi)d(x_q, x_i)d(xq,xi) is the distance between query point xqx_qxqand training instance
xix_ixi.
3. Distance-Weighted Nearest Neighbor
○ Assign weights inversely proportional to distance.
○ Helps improve prediction accuracy for both:
■ Classification (categorical).
■ Regression (numerical).

Weighing Variables

1. Purpose:
○ Similar to clustering, weighing variables helps address relevance during scaling or
standardization.
2. Need:
○ Some variables require higher weights (e.g., standardized income vs. standardized age).
3. Outcome:
○ Balances the impact of variables during distance-based methods.

Curse of Dimensionality

1. Problem:
○ High-dimensional data can cause distances to be dominated by irrelevant attributes.
2. Solutions:
○ Stretch axes or eliminate irrelevant attributes.
○ Use different scaling factors.
○ Cross-validation to determine the best factor.
3. Key Strategy:
○ Assign small k values to irrelevant variables, eliminating them if necessary.

Lazy Learning

1. Concept:
○ No explicit training phase; all historical data acts as the training set.
○ Minimal training cost.
2. Drawbacks:
○ Classifying a new instance requires real-time computation, which can be time-consuming.
3. Key Operation:
○ Finding the k-nearest neighbors for new observations.

MBR Strengths

1. Advantages:
○ Uses data "as is" with distance functions and combination functions.
○ Adapts easily with new data.
○ Produces results without lengthy training.
2. Real-Time Utility:
○ Efficient for applications requiring constant updates.

-
-

Chapter 6: Classification and Prediction

measure of how accurately the model predicts the output label.

Holdout Estimation

● Concept:
○ When data is limited, the holdout method splits the dataset into:
■ Training Set: Builds the model.
■ Testing Set: Evaluates model performance.
○ Typical split: 1/3 for testing and the rest for training.
● Problem:
○ Samples may not be representative.
○ Example: A class might be missing in the test set.
● Advanced Method:
○ Stratification: Ensures balanced representation of each class in both training and test sets.

Repeated Holdout Method

● Definition: The holdout estimate becomes more reliable by repeating the process multiple times
using different subsamples of data.
● Steps:
○ In each iteration, a portion of data is randomly selected for training.
■ Stratification: Ensures equal representation of each class in samples.
○ Error rates from different iterations are averaged to calculate an overall error rate.
● Issues:
○ Test Set Overlap: Repeated random selection may lead to overlapping test sets.
○ Question: Can overlapping be avoided?

Cross-Validation

1. Definition:
○ Cross-validation avoids overlapping test sets by systematically splitting the data into subsets.
○ It is an improvement over the simple holdout method.
2. Steps:
○ Step 1: Split data into k subsets of equal size.
○ Step 2: Use each subset in turn for testing, while the remaining subsets are used for training.
○ This is called k-fold cross-validation.
3. Key Points:
○ Stratification ensures equal class proportions in each fold.
○ The error estimates from each iteration are averaged to produce the overall error estimate.

1. Stratified Ten-Fold Cross-Validation:

○ Why 10 folds?:
■ Extensive experiments show that 10 folds provide the most accurate estimate.
■ Supported by theoretical evidence.
○ Stratification:
■ Reduces variance by ensuring balanced class representation in each fold.
2. Improvement:
○ Repeated Stratified Cross-Validation:
■ Ten-fold cross-validation is repeated multiple times.
■ Results are averaged to further reduce variance.

Cross-Validation Example

1. Scenario:
○ Data size: 1000 records.
○ k = 10 folds:
■ Randomize data to eliminate biases.
■ Split data into 10 equal subsets (folds) of 100 records each.
2. Process:
○ Fold 1: Used as test set; remaining 9 folds are for training.
○ Fold 2: Used as test set; remaining 9 folds for training.
○ Repeat this process until all folds have been used for testing.
3. Result:
○ Each record gets tested once.
○ Final error estimate is the average of all fold results.
Leave-One-Out Cross-Validation (LOOCV)

1. Definition:
○ A specific form of cross-validation where the number of folds equals the number of training
instances.
○ For n training instances, the classifier is trained n times.
2. Key Points:
○ Makes the best use of data.
○ No random subsampling is involved.
○ It is computationally expensive because the model is trained multiple times.
■ Exception: Nearest Neighbor (NN) methods.

The Bootstrap

1. Definition:
○ A resampling technique where instances are selected with replacement.
○ Bootstrap creates multiple datasets from the original data for training/testing.
2. Process:
○ Sample a dataset of n instances repeatedly to create a new training set.
○ Instances not selected form the test set.
3. Difference from Cross-Validation:
○ Cross-validation uses without replacement sampling, while bootstrap uses with replacement.

The 0.632 Bootstrap

1. Concept:
○ A particular instance has a probability of (1 - 1/n) of not being picked in a single sample.
○ For large n, this probability approximates to 0.368.
2. Implications:
○ Around 63.2% of the instances are included in the training data.
○ The remaining 36.8% of the instances appear in the test data.

Example of Bootstrap
1. Dataset Size:
○ The original dataset has 1000 observations.
2. Process:
○ Create a training set by sampling with replacement 1000 times.
○ The size of the training set remains 1000, but:
■ Some observations appear multiple times.
■ Some observations do not appear at all.
3. Test Set:
○ Observations not appearing in the training set form the test set.
○ The size of the test set is variable.

Bagging and Boosting

Bagging

1. Definition:
○ Bagging stands for Bootstrap Aggregating.
○ It improves accuracy by reducing variance in model predictions.
2. Steps:
○ Generate t bootstrap samples (with replacement) from the original dataset.
○ Train a new classifier CtC_tCtfor each sample StS_tSt.
○ For classification problems:
■ Combine predictions using the majority vote rule.
○ For regression problems:
■ Combine results by averaging predictions.
3. Outcome:
○ The final classifier C∗C^*C∗ is an aggregated version of all classifiers.

Boosting

1. Definition:
○ Boosting builds multiple classifiers sequentially, where each classifier pays more attention to
misclassified examples.
2. Steps:
○ Learn a series of classifiers.
○ Misclassified examples in each iteration receive higher weight for the next classifier.
3. Characteristics:
○ Boosting works well with decision trees or Bayesian classifiers.
○ Requires linear time and constant space.
-

Key Points

1. Boosting improves classification accuracy by:

○ Focusing on misclassified examples in subsequent iterations.
○ Combining multiple weak classifiers into a strong ensemble model.
2. The final hypothesis is a weighted combination of all classifiers, with more accurate classifiers
receiving higher weights.
3. Boosting is iterative and adapts the model during each step.

-
-
Evaluating Numeric Prediction

Characteristics:

● Penalizes larger errors more heavily due to the squaring of differences.

● Easy to manipulate mathematically.
● Widely used for regression problems.

Characteristics:

● Gives an interpretable error in the same scale as the original values.

● Sensitive to outliers (like MSE).
Characteristics:

● Less sensitive to outliers compared to MSE.

● Represents the average "absolute" error.

4. Relative Error:

● Definition: Expresses errors as a percentage of the target value, useful for understanding errors in
relative terms.
● Example:
○ If an error of 50 occurs while predicting 500, the relative error is 10%10\%10%.

Key Differences:

1. MSE and RMSE: Both emphasize large errors due to squaring, but RMSE makes the error more
interpretable.
2. MAE: Less sensitive to large outliers as it does not square the errors.
3. Relative Error: Provides a percentage-based understanding of error magnitude, useful for scaled
evaluation.

Lift Charts: Explanation and Practical Use

Definition

A lift chart is a visual tool used to compare the performance of predictive models or decisions. It shows the
"lift," or improvement, of targeting a subset of the population (using a model or strategy) versus random
selection.

Why Use Lift Charts?

1. Costs are unknown: In practice, you may not always have exact cost figures for decision-making.
2. Comparing Scenarios: Instead of relying solely on cost-based analysis, decisions are compared
based on how effective they are relative to the baseline.

How a Lift Chart Works

1. X-Axis: Proportion of the population targeted (e.g., 10%, 20%, etc.).

2. Y-Axis: Proportion of positive responses achieved (e.g., purchases, clicks, or any positive outcome).
3. Baseline (Random): A straight line representing expected results if no model is used (random
selection).
4. Lift Curve: The curve showing the actual performance of the model.
○ A steeper lift curve means the model is better at identifying promising subsets of the
population.

Practical Interpretation

● The higher the curve above the baseline, the better the predictive model or decision strategy.
● Decision-makers can visually identify the trade-offs:
○ How much of the population to target.
○ What percentage of responses to expect.

Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Module 5
No ratings yet
Module 5
94 pages
Classification and Clustering Algorithms
No ratings yet
Classification and Clustering Algorithms
108 pages
ML - 3 - Sovan - KNN - 1
No ratings yet
ML - 3 - Sovan - KNN - 1
95 pages
ML Supervised Learning Unit 3
No ratings yet
ML Supervised Learning Unit 3
51 pages
CHP 4
No ratings yet
CHP 4
24 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
3a KNN PDF
No ratings yet
3a KNN PDF
26 pages
CS8082U4L01 - K-Nearest Neighbour Learning
No ratings yet
CS8082U4L01 - K-Nearest Neighbour Learning
21 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
Classification
No ratings yet
Classification
74 pages
Module 4 A
No ratings yet
Module 4 A
29 pages
Module IV - K NN
No ratings yet
Module IV - K NN
15 pages
Instance Based Learning: November 2015
No ratings yet
Instance Based Learning: November 2015
11 pages
Aiml Module 3 Part 2
No ratings yet
Aiml Module 3 Part 2
12 pages
AML Mod5
No ratings yet
AML Mod5
33 pages
Module3-Similarity-based Learning-11Mar2024
No ratings yet
Module3-Similarity-based Learning-11Mar2024
34 pages
Instance Based Learning: 09s1: COMP9417 Machine Learning and Data Mining
No ratings yet
Instance Based Learning: 09s1: COMP9417 Machine Learning and Data Mining
9 pages
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
Lec05 InstanceBased
No ratings yet
Lec05 InstanceBased
13 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
ML Unit V
No ratings yet
ML Unit V
10 pages
Module 3
No ratings yet
Module 3
101 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
cs4302 Lecture2
No ratings yet
cs4302 Lecture2
40 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
CH 2
No ratings yet
CH 2
30 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
Lecture 07 Slides
No ratings yet
Lecture 07 Slides
45 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
22 pages
Machine Learning3
No ratings yet
Machine Learning3
51 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
MLT Unit 3 Part 2
No ratings yet
MLT Unit 3 Part 2
57 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Module 3-1
No ratings yet
Module 3-1
46 pages
Machine Learning Lecture 02
No ratings yet
Machine Learning Lecture 02
25 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
ML KN
No ratings yet
ML KN
12 pages
Unit 1
No ratings yet
Unit 1
15 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Internet of Things Comparative Study
No ratings yet
Internet of Things Comparative Study
3 pages
5c. Nearest Neighbour Classifier
No ratings yet
5c. Nearest Neighbour Classifier
2 pages
Difference Between Instance-And Model-Based Learning
No ratings yet
Difference Between Instance-And Model-Based Learning
35 pages
Replace All Valid Mathematical Equations With High
No ratings yet
Replace All Valid Mathematical Equations With High
6 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Instance Based Learning
No ratings yet
Instance Based Learning
20 pages
ML Lec7
No ratings yet
ML Lec7
5 pages
1 Unit 2 Notes
No ratings yet
1 Unit 2 Notes
31 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Chapter 3 Group Technology
No ratings yet
Chapter 3 Group Technology
54 pages
DM DT Solved Example 02 - Unlocked
No ratings yet
DM DT Solved Example 02 - Unlocked
3 pages
Fresco
No ratings yet
Fresco
50 pages
CH 22 Production and Growth Moodle
100% (1)
CH 22 Production and Growth Moodle
37 pages
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
100% (1)
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
10 pages
150 Essential Data Science Questions and Answers
No ratings yet
150 Essential Data Science Questions and Answers
55 pages
Data Analytics in Iot: Cs578: Internet of Things
No ratings yet
Data Analytics in Iot: Cs578: Internet of Things
27 pages
WEKA Manual
No ratings yet
WEKA Manual
25 pages
Slide Decision Tree Concept With Example
100% (2)
Slide Decision Tree Concept With Example
69 pages
Difference Between ANN, CNN and RNN
100% (1)
Difference Between ANN, CNN and RNN
5 pages
COMPX310-19A Machine Learning Chapter 3: Classification
No ratings yet
COMPX310-19A Machine Learning Chapter 3: Classification
39 pages
Machine Learning para Analysis
No ratings yet
Machine Learning para Analysis
2 pages
CH-5
No ratings yet
CH-5
29 pages
Personality Predictor: A Project/Dissertation Review-1 Report On
No ratings yet
Personality Predictor: A Project/Dissertation Review-1 Report On
6 pages
Python
No ratings yet
Python
31 pages
What Is Classification? What Is Prediction?
No ratings yet
What Is Classification? What Is Prediction?
36 pages
03-Data Science Methodology
No ratings yet
03-Data Science Methodology
8 pages
Final Exam Review: Nishant Mehta
No ratings yet
Final Exam Review: Nishant Mehta
32 pages
Mis 331 PS05 23-24 F
No ratings yet
Mis 331 PS05 23-24 F
40 pages
Spammer Detection and Fake User Identification On Social Networks
No ratings yet
Spammer Detection and Fake User Identification On Social Networks
15 pages
Group Assignment
No ratings yet
Group Assignment
4 pages
331 MT2 Study
No ratings yet
331 MT2 Study
30 pages
331 Stat Inf 1
No ratings yet
331 Stat Inf 1
30 pages
Social Media Bullying Detection Using Machine Learning On Bangla Text
No ratings yet
Social Media Bullying Detection Using Machine Learning On Bangla Text
4 pages
CH 21 - CPI Moodle Version
No ratings yet
CH 21 - CPI Moodle Version
30 pages
Lab 8
No ratings yet
Lab 8
1 page
Project
No ratings yet
Project
16 pages
A Survey of Text Classification With Transformers How Wide How Large How Long How Accurate How Expensive How Safe
No ratings yet
A Survey of Text Classification With Transformers How Wide How Large How Long How Accurate How Expensive How Safe
14 pages
Performance of Machine Learning Techniques in The Detection of Financial Frauds
No ratings yet
Performance of Machine Learning Techniques in The Detection of Financial Frauds
10 pages
Chapter 6 Data-DrivenModelingUsingMATLAB-6
No ratings yet
Chapter 6 Data-DrivenModelingUsingMATLAB-6
7 pages
Mask Former
No ratings yet
Mask Former
17 pages
Using Early Readouts To Mediate Featural Bias in Distillation
No ratings yet
Using Early Readouts To Mediate Featural Bias in Distillation
13 pages
Materials Today: Proceedings
No ratings yet
Materials Today: Proceedings
4 pages
CH 25 Basic Tools of Finance
No ratings yet
CH 25 Basic Tools of Finance
19 pages
Skin Detection A Bayesian Network Approach
No ratings yet
Skin Detection A Bayesian Network Approach
5 pages
Drone Classification Using Convolutional Neural Networks With Merged Doppler Images
No ratings yet
Drone Classification Using Convolutional Neural Networks With Merged Doppler Images
5 pages
LING 101 Spring 2023 Gökgöz Revised
No ratings yet
LING 101 Spring 2023 Gökgöz Revised
4 pages
Natural Language Processing With Java and Lingpipe Cookbook: Chapter No. 1 "Simple Classifiers"
No ratings yet
Natural Language Processing With Java and Lingpipe Cookbook: Chapter No. 1 "Simple Classifiers"
50 pages
Version Final
No ratings yet
Version Final
12 pages
Term Paper: Dept of CSE, GMRIT
No ratings yet
Term Paper: Dept of CSE, GMRIT
16 pages

Chapter 6: Classification and Prediction: Classify Predictions

Uploaded by

Chapter 6: Classification and Prediction: Classify Predictions

Uploaded by

Chapter 6: Classification and Prediction

Bayesian Classification: Why?

Naïve Bayesian Classifier: Comments

1. Class Conditional Independence Assumption:

Solution: Bayesian Belief Networks

● Use Bayesian Belief Networks.

● A Bayesian Belief Network allows a subset of variables to be conditionally independent.

Several Cases for Learning Bayesian Networks

1. Given Network Structure and All Variables Observable:

What are Instance-Based Methods?

Key Approach: k-Nearest Neighbor (k-NN)

Memory-Based Reasoning (MBR)

1. Examples in Daily Life:

The k-Nearest Neighbor (k-NN) Algorithm

● k-NN is an instance-based learning algorithm where:

Target Functions in k-NN

1. Discrete Target Function (Classification):

The k-NN algorithm is:

● Non-parametric (no assumptions about the data distribution).

1. Choosing Appropriate Historical Data

1. Choosing the Distance Function

Choosing the Combination Function

1. Categorical Target Variables (Classification)

1. Why Weight Neighbors?

Chapter 6: Classification and Prediction

Repeated Holdout Method

1. Stratified Ten-Fold Cross-Validation:

The 0.632 Bootstrap

Bagging and Boosting

1. Boosting improves classification accuracy by:

● Penalizes larger errors more heavily due to the squaring of differences.

● Gives an interpretable error in the same scale as the original values.

● Less sensitive to outliers compared to MSE.

Lift Charts: Explanation and Practical Use

Why Use Lift Charts?

How a Lift Chart Works

1. X-Axis: Proportion of the population targeted (e.g., 10%, 20%, etc.).

You might also like