6720 Labs Chapter 7

Universal Bank wants to increase personal loan customers using data on 5,000 previous customers. The document asks to: 1) Perform a k-NN analysis with k=1 on training data to classify a new customer, predicting they would accept a loan. 2) Determine the best k value to balance overfitting and information by testing k from 1 to 10. 3) Report the number of correct and incorrect classifications on the validation set using the best k. 4) Classify the new customer using the best k. 5) Repartition data into training, validation, test sets and compare confusion matrices, noting differences indicate overfitting on training data.

Uploaded by

sweetie05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views2 pages

6720 Labs Chapter 7

Uploaded by

sweetie05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 2

Data Mining Review Questions / XLMiner Labs

Chapter 7 k -Nearest Neighbors (k -NN)

1. Personal Loan Acceptance. Universal Bank is a relatively young bank growing

rapidly in terms of overall customer acquisition. Universal bank wants to convert
its liability customers (depositors) into personal loan customers (while retaining
them as depositors). A campaign that the bank ran last year for liability
customers showed a healthy conversion rate of over 9% success. This has
encouraged the retail marketing department to devise smarter campaigns with
better target marketing. The goal of our analysis is to model the previous
campaigns customer behavior to analyze what combination of factors make a
customer more likely to take out a personal loan.
The file UniversalBank.xls contains data on 5,000 customers. The data include
demographic information (age, income, etc.), the customers relationship with
the bank (mortgage, securities account, etc.), and the customers response to
the last personal loan campaign (variable = Personal Loan). Among the 5,000
customers, only 480 (9.6%) accepted the personal loan offer in the last
campaign (textbook reference - 7.1).
Partition the data into training (60%) and validation (40%) sets.
a. Perform a k -NN classification with all input variables except ID and ZIP
CODE using k = 1. (Remember to transform categorical variables with two
or more categories into dummy variables). Specify the success class as
1 (loan accepted), and use the default cutoff value of 0.5. How would
the following new customer be classified using your model: Age=40,
Experience=10, Income=84, Family=2, CCAvg=2, Education_1=0,
Education_2=1, Education_3=0, Mortgage=0, Securities Account=0, CD
Account=0, Online=1, and Credit Card=1?
b. What is the choice of k that balances between overfitting and ignoring the
predictor information? (Hint: Run k-NN for k values 1 to 10).
c. Using the Confusion Matrix for the validation data in Part b, how many
customers were classified correctly? How many customers were classified
incorrectly?
d. Classify the new customer using the best k.

e. Repartition the data; this time into training, validation, and test sets
(50% : 30% : 20%). Apply the k-NN method with the k chosen above.
Compare the Confusion Matrix of the test set with that of the training and
validation sets. Comment on the differences and their reason. What is
your assessment of the performance of this model?

Cart Project
75% (4)
Cart Project
17 pages
Project 2
100% (1)
Project 2
17 pages
Customer's New Voice: Extreme Relevancy and Experience through Volunteered Customer Information
From Everand
Customer's New Voice: Extreme Relevancy and Experience through Volunteered Customer Information
John S. McKean
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
18 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Data Mining Case Study PDF
No ratings yet
Data Mining Case Study PDF
21 pages
8 Steps to Problem Solving: Six Sigma
From Everand
8 Steps to Problem Solving: Six Sigma
Mohit Sharma
3.5/5 (3)
IITM
No ratings yet
IITM
61 pages
Data Mining Case Study PDF
100% (1)
Data Mining Case Study PDF
21 pages
Sample Form W 8BEN
100% (1)
Sample Form W 8BEN
1 page
Commercial Lending and Social Banking principles
From Everand
Commercial Lending and Social Banking principles
Michael AK CCBI MCBI Chartered Banker
No ratings yet
Banking Strategy, Technology and Operations including fintech's,company strategy and future technology
From Everand
Banking Strategy, Technology and Operations including fintech's,company strategy and future technology
Michael AK CCBI MCBI Chartered Banker
No ratings yet
Lec 7
No ratings yet
Lec 7
40 pages
A Study of Banking Risk Management
From Everand
A Study of Banking Risk Management
Michael AK CCBI MCBI Chartered Banker
No ratings yet
Advanced E-Commerce Business Questions and Analytical Hints
From Everand
Advanced E-Commerce Business Questions and Analytical Hints
Zemelak Goraga
No ratings yet
Machine Learning Unit 4 MCQ
No ratings yet
Machine Learning Unit 4 MCQ
28 pages
Answer 2022-23
No ratings yet
Answer 2022-23
22 pages
K-Nearest Neighbor (KNN) 6
No ratings yet
K-Nearest Neighbor (KNN) 6
46 pages
ML Unit 5..
No ratings yet
ML Unit 5..
40 pages
ML-Unit 5
No ratings yet
ML-Unit 5
40 pages
Test Bank
No ratings yet
Test Bank
55 pages
Thera Bank Loan Purchase Modelling
No ratings yet
Thera Bank Loan Purchase Modelling
44 pages
The Basel Ii "Use Test" - a Retail Credit Approach: Developing and Implementing Effective Retail Credit Risk Strategies Using Basel Ii
From Everand
The Basel Ii "Use Test" - a Retail Credit Approach: Developing and Implementing Effective Retail Credit Risk Strategies Using Basel Ii
Stephen D. Morris
No ratings yet
Multiple Choice Questions
No ratings yet
Multiple Choice Questions
56 pages
Test Bank
No ratings yet
Test Bank
55 pages
TB Prevention Factsheet
100% (1)
TB Prevention Factsheet
2 pages
2022CS665
No ratings yet
2022CS665
17 pages
Loan Approval Prediction
No ratings yet
Loan Approval Prediction
21 pages
DSV Ia2
No ratings yet
DSV Ia2
18 pages
Aiml Assignment
No ratings yet
Aiml Assignment
15 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
Final Project
No ratings yet
Final Project
9 pages
Universal Bank Case Solution
No ratings yet
Universal Bank Case Solution
9 pages
PA v0.7
No ratings yet
PA v0.7
15 pages
Quantitative Credit Portfolio Management: Practical Innovations for Measuring and Controlling Liquidity, Spread, and Issuer Concentration Risk
From Everand
Quantitative Credit Portfolio Management: Practical Innovations for Measuring and Controlling Liquidity, Spread, and Issuer Concentration Risk
Arik Ben Dor
3.5/5 (1)
BDMDM Telemarketing
No ratings yet
BDMDM Telemarketing
16 pages
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
No ratings yet
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
15 pages
K Nearest Neighbors: Probably A Duck."
No ratings yet
K Nearest Neighbors: Probably A Duck."
14 pages
Machine Lar Arii
No ratings yet
Machine Lar Arii
9 pages
QB - Data Science
No ratings yet
QB - Data Science
7 pages
Solution 1
No ratings yet
Solution 1
6 pages
Implementation of Mechine Learning Eligibility For Customer Credit Payments at Bank BTN Using The K - Nearst Neighbor Algorithm
No ratings yet
Implementation of Mechine Learning Eligibility For Customer Credit Payments at Bank BTN Using The K - Nearst Neighbor Algorithm
5 pages
Cmam2022 285 290
No ratings yet
Cmam2022 285 290
6 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
This Study Resource Was: Answer
No ratings yet
This Study Resource Was: Answer
5 pages
List - Midterm - 1 ML
No ratings yet
List - Midterm - 1 ML
6 pages
ML Viva and Oral Question and Answers
No ratings yet
ML Viva and Oral Question and Answers
5 pages
Lecture 38 KNN
No ratings yet
Lecture 38 KNN
4 pages
QB - Data Science
No ratings yet
QB - Data Science
4 pages
Dictionary of Credit Risk Business Terms - EXTRACT
From Everand
Dictionary of Credit Risk Business Terms - EXTRACT
Steve Preece
No ratings yet
Customer Segmentation
No ratings yet
Customer Segmentation
6 pages
ML 2m Cie2
No ratings yet
ML 2m Cie2
4 pages
Credit Risk Analysis
No ratings yet
Credit Risk Analysis
6 pages
Week 1 HW
No ratings yet
Week 1 HW
3 pages
Supervised Learning Problem For Solving
No ratings yet
Supervised Learning Problem For Solving
2 pages
Final Exam Review
No ratings yet
Final Exam Review
6 pages
Question Bank For Internal Assessment
No ratings yet
Question Bank For Internal Assessment
4 pages
BA Ass 3 2025
No ratings yet
BA Ass 3 2025
3 pages
6720 Labs Chapter 2
No ratings yet
6720 Labs Chapter 2
3 pages
IT6L2 DM Lab Credits: 2 Internal Assessment: 25 Marks Lab: 3 Periods/week Semester End Examination: 50 Marks Objectives
No ratings yet
IT6L2 DM Lab Credits: 2 Internal Assessment: 25 Marks Lab: 3 Periods/week Semester End Examination: 50 Marks Objectives
1 page
Chapter 02
No ratings yet
Chapter 02
157 pages
Logistic Regression PDF
No ratings yet
Logistic Regression PDF
29 pages
Objectives: Local Consistency Notions
No ratings yet
Objectives: Local Consistency Notions
10 pages
Or CHAPTER 14 Postanesthesia Care Units
No ratings yet
Or CHAPTER 14 Postanesthesia Care Units
16 pages
TB Day Brochure
No ratings yet
TB Day Brochure
20 pages
Chapter 14 Association Rules Collaborative Filtering
No ratings yet
Chapter 14 Association Rules Collaborative Filtering
34 pages
Algorithms For Drug Sensitivity Prediction
No ratings yet
Algorithms For Drug Sensitivity Prediction
25 pages
Optimization
No ratings yet
Optimization
5 pages
Exercise 7
No ratings yet
Exercise 7
4 pages
SSau Exercises
No ratings yet
SSau Exercises
6 pages
2.1: Frequency Distributions, Histograms, and Related Topics
No ratings yet
2.1: Frequency Distributions, Histograms, and Related Topics
4 pages
2.1: Frequency Distributions, Histograms, and Related Topics
No ratings yet
2.1: Frequency Distributions, Histograms, and Related Topics
4 pages
6720 Labs Chapter 9
No ratings yet
6720 Labs Chapter 9
2 pages