0% found this document useful (0 votes)
61 views2 pages

1) Identify The Dependent Variable in The Above Data: Ans

The document discusses the results of various classification models - Logistic Regression, Classification Tree, Random Forests, and Neural Networks - run on a dataset to predict credit risk, including accuracy scores on both training and validation data. It asks which classifier provides the best model and significant variables, and how many records would need to be reviewed to find all potential defaulters. Finally, it provides customer details and asks if the bank should approve him for a loan.

Uploaded by

Prateep Kandru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views2 pages

1) Identify The Dependent Variable in The Above Data: Ans

The document discusses the results of various classification models - Logistic Regression, Classification Tree, Random Forests, and Neural Networks - run on a dataset to predict credit risk, including accuracy scores on both training and validation data. It asks which classifier provides the best model and significant variables, and how many records would need to be reviewed to find all potential defaulters. Finally, it provides customer details and asks if the bank should approve him for a loan.

Uploaded by

Prateep Kandru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

1) Identify the dependent variable in the above data:

Ans: RESPONSE

2) What is the baseline for the dataset?


Ans: 490/ (490+210) = 70.00%

3) Seed and split used by our group according to the convention informed earlier
Ans: Seed= (18123+18073+18194+18087)/4 = 18119
Split= 70: 30

4) Running various classifiers (Logistic Regression, Classification Tree with Cross validation,
Random Forests and Neural Nets) and noting the Classification Accuracy on both training and test
datasets and the significant variables in the model
Ans: Classification Accuracy
Training Dataset Validation Dataset

Logistic Regression 78.7% 76%

Classification Tree with Cross


validation

Random Forests 75.4% 79.96%

Neural Nets

Significant Variables:

5) AUC obtained from the three classifiers


Ans:

AUC

Logistic Regression 80.85%

Classification Tree with Cross


validation

Random Forests

Neural Nets

6) Which classifier gives the best model? Note down the significant variables from this model. Your
model must fulfil the assumptions required for developing that model.
Ans:

7) If you wish to find all potential defaulters, how much minimum records you need to sift through
based on your model.
Ans:

8) A customer approaches the bank for credit. His details are as follows:
Checking Account > 200 DM;
History: Delay in Paying Off;
Savings Account: Greater than 1000 DM;
Purpose of Credit: New Car;
Amount: 1000;
Employment: 4-7 Years;
Instalment Rate: 3;
Marital Status: Male Married;
Co-Applicant: Applicant has a guarantor;
Present Residence: 2: 2-3 years;
Real Estate: Applicant owns no property;
Age: 35;
Other Instalments: No;
Residence: No; ;
Number of Credits: 2;
Job: Skilled Employee;
Number of Dependents: 2;
Telephone: Owns a phone;
Foreign: No.
Should the bank give him loan or not

You might also like