0% found this document useful (0 votes)

39 views21 pages

RCode Group 4

Uploaded by

trminhdung051108

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views21 pages

RCode Group 4

Uploaded by

trminhdung051108

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Hanoi University of Science and Technology

CASE STUDY REPORT

German Credit
Group 4
Major: Logistics and Supply Chain Management
Course: Applied Statistics

Members:
Trần Minh Thư 20233212
Nguyễn Trần Vân Anh 20233119
Nguyễn Quý Trang 20233221
Hoàng Mạnh Dũng 20233143
Nguyễn Trọng Thành Khôi 20233173

1
Trần Minh Thư
20233212

Nguyễn Trần Vân Anh Hoàng Mạnh Dũng

20233119 20233143

Nguyễn Quý Trang Nguyễn Trọng Thành Khôi

20233221 20233173

2
Applied Statistic Project : German Credit
Group project

2024-1-10

Tran Minh Thu 20233212 Nguyen Tran

Van Anh 20233119 Nguyen Quy Trang

20233221 Hoang Manh Dung 20233143

Nguyen Trong Thanh Khoi 20233173

1. Introduction
Understanding the factors influencing the credit amount offered to borrowers is crucial for lending institutions to make
sound financial decisions. The approved credit amount is shaped by various factors, including a borrower’s financial
stability, employment status, and demographic profile. A comprehensive analysis of these variables can help lenders
refine risk assessment models and tailor credit products more effectively.
This study delves into the primary factors that impact the credit amount provided to borrowers. By exam- ining these
variables, the research aims to deliver insights into how lending decisions are formed and how they can be optimized.

Objective

The main objective of this project is to analyze and predict the credit amount allocated to borrowers, which serves as
the dependent variable in this study. The research focuses on identifying and quantifying the relationships between
the credit amount and several independent variables within the dataset.

2. Data description
Independent Variables The independent variables that may influence the credit amount include:

1. Duration of Credit (month): ( quatitative data ) show the length of the credit repayment period
2. Purpose: ( qualitative data ) representing the reason for obtaining the credit with eleven levels ( 0-10
)

3
3. Instalment per cent: ( qualitative data )
4. Guarantors: ( qualitative data ) The presence or absence of guarantors for the credit. ( 1 - No ; 2 - Assistant ;
3- Gurantor)
5. Length of current employment: ( quatitative data ) a qualitative data containing 5 levels ( 1-5 )
6. Sex & Marital Status: ( qualitative data ) The gender and marital status of the borrower.
7. Age (years):( quatitative data ) The borrower’s age, which may indicate earning potential or financial stability.
8. Occupation: ( qualitative data ) The type of work the borrower engages in, which reflects income level and
stability.
9. Number of dependents: ( quatitative data ) The number of people financially dependent on the borrower.
10. Creditability: ( qualitative data ) worthy of belief ( 0 - Non_Creditability ; 1 - Creditability)

Variable selection Before describe the data, we remove Purpose, Guarantors, Length of current employment, Sex
& Marital Status, Number of dependents because they are very general and don’t have enough impact on Credit
Amount to choose for analysis. These following variables are kept to do analysis:

1. Creditability (C)
2. Duration of Credit (DOC)
3. Instalment per cent (IPC)
4. Age (years) (A)
5. Occupation (O)
6. Credit Amount (CA)

After removing variables that are considered, we have the new dataset as follow.

## # A tibble: 6 x 6
## C DOC IPC A O CA
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1
1 18 4 21 3 1049
## 2 1 9 2 36 3 2799
## 3 1 12 2 23 2 841
## 4 1 12 3 39 2 2122
## 5 1 12 4 38 2 2171
## 6 1 10 1 48 2 2241

Data description

## C DOC IPC A O
## Min. :0.0 Min. : 4.0 Min. :1.000 Min. :19.00 Min. :1.000
## 1st Qu.:0.0 1st Qu.:12.0 1st Qu.:2.000 1st Qu.:27.00 1st Qu.:3.000
## Median :1.0 Median :18.0 Median :3.000 Median :33.00 Median :3.000
## Mean :0.7 Mean :20.9 Mean :2.973 Mean :35.54 Mean :2.904
## 3rd Qu.:1.0 3rd Qu.:24.0 3rd Qu.:4.000 3rd Qu.:42.00 3rd Qu.:3.000

4
## Max. :1.0 Max. :72.0 Max. :4.000 Max. :75.00 Max. :4.000
## CA
## Min. : 250 ## 1st
Qu.: 1366
## Median : 2320 ##
Mean : 3271 ## 3rd
Qu.: 3972 ## Max.
:18424

Names of variables of the data set “German_credit”

## [1] "C" "DOC" "IPC" "A" "O" "CA"

Data description

For each of variables, we use these functions: summary(), hist(), boxplot() to do analysis and visuallize the data

1. Creditability This is a binary data. Thus, central tendencies, dispersion does not make any sense. Because of
that reason, the authors are not going to use summary() to analyse this variable. Hence, the functions table() and
hist() are considered.

## C
## 0 1
## 300 700

Histogram of C
700
500
Frequency

300
100

0.0 0.2 0.4 0.6 0.8 1.0

5
2. Duration_of_Credit (month) This is a continous variable with numeric data, the function summary() is
considered.

## Min. 1st Qu. Median Mean 3rd Qu. Max.

## 4.0 12.0 18.0 20.9 24.0 72.0

We use hist() to visualize the data.

Histogram of DOC
250

216
200

164
Frequency

150

123
89
100

57 49
50

16 13
7 3 2 0 0 1
0

0 20 40 60

DOC

3. Instalment per cent This is a continous variable with numeric data, the function summary() is considered.

## IPC
## 1 2 3 4
## 136 231 157 476

We visuallize the data by the function hist()

6
Histogram of IPC
400
300
Frequency

200
100
0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

IPC

7
Histogram of A

200
174 178
150
141
Frequency

100

88
71
42
50

26 27
16 12 6
0

20 30 40 50 60 70

A
4. Age(years)
Most borrower were in young age and none of them <19 because of the law only allow who meet the age
requirement to make a loan

8
Histogram of O

500
Frequency

300

200
148
100

22
0 0 0 0 0 0 0 0 0 0 0
0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

O
5. Occupation
This is qualitative data with 4 level where majority of observations concentrated at level 3.

6. Credit_Amount This variable is continuous. Therefore, to summarize the description of the variable we use
function summary().

## Min. 1st Qu. Median Mean 3rd Qu. Max.

## 250 1366 2320 3271 3972 18424

The authors are going to use the function hist() to visuallize the data considered

9
Histogram of CA
400
300
Frequency

200
100
0

0 5000 10000 15000 20000

3. SIMPLE REGRESSION
In this part, we will consider the relationship between the variables pairwise and investigate some simple
regressions when it makes sense.
We will use pairs(), cor() and summary(lm()) to have a general view about the relationship between variables.

10
10 40 70 20 40 60 0 10000

0.8
C

0.0
DOC
50
10

IPC

3.0
1.0
5020

3.0
O

1.0
15000

CA
0

0.0 0.4 0.8

1.0 2.5 4.0 1.0 2.5 4.0

## C DOC IPC A O CA
## C 1.00000000 -0.21492667 -0.07240394 0.09127195 -0.03273500 -0.15474015
## DOC -0.21492667 1.00000000 0.07474882 -0.03754986 0.21090973 0.62498846
## IPC -0.07240394 0.07474882 1.00000000 0.05727075 0.09775539 -0.27132228
## A 0.09127195 -0.03754986 0.05727075 1.00000000 0.01538303 0.03227268
## O -0.03273500 0.21090973 0.09775539 0.01538303 1.00000000 0.28539307
## CA -0.15474015 0.62498846 -0.27132228 0.03227268 0.28539307 1.00000000

The charts and correlations indicate several potential simple linear relationships. Specifically, there are noticeable
correlations between Creditability and Credit Amount (-0.155), Duration of Credit and Credit Amount (0.625),
Instalment Percent and Credit Amount (-0.271), and Occupation and Credit Amount (0.285).
Factors such as experience and the number of years working in the current job are excluded due to their lack of
analytical value. Additionally, the correlation between Age and Credit Amount is minimal (0.0323), so it will also be
disregarded in the analysis.

Duration of Credit and Credit Amount

Credi^
t Amount = β0 + β1 × Duration of Credit

11
15000
10000
CA

5000
0

10 20 30 40 50 60 70

DOC

## [1] 0.6249885 ##
## Call:
## lm(formula = CA ~ DOC) ##
## Residuals:
## Min 1Q Median 3Q Max ## -5151.7
-1260.0 -432.9 653.2 13805.0 ##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) ##
(Intercept) 213.169 139.569 1.527 0.127
## DOC 146.299 5.784 25.292 <2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 ##
## Residual standard error: 2205 on 998 degrees of freedom ## Multiple R-
squared: 0.3906, Adjusted R-squared: 0.39
## F-statistic: 639.7 on 1 and 998 DF, p-value: < 2.2e-16

The model given is:

Credi^
t Amount = 213.169 + 146.299 × Duration of Credit

12
The Multiple R-squared is calculated to equal to 0.3906 (39.06% of variation in Credit Amount can be explained by
the variability in Duration of Credit).

15000
10000
5000
CA

10 20 30 40 50 60 70

DOC

Creditability and Credit Amount

Credi^
t Amount = β0 + β1 × Creditability

13
Credit Amount Distribution by Creditability
15000
Credit Amount

10000
5000
0

No Yes

Creditability

## [1] -0.1547401 ##
## Call:
## lm(formula = CA ~ C) ##
## Residuals:
## Min 1Q Median 3Q Max ## -3505.1
-1765.6 -858.4 771.8 14485.9 ##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3938.1 161.1 24.447 < 2e-16 ***
## C -952.7 192.5 -4.948 8.8e-07 *** ## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 ##
## Residual standard error: 2790 on 998 degrees of freedom
## Multiple R-squared: 0.02394, Adjusted R-squared: 0.02297 ## F-
statistic: 24.48 on 1 and 998 DF, p-value: 8.795e-07

The model given is:

Credi^
t Amount = 3938.1 − 952.7 × Creditability

The Multiple R-squared is calculated to equal to 0.02394 (2.394% of variation in Credit Amount can be explained by
the variability in Creditability). -> Does not provide strong statistical value.

14
15000
10000
CA

5000
0

0.0 0.2 0.4 0.6 0.8 1.0

Instalment per cent and Credit Amount

Credi^
t Amount = β0 + β1 × Instalment per cent

15
15000
10000
CA

5000
0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

IPC

## [1] -0.2713223 ##
## Call:
## lm(formula = CA ~ IPC) ##

## Residuals:
## Min 1Q Median 3Q Max
## -4021.0 -1659.6 -854.5 788.9 13802.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5306.57 244.18 21.732 <2e-16 ***
## IPC -684.60 76.87 -8.905 <2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 2718 on 998 degrees of freedom
## Multiple R-squared: 0.07362, Adjusted R-squared: 0.07269
The model given is:
## F-statistic: 79.31 on 1 and 998 DF, p-value: < 2.2e-16

Credi^
t Amount = 5306.57 − 684.60 × Instalment per cent

16
The Multiple R-squared is calculated to equal to 0.07362 (7.362% of variation in Credit Amount can be explained by
the variability in Instalment per cent). -> Does not provide strong statistical value.

15000
10000
5000
CA

1.0 1.5 2.0 2.5 3.0 3.5 4.0

IPC

Occupation and Credit Amount

Credi^
t Amount = β0 + β1 × Occupation

17
15000
10000
CA

5000
0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

## [1] 0.2853931 ##
## Call:
## lm(formula = CA ~ O) ##
## Residuals:
## Min 1Q Median 3Q Max ## -3993.1
-1851.8 -777.6 776.4 13801.9 ##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) ##
(Intercept) -308 390 -0.790 0.43
## O 1232 131 9.407 <2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 ##
## Residual standard error: 2707 on 998 degrees of freedom
## Multiple R-squared: 0.08145, Adjusted R-squared: 0.08053
## F-statistic: 88.49 on 1 and 998 DF, p-value: < 2.2e-16

The model given is:

Credi^
t Amount = −308 + 1232 × Occupation

The Multiple R-squared is calculated to equal to 0.08145 (8.145% of variation in Credit Amount can be explained by
the variability in Occupation). -> Does not provide strong statistical value.

18
15000
10000
CA

5000
0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

4. Multiple regression
We are going to fit a linear model to explain the Credit amount which the response with the predictors C, DOC,IPC,A and O

C r e d^
i t . a m o u n t = β0 + β1 × C + β2 × DOC + β3 ×IPC + β4 ×A + β5 × O

##
## Call:
## lm(formula = CA ~ C + DOC + IPC + A + O) ##
## Residuals:
## Min 1Q Median 3Q Max ## -5805.4
-1096.3 -230.2 665.3 13252.0 ##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) ##
(Intercept) 69.478 377.685 0.184 0.854084
## C -312.785 137.292 -2.278 0.022923 *
## DOC 141.084 5.313 26.553 < 2e-16 ***
## IPC -865.190 55.192 -15.676 < 2e-16 ***
## A 18.965 5.420 3.499 0.000487 ***
## O 816.053 96.034 8.498 < 2e-16 ***
## ---

19
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 ##
## Residual standard error: 1932 on 994 degrees of freedom ## Multiple R-
squared: 0.534, Adjusted R-squared: 0.5316 ## F-statistic: 227.8 on 5 and
994 DF, p-value: < 2.2e-16

C r e d^
i t . a m o u n t = 69.478 − 312.785 × C + 141.084 × DOC − 865.190 × IPC + 18.965 × A + 816.053 × O

Most independent variables ( C, DOC, IPC, A, O ) are statistically significant, with p -values below 0.05.
The Rˆ2 = 0.534 value suggests a moderately good model fit, about 53.4% of the variance in CA is explained by the
independent variables in the model.
-> The model captures important Credit amount determinants with middle level of explanatory power.

20
CONCLUSION

1. Correlation Analysis:
The correlation matrix revealed strong pairwise relationships among some numeric variables. For example,
Duration.of.Credit..month. and Credit.Amount showed a positive correlation, indicating that longer credit durations are
often associated with higher credit amounts.
The visualization helped identify dependencies among variables, such as moderate correlations with Instalment.per.cent.

Multiple Regression Analysis:

The regression model with Credit.Amount as the dependent variable and predictors Duration.of.Credit..month.,
Age..years., and Instalment.per.cent explained approximately 53.4% of the variance (R² = 0.534).
Significant predictors included:
• Duration.of.Credit..month.: Positive association with credit amount, indicating that longer credit periods are linked
to higher credit.
• Age..years.: Positive impact, showing older individuals tend to request slightly higher credit amounts.
• Instalment.per.cent: Negative association, implying that higher instalment percentages reduce the overall credit
amount.

Visualization Insights:
The bar plots revealed the distribution of key variables like Creditability and Purpose. Most credits were considered
reliable (Creditability = 1), and a few purposes dominated the dataset.

Strengths:
Exploratory Analysis: Thorough statistical and visual exploration of variables provided valuable insights into distributions
and relationships.
Regression Model: The use of a multiple regression model allowed for understanding the contribution of key predictors to
credit amount, supported by statistically significant coefficients.
Correlation Visualization: The correlation plot provided a comprehensive overview of variable relationships.
Limitations:
Data Structure and Preprocessing:
• The dataset includes categorical variables like Purpose and Sex...Marital.Status, which were not fully utilized in
regression modeling.
• Potential outliers in variables like Credit.Amount were not addressed, possibly affecting model accuracy.
Model Performance:
• Although the model explains 49.8% of the variance, it suggests that other predictors not included in the analysis may
play a significant role in determining credit amount.
• Residual analysis indicates potential non-linearity or heteroscedasticity not accounted for.

Feature Selection: The regression only used three predictors. A broader feature set, including categorical variables
converted to dummy variables, could enhance the model's explanatory power.

This Study Resource Was: Bank Loan Default Prediction Model
No ratings yet
This Study Resource Was: Bank Loan Default Prediction Model
9 pages
Credit Card Default Taiwan Initial Exploratory
No ratings yet
Credit Card Default Taiwan Initial Exploratory
26 pages
Analysis of German Credit Data
100% (1)
Analysis of German Credit Data
24 pages
Thera Bank-Project
100% (12)
Thera Bank-Project
26 pages
1) Introduction A) Defining Problem Statement:-: ST ST
No ratings yet
1) Introduction A) Defining Problem Statement:-: ST ST
10 pages
EDA Assignment
100% (1)
EDA Assignment
19 pages
Vintage Restaurant Case
100% (1)
Vintage Restaurant Case
10 pages
Report
No ratings yet
Report
24 pages
Cart Project
75% (4)
Cart Project
17 pages
Linear+Regression+ +transcription
No ratings yet
Linear+Regression+ +transcription
22 pages
FRA Group Assignment - Report
No ratings yet
FRA Group Assignment - Report
22 pages
PFDA (Programming For Data Analysis) APU
No ratings yet
PFDA (Programming For Data Analysis) APU
60 pages
Mini Project-Data Mining
No ratings yet
Mini Project-Data Mining
25 pages
Project On Data Mining-Raveendra Babu Gaddam
No ratings yet
Project On Data Mining-Raveendra Babu Gaddam
29 pages
Student's First Name, Middle Initial(s), Last Name Institutional Affiliation Course Number and Name Instructor's Name and Title Assignment Due Date
No ratings yet
Student's First Name, Middle Initial(s), Last Name Institutional Affiliation Course Number and Name Instructor's Name and Title Assignment Due Date
20 pages
Advanced Modelling Techniques Anurag Payel
No ratings yet
Advanced Modelling Techniques Anurag Payel
41 pages
Jana Sir - Final
No ratings yet
Jana Sir - Final
19 pages
ECN190 Term Project: Predicting Credit Card Default Risk: Introduction and Literature
No ratings yet
ECN190 Term Project: Predicting Credit Card Default Risk: Introduction and Literature
18 pages
Group 5 Dseb64a Report
No ratings yet
Group 5 Dseb64a Report
10 pages
SanatKulkarni - AP22110010183 - Assignment3-1
No ratings yet
SanatKulkarni - AP22110010183 - Assignment3-1
4 pages
DM Assignment - Thena Bank
No ratings yet
DM Assignment - Thena Bank
39 pages
Data Preparation
No ratings yet
Data Preparation
4 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Default of Credit Card Clients
No ratings yet
Default of Credit Card Clients
27 pages
DS Report 1
No ratings yet
DS Report 1
5 pages
EDA Credit Assignment Shakti - PDF
No ratings yet
EDA Credit Assignment Shakti - PDF
51 pages
Capastone Project Taiwan Customer Default
67% (3)
Capastone Project Taiwan Customer Default
36 pages
DS Report 2
No ratings yet
DS Report 2
10 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
Project3: Loading Library
No ratings yet
Project3: Loading Library
17 pages
Documentation - Group Project FP 2019
No ratings yet
Documentation - Group Project FP 2019
7 pages
Summary and Context
No ratings yet
Summary and Context
51 pages
Programming For Data Analysis Assignment
No ratings yet
Programming For Data Analysis Assignment
38 pages
Capstone Project
No ratings yet
Capstone Project
33 pages
Thera Bank PRJ
100% (10)
Thera Bank PRJ
79 pages
Assignment 3 F1 - F4
No ratings yet
Assignment 3 F1 - F4
19 pages
Capstone Project
100% (1)
Capstone Project
7 pages
Loan Default Prediction Article Mar 31 2021
No ratings yet
Loan Default Prediction Article Mar 31 2021
14 pages
Nazreen - CIA 2 Applied Data Mining and Big Data
No ratings yet
Nazreen - CIA 2 Applied Data Mining and Big Data
5 pages
Credit EDA Case Study
No ratings yet
Credit EDA Case Study
42 pages
Progress Report 2
No ratings yet
Progress Report 2
10 pages
Jahnavijillella ML1 30 06 2024 PDF
No ratings yet
Jahnavijillella ML1 30 06 2024 PDF
53 pages
EDA Case Study
No ratings yet
EDA Case Study
94 pages
Project 3 Thera Bank
100% (1)
Project 3 Thera Bank
24 pages
Development of A Credit Scoring Model On The Public Report Data From Bondora P2P Lending Platform
No ratings yet
Development of A Credit Scoring Model On The Public Report Data From Bondora P2P Lending Platform
5 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
11 pages
A Population Is The Entire Group That You Want To Draw Conclusions About
No ratings yet
A Population Is The Entire Group That You Want To Draw Conclusions About
39 pages
Data Analysis in The Banking Sector: Pandas Fundamentals
No ratings yet
Data Analysis in The Banking Sector: Pandas Fundamentals
16 pages
Credit Risk Modelling (EDA & Classification) - Kaggle
No ratings yet
Credit Risk Modelling (EDA & Classification) - Kaggle
21 pages
Intro To R Introspection
No ratings yet
Intro To R Introspection
24 pages
Credit Card Default
No ratings yet
Credit Card Default
30 pages
Net Present Value Function
No ratings yet
Net Present Value Function
2 pages
Credit EDA Case Study
100% (3)
Credit EDA Case Study
16 pages
An Kit
No ratings yet
An Kit
12 pages
HW 1
No ratings yet
HW 1
4 pages
November 2010)
No ratings yet
November 2010)
6 pages
Project - Finance and Risk Assessment: Submitted By: Navendu Mishra
No ratings yet
Project - Finance and Risk Assessment: Submitted By: Navendu Mishra
18 pages
Capstone Project - Final Submission
No ratings yet
Capstone Project - Final Submission
36 pages
A Study On The Using of Grammatical Signals Withinl
No ratings yet
A Study On The Using of Grammatical Signals Withinl
1 page
2017 - Sewdas Et Al - Why Older Workers Work Beyond The Retirement Age-A Qualitative Study
No ratings yet
2017 - Sewdas Et Al - Why Older Workers Work Beyond The Retirement Age-A Qualitative Study
9 pages
The Effects of The Internal Control Opinion and Use of Audit Data Analytics On Perceptions of Audit Quality, Assurance, and Auditor Negligence
No ratings yet
The Effects of The Internal Control Opinion and Use of Audit Data Analytics On Perceptions of Audit Quality, Assurance, and Auditor Negligence
41 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
50 pages
(1-Way Analysis of Covariance ANCOVA) (DR SEE KIN HAI)
No ratings yet
(1-Way Analysis of Covariance ANCOVA) (DR SEE KIN HAI)
5 pages
ML VN Unit1 1
No ratings yet
ML VN Unit1 1
27 pages
The Bass Model Unscrambling Regression Coefficients For P&Q
No ratings yet
The Bass Model Unscrambling Regression Coefficients For P&Q
4 pages
OPPORTUNITY AND CHALLENGE OF INTER CASTE MARRIAGE Proposal
No ratings yet
OPPORTUNITY AND CHALLENGE OF INTER CASTE MARRIAGE Proposal
24 pages
Filtering and Sorting
No ratings yet
Filtering and Sorting
6 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Mathematical Statistics
No ratings yet
Mathematical Statistics
1 page
Project New
No ratings yet
Project New
13 pages
Worksheet1 Ch1 Statistics
No ratings yet
Worksheet1 Ch1 Statistics
2 pages
Neguse Sime PDF
No ratings yet
Neguse Sime PDF
73 pages
Critical Value Spearman
No ratings yet
Critical Value Spearman
1 page
GRR MSA-4 Anova Method
No ratings yet
GRR MSA-4 Anova Method
6 pages
Unit 2 Data Preprocessing
No ratings yet
Unit 2 Data Preprocessing
25 pages
Module 5 (Week 78)
No ratings yet
Module 5 (Week 78)
15 pages
Plymouth Uni Coursework Cover Sheet
100% (2)
Plymouth Uni Coursework Cover Sheet
8 pages
CIP
No ratings yet
CIP
126 pages
Intro To Machine Learning 101 Python Data Science v2
No ratings yet
Intro To Machine Learning 101 Python Data Science v2
101 pages
Gage R&R Study - ANOVA Method: Assignment - 1
No ratings yet
Gage R&R Study - ANOVA Method: Assignment - 1
3 pages
Additional Mathematics Project Work 2013 - Statistics
83% (41)
Additional Mathematics Project Work 2013 - Statistics
31 pages
Nptel Week 6 - 2
No ratings yet
Nptel Week 6 - 2
4 pages
Factor Analysis
No ratings yet
Factor Analysis
3 pages
Course To Work On Railways
100% (2)
Course To Work On Railways
6 pages
Research Capabilities of Senior High School Students: January 2018
No ratings yet
Research Capabilities of Senior High School Students: January 2018
9 pages
Primary - Kast Great Final Research
100% (1)
Primary - Kast Great Final Research
39 pages
The Impact of Extended Warehouse Management System Implementation On Warehouse Operational Performance
No ratings yet
The Impact of Extended Warehouse Management System Implementation On Warehouse Operational Performance
31 pages

RCode Group 4

Uploaded by

RCode Group 4

Uploaded by

Hanoi University of Science and Technology

CASE STUDY REPORT

Nguyễn Trần Vân Anh Hoàng Mạnh Dũng

Nguyễn Quý Trang Nguyễn Trọng Thành Khôi

Tran Minh Thu 20233212 Nguyen Tran

Van Anh 20233119 Nguyen Quy Trang

20233221 Hoang Manh Dung 20233143

Nguyen Trong Thanh Khoi 20233173

Names of variables of the data set “German_credit”

## [1] "C" "DOC" "IPC" "A" "O" "CA"

0.0 0.2 0.4 0.6 0.8 1.0

## Min. 1st Qu. Median Mean 3rd Qu. Max.

We use hist() to visualize the data.

We visuallize the data by the function hist()

1.0 1.5 2.0 2.5 3.0 3.5 4.0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

## Min. 1st Qu. Median Mean 3rd Qu. Max.

0 5000 10000 15000 20000

0.0 0.4 0.8

Duration of Credit and Credit Amount

The model given is:

Creditability and Credit Amount

The model given is:

0.0 0.2 0.4 0.6 0.8 1.0

Instalment per cent and Credit Amount

1.0 1.5 2.0 2.5 3.0 3.5 4.0

1.0 1.5 2.0 2.5 3.0 3.5 4.0

Occupation and Credit Amount

1.0 1.5 2.0 2.5 3.0 3.5 4.0

The model given is:

1.0 1.5 2.0 2.5 3.0 3.5 4.0

Multiple Regression Analysis:

You might also like