0% found this document useful (0 votes)

88 views28 pages

Project - 8: Finance &risk Analytics - India Credit Risk

This document outlines steps for building data models and analyzing their performance using employee transportation data. It includes: 1. Exploratory data analysis including basic data exploration, outlier detection, missing value treatment, and checking for multicollinearity. 2. Building models like KNN, logistic regression, naive bayes, boosting, and bagging. 3. Analyzing model performance through measures like prediction accuracy and sorting data based on probability of default. The analysis follows a step-by-step approach including data cleaning, new variable creation, model building, and comparing models to select the best performing one.

Uploaded by

psyish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views28 pages

Project - 8: Finance &risk Analytics - India Credit Risk

Uploaded by

psyish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 28

PROJECT - 8

Finance &Risk Analytics - India Credit Risk

Ishan Rastogi | PGP BABI online | 2nd Feb’20

0|PAGE
Contents
1 project objective...........................................................................................................................................................2
2 Data Analysis – Step by step approach..........................................................................................................................2
2.1 Exploratory data analysis.......................................................................................................................................2
2.1.1 Basic data exploration...................................................................................................................................2
2.1.2 Look out for outliers and missing values.......................................................................................................4
2.1.3 Check for multicollinearity & treat it.............................................................................................................6
2.2 Build Models and compare them to get to the best one.......................................................................................6
2.2.1 KNN................................................................................................................................................................7
2.2.2 Logistic Regression........................................................................................................................................9
2.2.3 Naive Bayes.................................................................................................................................................12
2.2.4 Boosting.......................................................................................................................................................14
2.2.5 Bagging........................................................................................................................................................16
2.2.6 Data model performance analysis...............................................................................................................19
3 Conclusion................................................................................................................................................................... 19
4 Appendix A – Source Code..........................................................................................................................................20

1.1

1|PAGE
1 PROJECT OBJECTIVE
To build data models and performance analysis of all the models using Employee mode of transport data (‘Cars.csv’):

1. Exploratory data analysis

2. Build Models
3. Model Performance Measures

2 DATA ANALYSIS – STEP BY STEP APPROACH

Building data models and performance analysis of all the models using Customer Churn Data (‘Cellphone.xlsx’) consists
of following steps:

 Exploratory data analysis

o Basic data exploration
o Missing value treatment
o New variable creation
o Outliers treatment
o Univariate & bivariate analysis
o Multicollinearity
 Build models
o Logistic Regression Model
o Analyze coefficient & their signs
 Model performance measures
o Predict accuracy of model
o Data sorting based on probability of default
o Model performance

2.1 EXPLORATORY DATA ANALYSIS

2.1.1 Basic data exploration
 Manual overview in excel
 dim() funstion - find out total number of rows and columns
o There are 3541 observations and 53 variables in raw data
o The variable, default is a target variable which is created by a new column “Default” and it is 1 if net worth
<= 0 else 0.
o There are 715 observations of 52 variables in validation data
 names() function - find out names of the columns (features)
 str() function - find out class of each column, along with internal structure .

2|PAGE
2.1.2 Missing value treatment
Missing values Ratio: Data columns with too many missing values are unlikely to carry with useful information. These
data columns with number of missing values greater than a given threshold can be removed.

Remove columns with missing values above threshold limit

> n <- dim(data)[1]
> m <- dim(data)[2]
> miss_values_data <- data.frame()
> for (i in 1:m) {
+ miss_values_data_i <- data.frame()
+ x <- colnames(data)[i]
+ valu <- sum(is.na(data[,x])) / n
+ name <- x
+ df1 <- data.frame(name = name, val = valu)
+ miss_values_data <- rbind(miss_values_data, df1)
+ }
> print(miss_values_data)
name val
1 Num 0.000000000
2 Networth Next Year 0.000000000
3 Total assets 0.000000000
4 Net worth 0.000000000
5 Total income 0.055916408
6 Change in stock 0.129341994
7 Total expenses 0.039254448
8 Profit after tax 0.036995199
9 PBDITA 0.036995199
10 PBT 0.036995199
11 Cash profit 0.036995199
12 PBDITA as % of total income 0.019203615
13 PBT as % of total income 0.019203615
14 PAT as % of total income 0.019203615
15 Cash profit as % of total income 0.019203615
16 PAT as % of net worth 0.000000000
17 Sales 0.073143180
18 Income from financial services 0.264049703
19 Other income 0.365715899
20 Total capital 0.001129624
21 Reserves and funds 0.024004518
22 Deposits (accepted by commercial banks) 1.000000000
23 Borrowings 0.103360633
24 Current liabilities & provisions 0.027110986
25 Deferred tax liability 0.321942954
26 Shareholders funds 0.000000000
27 Cumulative retained profits 0.010731432
28 Capital employed 0.000000000
29 TOL/TNW 0.000000000
30 Total term liabilities / tangible net worth 0.000000000
31 Contingent liabilities / Net worth (%) 0.000000000
32 Contingent liabilities 0.335498447
33 Net fixed assets 0.033323920
34 Investments 0.405252753
35 Current assets 0.018638803
36 Net working capital 0.009036995
37 Quick ratio (times) 0.026263767
38 Current ratio (times) 0.026263767
39 Debt to equity ratio (times) 0.000000000
40 Cash to current liabilities (times) 0.026263767
41 Cash to average cost of sales per day 0.024004518
42 Creditors turnover 0.013273087
43 Debtors turnover 0.011861056
44 Finished goods turnover 0.128212369
45 WIP turnover 0.099971759
46 Raw material turnover 0.021180457
47 Shares outstanding 0.000000000
3|PAGE
48 Equity face value 0.000000000
49 EPS 0.000000000
50 Adjusted EPS 0.000000000
51 Total liabilities 0.000000000
52 PE on BSE 0.006495340
53 default 0.000000000
> threshold <- 0.50
> print(miss_values_data[miss_values_data$val > 0.50,c('name','val')],row.names=F)
name val
Deposits (accepted by commercial banks) 1

We observe that variable - Deposits (accepted by commercial banks, have all observations as blank in raw data

Similarly we repeated the process for test data, found the same column as problem and removed it.

Also, removed column num as it’s an ID variable and hold no significance for model building

Removed missing values from raw data

> data_clean <- data[,-1]

> data_clean <- na.omit(data_clean)

> dim(data_clean)

[1] 934 51

There are 934 observations and 51 variables in raw data

> table(data_clean$default)

0 1
917 17

The number observations for No Defaulter is 934 and Defaulter is 17.

The Defaulters is a minority class comprising 2% (approximately) of total observations

Similarly we performed operations on test data

The number observations for No Defaulter is 178 and Defaulter is 5.

The Defaulters is a minority class comprising 2.5% (approximately) of total observations

2.1.3 New Variable creation

We need to convert the variables indicating amount into ratio variables by dividing each of them by relevant variables to
bring homogeneity while comparing.

Profitability - Profit after tax /Sales

Liquidity - Net Working Capital/Total Assets

Leverage - Total liabilities/ (Total assets - Total liabilities)

4|PAGE
Other ratios created

Total Income Ratio - Total income / Total assets

Change in stock Ratio - Change in stock / Total income

Total expenses Ratio - Total expenses / Total income

Profit after tax Ratio - Profit after tax / Total assets

PBDITA Ratio - PBDITA / Total assets

PBT Ratio - PBT / Total assets

Cash profit /Ratio - Cash profit / Total assets

Sales Ratio - Sales / Total assets

Income from financial services Ratio - Income from financial services /Total assets

Other income Ratio - other income /Total assets

Total capital Ratio - Total capital /Total assets

Reserves and funds Ratio - Reserves and funds /Total assets

Borrowings Ratio - Borrowings /Total assets

Current.liabilities...provisions.Ratio -Current.liabilities...provisions /Total assets

Deferred tax liability Ratio - Deferred tax liability /Total assets

Shareholders funds Ratio - Shareholders funds /Total assets

Cumulative retained profits Ratio - Cumulative retained profits /Total income

Capital employed Ratio - Capital employed /Total assets

Contingent liabilities - Contingent liabilities /Total assets

Net fixed assets Ratio - Net fixed assets /Total assets

Current assets Ratio - Current assets /Total assets

Net working capital Ratio - Net working capital /Total assets

After variable creation or conversion to ratio for better model analysis

Both the raw and test data variables and rows as below
> dim(test_data_transformed)
[1] 183 51
> dim(data_transformed)
[1] 934 51

2.1.4 Outliers treatment

We will check the presence of the outliers, which is an observation that lies at an abnormal distance from other values in
a data set for a continuous, numerical variable. Basically they are extreme values. We identify observations that are lying
5|PAGE
beyond an outer boundary of (Q3 - 1.5 IQR) and (Q3 + 1.5 IQR). Here Q3 is the third Quartile (75th percentile), Q1 is the
first Quartile (25th perecentile) and IQR, the Inter Quartile Range, difference between Q3 and Q1 (Q3 - Q1). We need to
deal with Outliers carefully. They could be mere typos or a possible correct value. They may contain valuable
information about the data gathering and recording process. Before deciding to eliminate these points from the data,
one should try to understand the root cause of why they appeared and whether similar values will continue to appear. Is
it due to the fact the values for one variable are recorded using different unit of measure instead of one unit of
measure? For example, weight recorded in kilograms for most of the observatiions and in pounds for a few
observations. Outliers are often bad data points influencing the data analysis.

Capping at the 5th and 95th percentile means values that are less than the value at 5 th percentile are replaced by the
value at 5th percentile, and values that are greater than the value at 95th percentile are replaced by the value at 95th
percentile.

2.1.5 Univariate & bivariate analysis

 boxplot() - A boxplot is a type of visual shorthand for a distribution of values that is popular among statisticians. Each
boxplot consists of:
o A box that stretches from the 25th percentile of the distribution to the 75th percentile, a distance known as
the interquartile range (IQR). In the middle of the box is a line that displays the median, i.e. 50th percentile,
of the distribution. These three lines give you a sense of the spread of the distribution and whether or not
the distribution is symmetric about the median or skewed to one side.
o Visual points that display observations that fall more than 1.5 times the IQR from either edge of the box.
These outlying points are unusual so are plotted individually.
o A line (or whisker) that extends from each end of the box and goes to the farthest non-outlier point in the
distribution.

2.1.6 Multicollinearity
Correlation

The multicolliniearity has caused the inflated VIF values for correlated variables. We will use a stepwise variable
reduction function using VIF values. The function work based on below method:

 Use the full set of explanatory variables.

 Calculates VIF for each variable,
 Removes the variable with the single highest value,
 Then recalculates all VIF values with the new set of variables,
 Removes the variable with the next highest value, and so on, until all values are below the threshold.

2.2 BUILD MODELS

2.2.1 KNN

6|PAGE
3 CONCLUSION

4 APPENDIX A – SOURCE CODE

#==============================================================================
> #
> # PROJEC- 5 : Predicting mode of Transport (ML)
> #
> #==============================================================================
> # Environment Set up
> # Setup Working Directory
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> # Install
>
> #install.packages("DMwR")
> #install.packages("class")
> #install.packages("VIF")
> #install.packages("GGally")
> #install.packages("mctest")
>
> # adding library
> library(ggplot2)
> library(DataExplorer)
> library(gower)
> library(rpart)
> #library(dplyr)
> library(plotrix)
> #library(rpart.plot)
> #library(randomForest)
> library(readxl)
> library(readr)
> #library(rattle)
> #library(ROCR)
> #library(ineq)
> #library(ROSE)
> #library(RColorBrewer)
> #library(data.table)
> #library(scales)
> library(corrplot)
> #library(caTools)
> #library(MASS)
> #library(clusterGeneration)
> library(caret)
> library(car)
> library(DMwR)
> library(class)
> library(carData)
> library(lattice)
> library(VIF)
> library(mctest)
> library(e1071)
> library(glmnet)
> library(xgboost)

7|PAGE
> library(data.table)
> library(ipred)
>
>
> # Read Input File
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> #attach(Cars)
>
> #Exploratory Data Analysis
>
> # Find out Total Number of Rows and Columns
> dim(Cars)
[1] 444 9
>
>
> # Find out Names of the Columns (Features)
> names(Cars)
[1] "Age" "Gender" "Engineer" "MBA" "Work Exp" "Salary" "Distance" "licens
>
>
> # Find out Class of each Feature, along with internal structure
> summary(Cars)
Age Gender Engineer MBA Work Exp Salary
Min. :18.00 Length:444 Min. :0.0000 Min. :0.0000 Min. : 0.0 Min. : 6
1st Qu.:25.00 Class :character 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.: 3.0 1st Qu.: 9
Median :27.00 Mode :character Median :1.0000 Median :0.0000 Median : 5.0 Median :13
Mean :27.75 Mean :0.7545 Mean :0.2528 Mean : 6.3 Mean :16
3rd Qu.:30.00 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 8.0 3rd Qu.:15
Max. :43.00 Max. :1.0000 Max. :1.0000 Max. :24.0 Max. :57
NA's :1
license Transport
Min. :0.0000 Length:444
1st Qu.:0.0000 Class :character
Median :0.0000 Mode :character
Mean :0.2342
3rd Qu.:0.0000
Max. :1.0000

> str(Cars)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 444 obs. of 9 variables:
$ Age : num 28 23 29 28 27 26 28 26 22 27 ...
$ Gender : chr "Male" "Female" "Male" "Female" ...
$ Engineer : num 0 1 1 1 1 1 1 1 1 1 ...
$ MBA : num 0 0 0 1 0 0 0 0 0 0 ...
$ Work Exp : num 4 4 7 5 4 4 5 3 1 4 ...
$ Salary : num 14.3 8.3 13.4 13.4 13.4 12.3 14.4 10.5 7.5 13.5 ...
$ Distance : num 3.2 3.3 4.1 4.5 4.6 4.8 5.1 5.1 5.1 5.2 ...
$ license : num 0 0 0 0 0 1 0 0 0 0 ...
$ Transport: chr "Public Transport" "Public Transport" "Public Transport" "Public Transport"
- attr(*, "spec")=
.. cols(
.. Age = col_double(),
.. Gender = col_character(),
.. Engineer = col_double(),

8|PAGE
.. MBA = col_double(),
.. `Work Exp` = col_double(),
.. Salary = col_double(),
.. Distance = col_double(),
.. license = col_double(),
.. Transport = col_character()
.. )
>
>
> # Check top and bottom Rows of the Dataset
> head(Cars,5)
# A tibble: 5 x 9
Age Gender Engineer MBA `Work Exp` Salary Distance license Transport
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 28 Male 0 0 4 14.3 3.2 0 Public Transport
2 23 Female 1 0 4 8.3 3.3 0 Public Transport
3 29 Male 1 0 7 13.4 4.1 0 Public Transport
4 28 Female 1 1 5 13.4 4.5 0 Public Transport
5 27 Male 1 0 4 13.4 4.6 0 Public Transport
> tail(Cars,5)
# A tibble: 5 x 9
Age Gender Engineer MBA `Work Exp` Salary Distance license Transport
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 40 Male 1 0 20 57 21.4 1 Car
2 38 Male 1 0 19 44 21.5 1 Car
3 37 Male 1 0 19 45 21.5 1 Car
4 37 Male 0 0 19 47 22.8 1 Car
5 39 Male 1 1 21 50 23.4 1 Car
>
>
> plot_str(Cars)
>
> #Hitogram View
>
> plot_histogram(Cars)
>
> # Density view
>
> plot_density(Cars)
>
>
>
> ## The columns converted into factors
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
>
> summary(Cars)
Age Gender Engineer MBA Work Exp Salary Distance
Min. :18.00 Female:128 0:109 0 :331 Min. : 0.0 Min. : 6.50 Min. : 3.20
1st Qu.:25.00 Male :316 1:335 1 :112 1st Qu.: 3.0 1st Qu.: 9.80 1st Qu.: 8.80
Median :27.00 NA's: 1 Median : 5.0 Median :13.60 Median :11.00
Mean :27.75 Mean : 6.3 Mean :16.24 Mean :11.32
3rd Qu.:30.00 3rd Qu.: 8.0 3rd Qu.:15.72 3rd Qu.:13.43
Max. :43.00 Max. :24.0 Max. :57.00 Max. :23.40
Transport
2Wheeler : 83
Car : 61
Public Transport:300

> boxplot(Cars$Age ~Cars$Engineer, main = "Age vs Eng.")

9|PAGE
> boxplot(Cars$Age ~Cars$MBA, main ="Age Vs MBA")
> boxplot(Cars$Salary ~Cars$Engineer, main = "Salary vs Eng.")
> boxplot(Cars$Salary ~Cars$MBA, main = "Salary vs MBA.")
> table(Cars$license,Cars$Transport)

2Wheeler Car Public Transport

0 60 13 267
1 23 48 33
> boxplot(Cars$'Work Exp' ~ Cars$Gender)
> boxplot(Cars$Salary~Cars$Transport, main="Salary vs Transport")
> boxplot(Cars$Age~Cars$Transport, main="Age vs Transport")
> boxplot(Cars$Distance~Cars$Transport, main="Distance vs Transport")
> table(Cars$Gender,Cars$Transport)

2Wheeler Car Public Transport

Female 38 13 77
Male 45 48 223
>
>
> #Native Bayes
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
> Cars<-na.exclude (Cars)
> Cars2<- Cars[,-5]
> Cars2$Salary <- log(Cars2$Salary)
>
> #test and train data
> set.seed(44)
> carindex<-createDataPartition(Cars2$Transport, p=0.7,list = FALSE)
> Cars2datatrain<-Cars2[carindex,]
> Cars2datatest<-Cars2[-carindex,]
> prop.table(table(Cars2datatrain$Transport))

2Wheeler Car Public Transport

0.1891026 0.1378205 0.6730769
> prop.table(table(Cars2datatest$Transport))

2Wheeler Car Public Transport

0.1832061 0.1374046 0.6793893
>
> #Fitting the Naive Bayes model
> Naive_Bayes_Model<-naiveBayes(Cars2datatrain$Transport ~., data=Cars2datatrain)
> Naive_Bayes_Model

Naive Bayes Classifier for Discrete Predictors

Call:

10 | P A G E
naiveBayes.default(x = X, y = Y, laplace = laplace)

A-priori probabilities:
Y
2Wheeler Car Public Transport
0.1891026 0.1378205 0.6730769

Conditional probabilities:
Age
Y [,1] [,2]
2Wheeler 25.00000 2.658753
Car 35.60465 3.546558
Public Transport 26.76190 2.853769

Gender
Y Female Male
2Wheeler 0.4576271 0.5423729
Car 0.2093023 0.7906977
Public Transport 0.2571429 0.7428571

Engineer
Y 0 1
2Wheeler 0.2542373 0.7457627
Car 0.1627907 0.8372093
Public Transport 0.2380952 0.7619048

MBA
Y 0 1
2Wheeler 0.7627119 0.2372881
Car 0.7906977 0.2093023
Public Transport 0.7238095 0.2761905

Salary
Y [,1] [,2]
2Wheeler 2.404017 0.3803774
Car 3.456899 0.4691048
Public Transport 2.522866 0.3143948

Distance
Y [,1] [,2]
2Wheeler 12.16780 3.200185
Car 15.31628 3.864301
Public Transport 10.42286 3.027442
license
Y 0 1
2Wheeler 0.7796610 0.2203390
Car 0.2790698 0.7209302
Public Transport 0.8904762 0.1095238

>
> #Prediction on the train dataset
> NB_Predictions<-predict(Naive_Bayes_Model,Cars2datatrain)
> table(NB_Predictions,Cars2datatrain$Transport)

NB_Predictions 2Wheeler Car Public Transport

2Wheeler 20 0 10
Car 2 35 6
Public Transport 37 8 194
> confusionMatrix(table(NB_Predictions,Cars2datatrain$Transport))
Confusion Matrix and Statistics

NB_Predictions 2Wheeler Car Public Transport

2Wheeler 20 0 10
Car 2 35 6

11 | P A G E
Public Transport 37 8 194

Overall Statistics

Accuracy : 0.7981
95% CI : (0.7492, 0.8412)
No Information Rate : 0.6731
P-Value [Acc > NIR] : 6.615e-07

Kappa : 0.5485

Mcnemar's Test P-Value : 0.0004845

Statistics by Class:

Class: 2Wheeler Class: Car Class: Public Transport

Sensitivity 0.33898 0.8140 0.9238
Specificity 0.96047 0.9703 0.5588
Pos Pred Value 0.66667 0.8140 0.8117
Neg Pred Value 0.86170 0.9703 0.7808
Prevalence 0.18910 0.1378 0.6731
Detection Rate 0.06410 0.1122 0.6218
Detection Prevalence 0.09615 0.1378 0.7660
Balanced Accuracy 0.64973 0.8921 0.7413
>
> #Prediction on the test dataset
> NB_Predictions<-predict(Naive_Bayes_Model,Cars2datatest)
> table(NB_Predictions,Cars2datatest$Transport)

NB_Predictions 2Wheeler Car Public Transport

2Wheeler 7 0 5
Car 1 17 2
Public Transport 16 1 82
> confusionMatrix(table(NB_Predictions,Cars2datatest$Transport))
Confusion Matrix and Statistics

NB_Predictions 2Wheeler Car Public Transport

2Wheeler 7 0 5
Car 1 17 2
Public Transport 16 1 82

Overall Statistics

Accuracy : 0.8092
95% CI : (0.7313, 0.8725)
No Information Rate : 0.6794
P-Value [Acc > NIR] : 0.0006484

Kappa : 0.5748

Mcnemar's Test P-Value : 0.0689234

Statistics by Class:

Class: 2Wheeler Class: Car Class: Public Transport

Sensitivity 0.29167 0.9444 0.9213
Specificity 0.95327 0.9735 0.5952
Pos Pred Value 0.58333 0.8500 0.8283
Neg Pred Value 0.85714 0.9910 0.7813
Prevalence 0.18321 0.1374 0.6794
Detection Rate 0.05344 0.1298 0.6260
Detection Prevalence 0.09160 0.1527 0.7557
Balanced Accuracy 0.62247 0.9589 0.7583

12 | P A G E
>
#KNN
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
> Cars<-na.exclude (Cars)
> Cars2<- Cars[,-5]
> Cars2$Salary <- log(Cars2$Salary)
>
> #test and train data
> set.seed(44)
> carindex<-createDataPartition(Cars2$Transport, p=0.7,list = FALSE)
> Cars2datatrain<-Cars2[carindex,]
> Cars2datatest<-Cars2[-carindex,]
> prop.table(table(Cars2datatrain$Transport))

2Wheeler Car Public Transport

0.1891026 0.1378205 0.6730769
> prop.table(table(Cars2datatest$Transport))

2Wheeler Car Public Transport

0.1832061 0.1374046 0.6793893
>
>
> trControl <- trainControl(method = "cv", number = 10)
> fit.knn <- train(Transport ~ .,
+ method = "knn",
+ tuneGrid = expand.grid(k = 2:20),
+ trControl = trControl,
+ metric = "Accuracy",
+ preProcess = c("center","scale"),
+ data = Cars2datatrain)
> fit.knn
k-Nearest Neighbors

312 samples
7 predictor
3 classes: '2Wheeler', 'Car', 'Public Transport'

Pre-processing: centered (7), scaled (7)

Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 281, 281, 281, 281, 281, 281, ...
Resampling results across tuning parameters:

k Accuracy Kappa
2 0.7302419 0.4281271

13 | P A G E
3 0.7726815 0.4952761
4 0.7630040 0.4715395
5 0.7596774 0.4502415
6 0.7533266 0.4343293
7 0.7630040 0.4423694
8 0.7695565 0.4590623
9 0.7597782 0.4193741
10 0.7594758 0.4089674
11 0.7594758 0.4013071
12 0.7530242 0.3777808
13 0.7626008 0.3941709
14 0.7658266 0.3973498
15 0.7658266 0.3939163
16 0.7627016 0.3891776
17 0.7594758 0.3774473
18 0.7627016 0.3890871
19 0.7530242 0.3569895
20 0.7562500 0.3697960

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was k = 3.
> KNN_predictions <- predict(fit.knn,Cars2datatrain)
> table(KNN_predictions, Cars2datatrain$Transport)

KNN_predictions 2Wheeler Car Public Transport

2Wheeler 36 0 8
Car 2 36 3
Public Transport 21 7 199
> confusionMatrix(table(KNN_predictions, Cars2datatrain$Transport))
Confusion Matrix and Statistics

KNN_predictions 2Wheeler Car Public Transport

2Wheeler 36 0 8
Car 2 36 3
Public Transport 21 7 199

Overall Statistics

Accuracy : 0.8686
95% CI : (0.826, 0.904)
No Information Rate : 0.6731
P-Value [Acc > NIR] : 1.579e-15

Kappa : 0.7177

Mcnemar's Test P-Value : 0.02411

Statistics by Class:

Class: 2Wheeler Class: Car Class: Public Transport

Sensitivity 0.6102 0.8372 0.9476
Specificity 0.9684 0.9814 0.7255
Pos Pred Value 0.8182 0.8780 0.8767
Neg Pred Value 0.9142 0.9742 0.8706
Prevalence 0.1891 0.1378 0.6731
Detection Rate 0.1154 0.1154 0.6378
Detection Prevalence 0.1410 0.1314 0.7276
Balanced Accuracy 0.7893 0.9093 0.8366
> KNN_predictions <- predict(fit.knn,Cars2datatest)
> table(KNN_predictions, Cars2datatest$Transport)

KNN_predictions 2Wheeler Car Public Transport

2Wheeler 8 3 6

14 | P A G E
Car 3 15 3
Public Transport 13 0 80
> confusionMatrix(table(KNN_predictions, Cars2datatest$Transport))
Confusion Matrix and Statistics

KNN_predictions 2Wheeler Car Public Transport

2Wheeler 8 3 6
Car 3 15 3
Public Transport 13 0 80

Overall Statistics

Accuracy : 0.7863
95% CI : (0.7061, 0.853)
No Information Rate : 0.6794
P-Value [Acc > NIR] : 0.004613

Kappa : 0.547

Mcnemar's Test P-Value : 0.133992

Statistics by Class:

Class: 2Wheeler Class: Car Class: Public Transport

Sensitivity 0.33333 0.8333 0.8989
Specificity 0.91589 0.9469 0.6905
Pos Pred Value 0.47059 0.7143 0.8602
Neg Pred Value 0.85965 0.9727 0.7632
Prevalence 0.18321 0.1374 0.6794
Detection Rate 0.06107 0.1145 0.6107
Detection Prevalence 0.12977 0.1603 0.7099
Balanced Accuracy 0.62461 0.8901 0.7947
>
>
> #Logistic Regression
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
> Cars<-na.exclude (Cars)
> Cars2<- Cars[,-5]
> Cars2$Salary <- log(Cars2$Salary)
>
>
> Cars2$CarUsage<-ifelse(Cars2$Transport =='Car',1,0)
> table(Cars2$CarUsage)

15 | P A G E
0 1
382 61
> sum(Cars2$CarUsage == 1)/nrow(Cars2)
[1] 0.1376975
> Cars2$CarUsage<-as.factor(Cars2$CarUsage)
> set.seed(44)
> carindex<-createDataPartition(Cars2$Transport, p=0.7,list = FALSE)
> Cars2datatrain<-Cars2[carindex,]
> Cars2datatest<-Cars2[-carindex,]
> prop.table(table(Cars2datatrain$Transport))

2Wheeler Car Public Transport

0.1891026 0.1378205 0.6730769
> prop.table(table(Cars2datatest$Transport))

2Wheeler Car Public Transport

0.1832061 0.1374046 0.6793893
> str(Cars2)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 443 obs. of 9 variables:
$ Age : num 28 23 29 28 27 26 28 26 22 27 ...
$ Gender : Factor w/ 2 levels "Female","Male": 2 1 2 1 2 2 2 1 2 2 ...
$ Engineer : Factor w/ 2 levels "0","1": 1 2 2 2 2 2 2 2 2 2 ...
$ MBA : Factor w/ 2 levels "0","1": 1 1 1 2 1 1 1 1 1 1 ...
$ Salary : num 2.66 2.12 2.6 2.6 2.6 ...
$ Distance : num 3.2 3.3 4.1 4.5 4.6 4.8 5.1 5.1 5.1 5.2 ...
$ license : Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 1 1 1 ...
$ Transport: Factor w/ 3 levels "2Wheeler","Car",..: 3 3 3 3 3 3 1 3 3 3 ...
$ CarUsage : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "na.action")= 'exclude' Named int 145
..- attr(*, "names")= chr "145"
> Cars2datatrain<-Cars2datatrain[,-8]
> Cars2datatest<-Cars2datatest[,-8]
> attach(Cars2datatrain)
The following objects are masked from Cars2datatrain (pos = 3):

Age, CarUsage, Distance, Engineer, Gender, license, MBA, Salary

The following objects are masked from Cars2datatrain (pos = 4):

Age, CarUsage, Distance, Engineer, Gender, license, MBA, Salary

The following objects are masked from Cars2datatrain (pos = 5):

Age, CarUsage, Distance, Engineer, Gender, license, MBA, Salary

The following objects are masked from Cars2datatrain (pos = 6):

Age, CarUsage, Distance, Engineer, Gender, license, MBA, Salary

The following objects are masked from Cars2datatrain (pos = 12):

Age, CarUsage, Distance, Engineer, Gender, license, MBA, Salary

> str(Cars2datatrain)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 312 obs. of 8 variables:
$ Age : num 28 23 26 28 22 27 25 27 24 27 ...
$ Gender : Factor w/ 2 levels "Female","Male": 2 1 2 2 2 2 1 2 2 2 ...
$ Engineer: Factor w/ 2 levels "0","1": 1 2 2 2 2 2 2 2 2 2 ...
$ MBA : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Salary : num 2.66 2.12 2.51 2.67 2.01 ...
$ Distance: num 3.2 3.3 4.8 5.1 5.1 5.2 5.2 5.3 5.4 5.5 ...
$ license : Factor w/ 2 levels "0","1": 1 1 2 1 1 1 1 2 1 2 ...
$ CarUsage: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "na.action")= 'exclude' Named int 145
..- attr(*, "names")= chr "145"
> Cars2dataSMOTE<-SMOTE(CarUsage~., as.data.frame(Cars2datatrain), perc.over = 250,perc.under =

16 | P A G E
> prop.table(table(Cars2dataSMOTE$CarUsage))

0 1
0.5 0.5
> ##Create control parameter for GLM
> outcomevar<-'CarUsage'
> regressors<-c("Age","Salary","Distance","license","Engineer","MBA","Gender")
> trainctrl<-trainControl(method = 'repeatedcv',number = 10,repeats = 3)
> Cars2glm<-train(Cars2dataSMOTE[,regressors],Cars2dataSMOTE[,outcomevar],method = "glm", famil
+ "binomial",trControl = trainctrl)
Warning messages:
1: glm.fit: fitted probabilities numerically 0 or 1 occurred
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
3: glm.fit: fitted probabilities numerically 0 or 1 occurred
> summary(Cars2glm$finalModel)

Call:
NULL

Deviance Residuals:
Min 1Q Median 3Q Max
-3.3474 -0.0194 0.0000 0.0445 2.2655

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -58.0133 12.1697 -4.767 1.87e-06 ***
Age 1.7266 0.3616 4.775 1.80e-06 ***
Salary 1.2720 1.4542 0.875 0.3817
Distance 0.1321 0.1539 0.858 0.3908
license1 1.9338 1.0768 1.796 0.0725 .
Engineer1 -0.1014 1.2338 -0.082 0.9345
MBA1 -2.1554 0.9435 -2.285 0.0223 *
GenderMale -1.0655 0.9267 -1.150 0.2502
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 357.664 on 257 degrees of freedom

Residual deviance: 47.327 on 250 degrees of freedom
AIC: 63.327

Number of Fisher Scoring iterations: 9

> carglmcoeff<-exp(coef(Cars2glm$finalModel))
> write.csv(carglmcoeff,file = "Coeffs.csv")
> varImp(object = Cars2glm)
glm variable importance

Overall
Age 100.00
MBA1 46.94
license1 36.52
GenderMale 22.75
Salary 16.89
Distance 16.54
Engineer1 0.00
> plot(varImp(object = Cars2glm), main="Vairable Importance for Logistic Regression")
> carusageprediction<-predict.train(object = Cars2glm,Cars2datatest[,regressors],type = "raw")
> Cars2datatest$CarUsage<-as.factor(Cars2datatest$CarUsage)
> confusionMatrix(carusageprediction,Cars2datatest$CarUsage, positive='1')
Confusion Matrix and Statistics

Reference
Prediction 0 1
0 104 0

17 | P A G E
1 9 18

Accuracy : 0.9313
95% CI : (0.8736, 0.9681)
No Information Rate : 0.8626
P-Value [Acc > NIR] : 0.010562

Kappa : 0.7605

Mcnemar's Test P-Value : 0.007661

Sensitivity : 1.0000
Specificity : 0.9204
Pos Pred Value : 0.6667
Neg Pred Value : 1.0000
Prevalence : 0.1374
Detection Rate : 0.1374
Detection Prevalence : 0.2061
Balanced Accuracy : 0.9602

'Positive' Class : 1

> #str(carusageprediction)
> #str(Cars2datatest$CarUsage)
>
> #summary(carusageprediction)
> #summary(Cars2datatest$CarUsage)
>
> carusagepreddata<-Cars2datatest
> carusagepreddata$predictusage<-carusageprediction
> trainctrlgn<-trainControl(method = 'cv',number = 10,returnResamp = 'none')
> Cars2glmnet<-train(CarUsage~Age+Salary+Distance+license, data = Cars2dataSMOTE,
+ method = 'glmnet', trControl = trainctrlgn)
> Cars2glmnet
glmnet

258 samples
4 predictor
2 classes: '0', '1'

No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 232, 232, 233, 232, 232, 233, ...
Resampling results across tuning parameters:

alpha lambda Accuracy Kappa

0.10 0.0008382152 0.9572308 0.9143453
0.10 0.0083821515 0.9572308 0.9143453
0.10 0.0838215151 0.9303077 0.8604991
0.55 0.0008382152 0.9572308 0.9143453
0.55 0.0083821515 0.9610769 0.9220376
0.55 0.0838215151 0.9610769 0.9220376
1.00 0.0008382152 0.9610769 0.9220376
1.00 0.0083821515 0.9570769 0.9141012
1.00 0.0838215151 0.9610769 0.9221917

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were alpha = 0.55 and lambda = 0.08382152.
> varImp(object <- Cars2glmnet)
glmnet variable importance

Overall
Salary 100.00
Age 26.38

18 | P A G E
license1 18.50
Distance 0.00
> plot(varImp(object <- Cars2glmnet), main="Vairable Importance for Logistic Regression - Post
+ gularization")
> carusagepredictiong<-predict.train(object = Cars2glmnet,Cars2datatest[,regressors],type = "ra
> confusionMatrix(carusagepredictiong,Cars2datatest$CarUsage, positive='1')
Confusion Matrix and Statistics

Reference
Prediction 0 1
0 105 0
1 8 18

Accuracy : 0.9389
95% CI : (0.8832, 0.9733)
No Information Rate : 0.8626
P-Value [Acc > NIR] : 0.004479

Kappa : 0.7829

Mcnemar's Test P-Value : 0.013328

Sensitivity : 1.0000
Specificity : 0.9292
Pos Pred Value : 0.6923
Neg Pred Value : 1.0000
Prevalence : 0.1374
Detection Rate : 0.1374
Detection Prevalence : 0.1985
Balanced Accuracy : 0.9646

'Positive' Class : 1

>
>
> # Boosting
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
> Cars<-na.exclude (Cars)
> Cars2<- Cars[,-5]
> Cars2$Salary <- log(Cars2$Salary)
> set.seed(44)
> carindex<-createDataPartition(Cars2$Transport, p=0.7,list = FALSE)
> Cars2datatrain<-Cars2[carindex,]
> Cars2datatest<-Cars2[-carindex,]

19 | P A G E
> Cars2datatrain$license<-as.factor(Cars2datatrain$license)
> Cars2datatest$license<-as.factor(Cars2datatest$license)
> Cars2train.car<-Cars2datatrain[Cars2datatrain$Transport %in% c("Car", "Public Transport"),]
> Cars2train.twlr<-Cars2datatrain[Cars2datatrain$Transport %in% c("2Wheeler", "Public Transport
> Cars2train.car$Transport<-as.character(Cars2train.car$Transport)
> Cars2train.car$Transport<-as.factor(Cars2train.car$Transport)
> Cars2train.twlr$Transport<-as.character(Cars2train.twlr$Transport)
> Cars2train.twlr$Transport<-as.factor(Cars2train.twlr$Transport)
> prop.table(table(Cars2train.car$Transport))

Car Public Transport

0.1699605 0.8300395
> prop.table(table(Cars2train.twlr$Transport))

2Wheeler Public Transport

0.2193309 0.7806691
> cartwlrsm <- SMOTE(Transport~., data = as.data.frame(Cars2train.twlr), perc.over = 150, perc.
> table(cartwlrsm$Transport)

2Wheeler Public Transport

118 118
> carCars2m <- SMOTE(Transport~., data = as.data.frame(Cars2train.car), perc.over = 175, perc.u
> table(carCars2m$Transport)

Car Public Transport

86 86
> car<-carCars2m[carCars2m$Transport %in% c("Car"),]
> Cars2datatrainldasm<-rbind(cartwlrsm,car)
> str(Cars2datatrainldasm)
'data.frame': 322 obs. of 8 variables:
$ Age : num 26 27 20 24 26 28 27 29 24 24 ...
$ Gender : Factor w/ 2 levels "Female","Male": 2 2 2 2 1 2 1 2 2 2 ...
$ Engineer : Factor w/ 2 levels "0","1": 1 2 2 2 2 2 2 2 2 2 ...
$ MBA : Factor w/ 2 levels "0","1": 1 1 1 2 2 1 2 2 1 1 ...
$ Salary : num 2.28 2.6 2.14 2.03 2.55 ...
$ Distance : num 12.2 5.3 7 9.5 10.8 9.3 13.3 10.2 16.8 16.8 ...
$ license : Factor w/ 2 levels "0","1": 2 2 1 1 1 1 1 2 1 1 ...
$ Transport: Factor w/ 3 levels "2Wheeler","Public Transport",..: 2 2 2 2 2 2 2 2 2 2 ...
- attr(*, "na.action")= 'exclude' Named int 145
..- attr(*, "names")= chr "145"
> boostcontrol <- trainControl(number=10)
> xgbGrid <- expand.grid(
+ eta = 0.3,
+ max_depth = 1,
+ nrounds = 50,
+ gamma = 0,
+ colsample_bytree = 0.6,
+ min_child_weight = 1, subsample = 1
+ )
> Cars2xgb <- train(Transport ~ .,Cars2datatrainldasm,trControl = boostcontrol,tuneGrid = xgbGr
+ = "Accuracy",method = "xgbTree")
> Cars2xgb$finalModel
##### xgb.Booster
raw: 38.5 Kb
call:
xgboost::xgb.train(params = list(eta = param$eta, max_depth = param$max_depth,
gamma = param$gamma, colsample_bytree = param$colsample_bytree,
min_child_weight = param$min_child_weight, subsample = param$subsample),
data = x, nrounds = param$nrounds, num_class = length(lev),
objective = "multi:softprob")
params (as set within xgb.train):
eta = "0.3", max_depth = "1", gamma = "0", colsample_bytree = "0.6", min_child_weight = "1",
"3", objective = "multi:softprob", silent = "1"
xgb.attributes:
niter
callbacks:

20 | P A G E
cb.print.evaluation(period = print_every_n)
# of features: 7
niter: 50
nfeatures : 7
xNames : Age GenderMale Engineer1 MBA1 Salary Distance license1
problemType : Classification
tuneValue :
nrounds max_depth eta gamma colsample_bytree min_child_weight subsample
1 50 1 0.3 0 0.6 1 1
obsLevels : 2Wheeler Public Transport Car
param :
list()
> predictions_xgb<-predict(Cars2xgb,Cars2datatest)
> confusionMatrix(predictions_xgb,Cars2datatest$Transport)
Confusion Matrix and Statistics

Reference
Prediction 2Wheeler Car Public Transport
2Wheeler 12 0 25
Car 1 18 6
Public Transport 11 0 58

Overall Statistics

Accuracy : 0.6718
95% CI : (0.5843, 0.7512)
No Information Rate : 0.6794
P-Value [Acc > NIR] : 0.614462

Kappa : 0.4182

Mcnemar's Test P-Value : 0.006006

Statistics by Class:

Class: 2Wheeler Class: Car Class: Public Transport

Sensitivity 0.5000 1.0000 0.6517
Specificity 0.7664 0.9381 0.7381
Pos Pred Value 0.3243 0.7200 0.8406
Neg Pred Value 0.8723 1.0000 0.5000
Prevalence 0.1832 0.1374 0.6794
Detection Rate 0.0916 0.1374 0.4427
Detection Prevalence 0.2824 0.1908 0.5267
Balanced Accuracy 0.6332 0.9690 0.6949
Warning message:
In confusionMatrix.default(predictions_xgb, Cars2datatest$Transport) :
Levels are not in the same order for reference and data. Refactoring data to match.
>
>
>
> #Bagging
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),

21 | P A G E
Transport = col_character()
)
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
> Cars<-na.exclude (Cars)
> Cars2<- Cars[,-5]
> Cars2$Salary <- log(Cars2$Salary)
> set.seed(44)
> #test and train data
> carindex <- createDataPartition(Cars2$Transport, p=0.70, list=FALSE)
> Cars2datatrain <- Cars2[ carindex,]
> Cars2datatest <- Cars2[-carindex,]
> Cars2.bagging <- bagging(Transport ~.,
+ data=Cars2datatrain,
+ control=rpart.control(maxdepth=5, minsplit=4))
> Cars2datatrain$pred.class <- predict(Cars2.bagging, Cars2datatrain)
> table(Cars2datatrain$Gender,Cars2datatrain$pred.class)

2Wheeler Car Public Transport

Female 17 10 63
Male 15 36 171
> table(Cars2datatrain$Salary,Cars2datatrain$pred.class)

2Wheeler Car Public Transport

1.88706964903238 0 0 1
1.90210752639692 1 0 0
1.91692261218206 1 0 2
1.93152141160321 6 0 0
2.01490302054226 0 0 2
2.02814824729229 0 0 3
2.04122032885964 0 0 4
2.05412373369555 0 0 2
2.06686275947298 3 0 1
2.07944154167984 1 0 1
2.11625551480255 0 0 1
2.12823170584927 0 0 2
2.14006616349627 0 0 9
2.15176220325946 0 0 8
2.16332302566054 1 0 2
2.17475172148416 4 0 3
2.18605127673809 3 0 6
2.19722457733622 3 0 0
2.2512917986065 0 0 5
2.26176309847379 0 0 4
2.28238238567653 0 0 8
2.29253475714054 1 0 5
2.30258509299405 0 0 1
2.35137525716348 0 0 2
2.36085400111802 1 0 4
2.37024374146786 0 0 2
2.37954613413017 0 0 3
2.3887627892351 0 0 2
2.4423470353692 0 0 2
2.45100509811232 1 0 2
2.45958884180371 0 0 3
2.47653840011748 0 0 1
2.50959926237837 0 0 2
2.51769647261099 0 0 1
2.52572864430826 0 0 4
2.53369681395743 0 0 2
2.54160199346455 0 0 10
2.54944517092557 0 0 9
2.55722731136763 0 0 7

22 | P A G E
2.59525470695687 0 0 1
2.60268968544438 0 0 2
2.61006979274201 0 0 10
2.61739583283408 0 0 6
2.62466859216316 0 0 7
2.63188884013665 0 0 10
2.66025953726586 0 0 1
2.66722820658195 0 0 3
2.67414864942653 0 0 1
2.68102152871429 0 0 15
2.68784749378469 0 0 4
2.69462718077007 0 0 7
2.70136121295141 1 0 5
2.70805020110221 2 0 0
2.73436750941958 0 0 1
2.74727091425549 0 1 6
2.75366071235426 0 0 1
2.76000994003292 0 1 3
2.76631910922619 0 2 1
2.8094026953625 0 0 1
2.81540871942271 0 1 0
2.82137888640921 0 0 1
2.82731362192903 0 5 1
2.83321334405622 0 4 0
2.87919845729804 0 0 3
2.9338568698359 1 0 2
2.98061863574394 0 0 1
3.03013370027132 0 0 2
3.03495298670727 0 0 1
3.07269331469012 0 0 1
3.08190996979504 0 0 1
3.13549421592915 1 0 0
3.16968558067743 1 0 2
3.17387845893747 0 0 1
3.21486780347066 0 0 1
3.25424296870549 0 0 1
3.35689712276558 0 0 2
3.3603753871419 0 1 2
3.49650756146648 0 1 0
3.52636052461616 0 2 0
3.55248682920838 0 1 0
3.55534806148941 0 1 0
3.58351893845611 0 1 0
3.60004824040732 0 0 1
3.61091791264422 0 2 0
3.66356164612965 0 1 0
3.68637632389582 0 1 0
3.71113006304876 0 1 0
3.71357206670431 0 1 0
3.73766961828337 0 2 0
3.75887182593397 0 1 0
3.76120011569356 0 2 0
3.78418963391826 0 1 0
3.80666248977032 0 4 0
3.85014760171006 0 2 0
3.87120101090789 0 1 0
3.93182563272433 0 2 0
3.95124371858143 0 1 0
3.98898404656427 0 1 0
4.00733318523247 0 1 0
4.04305126783455 0 1 0
> Cars2.bagging <- bagging(Transport ~.,
+ data=Cars2datatest,
+ control=rpart.control(maxdepth=5, minsplit=4))
> Cars2datatest$pred.class <- predict(Cars2.bagging, Cars2datatest)
>

23 | P A G E
> table(Cars2datatest$Gender,Cars2datatest$pred.class)

2Wheeler Car Public Transport

Female 6 4 27
Male 7 15 72
> summary(Cars2dataSMOTE)
Age Gender Engineer MBA Salary Distance license CarUsage
Min. :18.00 Female: 77 0: 52 0:175 Min. :1.902 Min. : 4.80 0:161 0:129
1st Qu.:26.00 Male :181 1:206 1: 83 1st Qu.:2.549 1st Qu.:10.00 1: 97 1:129
Median :31.02 Median :2.817 Median :12.25
Mean :30.90 Mean :2.942 Mean :12.78
3rd Qu.:34.90 3rd Qu.:3.551 3rd Qu.:15.47
Max. :43.00 Max. :4.043 Max. :22.80

>
> # Missing Value & Multicollinearity
> # Setup Working Directory
> setwd("C:/Users/Ishan/Documents")
> getwd()
[1] "C:/Users/Ishan/Documents"
>
> # adding library
> library(DataExplorer)
> library(gower)
> library(rpart)
> library(plotrix)
> library(readr)
> library(car)
> library(DMwR)
> library(class)
> library(carData)
> library(lattice)
>
>
> # Read Input File
> Cars <- read_csv("E:/analytics program great lakes/PROJECT 5/Cars.csv")
Parsed with column specification:
cols(
Age = col_double(),
Gender = col_character(),
Engineer = col_double(),
MBA = col_double(),
`Work Exp` = col_double(),
Salary = col_double(),
Distance = col_double(),
license = col_double(),
Transport = col_character()
)
> ## The columns converted into factors
> Cars$Engineer<-as.factor(Cars$Engineer)
> Cars$MBA<-as.factor(Cars$MBA)
> Cars$license<-as.factor(Cars$license)
> Cars$Gender<-as.factor(Cars$Gender)
> Cars$Transport<-as.factor(Cars$Transport)
>
>
> Cars$Engineer<-as.numeric(Cars$Engineer)
> Cars$MBA<-as.numeric(Cars$MBA)
> Cars$license<-as.numeric(Cars$license)
> Cars$Gender<-as.numeric(Cars$Gender)
> Cars$Transport<-as.numeric(Cars$Transport)
>
> str(Cars)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 444 obs. of 9 variables:
$ Age : num 28 23 29 28 27 26 28 26 22 27 ...

24 | P A G E
$ Gender : num 2 1 2 1 2 2 2 1 2 2 ...
$ Engineer : num 1 2 2 2 2 2 2 2 2 2 ...
$ MBA : num 1 1 1 2 1 1 1 1 1 1 ...
$ Work Exp : num 4 4 7 5 4 4 5 3 1 4 ...
$ Salary : num 14.3 8.3 13.4 13.4 13.4 12.3 14.4 10.5 7.5 13.5 ...
$ Distance : num 3.2 3.3 4.1 4.5 4.6 4.8 5.1 5.1 5.1 5.2 ...
$ license : num 1 1 1 1 1 2 1 1 1 1 ...
$ Transport: num 3 3 3 3 3 3 1 3 3 3 ...
- attr(*, "spec")=
.. cols(
.. Age = col_double(),
.. Gender = col_character(),
.. Engineer = col_double(),
.. MBA = col_double(),
.. `Work Exp` = col_double(),
.. Salary = col_double(),
.. Distance = col_double(),
.. license = col_double(),
.. Transport = col_character()
.. )
> #missing value
> anyNA(Cars)
[1] TRUE
> Cars[!complete.cases(Cars), ]
# A tibble: 1 x 9
Age Gender Engineer MBA `Work Exp` Salary Distance license Transport
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 28 1 1 NA 6 13.7 9.4 1 3
> Cars<-na.exclude (Cars)
> accounts_n<-Cars
> str(accounts_n)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 443 obs. of 9 variables:
$ Age : num 28 23 29 28 27 26 28 26 22 27 ...
$ Gender : num 2 1 2 1 2 2 2 1 2 2 ...
$ Engineer : num 1 2 2 2 2 2 2 2 2 2 ...
$ MBA : num 1 1 1 2 1 1 1 1 1 1 ...
$ Work Exp : num 4 4 7 5 4 4 5 3 1 4 ...
$ Salary : num 14.3 8.3 13.4 13.4 13.4 12.3 14.4 10.5 7.5 13.5 ...
$ Distance : num 3.2 3.3 4.1 4.5 4.6 4.8 5.1 5.1 5.1 5.2 ...
$ license : num 1 1 1 1 1 2 1 1 1 1 ...
$ Transport: num 3 3 3 3 3 3 1 3 3 3 ...
- attr(*, "na.action")= 'exclude' Named int 145
..- attr(*, "names")= chr "145"
> corrplot(cor(accounts_n))
> model1 <- glm(Transport ~ ., data= accounts_n)
> summary(model1)

Call:
glm(formula = Transport ~ ., data = accounts_n)

Deviance Residuals:
Min 1Q Median 3Q Max
-2.1081 -0.2535 0.2071 0.4631 1.1748

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.965013 0.529462 1.823 0.069 .
Age 0.087019 0.021419 4.063 5.76e-05 ***
Gender 0.369621 0.077090 4.795 2.24e-06 ***
Engineer -0.007704 0.079025 -0.097 0.922
MBA 0.129177 0.078727 1.641 0.102
`Work Exp` -0.039439 0.026128 -1.509 0.132
Salary -0.008249 0.009595 -0.860 0.390
Distance -0.052808 0.010545 -5.008 8.03e-07 ***
license -0.561159 0.095571 -5.872 8.59e-09 ***
---

25 | P A G E
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.5022761)

Null deviance: 276.68 on 442 degrees of freedom

Residual deviance: 217.99 on 434 degrees of freedom
AIC: 963.03

Number of Fisher Scoring iterations: 2

> vif(model1)
Age Gender Engineer MBA `Work Exp` Salary Distance license
7.892948 1.071875 1.015424 1.032638 15.735548 8.871978 1.274628 1.447236
> model2 <- glm(Transport ~ ., data= accounts_n[,-5])
> summary(model2)

Call:
glm(formula = Transport ~ ., data = accounts_n[, -5])

Deviance Residuals:
Min 1Q Median 3Q Max
-2.0607 -0.2625 0.2145 0.4680 1.0776

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.488038 0.400920 3.712 0.000233 ***
Age 0.064015 0.015072 4.247 2.65e-05 ***
Gender 0.373212 0.077167 4.836 1.84e-06 ***
Engineer -0.004950 0.079120 -0.063 0.950141
MBA 0.115982 0.078355 1.480 0.139541
Salary -0.018465 0.006811 -2.711 0.006973 **
Distance -0.051120 0.010501 -4.868 1.58e-06 ***
license -0.545627 0.095155 -5.734 1.84e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.5037523)

Null deviance: 276.68 on 442 degrees of freedom

Residual deviance: 219.13 on 435 degrees of freedom
AIC: 963.35

Number of Fisher Scoring iterations: 2

> vif(model2)
Age Gender Engineer MBA Salary Distance license
3.896910 1.070855 1.014883 1.019907 4.457554 1.260307 1.430460
> model3 <- glm(Transport ~ ., data= accounts_n[,-c(5,6)])
> summary(model3)

Call:
glm(formula = Transport ~ ., data = accounts_n[, -c(5, 6)])

Deviance Residuals:
Min 1Q Median 3Q Max
-2.0817 -0.2194 0.2039 0.4808 1.1676

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.259173 0.284592 7.938 1.75e-14 ***
Age 0.031048 0.008969 3.462 0.00059 ***
Gender 0.381131 0.077671 4.907 1.31e-06 ***
Engineer -0.007425 0.079688 -0.093 0.92581
MBA 0.109572 0.078888 1.389 0.16555
Distance -0.058510 0.010215 -5.728 1.90e-08 ***
license -0.605413 0.093236 -6.493 2.29e-10 ***

26 | P A G E
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.5110887)

Null deviance: 276.68 on 442 degrees of freedom

Residual deviance: 222.83 on 436 degrees of freedom
AIC: 968.78

Number of Fisher Scoring iterations: 2

> vif(model3)
Age Gender Engineer MBA Distance license
1.360182 1.069320 1.014748 1.018978 1.175392 1.353631
> model4 <- glm(Transport ~ ., data= accounts_n[,-c(5,1)])
> summary(model4)

Call:
glm(formula = Transport ~ ., data = accounts_n[, -c(5, 1)])

Deviance Residuals:
Min 1Q Median 3Q Max
-2.0259 -0.1351 0.2098 0.4732 1.1094

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.877943 0.236105 12.189 < 2e-16 ***
Gender 0.383198 0.078624 4.874 1.54e-06 ***
Engineer 0.009012 0.080581 0.112 0.911
MBA 0.100695 0.079787 1.262 0.208
Salary 0.004875 0.004102 1.189 0.235
Distance -0.053903 0.010684 -5.045 6.66e-07 ***
license -0.532477 0.096945 -5.493 6.74e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.5234398)

Null deviance: 276.68 on 442 degrees of freedom

Residual deviance: 228.22 on 436 degrees of freedom
AIC: 979.36

Number of Fisher Scoring iterations: 2

> vif(model4)
Gender Engineer MBA Salary Distance license
1.069861 1.013131 1.017755 1.555869 1.255401 1.428946
> #############################################################################################
> # END OF PROJECT
> #############################################################################################

27 | P A G E

Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
No ratings yet
Intracranial Aneurysms by Andrew J Ringer Ebook and TestBank Bundle Official Test Bank
311 pages
04 Edu91 FM Practice Sheets FM Solution (Not To Print)
No ratings yet
04 Edu91 FM Practice Sheets FM Solution (Not To Print)
154 pages
ProfEd221 - Unit 5 - Feedbacking and Communicating Assessment Results PDF
100% (4)
ProfEd221 - Unit 5 - Feedbacking and Communicating Assessment Results PDF
12 pages
Dreams - Interpreting Your Dreams and How To Dream Your Desires - Lucid Dreaming, Visions and Dream Interpretation PDF
100% (1)
Dreams - Interpreting Your Dreams and How To Dream Your Desires - Lucid Dreaming, Visions and Dream Interpretation PDF
70 pages
3 Happiness Exercises
No ratings yet
3 Happiness Exercises
20 pages
Project: Submitted By: Abhijit Kumar Kalita
90% (21)
Project: Submitted By: Abhijit Kumar Kalita
44 pages
WEG - Transformer
No ratings yet
WEG - Transformer
20 pages
Denim Washing
78% (9)
Denim Washing
141 pages
Credit Risk Analysis - Janani Prakash
87% (15)
Credit Risk Analysis - Janani Prakash
46 pages
Docmine: Spare Parts Catalog
No ratings yet
Docmine: Spare Parts Catalog
83 pages
MS For Survey Works (Draft) R5
No ratings yet
MS For Survey Works (Draft) R5
47 pages
FM CHAPTER 2 TWO. PPT Slides - PPT 2
No ratings yet
FM CHAPTER 2 TWO. PPT Slides - PPT 2
90 pages
Project Report - FRA V1.0
71% (7)
Project Report - FRA V1.0
28 pages
Moog Valves DIVelectricalInterfaces Manual
No ratings yet
Moog Valves DIVelectricalInterfaces Manual
108 pages
Polyester Microfiber
83% (6)
Polyester Microfiber
37 pages
FRA Project Report Milestone 1 PDF
No ratings yet
FRA Project Report Milestone 1 PDF
29 pages
CHAPTER TWO FM For Students
No ratings yet
CHAPTER TWO FM For Students
70 pages
FRA Module Business Report
100% (1)
FRA Module Business Report
37 pages
Data File Handling Worksheet
No ratings yet
Data File Handling Worksheet
10 pages
Setting Up The Financial Statement Model
No ratings yet
Setting Up The Financial Statement Model
45 pages
Problem Set #1: CASE STUDY: Ratio Analysis
No ratings yet
Problem Set #1: CASE STUDY: Ratio Analysis
17 pages
FSA Appendix 2
No ratings yet
FSA Appendix 2
267 pages
حل تحديدات دكتور سعيد
No ratings yet
حل تحديدات دكتور سعيد
31 pages
Performance Management System of Accenture: Presented by
No ratings yet
Performance Management System of Accenture: Presented by
18 pages
FRA Project Business Report
100% (2)
FRA Project Business Report
27 pages
Sale Smart's+Financial+Ratios+Pack
100% (1)
Sale Smart's+Financial+Ratios+Pack
23 pages
Financial Ratios Soln
No ratings yet
Financial Ratios Soln
12 pages
3 Financial Ratio Analysis - 2023
No ratings yet
3 Financial Ratio Analysis - 2023
57 pages
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
No ratings yet
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
4 pages
Solution For FM Extra Questions
No ratings yet
Solution For FM Extra Questions
130 pages
Hazop Ip
No ratings yet
Hazop Ip
117 pages
Spreading and Cutting Machinary
100% (1)
Spreading and Cutting Machinary
21 pages
fm3 Chapter03
No ratings yet
fm3 Chapter03
44 pages
Topic 2 Linear Programming
No ratings yet
Topic 2 Linear Programming
64 pages
Anuj
No ratings yet
Anuj
227 pages
Sizing Flexible Enough For Any Demand
100% (1)
Sizing Flexible Enough For Any Demand
8 pages
Ergonomics in Carpentry
No ratings yet
Ergonomics in Carpentry
32 pages
Corpuz, Aily F-Fm2-2-Midterm Practice Problem
No ratings yet
Corpuz, Aily F-Fm2-2-Midterm Practice Problem
5 pages
Retail Program
No ratings yet
Retail Program
70 pages
eME4 HW5 Flores BSME-4B
No ratings yet
eME4 HW5 Flores BSME-4B
4 pages
FCF 9th Edition Chapter 03
No ratings yet
FCF 9th Edition Chapter 03
37 pages
Year 0 1 2 3 Income Statement
No ratings yet
Year 0 1 2 3 Income Statement
56 pages
GMB - 722 Presentation - Financial Statement Analysis
No ratings yet
GMB - 722 Presentation - Financial Statement Analysis
37 pages
AFM Group Assignment 2
No ratings yet
AFM Group Assignment 2
10 pages
MSBC 5060: Financial Statement Analysis and Financial Models
No ratings yet
MSBC 5060: Financial Statement Analysis and Financial Models
51 pages
CFM May 2016 SS
No ratings yet
CFM May 2016 SS
9 pages
Introduction-to-TikTok-Shop-Affiliate-Program 2
No ratings yet
Introduction-to-TikTok-Shop-Affiliate-Program 2
10 pages
0serial No: Ratio's Company's Result (2009)
No ratings yet
0serial No: Ratio's Company's Result (2009)
25 pages
Al-7020 Paper
No ratings yet
Al-7020 Paper
12 pages
4228-XLS-ENG Solucion Monmouth
No ratings yet
4228-XLS-ENG Solucion Monmouth
13 pages
Fsa Answers
No ratings yet
Fsa Answers
22 pages
Ejercicios de Finanzas
No ratings yet
Ejercicios de Finanzas
25 pages
8300 Gcse Maths Practice Paper Set 3 1f Mark Scheme.321526845
No ratings yet
8300 Gcse Maths Practice Paper Set 3 1f Mark Scheme.321526845
17 pages
FM S 2012 HW5 Solution
No ratings yet
FM S 2012 HW5 Solution
49 pages
9-Mm Pistol Pmi Training: REF: FM 23 - 35
No ratings yet
9-Mm Pistol Pmi Training: REF: FM 23 - 35
30 pages
Costing of Apparel Products: Session I
No ratings yet
Costing of Apparel Products: Session I
29 pages
ACPH Formula
No ratings yet
ACPH Formula
4 pages
Alle FSA Exercises
No ratings yet
Alle FSA Exercises
11 pages
Importance of Return On Capital Invested - ROIC - 1
No ratings yet
Importance of Return On Capital Invested - ROIC - 1
28 pages
CIMA F2 AFR Chapter 17
No ratings yet
CIMA F2 AFR Chapter 17
46 pages
FRA Group Assignment - Report
No ratings yet
FRA Group Assignment - Report
22 pages
Carbon Dioxide Enhanced Oil Recovery in The United States: Snapshot and Forecast
No ratings yet
Carbon Dioxide Enhanced Oil Recovery in The United States: Snapshot and Forecast
45 pages
Setting Up The Financial Statement Model
No ratings yet
Setting Up The Financial Statement Model
25 pages
Preliminary Analytical Review-Partial
No ratings yet
Preliminary Analytical Review-Partial
12 pages
Project Financial Management Assesment 1
No ratings yet
Project Financial Management Assesment 1
11 pages
Process Design and Reengineering
No ratings yet
Process Design and Reengineering
23 pages
Suggested Solution AO2 2017
No ratings yet
Suggested Solution AO2 2017
13 pages
Sonal Fra Milestone1 v1
No ratings yet
Sonal Fra Milestone1 v1
20 pages
12-WS Sing 0301 CLB
No ratings yet
12-WS Sing 0301 CLB
7 pages
Office Ergonomics 2
No ratings yet
Office Ergonomics 2
21 pages
FRA Assignment - India Credit Model
No ratings yet
FRA Assignment - India Credit Model
14 pages
Test and Exam Qs Topic 2 - Solutions - v2 PDF
No ratings yet
Test and Exam Qs Topic 2 - Solutions - v2 PDF
20 pages
Horizontal Analysis: Balance Sheet
No ratings yet
Horizontal Analysis: Balance Sheet
9 pages
Interpreting Fin Statements
No ratings yet
Interpreting Fin Statements
22 pages
Senior Management 1
No ratings yet
Senior Management 1
3 pages
Seminar 4 N1591 - MCK Chaps 9 & 10 Questions
No ratings yet
Seminar 4 N1591 - MCK Chaps 9 & 10 Questions
4 pages
Ratios Analysis For Adidas
No ratings yet
Ratios Analysis For Adidas
7 pages
1st Sem Syllabus (ICT 416 Programming Concept With C)
No ratings yet
1st Sem Syllabus (ICT 416 Programming Concept With C)
5 pages
Basic Practice of Ratio Analysis Numerical
No ratings yet
Basic Practice of Ratio Analysis Numerical
9 pages
DAILY LESSON LOG Organic Compounds
No ratings yet
DAILY LESSON LOG Organic Compounds
4 pages
Rubric 4
No ratings yet
Rubric 4
5 pages
Problem Solving 2 Q PDF
No ratings yet
Problem Solving 2 Q PDF
3 pages
IMT Ceres
No ratings yet
IMT Ceres
7 pages
Finance and Risk Analytics GRP Assgn
No ratings yet
Finance and Risk Analytics GRP Assgn
7 pages
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
No ratings yet
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
9 pages
To The Point
No ratings yet
To The Point
6 pages
Subject: Fundamentals of Corporate Finance: Mr. Raza Saeed Hailey College of Banking & Finance
No ratings yet
Subject: Fundamentals of Corporate Finance: Mr. Raza Saeed Hailey College of Banking & Finance
8 pages
Install
No ratings yet
Install
3 pages
Cross-Sectional (Comparative) Comparisons Are Made by Comparing Similar Ratios For Firms Within The Same
No ratings yet
Cross-Sectional (Comparative) Comparisons Are Made by Comparing Similar Ratios For Firms Within The Same
7 pages
Vertic
No ratings yet
Vertic
4 pages
INS3030 - Financial Report Analysis - Chu Huy Anh - Đề 3
No ratings yet
INS3030 - Financial Report Analysis - Chu Huy Anh - Đề 3
4 pages
PGPM2020-21 - DIVA - Group19 - PIT (Performance Management) Bhupati
No ratings yet
PGPM2020-21 - DIVA - Group19 - PIT (Performance Management) Bhupati
4 pages
Profitability Ratios: Profit Margin
No ratings yet
Profitability Ratios: Profit Margin
6 pages
Tutorial 07
No ratings yet
Tutorial 07
3 pages
Sepam80 64REF Wiring 4wire Low-Voltage Transformer T81 v0
No ratings yet
Sepam80 64REF Wiring 4wire Low-Voltage Transformer T81 v0
2 pages
Analyzing Financial Statements Exercise
No ratings yet
Analyzing Financial Statements Exercise
4 pages
Concept Note Adjustable Dress Form
No ratings yet
Concept Note Adjustable Dress Form
2 pages
Homework 2 DSP
No ratings yet
Homework 2 DSP
2 pages
Case Study: Gates Corporation Blueprint
No ratings yet
Case Study: Gates Corporation Blueprint
2 pages
PGPM 2021 - FPM - Individual Assignment - Final
No ratings yet
PGPM 2021 - FPM - Individual Assignment - Final
1 page
Printlac High Gloss TDS
No ratings yet
Printlac High Gloss TDS
2 pages
F2 Chapter 18
No ratings yet
F2 Chapter 18
0 pages