Final - Project - Model 1
Final - Project - Model 1
27/02/2023
library(psych)
library(rstatix)
##
## Attaching package: ’rstatix’
RE <- c(data1$RE)
class.df<- data.frame(Alt,EBIT,NWC,Sale,equity,RE)
1
View(class.df)
summary(data1)
Correlation test
2
Correlation matrices
cor_mat(class.df)
## # A tibble: 6 x 7
## rowname Alt EBIT NWC Sale equity RE
## * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Alt 1 0.21 0.15 0.058 -0.4 0.081
## 2 EBIT 0.21 1 0.19 0.21 -0.017 -0.027
## 3 NWC 0.15 0.19 1 0.12 -0.063 -0.1
## 4 Sale 0.058 0.21 0.12 1 0.08 -0.043
## 5 equity -0.4 -0.017 -0.063 0.08 1 -0.047
## 6 RE 0.081 -0.027 -0.1 -0.043 -0.047 1
EBIT
IT
EB
NWC
C
W
N
Sale
le
Sa
equity
ty
ui
eq
RE
3
identify_outliers(data1, Alt)
4
## 51 ANET UN Equity 21.20 0.13 0.13 0.51 0.00 0.31 TRUE FALSE
## 52 CDNS UW Equity 24.27 0.21 0.29 0.68 0.01 0.40 TRUE FALSE
## 53 ENPH UQ Equity 22.25 0.10 0.14 0.66 0.04 -0.09 TRUE FALSE
## 54 EPAM UN Equity 33.88 0.36 0.45 1.07 0.01 0.29 TRUE TRUE
## 55 INTU UW Equity 25.08 0.16 0.65 0.62 0.02 0.70 TRUE TRUE
## 56 MPWR UW Equity 58.58 0.27 0.38 0.76 0.00 0.14 TRUE TRUE
## 57 NVDA UW Equity 40.40 0.28 0.51 0.58 0.02 0.52 TRUE TRUE
## 58 TXN UW Equity 17.22 0.11 0.23 0.74 0.05 1.62 TRUE FALSE
## 59 TYL UN Equity 24.01 0.14 0.11 0.34 0.06 0.19 TRUE FALSE
## 60 VRSN UW Equity -4.95 0.11 0.27 0.67 0.06 -8.31 TRUE FALSE
## 61 CDNS UW Equity 19.39 0.23 0.20 0.68 0.01 0.46 TRUE FALSE
## 62 MPWR UW Equity 31.51 0.29 0.32 0.76 0.00 0.47 TRUE TRUE
## 63 ADBE UW Equity 18.05 0.07 0.09 0.65 0.01 0.25 TRUE FALSE
## 64 ANET UN Equity 26.23 0.15 0.12 0.65 0.00 -0.45 TRUE TRUE
## 65 CDNS UW Equity 22.40 0.21 0.19 0.69 0.02 0.48 TRUE FALSE
## 66 FICO UN Equity 18.37 0.06 0.16 0.96 0.09 0.86 TRUE FALSE
## 67 MPWR UW Equity 49.34 0.25 0.38 0.87 0.00 0.78 TRUE TRUE
## 68 NVDA UW Equity 40.60 0.28 0.48 0.65 0.02 1.66 TRUE TRUE
## 69 VRSN UW Equity -5.36 0.10 0.12 0.82 0.08 2.47 TRUE FALSE
identify_outliers(data1, EBIT)
5
## 32 HPQ UN Equity 2.48 0.32 0.00 1.45 0.41 0.00 TRUE FALSE
## 33 ZBRA UW Equity 5.97 0.41 0.12 0.77 0.17 1.80 TRUE FALSE
identify_outliers(data1, NWC)
identify_outliers(data1, Sale)
6
## 38 JBL UN Equity 2.67 0.05 0.07 1.89 0.64 0.05 TRUE TRUE
## 39 AMD UW Equity 34.78 0.44 0.10 1.32 0.00 -0.57 TRUE FALSE
## 40 CDW UW Equity 5.61 0.16 0.42 1.58 0.26 0.00 TRUE FALSE
## 41 HPQ UN Equity 2.84 0.31 0.03 1.64 0.27 0.00 TRUE FALSE
## 42 JBL UN Equity 2.75 0.11 0.14 1.76 0.41 0.12 TRUE TRUE
## 43 ACN UN Equity 8.12 0.12 -0.17 1.30 0.02 0.07 TRUE FALSE
## 44 AMD UW Equity 15.34 0.44 0.10 1.32 0.00 0.18 TRUE FALSE
## 45 CDW UW Equity 4.44 0.23 0.55 1.58 0.26 0.00 TRUE FALSE
## 46 HPQ UN Equity 2.68 0.34 -0.05 1.63 0.45 0.00 TRUE FALSE
## 47 JBL UN Equity 2.62 0.19 0.16 1.70 0.46 0.07 TRUE FALSE
## 48 STX UW Equity 1.87 0.11 0.07 1.30 0.39 6.38 TRUE FALSE
## 49 CDW UW Equity 5.55 0.10 0.40 1.81 0.25 0.00 TRUE TRUE
## 50 HPQ UN Equity 2.48 0.32 0.00 1.45 0.41 0.00 TRUE FALSE
## 51 JBL UN Equity 3.18 0.22 0.18 1.79 0.24 0.09 TRUE TRUE
identify_outliers(data1, equity)
7
## 37 HPE UN Equity 0.85 0.22 0.03 0.50 0.74 0.28 TRUE TRUE
## 38 HPQ UN Equity 2.68 0.34 -0.05 1.63 0.45 0.00 TRUE FALSE
## 39 IBM UN Equity 5.20 0.07 -0.14 0.43 0.46 -0.01 TRUE FALSE
## 40 JBL UN Equity 2.62 0.19 0.16 1.70 0.46 0.07 TRUE FALSE
## 41 WDC UW Equity 3.20 0.11 0.22 0.72 0.54 0.08 TRUE FALSE
## 42 GEN UW Equity 2.40 0.03 0.05 0.21 0.89 0.55 TRUE TRUE
## 43 HPE UN Equity 1.07 0.22 0.10 0.51 0.68 0.29 TRUE TRUE
## 44 STX UW Equity 0.35 0.19 0.00 0.98 0.45 11.15 TRUE FALSE
## 45 WDC UW Equity 2.30 0.16 0.12 0.50 0.60 0.10 TRUE FALSE
identify_outliers(data1, RE)
8
## 42 FTNT UW Equity 6.99 0.01 0.02 0.64 0.00 7.61 TRUE TRUE
## 43 IT UN Equity 3.58 0.12 0.30 0.56 0.20 5.18 TRUE TRUE
## 44 KLAC UW Equity 6.26 0.13 0.58 0.63 0.12 1.60 TRUE FALSE
## 45 NVDA UW Equity 41.28 0.18 0.53 0.63 0.02 2.90 TRUE TRUE
## 46 ON UW Equity 3.21 0.07 0.05 0.61 0.27 1.41 TRUE FALSE
## 47 PTC UW Equity 5.79 0.05 0.06 0.43 0.13 47.06 TRUE TRUE
## 48 QRVO UW Equity 6.94 0.25 0.47 0.49 0.18 -0.90 TRUE FALSE
## 49 STX UW Equity 2.60 0.04 0.03 1.18 0.35 2.70 TRUE TRUE
## 50 TYL UN Equity 21.56 0.13 0.08 0.43 0.00 -1.34 TRUE FALSE
## 51 VRSN UW Equity -6.01 0.13 0.09 0.72 0.07 1.71 TRUE FALSE
## 52 ZBRA UW Equity 9.60 0.18 0.07 0.83 0.07 1.79 TRUE FALSE
## 53 FICO UN Equity 12.74 0.04 0.12 0.84 0.12 1.40 TRUE FALSE
## 54 IBM UN Equity 4.42 0.09 -0.09 0.43 0.46 1.23 TRUE FALSE
## 55 TXN UW Equity 17.22 0.11 0.23 0.74 0.05 1.62 TRUE FALSE
## 56 VRSN UW Equity -4.95 0.11 0.27 0.67 0.06 -8.31 TRUE TRUE
## 57 AKAM UW Equity 3.82 0.20 0.09 0.43 0.15 1.31 TRUE FALSE
## 58 ANSS UW Equity 11.79 -0.02 -0.15 0.30 0.03 -0.88 TRUE FALSE
## 59 ENPH UQ Equity 14.63 0.10 0.14 0.66 0.04 1.32 TRUE FALSE
## 60 FFIV UW Equity 4.33 0.09 0.16 0.51 0.08 3.58 TRUE TRUE
## 61 FSLR UW Equity 8.26 0.12 -0.10 0.39 0.04 4.35 TRUE TRUE
## 62 FTNT UW Equity 5.60 0.01 0.01 0.56 0.02 6.60 TRUE TRUE
## 63 IT UN Equity 5.23 0.10 0.11 0.64 0.12 5.67 TRUE TRUE
## 64 KEYS UN Equity 7.76 0.04 0.03 0.67 0.06 1.21 TRUE FALSE
## 65 KLAC UW Equity 5.83 0.17 0.57 0.73 0.15 1.90 TRUE TRUE
## 66 NVDA UW Equity 15.26 0.31 0.45 0.61 0.02 1.27 TRUE FALSE
## 67 ON UW Equity 5.60 0.07 0.05 0.70 0.11 1.22 TRUE FALSE
## 68 PTC UW Equity 6.00 0.14 -0.12 0.41 0.13 34.71 TRUE TRUE
## 69 STX UW Equity 1.87 0.11 0.07 1.30 0.39 6.38 TRUE TRUE
## 70 TYL UN Equity 6.55 0.14 0.11 0.34 0.06 -4.25 TRUE TRUE
## 71 VRSN UW Equity -4.47 0.12 0.20 0.67 0.06 1.87 TRUE FALSE
## 72 ZBRA UW Equity 7.99 0.38 0.11 0.91 0.04 1.98 TRUE TRUE
## 73 AKAM UW Equity 4.66 0.17 0.10 0.44 0.24 1.44 TRUE FALSE
## 74 ANSS UW Equity 15.68 0.03 -0.12 0.31 0.04 -0.80 TRUE FALSE
## 75 FFIV UW Equity 5.73 -0.05 0.10 0.54 0.03 3.09 TRUE TRUE
## 76 FSLR UW Equity 5.82 0.15 -0.11 0.32 0.01 4.59 TRUE TRUE
## 77 FTNT UW Equity 5.20 0.01 0.01 0.71 0.03 7.59 TRUE TRUE
## 78 IT UN Equity 6.56 0.10 -0.02 0.75 0.12 6.29 TRUE TRUE
## 79 KEYS UN Equity 7.21 0.05 -0.01 0.63 0.10 2.05 TRUE TRUE
## 80 KLAC UW Equity 7.05 0.26 0.56 0.75 0.09 2.01 TRUE TRUE
## 81 NVDA UW Equity 40.60 0.28 0.48 0.65 0.02 1.66 TRUE FALSE
## 82 PTC UW Equity 5.73 0.08 -0.10 0.33 0.11 24.52 TRUE TRUE
## 83 STX UW Equity 0.35 0.19 0.00 0.98 0.45 11.15 TRUE TRUE
## 84 TYL UN Equity 9.06 0.13 0.05 0.39 0.08 -6.69 TRUE TRUE
## 85 VRSN UW Equity -5.36 0.10 0.12 0.82 0.08 2.47 TRUE TRUE
## 86 ZBRA UW Equity 5.97 0.41 0.12 0.77 0.17 1.80 TRUE FALSE
Test Normality
shapiro_test(data1$Alt)
## # A tibble: 1 x 3
## variable statistic p.value
9
## <chr> <dbl> <dbl>
## 1 data1$Alt 0.776 1.08e-28
shapiro_test(data1$EBIT)
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 data1$EBIT 0.961 4.37e-12
shapiro_test(data1$NWC)
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 data1$NWC 0.960 2.88e-12
shapiro_test(data1$Sale)
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 data1$Sale 0.830 1.50e-25
shapiro_test(data1$equity)
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 data1$equity 0.752 7.22e-30
shapiro_test(data1$RE)
## # A tibble: 1 x 3
## variable statistic p.value
## <chr> <dbl> <dbl>
## 1 data1$RE 0.280 4.00e-44
To make things easier, we will extract the two variables we are using into their own variable form.
# scatter plot
plot(data1$Alt, data1$EBIT)
10
0.4
data1$EBIT
0.2
0.0
−0.2
−0.4
−10 0 10 20 30 40 50 60
data1$Alt
# scatter plot
plot(data1$Alt, data1$NWC)
11
1.0
data1$NWC
0.5
0.0
−10 0 10 20 30 40 50 60
data1$Alt
# scatter plot
plot(data1$Alt, data1$Sale)
12
2.0
1.5
data1$Sale
1.0
0.5
−10 0 10 20 30 40 50 60
data1$Alt
# scatter plot
plot(data1$Alt, data1$equity)
13
1.5
1.0
data1$equity
0.5
0.0
−10 0 10 20 30 40 50 60
data1$Alt
# scatter plot
plot(data1$Alt, data1$RE)
14
40
30
data1$RE
20
10
0
−10
−10 0 10 20 30 40 50 60
data1$Alt
# scatter plot
plot(data1$Alt, data1$EBIT)
15
0.4
data1$EBIT
0.2
0.0
−0.2
−0.4
−10 0 10 20 30 40 50 60
data1$Alt
The scatterplot shows that there is a relationship between the Alt and explaintory variables, including EBIT,
NWC, Sale, Equity, RE. Let’s check the correlation:
Correlation test
# Correlation
cor.test(data1$Alt,data1$EBIT)
##
## Pearson’s product-moment correlation
##
## data: data1$Alt and data1$EBIT
## t = 5.5406, df = 638, p-value = 4.412e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1390699 0.2869966
## sample estimates:
## cor
## 0.2142615
# Correlation
cor.test(data1$Alt,data1$NWC)
##
16
## Pearson’s product-moment correlation
##
## data: data1$Alt and data1$NWC
## t = 3.9106, df = 638, p-value = 0.0001019
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.07640358 0.22779825
## sample estimates:
## cor
## 0.1529985
# Correlation
cor.test(data1$Alt,data1$Sale)
##
## Pearson’s product-moment correlation
##
## data: data1$Alt and data1$Sale
## t = 1.4655, df = 638, p-value = 0.1433
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.01966676 0.13481806
## sample estimates:
## cor
## 0.05792239
# Correlation
cor.test(data1$Alt,data1$equity)
##
## Pearson’s product-moment correlation
##
## data: data1$Alt and data1$equity
## t = -10.999, df = 638, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4624391 -0.3320195
## sample estimates:
## cor
## -0.3992471
# Correlation
cor.test(data1$Alt,data1$RE)
##
## Pearson’s product-moment correlation
##
## data: data1$Alt and data1$RE
## t = 2.0613, df = 638, p-value = 0.03968
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.003860737 0.157843147
17
## sample estimates:
## cor
## 0.08133727
The line of best fit is the statistical estimate (“best guess”) for the coefficients in the model
y = β0 + β1 x
summary(p_mod)
##
## Call:
## lm(formula = Alt ~ EBIT + NWC + Sale + equity + RE, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.964 -3.372 -1.288 1.764 45.715
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.92781 0.66756 10.378 < 2e-16 ***
## EBIT 14.22059 2.84917 4.991 7.77e-07 ***
## NWC 3.70314 1.38313 2.677 0.00761 **
## Sale 0.87019 0.73486 1.184 0.23680
## equity -16.23060 1.47342 -11.016 < 2e-16 ***
## RE 0.20713 0.09188 2.254 0.02452 *
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 6.67 on 634 degrees of freedom
## Multiple R-squared: 0.2184, Adjusted R-squared: 0.2122
## F-statistic: 35.42 on 5 and 634 DF, p-value: < 2.2e-16
So from the output of the model, we have the intercept (β0 ) equaling 149.7477, and the slope (β1 ) equalling
0.25924. We can round these - that much precision is silly - giving our model fit as
Alt = 6.92 + 14.2 · EBIT + 3.70 · NWC + 0.87 · Sale − 16.23 · Equity + 0.20 · RE
18
• RunTime coefficient: for every additional minute of run time, our factory will produce 2.8 more items
on average;
• Intercept (-367): It takes 367 items worth of production time to setup the factory to produce units.
Residuals vs Fitted
10 20 30 40 50
422486
614
Residuals
0
−20
−15 −10 −5 0 5 10 15
Fitted values
lm(Alt ~ EBIT + NWC + Sale + equity + RE)
19
Q−Q Residuals
422 486
6
Standardized residuals
614
4
2
0
−2
−3 −2 −1 0 1 2 3
Theoretical Quantiles
lm(Alt ~ EBIT + NWC + Sale + equity + RE)
shapiro.test(p_mod$residuals)
##
## Shapiro-Wilk normality test
##
## data: p_mod$residuals
## W = 0.81856, p-value < 2.2e-16
hist(p_mod$residuals)
20
Histogram of p_mod$residuals
300
200
Frequency
50 100
0
−10 0 10 20 30 40 50
p_mod$residuals
21