0% found this document useful (0 votes)

8 views16 pages

R Class 21

The document presents two solutions involving linear regression analyses in R. Solution 1 analyzes the relationship between failure time, material type, and rainfall, revealing that material type is significant while rainfall is not. Solution 2 examines the impact of state, class, gender, and age on payment amounts, with significant coefficients for certain states and the intercept.

Uploaded by

sarthakgarg0401

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views16 pages

R Class 21

Uploaded by

sarthakgarg0401

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

SOLUTION 1

> d1<-read.csv(file.choose(),header = T);d1 # to open data file in R

X Failure.Time Material.Type Rainfall
1 1 0.093496420 1 0.2049910
2 2 0.064299790 2 0.2459758
3 3 0.037432940 3 0.1037756
4 4 0.036485400 4 0.3138880
5 5 0.080959110 1 0.2020806
6 6 0.002198732 2 0.2545738
7 7 0.028674680 3 0.2701437
8 8 0.032782080 4 0.3307581
9 9 0.074037110 1 0.2911203
10 10 0.059623200 2 0.2500904
11 11 0.030189960 3 0.1641475
12 12 0.030789370 4 0.4265417
13 13 0.071302530 1 0.1216770
14 14 0.046903810 2 0.3412739
15 15 0.033226010 3 0.3929279
16 16 0.038243150 4 0.3974997
17 17 0.064787050 1 0.2993994
18 18 0.042787140 2 0.3332971
19 19 0.033838870 3 0.2108672
20 20 0.025070570 4 0.2722977

# to fit the linear Model

> fit<-lm(d1$Failure.Time~d1$Material.Type+d1$Rainfall)
> fit

Call:
lm(formula = d1$Failure.Time ~ d1$Material.Type + d1$Rainfall)

Coefficients:
(Intercept) d1$Material.Type d1$Rainfall
0.081086 -0.014499 0.005591

> summary(fit)

Call:
lm(formula = d1$Failure.Time ~ d1$Material.Type + d1$Rainfall)

Residuals:
Min 1Q Median 3Q Max
-0.051313 -0.006694 0.002246 0.008590 0.025763
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.081086 0.012659 6.406 6.52e-06 *** pvalue not greater than 0.05
d1$Material.Type -0.014499 0.003575 -4.055 0.000822 *** pvalue not greater than 0.05
d1$Rainfall 0.005591 0.046745 0.120 0.906200 pvalue greater than 0.05
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01627 on 17 degrees of freedom

Multiple R-squared: 0.5326, Adjusted R-squared: 0.4776
F-statistic: 9.687 on 2 and 17 DF, p-value: 0.001556

𝑎) 𝐹𝑟𝑜𝑚 𝑎𝑛𝑎𝑙𝑦𝑠𝑖𝑛𝑔 𝑡ℎ𝑒 𝑅 𝑜𝑢𝑡𝑝𝑢𝑡, 𝑤𝑒 𝑠𝑒𝑒 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑓𝑖𝑡𝑡𝑒𝑑 𝑙𝑖𝑛𝑒𝑎𝑟 𝑚𝑜𝑑𝑒𝑙 𝑖𝑠:

𝑦̂𝑖 = 0.081086 – 0.014499𝑥1 + 0.005591𝑥2

𝑤ℎ𝑒𝑟𝑒 𝑥1 𝑖𝑠 𝑡ℎ𝑒 ‘𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙 𝑡𝑦𝑝𝑒’ 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒, 𝑎𝑛𝑑 𝑥2 𝑖𝑠 𝑡ℎ𝑒 ‘𝑟𝑎𝑖𝑛𝑓𝑎𝑙𝑙’ 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
b) The R output shows that the ‘material type’ parameter is significantly different to zero (at the 0.1% level), [1]
(p value is not greater than 0.05, this means Ho is rejected)
but the ‘rainfall’ parameter is not significantly different to zero ( P value is greater than 0.05, this means Ho is not
rejected.)

(iii) (a)
# plot the residuals
> plot(residuals(fit))

th
(b) The residuals exhibit a fairly random scatter around zero (independent) apart from the 6 point.

#remove the sixth row

>
> d1new<-d1[-6,] # negative sign indicate except 6th row all elements

X Failure.Time Material.Type Rainfall

1 1 0.09349642 1 0.2049910
2 2 0.06429979 2 0.2459758
3 3 0.03743294 3 0.1037756
4 4 0.03648540 4 0.3138880
5 5 0.08095911 1 0.2020806
7 7 0.02867468 3 0.2701437
8 8 0.03278208 4 0.3307581
9 9 0.07403711 1 0.2911203
10 10 0.05962320 2 0.2500904
11 11 0.03018996 3 0.1641475
12 12 0.03078937 4 0.4265417
13 13 0.07130253 1 0.1216770
14 14 0.04690381 2 0.3412739
15 15 0.03322601 3 0.3929279
16 16 0.03824315 4 0.3974997
17 17 0.06478705 1 0.2993994
18 18 0.04278714 2 0.3332971
19 19 0.03383887 3 0.2108672
20 20 0.02507057 4 0.2722977

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
(b) The residuals plot indicates that data point 6 is an outlier.

refit<-lm(d1new$Failure.Time~d1new$Material.Type+d1new$Rainfall);refit
Call:
lm(formula = d1new$Failure.Time ~ d1new$Material.Type + d1new$Rainfall)
Coefficients:
(Intercept) d1new$Material.Type d1new$Rainfall
0.086629 -0.015576 0.005152

summary(refit)

Call:
lm(formula = d1new$Failure.Time ~ d1new$Material.Type + d1new$Rainfall)

Residuals:
Min 1Q Median 3Q Max
-0.0144060 -0.0082529 -0.0003768 0.0071557 0.0213878

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.086629 0.008094 10.703 1.06e-08 ***
d1new$Material.Type -0.015576 0.002275 -6.846 3.93e-06 ***
d1new$Rainfall 0.005152 0.029621 0.174 0.864
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01031 on 16 degrees of freedom

Multiple R-squared: 0.7756, Adjusted R-squared: 0.7475
F-statistic: 27.64 on 2 and 16 DF, p-value: 6.438e-06
2
The adjusted R statistic for the model fitted to the data with the outlier removed is 0.7475. This shows an
2
improved fit relative to the model fitted to all 20 data points, which had an adjusted R statistic of 0.4776.

> gfit<-glm(d1new$Failure.Time~d1new$Material.Type+d1new$Rainfall,family= Gamma)

> gfit

Call: glm(formula = d1new$Failure.Time ~ d1new$Material.Type + d1new$Rainfall,

family = Gamma)

Coefficients:
(Intercept) d1new$Material.Type d1new$Rainfall
6.1921 6.7286 0.4814

Degrees of Freedom: 18 Total (i.e. Null); 16 Residual

Null Deviance: 2.998
Residual Deviance: 0.485 AIC: -125.5

(b) The fitted model is:

# extract the coefficients

coef(gfit)

(Intercept) d1new$Material.Type d1new$Rainfall

6.1920558 6.7286452 0.4814427

𝜂̂ = 1/𝜇 = 6.1920558 + 6.7286452𝑥1 + 0.4814427𝑥2

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
where x1 is the ‘material type’ variable, and x2 is the ‘rainfall’ variable.

review the model fit

> summary(gfit)
Call:
glm(formula = d1new$Failure.Time ~ d1new$Material.Type + d1new$Rainfall,
family = Gamma)

Deviance Residuals:
Min 1Q Median 3Q Max
-0.26231 -0.14156 -0.03338 0.12850 0.25185

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.1921 2.6184 2.365 0.031 *
d1new$Material.Type 6.7286 0.8359 8.050 5.12e-07 ***
d1new$Rainfall 0.4814 10.6244 0.045 0.964
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.03085425)

Null deviance: 2.99847 on 18 degrees of freedom

Residual deviance: 0.48503 on 16 degrees of freedom
AIC: -125.54

Number of Fisher Scoring iterations: 4

Reviewing the model fit output from R, the ‘rainfall’ parameter is not significantly different to zero, whereas the
‘material type’ parameter is significant at the 0.1% level.

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
SOLUTION 2
> # Solution 2
> ac<-read.csv(file.choose(),header=T);ac
STATE CLASS GENDER AGE PAID
1 STATE 01 C6 M 43 2364.696
2 STATE 01 F6 M 43 18787.967
3 STATE 01 F6 M 43 27115.745
4 STATE 02 C1 M 43 15288.492
5 STATE 02 C11 M 43 2265.707
(no need of copy complete data, if data is large)

> reg<-lm(ac$PAID~ac$STATE+ac$CLASS+ac$GENDER+ac$AGE)
> reg
(alternative reg<-lm(ac$PAID~.,data=ac) # dot means all covariates)

Call:
lm(formula = ac$PAID ~ ac$STATE + ac$CLASS + ac$GENDER + ac$AGE)

Coefficients:
(Intercept) ac$STATESTATE 02 ac$STATESTATE 03 ac$STATESTATE 04 ac$STATESTATE 06
19818.12 -2306.41 -580.09 -689.08 440.79
ac$STATESTATE 07 ac$STATESTATE 10 ac$STATESTATE 12 ac$STATESTATE 14 ac$STATESTATE 15
-1254.29 2275.25 -752.99 -404.69 -4791.86
ac$STATESTATE 17 ac$CLASSC11 ac$CLASSC6 ac$CLASSF6 ac$GENDERM
-883.67 -11743.95 -14833.37 -225.16 -1193.01
ac$AGE
15.76
> summary(reg)

Call:
lm(formula = ac$PAID ~ ac$STATE + ac$CLASS + ac$GENDER + ac$AGE)

Residuals:
Min 1Q Median 3Q Max
-10462 -2276 119 1611 36377

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19818.12 1391.58 14.242 < 2e-16 ***
ac$STATESTATE 02 -2306.41 658.69 -3.502 0.000477 ***
ac$STATESTATE 03 -580.09 761.36 -0.762 0.446242
ac$STATESTATE 04 -689.08 702.04 -0.982 0.326495
ac$STATESTATE 06 440.79 752.27 0.586 0.558010
ac$STATESTATE 07 -1254.29 837.22 -1.498 0.134318
ac$STATESTATE 10 2275.25 885.44 2.570 0.010284 *
ac$STATESTATE 12 -752.99 850.10 -0.886 0.375897
ac$STATESTATE 14 -404.69 842.90 -0.480 0.631216
ac$STATESTATE 15 -4791.86 623.56 -7.685 2.87e-14 ***
ac$STATESTATE 17 -883.67 704.58 -1.254 0.209982
ac$CLASSC11 -11743.95 430.60 -27.274 < 2e-16 ***
ac$CLASSC6 -14833.37 410.84 -36.105 < 2e-16 ***
ac$CLASSF6 -225.16 517.68 -0.435 0.663670
ac$GENDERM -1193.01 215.50 -5.536 3.69e-08 ***
ac$AGE 15.76 24.10 0.654 0.513418
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3943 on 1401 degrees of freedom

Multiple R-squared: 0.6886, Adjusted R-squared: 0.6852
F-statistic: 206.5 on 15 and 1401 DF, p-value: < 2.2e-16

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
Explanation

R-Squared: 68.86% of the variation in the claims paid is explained by state, rating class, gender and age

Adjusted R-Squared: 68.52% is used to compare with other models, adjusts for the number of terms in the model.
We Use adjusted R-squared to compare the goodness-of-fit for regression models that contain differing numbers of
independent variables.

p-value of the model is <2.2*E-16 which is less than 0.05 and hence the null hypothesis of “There is no significant
linear relationship between the given independent variables X and a dependent variable Y” is rejected at 5% level of
significance. Using this model to predict the DV is better than simply using the expected value of the DV as a
predictor for the DV

p-value of the coefficients: While the model is overall significant, some of the variables may be insignificant. As
state 1, Rating class C1 and Gender female are taken as based states and their coefficients are clubbed in the
intercept itself, we observe that coefficients of State 2 and state 15 (Negative) and State 10 (Positive) are
significantly different from state 1 (At 95% Confidence level). Similarly rating classes C11 and C6 have
significantly negative coefficients compared to C1 indicating that the claim paid for those two rating classes is
significantly lesser compared to that of C1. Males have significantly lesser claim paid compared to females at 95%
confidence level

> anova(reg)
Analysis of Variance Table

Response: ac$PAID
Df Sum Sq Mean Sq F value Pr(>F)
ac$STATE 10 9.8027e+09 9.8027e+08 63.0629 < 2.2e-16 ***
ac$CLASS 3 3.7869e+10 1.2623e+10 812.0763 < 2.2e-16 ***
ac$GENDER 1 4.7271e+08 4.7271e+08 30.4106 4.158e-08 ***
ac$AGE 1 6.6423e+06 6.6423e+06 0.4273 0.5134
Residuals 1401 2.1778e+10 1.5544e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From the ANOVA table, we can infer that except Age, all other variables are significant in prediction of claims paid

(iii)
> summary(glmmodel)

Call:
glm(formula = indices$Sensex_direction ~ indices$BM + indices$CD +
indices$EN + indices$FM + indices$FI + indices$HC + indices$IN +
indices$IT + indices$TE + indices$UT, family = binomial(link = "logit"))

Deviance Residuals:
Min 1Q Median 3Q Max
-2.27544 -0.00117 0.00000 0.01354 1.75651

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0086 0.7315 -1.379 0.16796
indices$BM 7.7977 16.5255 0.472 0.63703
indices$CD -87.5335 42.6785 -2.051 0.04027 *
indices$EN 93.9675 38.3193 2.452 0.01420 *
indices$FM 41.1745 20.1436 2.044 0.04095 *
indices$FI 172.8807 60.8192 2.843 0.00448 **
indices$HC -6.4294 13.9394 -0.461 0.64463
indices$IN 4.1735 18.2152 0.229 0.81877
indices$IT 78.3494 30.9307 2.533 0.01131 *
indices$TE 29.9111 13.4184 2.229 0.02581 *
FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
indices$UT -14.4767 23.0602 -0.628 0.53015
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 223.213 on 163 degrees of freedom

Residual deviance: 32.905 on 153 degrees of freedom
AIC: 54.905

Number of Fisher Scoring iterations: 11

Sectors which have significantly impacted the direction of Sensex returns are CD, EN, FI,
FM, IT and TE at 95% Confidence level. But only FI has impacted the Sensex direction at
99% Confidence level

> anova(reg)
Analysis of Variance Table
Response: ac$PAID
Df Sum Sq Mean Sq F value Pr(>F)
ac$STATE 10 9.8027e+09 9.8027e+08 63.0629 < 2.2e-16 ***
ac$CLASS 3 3.7869e+10 1.2623e+10 812.0763 < 2.2e-16 ***
ac$GENDER 1 4.7271e+08 4.7271e+08 30.4106 4.158e-08 ***
ac$AGE 1 6.6423e+06 6.6423e+06 0.4273 0.5134
Residuals 1401 2.1778e+10 1.5544e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From the ANOVA table, we can infer that except Age, all other variables
are significant in prediction of claims paid > #Plot of residuals vs.
Fitted Values
>
plot(reg$fitted.values,reg$residuals,col=c("blue","red"))

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
The plot is used to detect non-linearity, unequal error variances, and outliers.
The residuals "do not bounce randomly" around the 0 line. This suggests that the
assumption that the relationship is linear is not reasonable.
The residuals do not form a "horizontal band" around the 0 line. This suggests that the
variances of the error terms are not equal and exhibit heteroscedasticity
A few residuals "stands out" from the basic random pattern of residuals.This suggests
that there are outliers.

# QQ Plot
> qqnorm(reg$residuals)

A Q-Q plot is a scatterplot created by plotting two sets of quantiles against one
another. If both sets of quantiles came from the same distribution, we should see the
points forming a line that’s roughly straight. Here it is not, indicating deviance of the
residuals from normality. Thus linear regression may not be a better fit to the data
#Reason for better model

(iv) Although Kurtosis is not in course, but they asked in IAI exam.

#Checking for the normality of Auto Claims Paid vs. Logairthm of Auto Claims Paid
#Writing Functions for Skewness and Kurtosis

skew<-function(x)mean((x-mean(x))^3)/sd(x)^3 [2]
kurt<-function(x)(mean((x-mean(x))^4)/sd(x)^4)-3 [2]
skew(AutoClaims$PAID) [0.5]
## [1] 2.619422
kurt(AutoClaims$PAID) [0.5]
## [1] 9.20876
skew(log(AutoClaims$PAID)) [0.5]
## [1] 0.4528057
FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
kurt(log(AutoClaims$PAID)) [0.5]
## [1] -0.787689
Skewness and Kurtosis of Log (Claims) are more close to Zero compared to those of actual
claims paid, thus indicating the possibility of using linear regression with this
dependent
variable [1]` [7]

# Using Natural Logarithm of the claims paid

model2<-lm(log(PAID)~.,data = AutoClaims)
summary(model2)

anova(model2)
Analysis of Variance Table
Response: log(PAID)
Df Sum Sq Mean Sq F value Pr(>F)
STATE 10 246.26 24.626 107.6969 < 2.2e-16 ***
CLASS 3 690.09 230.031 1006.0090 < 2.2e-16 ***
GENDER 1 10.88 10.882 47.5908 7.927e-12 ***
AGE 1 0.12 0.123 0.5384 0.4632
Residuals 1401 320.35 0.229
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
##
## Call:
## lm(formula = log(PAID) ~ ., data = AutoClaims)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.96098 -0.34264 -0.05047 0.36828 1.08237
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.896308 0.168777 58.635 < 2e-16 ***
## STATESTATE 02 -0.154804 0.079889 -1.938 0.0529 .
## STATESTATE 03 0.110585 0.092342 1.198 0.2313
## STATESTATE 04 0.049554 0.085147 0.582 0.5607
## STATESTATE 06 0.116190 0.091239 1.273 0.2031
## STATESTATE 07 0.142721 0.101543 1.406 0.1601
## STATESTATE 10 0.098014 0.107391 0.913 0.3616
## STATESTATE 12 0.027982 0.103105 0.271 0.7861
## STATESTATE 14 0.090316 0.102231 0.883 0.3771
## STATESTATE 15 -0.645918 0.075628 -8.541 < 2e-16 ***
## STATESTATE 17 0.004611 0.085455 0.054 0.9570
## CLASSC11 -1.203098 0.052225 -23.037 < 2e-16 ***
## CLASSC6 -1.988743 0.049829 -39.911 < 2e-16 ***
## CLASSF6 -0.034909 0.062787 -0.556 0.5783
## GENDERM -0.180923 0.026136 -6.922 6.75e-12 ***
## AGE 0.002145 0.002923 0.734 0.4632
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4782 on 1401 degrees of freedom
## Multiple R-squared: 0.7473, Adjusted R-squared: 0.7446
## F-statistic: 276.2 on 15 and 1401 DF, p-value: < 2.2e-16
Key Differences
1. R-Squared and Adjusted R-Squared improved and hence the model is a better
fit compared to the initial model [1.5

2. While a the significance level of a few factor coefficients when compared with the
base categories changed, the overall significant variables did not change which can
FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
be inferred from the ANOVA table [1.5]
[6]
v)
# Using Interaction effects in the model
model3<-lm(PAID~.+STATE:CLASS+STATE:GENDER+CLASS:GENDER,data = AutoClaims)
summary(model3) [5]
##
## Call:
## lm(formula = PAID ~ . + STATE:CLASS + STATE:GENDER + CLASS:GENDER,
## data = AutoClaims)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13110.3 -1475.9 -377.5 1250.5 20442.8
##
## Coefficients: (3 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24373.09 3236.06 7.532 9.08e-14 ***
## STATESTATE 02 -7868.94 3269.51 -2.407 0.016227 *
## STATESTATE 03 -3695.87 3422.76 -1.080 0.280427
## STATESTATE 04 6883.00 3545.51 1.941 0.052425 .
## STATESTATE 06 5428.58 3262.57 1.664 0.096363 .
## STATESTATE 07 -979.80 1184.59 -0.827 0.408314
## STATESTATE 10 7340.08 3546.35 2.070 0.038664 *
## STATESTATE 12 1048.26 3382.21 0.310 0.756659
## STATESTATE 14 -2796.37 3538.40 -0.790 0.429494
## STATESTATE 15 -14038.72 3164.04 -4.437 9.86e-06 ***
## STATESTATE 17 -4266.43 3640.98 -1.172 0.241490
## CLASSC11 -15321.38 3534.31 -4.335 1.56e-05 ***
## CLASSC6 -20741.82 3262.53 -6.358 2.79e-10 ***
## CLASSF6 4075.85 3386.90 1.203 0.229026
## GENDERM -5508.80 1305.95 -4.218 2.62e-05 ***
## AGE 23.06 19.35 1.192 0.233581
## STATESTATE 02:CLASSC11 4114.13 3666.38 1.122 0.262008
## STATESTATE 03:CLASSC11 2022.51 3802.06 0.532 0.594847
## STATESTATE 04:CLASSC11 -7719.04 3904.17 -1.977 0.048229 *
## STATESTATE 06:CLASSC11 -6209.18 3743.46 -1.659 0.097412 .
## STATESTATE 07:CLASSC11 -1049.15 1827.01 -0.574 0.565899
## STATESTATE 10:CLASSC11 -6795.03 3956.71 -1.717 0.086144 .
## STATESTATE 12:CLASSC11 -3756.14 3939.41 -0.953 0.340517
## STATESTATE 14:CLASSC11 1668.66 3927.62 0.425 0.671012
## STATESTATE 15:CLASSC11 7776.13 3581.01 2.171 0.030066 *
## STATESTATE 17:CLASSC11 2227.20 4011.07 0.555 0.578806
## STATESTATE 02:CLASSC6 5677.35 3416.32 1.662 0.096777 .
## STATESTATE 03:CLASSC6 2122.55 3605.64 0.589 0.556177
## STATESTATE 04:CLASSC6 -8231.17 3692.27 -2.229 0.025957 *
## STATESTATE 06:CLASSC6 -6151.60 3417.36 -1.800 0.072066 .
## STATESTATE 07:CLASSC6 NA NA NA NA
## STATESTATE 10:CLASSC6 -8117.07 3687.43 -2.201 0.027884 *

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3111 on 1361 degrees of freedom
## Multiple R-squared: 0.8116, Adjusted R-squared: 0.804
## F-statistic: 106.6 on 55 and 1361 DF, p-value: < 2.2e-16
Interpretation [5]
anova(model3)
Analysis of Variance Table
Response: PAID
FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
Df Sum Sq Mean Sq F value Pr(>F)
STATE 10 9.8027e+09 9.8027e+08 101.2664 < 2.2e-16 ***
CLASS 3 3.7869e+10 1.2623e+10 1304.0318 < 2.2e-16 ***
GENDER 1 4.7271e+08 4.7271e+08 48.8334 4.350e-12 ***
AGE 1 6.6423e+06 6.6423e+06 0.6862 0.407612
STATE:CLASS 27 7.8659e+09 2.9133e+08 30.0960 < 2.2e-16 ***
STATE:GENDER 10 2.7570e+08 2.7570e+07 2.8482 0.001634 **
CLASS:GENDER 3 4.6128e+08 1.5376e+08 15.8841 3.733e-10 ***
Residuals 1361 1.3175e+10 9.6801e+06
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

1. R-Squared and Adjusted R-Squared increased to above 80% and hence the model is a
better fit compared to the earlier models [2]

2. Interaction effect between a few classes and states emerged out to be very
significant (State 6 and Class F6 came out to be significantly negative). Though State
15 came out to be significantly negative when main effects alone were considered,
the interaction effects compensated that negative significantly when interacted with
class C6 and Class 11 whereas the interaction coefficient is not significant between
State 15 and Class F6 indicating that the claims paid is significantly lesser when the
state is 15 and class is F6 compared to other rating classes. Digging deeper into the
relationships is possible with the interaction effect. Similarly main effect of Gender is
significantly negative compared to females but that is offset to some extent for some
states (2,3,7,15) and for some rating classes (C6) whereas it is further negative in
case of F6. So the differences can be magnified by considering the interaction
effects, improving the predictability of the model [1]

3. ANOVA table for the model suggests that except the age all the main effects and
their interaction effects are significant at 5% significance level indicating their
contribution to the predictability of the model [2]

Solution 3:
# Load the data file

indices<-read.csv(file.choose(),header=T);indices

Compute pearson correlation coefficient and finding the most correlated and least
correlated pair

correlation<-cor(indices[,3:12], method = "pearson") [1]

correlation<-round(correlation,3) [1]

correlation [1]
## BM CD EN FM FI HC IN IT TE UT
## BM 1.000 0.882 0.823 0.534 0.819 0.605 0.898 0.447 0.646 0.861
## CD 0.882 1.000 0.783 0.581 0.878 0.637 0.915 0.434 0.696 0.847
## EN 0.823 0.783 1.000 0.446 0.745 0.506 0.799 0.359 0.622 0.793
## FM 0.534 0.581 0.446 1.000 0.485 0.510 0.511 0.303 0.410 0.520
## FI 0.819 0.878 0.745 0.485 1.000 0.502 0.902 0.349 0.623 0.838
## HC 0.605 0.637 0.506 0.510 0.502 1.000 0.588 0.525 0.489 0.530
## IN 0.898 0.915 0.799 0.511 0.902 0.588 1.000 0.370 0.676 0.882
## IT 0.447 0.434 0.359 0.303 0.349 0.525 0.370 1.000 0.291 0.317
## TE 0.646 0.696 0.622 0.410 0.623 0.489 0.676 0.291 1.000 0.669
## UT 0.861 0.847 0.793 0.520 0.838 0.530 0.882 0.317 0.669 1.000

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
(ii) Do manually
min_cor_pair "IT TE"
max_cor_pair "CD IN"

(iii)
Perform a Principal component analysis of the sectoral return values
PCA_corr<-princomp(indices[,3:12])
summary(PCA_corr) [4]
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4
## Standard deviation 0.2106142 0.06763728 0.05825406 0.04631607
## Proportion of Variance 0.7294045 0.07522554 0.05580143 0.03527414
## Cumulative Proportion 0.7294045 0.80463008 0.86043151 0.89570565
## Comp.5 Comp.6 Comp.7 Comp.8
## Standard deviation 0.04255566 0.03735608 0.03326497 0.03093065
## Proportion of Variance 0.02977884 0.02294646 0.01819564 0.01573154
## Cumulative Proportion 0.92548448 0.94843094 0.96662658 0.98235812
## Comp.9 Comp.10
## Standard deviation 0.023803102 0.022500987
## Proportion of Variance 0.009316657 0.008325228
## Cumulative Proportion 0.991674772 1.000000000

Alternatively instead of using princomp, the student can use prcomp as well.

PCA_corr_1<-prcomp(indices[,3:12])
summary(PCA_corr_1)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 P6
## Standard deviation 0.2113 0.06784 0.05843 0.04646 0.04269 0.0377
## Proportion of Variance 0.7294 0.07523 0.05580 0.03527 0.02978 0.0225
## Cumulative Proportion 0.7294 0.80463 0.86043 0.89571 0.92548 0.9483
## PC7 PC8 PC9 PC10
## Standard deviation 0.03337 0.03103 0.02388 0.02257
## Proportion of Variance 0.01820 0.01573 0.00932 0.00833
## Cumulative Proportion 0.96663 0.98236 0.99167 1.00000

proportion of total variation explained by the first two principal components

sum(PCA_corr$sdev[1:2]^2)/sum(PCA_corr$sdev^2)
## [1] 0.8046301 [1]

OR Alternatively
sum(PCA_corr_1$sdev[1:2]^2)/sum(PCA_corr_1$sdev^2)
## [1] 0.8046301

vi)
Paiwise correlations of the transformed components

round(cor(PCA_corr$scores),3) [3]

## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9

## Comp.1 1 0 0 0 0 0 0 0 0
## Comp.2 0 1 0 0 0 0 0 0 0
## Comp.3 0 0 1 0 0 0 0 0 0
## Comp.4 0 0 0 1 0 0 0 0 0
## Comp.5 0 0 0 0 1 0 0 0 0
## Comp.6 0 0 0 0 0 1 0 0 0
## Comp.7 0 0 0 0 0 0 1 0 0
FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
## Comp.8 0 0 0 0 0 0 0 1 0
## Comp.9 0 0 0 0 0 0 0 0 1
## Comp.10 0 0 0 0 0 0 0 0 0
## Comp.10
## Comp.1 0
## Comp.2 0
## Comp.3 0
## Comp.4 0
## Comp.5 0
## Comp.6 0
## Comp.7 0
## Comp.8 0
## Comp.9 0
## Comp.10 1

OR Alternatively
round(cor(PCA_corr_1$x),3)
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
## PC1 1 0 0 0 0 0 0 0 0 0
## PC2 0 1 0 0 0 0 0 0 0 0
## PC3 0 0 1 0 0 0 0 0 0 0
## PC4 0 0 0 1 0 0 0 0 0 0
## PC5 0 0 0 0 1 0 0 0 0 0
## PC6 0 0 0 0 0 1 0 0 0 0
## PC7 0 0 0 0 0 0 1 0 0 0
## PC8 0 0 0 0 0 0 0 1 0 0
## PC9 0 0 0 0 0 0 0 0 1 0
## PC10 0 0 0 0 0 0 0 0 0 1

Interpretation
The pairwise correlation between the components after the PCA is performed should be zero
as PCA is a way to deal with highly correlated variables. If N variables are highly
correlated than they will all load out on the SAME Principal Component (Eigenvector) and
they will be uncorrelated with other components (All these components are orthogonal).
Hence the correlations will be zero between the components [2]

(vi)
Scree Plot screeplot(PCA_corr,type = "l")

Interpretation: Number of significant components is 1 as the scree plot almost flattened out after the second
component [1]

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
Solution 4

> budget = read.csv("marketingbudget.csv")

> plot(budget$Spend,budget$Sales)

The above scatter plot shows a positive linear relationship between marketing Spend and Sales data.

ii)
> cor = cor(budget$Sales,budget$Spend)
> cor
[1] 0.9701669

iii)
> cor.test(budget$Spend,budget$Sales,method="pearson",alternative = "greater")
Pearson's product-moment correlation
data: budget$Spend and budget$Sales
t = 30.476, df = 58, p-value < 2.2e-16
alternative hypothesis: true correlation is greater than 0
95 percent confidence interval:
0.9542479 1.0000000
sample estimates:
cor
0.9701669
The p-value is 2.2 X 10^-16, showing very strong evidence against the null hypothesis. Thus, we reject that
the Pearson’s correlation coefficient is equal to 0 and conclude that it is positive.

iv)
> reg = lm(Sales ~ Spend, data = budget)
> summary(reg)
Call:
lm(formula = Sales ~ Spend, data = budget)
Residuals:
Min 1Q Median 3Q Max
-25331.9 -6783.1 -844.5 7965.9 25320.1
Coefficients:
Estimate Std. Error t value Pr(>|t|)

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
(Intercept) 3431.5592 3245.9169 1.057 0.295
Spend 10.5310 0.3455 30.476 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10650 on 58 degrees of freedom
Multiple R-squared: 0.9412, Adjusted R-squared: 0.9402
F-statistic: 928.8 on 1 and 58 DF, p-value: < 2.2e-16
From the output, the estimate of parameter sigma is 10,650.
v)
> abline(reg)

vi)
From the R output, the proportion of total variability of the responses explained by the model is 94.12%. [1]

viii)
es = resid(reg)
> t.test(es,conf.level = 0.99)$conf.int
[1] -3630.146 3630.146
attr(,"conf.level")
[1] 0.99
From the above, the confidence interval for parameter sigma is (-3630.15, 3630.15)

ix)Based on the results in both part (vii) and part (viii), the errors seem to be close to zero and the
confidence interval of residuals also contains 0. Hence the model seems to be a good fit.

x)
Let Ho: Beta = 10 and H1: Beta not equal to 10
> b1 = (coef(reg))[['Spend']]… [1]
> n = 60
> s = sqrt(sum(es^2)/(n-2))
> SE = s/sqrt(sum((budget$Spend-mean(budget$Spend))^2)).. [2]
> t = (b1-10)/SE … [1]
> pt(t,58,lower.tail = FALSE)… [1]
[1] 0.06489565

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.
> pvalue = 2*pt(t,58,lower.tail = FALSE).. [1]
> pvalue
[1] 0.1297913.. [1]

xi)
There is insufficient evidence to reject the null hypothesis at 5% level of significance. The slope is equalto 10 for this
data.

xii)
> y = 3431.5592 + (b1*4500)
>y
[1] 50821.17
With a marketing spend of INR 4,500, the Sales would be INR 50,821.

FUTURE TRACK Edutech Pvt Ltd | 52, First floor Mall Road, Kingsway camp, Delhi-09|+9910024949, 011 -45024949.

STT206 Summary-1
100% (1)
STT206 Summary-1
36 pages
Multi-Step Polynomial Regression Method To Model and Forecast Malaria Incidence
100% (2)
Multi-Step Polynomial Regression Method To Model and Forecast Malaria Incidence
26 pages
BUSINESS ORGANISATION (Unit - 1)
No ratings yet
BUSINESS ORGANISATION (Unit - 1)
21 pages
Multiple Linear Regression
100% (1)
Multiple Linear Regression
14 pages
Estacion Meteorologica Pachacoto Analisis Estadistico de Datos Meteorologicos
100% (1)
Estacion Meteorologica Pachacoto Analisis Estadistico de Datos Meteorologicos
4 pages
A028 GLM-SC3
No ratings yet
A028 GLM-SC3
137 pages
Chapter 4 - Descriptive Statistics
No ratings yet
Chapter 4 - Descriptive Statistics
100 pages
Homework
No ratings yet
Homework
25 pages
Degrees of Freedom
No ratings yet
Degrees of Freedom
16 pages
SIDI2
No ratings yet
SIDI2
28 pages
Sedimentation Tank
No ratings yet
Sedimentation Tank
18 pages
Homework4 1
No ratings yet
Homework4 1
10 pages
Yaikob Second Assesiment Final
No ratings yet
Yaikob Second Assesiment Final
33 pages
Lab 5 LR
No ratings yet
Lab 5 LR
9 pages
Apotelesmata 2000+ Robust
No ratings yet
Apotelesmata 2000+ Robust
4 pages
ARIMA Predict Forecast
No ratings yet
ARIMA Predict Forecast
1 page
Regression 2
No ratings yet
Regression 2
52 pages
R Output
No ratings yet
R Output
7 pages
Analysis of Climatic Variations Using ANOVA: DR CH R Phani Kumar, D.Vishnu Naga Praveen
No ratings yet
Analysis of Climatic Variations Using ANOVA: DR CH R Phani Kumar, D.Vishnu Naga Praveen
6 pages
Quiz
No ratings yet
Quiz
1 page
Tutorial 2 Solutions
No ratings yet
Tutorial 2 Solutions
12 pages
Case Project Econometrics
No ratings yet
Case Project Econometrics
4 pages
Erroe Metrics
No ratings yet
Erroe Metrics
4 pages
Interpretation
No ratings yet
Interpretation
3 pages
36-401 Modern Regression HW #5 Solutions: Air - Flow
No ratings yet
36-401 Modern Regression HW #5 Solutions: Air - Flow
7 pages
Department of Statistics Course STATS 330/772: Advanced Statistical Modelling/Special Topic in Regression
No ratings yet
Department of Statistics Course STATS 330/772: Advanced Statistical Modelling/Special Topic in Regression
13 pages
soruma-SECOND-ASSEsiment L Reg
No ratings yet
soruma-SECOND-ASSEsiment L Reg
33 pages
Plackett Burman
No ratings yet
Plackett Burman
46 pages
Past Years Sem 2
No ratings yet
Past Years Sem 2
160 pages
A1w2017s PDF
No ratings yet
A1w2017s PDF
11 pages
Soruma SECOND ASSEsiment Final L Reg
No ratings yet
Soruma SECOND ASSEsiment Final L Reg
34 pages
Lab Book
No ratings yet
Lab Book
24 pages
Bill Sendewicz TSA Project
No ratings yet
Bill Sendewicz TSA Project
49 pages
Week 5
No ratings yet
Week 5
15 pages
Thea Part B Code
No ratings yet
Thea Part B Code
5 pages
ch4 4试验设计与数据分析-最速上升法
No ratings yet
ch4 4试验设计与数据分析-最速上升法
10 pages
The Statistics
No ratings yet
The Statistics
4 pages
Topic 6 Statistics
No ratings yet
Topic 6 Statistics
7 pages
Statistics Paper 1: Answer: (A) ..
No ratings yet
Statistics Paper 1: Answer: (A) ..
7 pages
Stat 362 UNIT 1
No ratings yet
Stat 362 UNIT 1
53 pages
MAT 240 Real Estate Data Mod
No ratings yet
MAT 240 Real Estate Data Mod
4 pages
Joshna Priya - 141 Business Statistics
No ratings yet
Joshna Priya - 141 Business Statistics
5 pages
Appendix
No ratings yet
Appendix
12 pages
BT PTTKNC
No ratings yet
BT PTTKNC
5 pages
RSM Tutorial
No ratings yet
RSM Tutorial
12 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
11 Regression JASP
100% (1)
11 Regression JASP
35 pages
Week 7 and Week 8
No ratings yet
Week 7 and Week 8
29 pages
Elementary Statistic Assignment 2019
No ratings yet
Elementary Statistic Assignment 2019
13 pages
Y - Bus Matrix For IEEE 14 Bus
No ratings yet
Y - Bus Matrix For IEEE 14 Bus
2 pages
Lesson 7
No ratings yet
Lesson 7
2 pages
Nmexample Rev3
No ratings yet
Nmexample Rev3
36 pages
Stats Assingment
No ratings yet
Stats Assingment
12 pages
14-1-2025
No ratings yet
14-1-2025
3 pages
L21 ECO220 Print
No ratings yet
L21 ECO220 Print
16 pages
Exercise 6: Time Series Analysis and Stochastic Modelling
No ratings yet
Exercise 6: Time Series Analysis and Stochastic Modelling
18 pages
AE6207 - Solution 4 - 2024
No ratings yet
AE6207 - Solution 4 - 2024
6 pages
Week 12 Tutorial 11 Review Questions and Solutions
No ratings yet
Week 12 Tutorial 11 Review Questions and Solutions
17 pages
Final Reviews
No ratings yet
Final Reviews
4 pages
UNSW ECON2206 Assignment
No ratings yet
UNSW ECON2206 Assignment
7 pages
Exercise 4: Simple and Multiple Linear Regression Analysis
No ratings yet
Exercise 4: Simple and Multiple Linear Regression Analysis
15 pages
WEEK
No ratings yet
WEEK
17 pages
As 2
No ratings yet
As 2
10 pages
44
No ratings yet
44
9 pages
Contents-Quantitative Techniques-Statistics-Econometrics-2024-25
No ratings yet
Contents-Quantitative Techniques-Statistics-Econometrics-2024-25
6 pages
Problem 4.1 A)
No ratings yet
Problem 4.1 A)
11 pages
Ams 427 Statistical Model Building
No ratings yet
Ams 427 Statistical Model Building
5 pages
12 - Karl Pearson Correlation Calculation
No ratings yet
12 - Karl Pearson Correlation Calculation
3 pages
Chapter 5
No ratings yet
Chapter 5
11 pages
Study of Averages (Measures of Central Tendency) (1) Sheet-2
No ratings yet
Study of Averages (Measures of Central Tendency) (1) Sheet-2
119 pages
Module 10 - Central Tendency
No ratings yet
Module 10 - Central Tendency
6 pages
Solutions Chapter 11
No ratings yet
Solutions Chapter 11
9 pages
Sol - PQ220 6234F.Ch 11
No ratings yet
Sol - PQ220 6234F.Ch 11
13 pages
R Class 15
No ratings yet
R Class 15
3 pages
CM1A Nov 24 QP - 0
No ratings yet
CM1A Nov 24 QP - 0
10 pages
Isye4031 Regression and Forecasting Practice Problems 2 Fall 2014
No ratings yet
Isye4031 Regression and Forecasting Practice Problems 2 Fall 2014
5 pages
University of Delhi: Semester Examination 2023-MAY-JUNE:REGULAR Statement of Marks / Grades
No ratings yet
University of Delhi: Semester Examination 2023-MAY-JUNE:REGULAR Statement of Marks / Grades
2 pages
JHC BAQF Brochure
No ratings yet
JHC BAQF Brochure
19 pages
Arch 2012 Iss1 Shapiro Paper
No ratings yet
Arch 2012 Iss1 Shapiro Paper
11 pages
Test of Interest Rate
No ratings yet
Test of Interest Rate
5 pages
Validitas Kuesioner Pengetahuan: Reliability Statistics
No ratings yet
Validitas Kuesioner Pengetahuan: Reliability Statistics
3 pages
Pearson's Correlation
No ratings yet
Pearson's Correlation
45 pages
Najah Mubashira Final STT 351 Project
No ratings yet
Najah Mubashira Final STT 351 Project
7 pages
08.0 PP 52 67 Location and Dispersion
No ratings yet
08.0 PP 52 67 Location and Dispersion
16 pages
Prob Set 1
No ratings yet
Prob Set 1
5 pages
Round 1 - Opening Closing Ranks - JAM MS 2025
No ratings yet
Round 1 - Opening Closing Ranks - JAM MS 2025
1 page
CH 61
No ratings yet
CH 61
1 page
Chapter 8 PPT New Period 3
No ratings yet
Chapter 8 PPT New Period 3
12 pages
Activity 3
No ratings yet
Activity 3
3 pages
Class Interval (F) (X) (FX)
No ratings yet
Class Interval (F) (X) (FX)
3 pages
04.0 PP Xi Xii Preface To Second Edition
No ratings yet
04.0 PP Xi Xii Preface To Second Edition
2 pages
Práctica Estadística
No ratings yet
Práctica Estadística
4 pages
Punjab College, Sambrial: Assignment Subject: Business Statistics Q1
No ratings yet
Punjab College, Sambrial: Assignment Subject: Business Statistics Q1
2 pages
Idem - Adyanto Armando Purba
No ratings yet
Idem - Adyanto Armando Purba
3 pages
CO2024 Tutorial Estimation
No ratings yet
CO2024 Tutorial Estimation
3 pages
Answers To Exercises
No ratings yet
Answers To Exercises
4 pages
Quiz PPNC
No ratings yet
Quiz PPNC
4 pages
Tugas2 Regresi Linear Berganda - Ipynb - Colab
No ratings yet
Tugas2 Regresi Linear Berganda - Ipynb - Colab
3 pages
Solutions Manual to accompany Introduction to Linear Regression Analysis
From Everand
Solutions Manual to accompany Introduction to Linear Regression Analysis
Douglas C. Montgomery
1/5 (1)
Computer Solved Differential Equations
From Everand
Computer Solved Differential Equations
Joe J.
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Times Tables
From Everand
Times Tables
Darrell Butters
No ratings yet

R Class 21

Uploaded by

R Class 21

Uploaded by

SOLUTION 1

> d1<-read.csv(file.choose(),header = T);d1 # to open data file in R

# to fit the linear Model

Residual standard error: 0.01627 on 17 degrees of freedom

𝑦̂𝑖 = 0.081086 – 0.014499𝑥1 + 0.005591𝑥2

#remove the sixth row

X Failure.Time Material.Type Rainfall

Residual standard error: 0.01031 on 16 degrees of freedom

> gfit<-glm(d1new$Failure.Time~d1new$Material.Type+d1new$Rainfall,family= Gamma)

Call: glm(formula = d1new$Failure.Time ~ d1new$Material.Type + d1new$Rainfall,

Degrees of Freedom: 18 Total (i.e. Null); 16 Residual

(b) The fitted model is:

(Intercept) d1new$Material.Type d1new$Rainfall

𝜂̂ = 1/𝜇 = 6.1920558 + 6.7286452𝑥1 + 0.4814427𝑥2

review the model fit

(Dispersion parameter for Gamma family taken to be 0.03085425)

Null deviance: 2.99847 on 18 degrees of freedom

Number of Fisher Scoring iterations: 4

Residual standard error: 3943 on 1401 degrees of freedom

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 223.213 on 163 degrees of freedom

Number of Fisher Scoring iterations: 11

# Using Natural Logarithm of the claims paid

correlation<-cor(indices[,3:12], method = "pearson") [1]

proportion of total variation explained by the first two principal components

## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9

> budget = read.csv("marketingbudget.csv")

You might also like