0% found this document useful (0 votes)
56 views8 pages

Ayushi Patel A044 R Software

The document contains the results of regression analyses and hypothesis tests conducted on various datasets. It includes: 1) A linear regression of sales on advertising and competition with an R^2 of 97% showing a good fit. 2) A multiple regression of points scored on passing yards, rushing yards, and points allowed with a lower R^2 of 55% indicating a more modest relationship. 3) A hypothesis test that fails to reject the null that the mean weight of turtles equals 310 pounds. 4) A paired t-test that fails to reject the null of no difference between male and female weights. 5) An ANOVA that rejects the null that strengths of raw materials are the same

Uploaded by

Ayushi Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views8 pages

Ayushi Patel A044 R Software

The document contains the results of regression analyses and hypothesis tests conducted on various datasets. It includes: 1) A linear regression of sales on advertising and competition with an R^2 of 97% showing a good fit. 2) A multiple regression of points scored on passing yards, rushing yards, and points allowed with a lower R^2 of 55% indicating a more modest relationship. 3) A hypothesis test that fails to reject the null that the mean weight of turtles equals 310 pounds. 4) A paired t-test that fails to reject the null of no difference between male and female weights. 5) An ANOVA that rejects the null that strengths of raw materials are the same

Uploaded by

Ayushi Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

AYUSHI PATEL

A044
40525200045

R Software – Internal Assignment (Semester 2)

1. Regression line of Sales on Advertisement & Competition


i. > sales=c(27,23,31,45,47,42,39,45,57,59,73,84)
> adv=c(20,20,25,28,29,28,31,34,35,36,41,45)
> comp=c(10,15,15,15,20,25,35,35,20,30,20,20)
> reg =lm(sales~adv+comp)
> reg Call:
lm(formula = sales ~ adv + comp)

Coefficients:
(Intercept) adv comp
-18.7958 2.5248 -0.5449

ii. > names(reg)


[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "xlevels" "call" "terms" "model"

> fitted.values(reg)
1 2 3 4 5 6 7 8
26.25119 23.52657 36.15062 43.72506 43.52525
38.27582 40.40102 47.97545
9 10 11 12
58.67412 55.74969 73.82298 83.92222

> residuals(reg)
1 2 3 4 5 6
0.74881225 -0.52657047 -5.15062467 1.27494281
3.47474925 3.72417737
AYUSHI PATEL
A044
40525200045

7 8 9 10 11 12
-1.40102059 -2.97545311 -1.67411578 3.25030794
0.82298082 0.07777582

iii. > summary(reg)

Call:
lm(formula = sales ~ adv + comp)

Residuals:
Min 1Q Median 3Q Max
-5.1506 -1.4693 -0.2244 1.7688 3.7242

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -18.7958 3.8520 -4.880 0.000872 *** adv
2.5248 0.1295 19.495 1.14e-08 ***
comp -0.5449 0.1230 -4.432 0.001643 ** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.978 on 9 degrees of freedom


Multiple R-squared: 0.9779, Adjusted R-squared: 0.973
F-statistic: 199.2 on 2 and 9 DF, p-value: 3.539e-08

R2 = 0.9779 i.e. 97%, it shows how good the model is. The
independent variables (advertisement and competition) has
97% affect on the dependent variable (sales).
Adjusted R2 = 0.973 i.e. 97%, this is a better representation
for goodness of fit. It is almost same as above which
means the model is good.

iv. VIF = > install.packages("car")


AYUSHI PATEL
A044
40525200045

> library(car)
Loading required package: carData Warning
messages:
1: package ‘car’ was built under R version 4.0.5 2:
package ‘carData’ was built under R version 4.0.3
> VIF=vif(reg) >
VIF
adv comp
1.221978 1.221978

Variation Inflation Factor (VIF) measures


multicollinearity, for advertisement and competition
VIF=1.22 which is negligible. Therefore we conclude that
there is no multicollinearity.

2. Regression of points scored on pass yards, rush yards, and points


allowed
v. >ptsscored=c(38,42,38,27,30,40,45,30,37,26,51,40,27,28,3
1,35)
>passyard=c(256,326,314,304,313,352,358,303,375,249,4
78,295,377,243,273,281)
>rushyards=c(106,127,77,142,126,94,198,49,139,118,98,1
74,94,60,154,99)
>ptsallowed=c(28,37,27,23,14,43,10,23,21,14,54,33,30,22
,22,18)
> reg1=lm(ptsscored~passyard+rushyards+ptsallowed)
> reg1
AYUSHI PATEL
A044
40525200045

Call:
lm(formula = ptsscored ~ passyard + rushyards +
ptsallowed)

Coefficients:
(Intercept) passyard rushyards ptsallowed
7.37778 0.03859 0.06653 0.30279

vi. > names(reg1)


[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "xlevels" "call" "terms" "model"

> fitted.values(reg1)
1 2 3 4 5 6 7 8
32.78613 39.60938 32.79190 35.51943 32.07711
40.23380 37.39258 29.29342
9 10 11 12 13 14 15 16
37.45385 29.07537 48.69239 40.32905 37.26220
27.40734 34.81887 30.25717
> residuals(reg1)
1 2 3 4 5 6
5.2138729 2.3906192 5.2081010 -8.5194349
2.0771111 -0.2337994
7 8 9 10 11 12
7.6074176 0.7065770 -0.4538513 -3.0753728
2.3076051 -0.3290500
13 14 15 16
-10.2621987 0.5926626 -3.8188669 4.7428298

vii. > summary(reg1)


AYUSHI PATEL
A044
40525200045

Call:
lm(formula = ptsscored ~ passyard + rushyards +
ptsallowed)

Residuals:
Min 1Q Median 3Q Max
-10.2622 -2.3267 0.1794 2.9787 7.6074

Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept)
7.37778 8.35058 0.884 0.3943 passyard
0.03859 0.02993 1.289 0.2216 rushyards
0.06653 0.03796 1.753 0.1051 ptsallowed
0.30279 0.16220 1.867 0.0866 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.425 on 12 degrees of freedom


Multiple R-squared: 0.5582, Adjusted R-squared:
0.4478
F-statistic: 5.054 on 3 and 12 DF, p-value: 0.01717

R2 = 0.5582 i.e. 55% it shows goodness of fit. The


independent variables only affect the dependent variable
by approx. 55% which is decent.
Adjusted R2 = 0.4478 i.e. 44% a better representation of
how good the model is. Since it is 44% the model is not
that good.

viii. VIF = > library(car)


> VIF=vif(reg1) >
VIF
AYUSHI PATEL
A044
40525200045

passyard rushyards ptsallowed


1.636215 1.164668 1.739894 VIF
measures multicollinearity, in passyard,
rushyard, and points allowed the VIF is
1.63, 1.16, and 1.73 which is very less.
Thus we conclude there is no
multicollinearity.

3. Ho = 310 HA≠ 310 , two tailed test at 5% alpha


> n=25 ; xbar=300 ; mu=310 ; sd=18.5
> SE=sd/sqrt(n-1)
> SE
[1] 3.776297
> tcal=(xbar-mu)/SE
> tcal
[1] -2.648097
> PV=2*pt(abs(tcal),df=n-1,lower.tail=F)
> PV
[1] 0.01408115
Since p-value is greater than alpha hence we will accept the null
hypothesis and conclude that the mean weight of certain species
of turtle is equal to 310 pounds.

4. Weight of men = mu1


Weight of women = mu2
Ho=mu1=mu2 HA=mu1>mu2, paired t test at 1% alpha level

> wtmales=c(67.8,60,63.4,76,89.4,73.3,67.3,61.3,62.4)
> wtfemales=c(38.9,61.2,73.3,21.8,63.4,64.6,48.4,48.8,48.5)
AYUSHI PATEL
A044
40525200045

>
pttest=t.test(wtmales,wtfemales,mu=0,paired=T,alter='greater')
> pttest

Paired t-test

data: wtmales and wtfemales t = 2.726, df = 8, p-value = 0.013


alternative hypothesis: true difference in means is greater than
0 95 percent confidence interval:
5.368299 Inf
sample estimates:
mean of the differences
16.88889

Since our p-value is greater than alpha 1% we will accept null


hypothesis and conclude that there is no significant difference
between weights of men and women.

5. Ho= there is no significant difference between strength 1,2,3,4


of raw materials
HA= at least one of the strength of raw material is different

>st1=c(11.715501,11.981569,8.0439292,10.55816,14.079463,1
0.776867,7.8602695,11.889672,11.942314,13.177454)
>st2=c(10.566155,13.455359,7.4188405,12.031314,7.7766332,
10.748939,10.72698,4.4772914,6.8038204,5.3718922)
>st3=c(10.283346,12.177732,10.559808,9.6551865,8.7902748,
10.862457,10.378184,10.188052,11.62452,12.305905)
>st4=c(6.903486,8.9901103,6.9712734,9.1603896,8.6784264,1
1.443832,10.780441,5.66676,10.776041,9.0087649)
AYUSHI PATEL
A044
40525200045

> d=stack(list(b1=st1,b2=st2,b3=st3,b4=st4)
+)
> names(d)
[1] "values" "ind"
> av1=aov(values~ind,data=d)
> av1
> summary(av1)
Df Sum Sq Mean Sq F value Pr(>F) ind
3 43.62 14.540 3.303 0.0311 *
Residuals 36 158.47 4.402
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

At alpha 5% we reject the null hypothesis because the p-value


0.03 is less than 0.05, and conclude that at least one of the
strength is different.

You might also like