Mini Project - Golf: by Vishnu Vinod V.K
Mini Project - Golf: by Vishnu Vinod V.K
The results of the tests, with distances measured to the nearest yard,
are contained in the data set “Golf”. Prepare a Managerial Report
1. Formulate and present the rationale for a hypothesis test that par
could use to compare the driving distances of the current and new golf
balls
2. Analyze the data to provide the hypothesis testing conclusion. What
is the p-value for your test? What is your recommendation for Par
Inc.?
3. Provide descriptive statistical summaries of the data for each model
4. What is the 95% confidence interval for the population mean of
each model, and what is the 95% confidence interval for the difference
between the means of the two population? 5. Do you see a need for
larger sample sizes and more testing with the golf balls? Discuss
SOLLUTION
Loading dataset
golf <- read.csv("Golf.csv")
Sample size: 40
No.of samples : 2
[1] 40 2
Summary of the given data shows mean and median are very close
the data is normally distributed.
[1] 8.752985
[1] 9.896904
Also 5-point summary and standard deviations for both columns says
that there is no significant change in the driving distance of balls with
and without coating.
[1] 76.61474
[1] 97.94872
HISTOGRAM AND BOXPLOT
From histogram we can see that both variable are nearly normally
distributed
OBSERVATIONS
Sample size:40
Number of samples: 2
Unpaired variables.
DOF = 40+40-2 = 78
Null Hypothesis:
H0: µold - µnew = 0 (New coating does not have effect on driving
distances)
Alternate Hypothesis:
data: golf$Current
t = 195.29, df = 39, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
267.4757 273.0743
sample estimates:
mean of x
270.275
data: golf$New
t = 170.94, df = 39, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
264.3348 270.6652
sample estimates:
mean of x
267.5
T-TEST CONCLUSION
TWO TAILED TWO SAMPLE INDEPENDENT T TEST
In this scenario, the p value is 0.094 which is greater than the 0.05.
Hence, we failed to reject the Null Hypothesis.
Thus, accepting the Null Hypothesis that there is no significant change in driving
distances due to the new coating.
The difference in mean in the case of new balls can also be attributed to
the higher variance compared to `Current` balls.
The variance of `New` balls driving distances is 97.95 is 28% more than
the variance of the driving distances of `Current` balls 76.61.
We are unsure of the sampling error present in the data.
Statistically there is no effect of new coating on driving distances. Though
it is suggested to check the effect on the weights and other characteristics
like size and shape of the new balls.
Also, the given sample is from only one golf course, It is advisable that test
should perform on different
Type I Error alpha(α): Probability of rejecting null hypothesis when it is true, the
probability of a Type I error in hypothesis testing is predetermined by the
significance level.
Type II error (β) : Probability of falling to reject the null when it is false. Type II
error calculation **depends on the population mean which is unknown
> abs(qt(0.05,38))
[1] 1.685954
n = 40
delta = 2.775
sd = 13.74397
sig.level = 0.05
power = 0.14274
alternative = two.sided
Basically, the power of the test is the probability that we make the right
decision when the null is not correct (i.e. we correctly reject it)
Let us assume that, we need Type I error and Type II error equal to
0.05
Assuming sample standard deviation is equal to population standard
deviation, we can calculate sample size needed as below:
n = 197.3383
delta = 5
sd = 13.74397
sig.level = 0.05
power = 0.95
alternative = two.sided
CONCLUSION