0% found this document useful (0 votes)
37 views7 pages

BSRM Final Assignment

assignment

Uploaded by

Harinder Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views7 pages

BSRM Final Assignment

assignment

Uploaded by

Harinder Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Answer 1. (a) Hypothesis Question.

Eco-Tourism Resort Dilemma: In the lush tropical rainforest of


Southeast Asia, eco-tourism resorts often encounter an unusual problem:
the presence of large flocks of exotic birds near their main viewing
platforms can sometimes disrupt the peaceful environment that tourists
expect. Experience has shown that 50% of these birds tend to leave the
area on their own. Recently, a new strategy is being tested to see if it can
increase this percentage and encourage more birds to leave the viewing
platforms.

To evaluate the effectiveness of this new strategy, the resort manager


decided to conduct a study. On 100 occasions, this new technique was
implemented, and it successfully prompted the birds to leave 62 times.
The manager wants to determine if this new strategy results in a higher
percentage of birds leaving the area compared to the usual 50%.

1. (b) Hypothesis Testing:

(i) Null Hypothesis H₀ - The number of birds leaving the area is


50% o
(ii) Alternative Hypothesis Hₐ - The number of birds leaving the
area after implementation of new strategy is more than 50%
(iii) P-value: 0.01072
(iv) Since the p-value is less than 0.05, we reject the null hypothesis.
There is strong statistical evidence to confidently say that the
new strategy results in more than 50% of the birds leaving the
viewing platforms.

(v) Conclusion: Based on the survey conducted, we can confidently


say that the new strategy is effective. More than 50% of the birds
leave the viewing platforms when the new method is used, which
is a significant improvement compared to the usual behavior.
This suggests that the strategy helps maintain a peaceful
environment for tourists more effectively than before.

1.(c) Step-by-Step Calculation:

Given Data:
o Sample size (n): 100
o Number of successes (x): 62
o Hypothesized population proportion (p0): 0.50
Calculate the Sample Proportion (p^):

p^= x/n = 62/100 = 0.62

Hypothesized Population Proportion (p0):

p0 = 0.50

Calculate the Standard Error (SE):

SE = Square root{p 0*(1−p0)/n} = Square root{0.5*(1−0.5)/100}


= 0.05

Calculate the Test Statistic (Z):

Z = (p^ − p0)/SE = (0.62−0.50)/ 0.05 = 2.4

Therefore, Test Statistic: Z = 2.4

This test statistic quantifies how far the observed sample proportion (0.62)
is from the hypothesized population proportion (0.50). In this case, a Z-
value of 2.4 indicates that the sample proportion is 2.4 standard errors
away from the hypothesized proportion.

Result from R: Test Statistic : Z = 2.4

1.(d) Calculating the P-Value: For a one-tailed test where we are


testing if the sample proportion is greater than the hypothesized
proportion:

P-Value = 1 − pnorm(Z)
Logic - The p-value is the probability that the observed test statistic (or
one more extreme) would occur if the null hypothesis were true. For a
one-tailed test where we are testing if the sample proportion is greater
than the hypothesized proportion, the p-value can be found using the
cumulative distribution function (CDF) of the standard normal distribution.
Mathematically, the p-value for a one-tailed test is:

P-Value = 1 − Φ(Z) ; Where Φ(Z) is the CDF of the standard normal


distribution at the Z value.
Look up the Z value in the Z-table or use a calculator to find the area to
the left of the Z value (which is Φ(Z)).
For Z = 2.4: Φ(2.4) ≈ 0.9918
Therefore, the p-value is:
P-Value = 1 − 0.9918 = 0.0082

1.(e) Explanation of the p-value

The p-value represents the probability of observing a test statistic as


extreme as, or more extreme than, the one obtained from your sample
data, assuming the null hypothesis is true.

In simpler terms, it quantifies how likely your sample results are given the
null hypothesis. When the p-value is very small (less than 0.05), it
indicates that such extreme results are highly unlikely under the null
hypothesis. This small area beyond the test statistic suggests that the
observed sample proportion significantly deviates from what we would
expect if the null hypothesis were true.

Therefore, a small p-value leads us to reject the null hypothesis because


the observed data would be very rare under the null hypothesis. The test
statistic's magnitude directly influences the p-value; the larger the test
statistic, the smaller the p-value, reflecting a greater deviation from the
null hypothesis and thereby strengthening the evidence against it.

Answer 2. Confidence Intervals

2.(a) To compute the confidence intervals for the sample proportion, we


use the given formula for the confidence interval of a proportion.

For question at 1(a) above, the data given are as under;

o Sample size (n): 100


oNumber of successes (x): 62
oCalculate the Sample Proportion (p^):
p^= x/n = 62/100 = 0.62
o Calculate the Standard Error (SE):
SE = Square root{p0*(1−p0)/n} = Square
root{0.62*(1−0.62)/100} = 0.0485

90% Confidence Interval: For a 90% confidence interval, the Z-value is


1.645.

CI90% = p^ ± 1.645* Square root{p0*(1−p0)/n}

= 0.62 ± 1.645*0.0485

= 0.62 ± 0.0798

Therefore, 90% Confidence Interval = (0.5402, 0.6998)

95% Confidence Interval: For a 95% confidence interval, the Z-value is


1.96.

CI95% = p^ ± 1.96* Square root{p0*(1−p0)/n}

= 0.62 ± 1.96*0.0485

= 0.62 ± 0.0951

Therefore, 95% Confidence Interval = (0.5249, 0.7151)

99% Confidence Interval: For a 99% confidence interval, the Z-value is


2.576.

CI99% = p^ ± 2.576* Square root{p0*(1−p0)/n}

= 0.62 ± 2.576*0.0485

= 0.62 ± 0.1249

Therefore, 90% Confidence Interval = (0.4951, 0.7449)

These intervals provide ranges within which the true population proportion
is likely to fall, with the specified level of confidence.

2.(b) Length of the Intervals:

 Length of 90% CI: 0.6998 − 0.5402 = 0.1596

 Length of 95% CI: 0.7151 − 0.5249 = 0.1902

 Length of 99% CI: 0.7449 − 0.4951 = 0.2498

Comparison:

 Shortest Interval: 90% Confidence Interval (0.1596)


 Widest Interval: 99% Confidence Interval (0.2498)

Intuitive Explanation: The width of a confidence interval is related to


the confidence level:

 Higher Confidence Level: To be more confident that the true


population proportion lies within the interval, we use a wider range
(When you zoom out, you capture a broader view of the say football
field, including several players and more of the action. This is like a
99% confidence interval, where the interval is wide, covering more
possibilities, ensuring you capture the whole scene with higher
certainty.). This is why the 99% confidence interval is the widest.

 Lower Confidence Level: To have a narrower range, but still be


reasonably sure that the true proportion falls within it, we use a
narrower interval. (When you zoom in closely, you capture a precise,
detailed image of a small area, like the player with the ball. This is
akin to a 90% confidence interval, where the interval is narrow,
giving a specific range, but there's a higher chance you might miss
some of the action happening outside this narrow frame.) Therefore,
the 90% confidence interval is the shortest.

In summary, as the confidence level increases, the interval width


increases to provide higher certainty that the true population parameter is
captured within the interval. Conversely, a lower confidence level gives a
narrower interval, but with less certainty.

Note: In statistical analysis, there's an intrinsic connection between


confidence levels and levels of significance. The confidence level is
calculated as Confidence Level=1−Level of Significance\text{Confidence
Level} = 1 - \text{Level of Significance}. This relationship ties together
hypothesis testing and confidence intervals. For instance, a 95%
confidence level corresponds to a significance level (α\alpha) of 0.05.

When conducting hypothesis testing, particularly with a "not equal to"


alternative hypothesis, any null value that falls outside the obtained
confidence interval leads to the rejection of the null hypothesis.
Essentially, if the null hypothesis value does not lie within the confidence
interval, it indicates that the sample data provide enough evidence to
reject the null hypothesis in favor of the alternative. This interconnection
ensures that both the hypothesis test results and confidence intervals
provide consistent conclusions about the population parameter under
study.

Answer 3.
(a) Let, MonthlyExpenditure = Y
Age = X1
Income = X2
HoursOfExcercise = X3
ScreenTime = X4
(b) From the given data and using Rstudio, the linear regression model of monthly
expenditure on fitness on age, income, hours of exercise and screen-time is as
follows:-
Y= -0.93891 + 1.82695*X1 + 1.70428*X2 - 0.06169*X3 + 0.12477*X4
(c). The p-values obtained are as follows:
For X1, p-value = 2.02x10-08
For X2, p-value < 2x10-16
For X3, p-value = 0.378
For X4, p-value = 0.039
(d). The ‘R’ screenshot for reference is as follows:

(e) Now, from the teachings in the class, we know that a p-value measures the
probability of obtaining the observed results, assuming that the null hypothesis is
true. The lower the p-value, the greater the statistical significance of the observed
difference. A p-value of 0.05 or lower is generally considered statistically significant.
For X1, p-value = 2.02x10-08 and it is < 0.05
For X2, p-value < 2x10-16 and it is < 0.05
For X3, p-value = 0.378 and it is > 0.05
For X4, p-value = 0.039 and it is < 0.05
Thus, from the above obtained p-values, we can draw following conclusion regarding
significance of predictors:-
(i) The age and monthly income (p-value of both is very less than 0.05) of
the individuals are highly significant predictors of their monthly expenditure on
gym. There is strong relationship between the dependent variable monthly
expenditure and independent variables age and monthly income.
(ii) The p-value of predictor viz. hours of exercise is greater than 0.05 and
hence it is insignificant and not a reliable predictor of monthly expenditure of the
individuals.
(iii) The p-value of screen time is less than 0.05 and it bears a significant
relation with monthly expenditure of the individuals. Hence, it is a reliable
predictor of monthly expenditure of the individuals.

You might also like