0% found this document useful (0 votes)
38 views14 pages

Confidence Interval

Uploaded by

awankanwal568
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views14 pages

Confidence Interval

Uploaded by

awankanwal568
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Regression and correlation q3

To calculate the linear correlation coefficient, denoted by r, we first need to calculate the means,
standard deviations, and sum of products for both age (A) and weight (W) using the following
formulas:

mean(A) = (2 + 3 + 5 + 7 + 8) / 5 = 5

mean(W) = (14 + 20 + 32 + 42 + 44) / 5 = 30.4

std(A) = sqrt(((2-5)^2 + (3-5)^2 + (5-5)^2 + (7-5)^2 + (8-5)^2) / 4) = 2.16

std(W) = sqrt(((14-30.4)^2 + (20-30.4)^2 + (32-30.4)^2 + (42-30.4)^2 + (44-30.4)^2) / 4) =


13.96

sum(A*W) = (2*14 + 3*20 + 5*32 + 7*42 + 8*44) = 922

Using these values, we can calculate the linear correlation coefficient:

r = (1/4) * (sum(A*W) - 5*mean(A)*mean(W)) / (std(A)*std(W)) = 0.998

Therefore, there is a very strong positive linear correlation between age and weight.

To find the equation of the regression line of age on weight, we use the formula:
y = b0 + b1*x

where y is the predicted value of age, x is the observed value of weight, b0 is the intercept, and
b1 is the slope. The values of b0 and b1 can be calculated as:

b1 = (sum(A*W) - 5*mean(A)*mean(W)) / (sum(W^2) - 5*mean(W)^2) = 0.901

b0 = mean(A) - b1*mean(W) = 0.787

Therefore, the equation of the regression line of age on weight is:

age = 0.787 + 0.901*weight

To estimate the weight of a six year old child, we can plug in x=6 into the equation and solve for
y:

age = 0.787 + 0.901*6 = 6.693

Therefore, the approximate weight of a six year old child is 6.693 kilograms. However, it is
important to note that this is only an estimate based on the given data, and actual weights may
vary depending on other factors such as height, body composition, and overall health.
Confidence Interval & Hypothesis Testing

1. To construct a confidence interval for the true mean of bill payments, we can use
the following formula:

CI = x̄ ± z*(σ/√n)

where:

CI is the confidence interval

x̄ is the sample mean (RM 205.42)

z* is the z-score corresponding to the desired confidence level of 97.8%. We can find this using
a standard normal distribution table or calculator. For a two-tailed test with a confidence level of
97.8%, the z-score is 2.59.

σ is the population standard deviation (RM 8.92)

n is the sample size (475)

Substituting the values, we get:

CI = 205.42 ± 2.59*(8.92/√475)

CI = 205.42 ± 1.65

CI = (203.77, 207.07)

Therefore, we are 97.8% confident that the true mean bill payment for all customers falls within
the range of RM 203.77 to RM 207.07.
2. To construct a confidence interval for the true average duration for all games, we
can use the following formula:

CI = x̄ ± t*(s/√n)

where:

- CI is the confidence interval

- x̄ is the sample mean

- t* is the t-score corresponding to the desired confidence level of 95% with (n-1) degrees of
freedom. We can find this using a t-distribution table or calculator. For a two-tailed test with
9 degrees of freedom and a confidence level of 95%, the t-score is 2.262.

- s is the sample standard deviation

- n is the sample size

First, we need to calculate the sample mean and standard deviation from the given data:

x̄ = (94 + 110 + 70 + 115 + 122 + 99 + 86 + 103 + 80 + 91) / 10 = 95.0

s = √[Σ(xi - x̄ )² / (n - 1)] = √[2152.2 / 9] = 14.92

Substituting the values into the formula, we get:

CI = 95.0 ± 2.262*(14.92/√10)

CI = 95.0 ± 9.49

CI = (85.51, 104.49)

Therefore, we are 95% confident that the true average duration for all games falls within the
range of 85.51 to 104.49 minutes.
3. To construct a confidence interval for the mean Rockwell hardness, we can use the
following formula:

CI = x̄ ± t*(s/√n)

where:

- CI is the confidence interval

- x̄ is the sample mean (48.50)

- t* is the t-score corresponding to the desired confidence level of 90% with (n-1) degrees of
freedom. We can find this using a t-distribution table or calculator. For a two-tailed test with
11 degrees of freedom and a confidence level of 90%, the t-score is 1.796.

- s is the sample standard deviation (the square root of the sample variance, which is 1.5)

- n is the sample size (12)

Substituting the values into the formula, we get:

CI = 48.50 ± 1.796*(√2.25/√12)

CI = 48.50 ± 0.99

CI = (47.51, 49.49)

Therefore, we are 90% confident that the true mean Rockwell hardness falls within the range
of 47.51 to 49.49.

4. To construct a confidence interval for the true mean time for all mixing processes,
we can use the following formula:

CI = x̄ ± z*(σ/√n)

where:

- CI is the confidence interval


- x̄ is the sample mean (3.75 hours)

- z* is the z-score corresponding to the desired confidence level of 95%. We can find this
using a standard normal distribution table or calculator. For a two-tailed test with a
confidence level of 95%, the z-score is 1.96.

- σ is the population standard deviation (0.7986 hours)

- n is the sample size (45)

Substituting the values, we get:

CI = 3.75 ± 1.96*(0.7986/√45)

CI = 3.75 ± 0.23

CI = (3.52, 3.98)

Therefore, we are 95% confident that the true mean time for all mixing processes falls within
the range of 3.52 to 3.98 hours.

5. To find a 95% confidence interval on the population mean time that engineering
students spend watching television per night, we can use the following formula:

CI = x̄ ± t*(s/√n)

where:

- CI is the confidence interval

- x̄ is the sample mean (obtained by summing up all the observations and dividing by the
sample size): (2 + 1.5 + 3 + 2 + 3.5 + 1 + 0.5 + 3 + 2 + 4) / 10 = 2.35

- t* is the t-score corresponding to the desired confidence level of 95% with (n-1) degrees of
freedom. We can find this using a t-distribution table or calculator. For a two-tailed test with
9 degrees of freedom and a confidence level of 95%, the t-score is 2.262.
- s is the sample standard deviation, which can be calculated using the formula:

s = sqrt((1/(n-1)) * Sum((xi - x̄ )^2))

where xi is the i-th observation, x̄ is the sample mean, and n is the sample size.

Substituting the values, we get:

s = sqrt((1/(10-1)) * ((2-2.35)^2 + (1.5-2.35)^2 + (3-2.35)^2 + (2-2.35)^2 + (3.5-2.35)^2 +


(1-2.35)^2 + (0.5-2.35)^2 + (3-2.35)^2 + (2-2.35)^2 + (4-2.35)^2))

s = 0.9725

Substituting the values into the formula for the confidence interval, we get:

CI = 2.35 ± 2.262*(0.9725/√10)

CI = 2.35 ± 0.99

CI = (1.36, 3.34)

Therefore, we are 95% confident that the true population mean time engineering students
spend watching television per night falls within the range of 1.36 to 3.34 hours.

6. To test whether the mean time to stop a vehicle with ABS during heavy rain is
exactly 25.4 seconds, we can perform a one-sample t-test. The null and alternative
hypotheses are:

H0: µ = 25.4 (the mean time to stop a vehicle with ABS during heavy rain is exactly 25.4
seconds)

Ha: µ ≠ 25.4 (the mean time to stop a vehicle with ABS during heavy rain is not exactly 25.4
seconds)

We can use a significance level of 𝛼 = 0.03, which means we want a 97% confidence level
for our conclusion.
The test statistic for a one-sample t-test is:

t = (x̄ - µ) / (s / √n)

where x̄ is the sample mean, µ is the hypothesized population mean, s is the sample standard
deviation, and n is the sample size.

Substituting the values given in the problem, we get:

t = (26.02 - 25.4) / (2.3 / √40) = 2.52

Using a t-distribution table or calculator with 39 degrees of freedom (n - 1), we find that the
p-value for a two-tailed test with a t-statistic of 2.52 is approximately 0.015. This means that
there is a 1.5% probability of obtaining a sample mean of 26.02 or more extreme values,
assuming that the true population mean is 25.4.

Since the p-value (0.015) is less than the significance level (0.03), we reject the null
hypothesis. This means that there is sufficient evidence to conclude that the mean time to
stop a vehicle with ABS during heavy rain is not exactly 25.4 seconds. However, we cannot
say what the actual mean time is based on this test alone.

7. To test En. Ali's claim, we need to perform a hypothesis test. The null and
alternative hypotheses for this test are:

H0: μ ≤ 3.9 (the true population mean weight of fish caught at the XYZ fishing centre is less
than or equal to 3.9 kg)

Ha: μ > 3.9 (the true population mean weight of fish caught at the XYZ fishing centre is
greater than 3.9 kg)

We will use a significance level of 𝛼 = 0.05, which means we want to be 95% confident in
our conclusion.
Since we are not given the population standard deviation, we will use the sample standard
deviation as an estimate. Suppose En. Ali takes a random sample of 10 fish caught at the
XYZ fishing centre and obtains the following weights (in kg):

5.6, 4.2, 4.5, 4.7, 4.4, 5.3, 3.7, 3.1, 4.8, 3.4

The sample mean is x̄ = (5.6+4.2+4.5+4.7+4.4+5.3+3.7+3.1+4.8+3.4)/10 = 4.41 kg

The sample standard deviation is s = sqrt[((5.6-4.41)²+(4.2-4.41)²+(4.5-4.41)²+(4.7-


4.41)²+(4.4-4.41)²+(5.3-4.41)²+(3.7-4.41)²+(3.1-4.41)²+(4.8-4.41)²+(3.4-4.41)²)/(10-1)] =
0.77 kg

Using the one-sample t-test, the t-statistic for this test is:

t = (x̄ - μ) / (s / √n) = (4.41 - 3.9) / (0.77 / √10) = 1.97

The degrees of freedom for this test are 9 (n-1). Using a t-distribution table or calculator, the
p-value for a one-tailed test with 9 degrees of freedom and a t-statistic of 1.97 is
approximately 0.040.

Since the p-value (0.040) is less than the significance level (0.05), we can reject the null
hypothesis. Therefore, we have evidence to support En. Ali's claim that the average weight of
fish caught at the XYZ fishing centre is higher than his average catch of 3.9 kg, at a 95%
confidence level.

8. To determine if the weight of a vehicle that can be raised by a jack is more than
100kg, we will perform a one-sample t-test.

The null hypothesis is that the mean weight that a jack can raise is equal to 100kg.

The alternative hypothesis is that the mean weight that a jack can raise is greater than 100kg.

Using a significance level of 𝛼 = 0.03, we can find the critical value from the t-distribution
table with 124 degrees of freedom (125 - 1):

t_critical = 1.662

The test statistic can be calculated as:


t = (sample mean - hypothesized mean) / (sample standard deviation / sqrt(sample size))

t = (102.2 - 100) / (15.17 / sqrt(125))

t = 3.02

Since t (3.02) is greater than t_critical (1.662), we reject the null hypothesis and conclude
that there is sufficient evidence to suggest that a jack can raise more than a 100kg vehicle
with a significance level of 𝛼 = 0.03.

9. To determine if the average lifetime of the calculator is at least 2 years, we will


perform a one-sample t-test.

The null hypothesis is that the mean lifetime of the calculator is equal to 2 years.

The alternative hypothesis is that the mean lifetime of the calculator is less than 2 years.

Using a significance level of 𝛼 = 0.025 (two-tailed test), we can find the critical value from
the t-distribution table with 17 degrees of freedom (18 - 1):

t_critical = -2.11

The test statistic can be calculated as:

t = (sample mean - hypothesized mean) / (sample standard deviation / sqrt(sample size))

t = (1.84 - 2) / (0.17 / sqrt(18))

t = -3.33

Since t (-3.33) is less than t_critical (-2.11), we reject the null hypothesis and conclude that
there is sufficient evidence to suggest that the average lifetime of the calculator is less than 2
years with a significance level of 𝛼 = 0.025.
10. We can use a one-sample t-test to determine whether the average score is
significantly greater than 70.

The null hypothesis is that the average score is not significantly greater than 70, i.e.,

H0: μ ≤ 70

The alternative hypothesis is that the average score is significantly greater than 70, i.e.,

Ha: μ > 70

We can use a t-test with a significance level of 0.05 and 99 degrees of freedom (since we
have a sample size of 100 and are estimating one parameter) to determine whether we can
reject the null hypothesis.

The test statistic is calculated as follows:

t = (x̄ - μ) / (s / sqrt(n))

where x̄ is the sample mean, μ is the population mean (in this case, 70), s is the sample
standard deviation, and n is the sample size.

Plugging in the values, we get:

t = (71.8 - 70) / (8.9 / sqrt(100)) = 2.02

Looking up the critical value of t for 99 degrees of freedom and a one-tailed test with a
significance level of 0.05, we get a value of 1.660. Since our calculated t-value (2.02) is
greater than the critical value of t (1.660), we can reject the null hypothesis and conclude that
the average score is significantly greater than 70 at a significance level of 0.05.

11. Null hypothesis H0: the average weight of each bag is 8kg.

Alternative hypothesis Ha: the average weight of each bag is less than 8kg.

We will use a one-tailed t-test with a significance level of 0.01 and 49 degrees of freedom (n-
1), since the sample size is greater than 30.
The test statistic is calculated as:

t = (x̄ - μ) / (s / √n) = (7.8 - 8) / (0.5 / √50) = -2.82

where x̄ is the sample mean, μ is the population mean, s is the sample standard deviation, and
n is the sample size.

Using a t-distribution table with 49 degrees of freedom and a one-tailed test at a significance
level of 0.01, we find the critical value to be -2.405.

Since the calculated test statistic (-2.82) is less than the critical value (-2.405), we reject the
null hypothesis and conclude that there is sufficient evidence to suggest that the average
weight of each bag is less than 8kg at a significance level of 0.01.

12. To test the claim that the average distance driven per year is more than 20,000 km,
we will conduct a one-sample t-test with a significance level of 0.05.

The null hypothesis is that the true mean distance driven per year is equal to or less than
20,000 km:

H0: μ ≤ 20000

The alternative hypothesis is that the true mean distance driven per year is greater than
20,000 km:

Ha: μ > 20000

We have a sample size of n = 30, a sample mean of x̄ = 22500 km, and a sample standard
deviation of s = 5500 km.

The t-statistic is calculated as:

t = (x̄ - μ) / (s / sqrt(n))

where μ = 20000 km.

Plugging in the numbers, we get:

t = (22500 - 20000) / (5500 / sqrt(30))


t = 3.34

Using a t-table with 29 degrees of freedom (n-1), we find that the critical value for a one-
tailed test at a 0.05 level of significance is 1.699. Since our calculated t-statistic (3.34) is
greater than the critical value (1.699), we reject the null hypothesis.

Therefore, we can conclude that there is sufficient evidence to suggest that the car owners in
the sample drive more than 20,000 km per year.

13. This is a one-tailed hypothesis test with the null hypothesis H0: µ ≥ 500 and the
alternative hypothesis Ha: µ < 500.

We can use a one-sample t-test since the sample size is large enough (n=60) to assume the
sample mean follows a normal distribution.

The test statistic can be calculated as:

t = (sample mean - hypothesized mean) / (sample standard deviation / sqrt(sample size))

t = (X̄ - 500) / (s / sqrt(n))

t = (X̄ - 500) / (20 / sqrt(60))

t = (X̄ - 500) / 2.581

where X̄ is the sample mean and s is the sample standard deviation.

Using a significance level of 0.03 and 59 degrees of freedom (n-1), the critical value for a
one-tailed test is -1.645.

If the calculated t-value is less than the critical value, we can reject the null hypothesis and
conclude that the true mean breaking strength is less than 500kg.

t_ calculated = (22500 - 500) / 2.581 = 8505.11

Since the calculated t-value is much greater than the critical value, we fail to reject the null
hypothesis. There is not enough evidence to support the claim that the mean breaking
strength of the new bungee jumping cords is less than 500kg at a 0.03 level of significance.
14. We can use a one-sample t-test to test the hypothesis that the average content of
containers of a particular brand of cooking oil is 10 liters.

The null and alternative hypotheses are:

- H0: µ = 10 (the mean content of the containers is 10 liters)

- Ha: µ ≠ 10 (the mean content of the containers is not 10 liters)

We can calculate the t-statistic using the formula:

t = (x̄ - µ) / (s / sqrt(n))

where x̄ is the sample mean, µ is the hypothesized population mean (10 liters), s is the
sample standard deviation, and n is the sample size.

Plugging in the values from the sample, we get:

x̄ = 10.06

s = 0.23

n = 10

Then, the t-statistic can be calculated as:

t = (10.06 - 10) / (0.23 / sqrt(10)) = 3.42

Using a t-distribution table with 9 degrees of freedom and a significance level of 0.01 (two-
tailed test), the critical values are ±3.250.

Since our calculated t-value (3.42) is greater than the critical value (3.250), we can reject the
null hypothesis and conclude that there is sufficient evidence to suggest that the mean content
of the containers is different from 10 liters at a 0.01 level of significance.

You might also like