0% found this document useful (0 votes)
13 views21 pages

Lab 6 For Students

A chemical engineer claims the population mean yield of a batch process is 500 grams/ml, but a sample of 25 batches shows a mean of 518 grams/ml and a standard deviation of 40 grams/ml. Using a two-tailed t-test at a 5% significance level, the calculated t-statistic of 2.25 exceeds the critical t-value of 2.0639, leading to the rejection of the claim that the population mean is 500 grams/ml. Additionally, confidence intervals for a sample of 90 observations indicate that the true population mean lies within specific ranges, and the validity of these intervals is supported by the Central Limit Theorem.

Uploaded by

Prza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views21 pages

Lab 6 For Students

A chemical engineer claims the population mean yield of a batch process is 500 grams/ml, but a sample of 25 batches shows a mean of 518 grams/ml and a standard deviation of 40 grams/ml. Using a two-tailed t-test at a 5% significance level, the calculated t-statistic of 2.25 exceeds the critical t-value of 2.0639, leading to the rejection of the claim that the population mean is 500 grams/ml. Additionally, confidence intervals for a sample of 90 observations indicate that the true population mean lies within specific ranges, and the validity of these intervals is supported by the Central Limit Theorem.

Uploaded by

Prza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 21

PROBLEM 2.

A chemical engineer claims that the population mean yield of a certain batch process is 500 grams per milliliter

with this claim. What conclusion should he draw from a sample that has a mean 𝑥 ̅= 518 grams per milliliter and a sample s
material. To check this claim he samples 25 batches each month. If the computed t-value falls between −t0.05 and t0.05, he

deviation s = 40 grams? Assume the distribution of yields to be approximately normal.

SOLUTION 2. A chemical engineer claims that the population mean μ=500 grams/ml.
A sample of 25 batches gives:
- Sample mean = 518
- Sample standard deviation s=40

We assume the population is normally distributed, and we’re doing a two-tailed test at the 5% significance level.
We are told:
If the t-value falls between −t₀.₀₅ and t₀.₀₅, we accept the claim.

Let's write what we know:

Sample mean (𝑥 ̅)


Claimed population mean (μ)

Sample SD (s)
Sample size (n)
Significance level (α)

Let's first determine the degrees of freedom (df). It will be:

Degrees of freedom (df)

Let's calculate now thestandard error:

Standard error

The standard error tells us how much the sample means would vary if we kept taking samples of 25 batches over
again.
Think of it like this:
"If we repeat this test 100 times, how much would the results bounce around just by chance?"
A smaller SE means our sample mean is more stable and trustworthy.
Now, let's compute the t-statistic:

The t-statistic tells us how far our sample mean is from the claimed population mean, measured in standard erro
So in this case:
“The sample mean is 2.25 standard errors higher than the claimed mean of 500.”
This number helps us decide if the difference is small and acceptable (normal chance) or too big to ignore.
The t-statistic tells us how far our sample mean is from the claimed population mean, measured in standard erro
So in this case:
“The sample mean is 2.25 standard errors higher than the claimed mean of 500.”
This number helps us decide if the difference is small and acceptable (normal chance) or too big to ignore.

t-statistic

Now, we need to find the critical t-value. A critical t-value is a cutoff point.
It tells us:
"How far from the claimed mean is too far?"
We use it to decide whether the difference between the sample mean and the claimed mean is small enough to be normal,
enough to reject the claim.

For that we are using the Excel formula: =T.INV(probability,deg_freedom)

Where:
- probability → it is the probability (from 0 to 1) to the left of the requested t-value
- deg_freedom → degrees of freedom, in the t-test it is n−1

What will be the probability in our case?


We perform a two-sided test with a 5% error (α = 0.05).In a two-tailed test, that error is split in half:
- 2.5% on the left side
- 2.5% on the right side

But the function T.INV(probability, df) looks for the probability to the left of the boundary, that's why we are using:

Now, what does α=0.05 mean?


It means we allow a 5% chance that the result happened by luck, even if the claim is true.
But because we are checking both tails (both ends of the curve — lower and upper), we divide this error equally between th

This gives us 2.5% for each tail.

ImagineProbability 0.975
a bell-shaped curve (normal or t-distribution). For a 95% confidence level:
The middle 95% is where we accept the result.
The outer 5% is where we reject the result.
2.5% on the left side
2.5% on the right side
That’s where the value 0.025 comes from.
But to find the critical value on the right side, we need the total area to the left of it, which is:

Because we are saying:


“Give me the t-value where 97.5% of the curve is to the left of it (and 2.5% is to the right).”
We are looking for this:
t-value

Now, we have given that if the t-value falls between −t₀.₀₅ and t₀.₀₅, we accept the claim. We have that:

So if our calculated t-statistic is greater than 2.0639 or less than −2.0639, it's too extreme to be due to chance.
It needs to be −t₀.₀₅ < t-statistc < t₀.₀₅, e.g. –2.0639 < t-statistic < 2.0639.

But since our calculated t-statistic = 2,25, it is greater than 2.0639, so we can conclude that the sample mean of 518 g/ml is
higher than the claimed mean of 500 g/ml, with a t-statistic of 2.25 that exceeds the critical t-value of 2.0639.
Therefore, we reject the claim that the population mean is 500 g/ml.
ess is 500 grams per milliliter of raw Why do we use the t-statistic?
etween −t0.05 and t0.05, he is satisfied We use the t-statistic when:
per milliliter and a sample standard - We want to compare the sample mean to a claimed population mean.
- The population standard deviation σ is unknown.
- We are working with a small sample size (usually n<30).

The goal is to check if the difference between the sample mean and the population
significant or just happened by chance.
So basically, we’re asking:
“Is the sample mean far enough from the claimed mean to doubt the claim?”
What does the t-statistic tell us?
significance level. The t-statistic tells us how many standard errors the sample mean is away from the

- A large t-value (positive or negative) suggests a big difference → we may reject th


- A small t-value means the sample mean is close to the claimed mean → the differ
random variation.
What are degrees of freedom (df)?
Degrees of freedom tell us how much information is available for estimating varia
For a one-sample t-test, the formula is:

Where:
- n is the number of observations (sample size).
- We subtract 1 because we’re using the sample mean to estimate the population m
up" one degree of freedom.
Think of it this way:
If you know the values of n−1 data points and the mean, you can calculate the last
n−1 values are free to vary.

amples of 25 batches over and over

ance?"

measured in standard errors.

or too big to ignore.


measured in standard errors.

or too big to ignore.

small enough to be normal, or large

half:

s why we are using:

this error equally between the two sides:

ht).”
ave that:

due to chance.

sample mean of 518 g/ml is significantly


alue of 2.0639.
population mean.
0).

ple mean and the population mean is statistically

n to doubt the claim?”

mple mean is away from the population mean μ.

fference → we may reject the claim.


e claimed mean → the difference could just be

vailable for estimating variation.

o estimate the population mean, which "uses

n, you can calculate the last data point. So only


PROBLEM 3. A random sample of 90 observations produced a mean x =25.9 and a standard deviation s=2.7.
a) Find a 95% confidence interval for the population mean.
b) Find a 90% confidence interval for.
c) Find a 99% confidence interval for.
d) What happens to the width of a confidence interval as the value of the confidence coefficient is increased while the samp
fixed?
e) Would your confidence intervals of parts a-c be valid if the distribution of the original population was not normal? Explain

SOLUTION 3. A random sample of 90 observations produced a mean 𝑥 ̅ =25.9 and standard deviation s=2.7.
We want to calculate confidence intervals for the population mean using different confidence levels.
We want to estimate the true mean of a population based on a sample.
But since we don’t have data for the entire population, we use a confidence interval to give a range where we believe the tr
located.

From the problem statment we know the following:

- Sample mean 𝑥 ̅ = 25,9


- Sample standard deviation s = 2,7
- Sample size n = 90
- Degrees of freedom df = n−1 = 90 - 1 =89

Sample mean (x ̅)


Samle SD (s)
Sample size (n)
Df = n-1

Before determining confidence intervals, let's compute the sample standard error for the mean.

The sample standard deviation (s) tells us how much individual data points vary within a single sample.
But what we really care about when estimating the population mean is:
How much would the sample mean vary if we took many samples from the population?
That’s where Standard Error (SE) comes in.
So, we already know the sample standard deviation (s = 2.7) and the sample size (n = 90).
The standard error (SE) tells us how much the sample mean is expected to vary from the true population mean.

Standard error (SE)

This means that if we took many samples of size 90, their means would vary by about 0.2846 from the true mean.

Since we do not know the population standard deviation we are going to use the t-test for determining each confidence inte

A confidence interval is a range of values that we believen with a certain level of confidence that it contains the true popula

If the population standard deviation is unknown (which is common), and the sample is small (n < 30) or populati
Since we do not know the population standard deviation we are going to use the t-test for determining each confidence inte

A confidence interval is a range of values that we believen with a certain level of confidence that it contains the true popula

If the population standard deviation is unknown (which is common), and the sample is small (n < 30) or populati
perfectly normal we are using this formula:

a) 95% Confidence Interval


Now let's write or compute everything that is needed for this formula above.

Sample mean
Standard error (SE)
Level of confidence
Alpha = 1 - Level of confidence
Alpha/2
df = n - 1

We need the t-value for a 95% confidence level. Since it's a two-tailed test, each tail gets 0.025 (because 0.05 tot
0.025 per tail). In Excel we are using:
=T.INV.2T(probability,deg_freedom)

t-value

Now let's calculate the margin error:

and then let's determine the confidence interval (CI):

Margin of error (ME)


Lower Interval Limit
Upper Interval Limit

So wanted confidence interval is [25.33, 26.47], which means that we are 95% confident that the true population m
between 25.33 and 26.47.

b) 90% Confidence Interval


Now let's write or compute everything that is needed for the confidencve interval formula above.
Sample mean
Standard error (SE)
Level of confidence
Alpha = 1 - Level of confidence
Alpha/2
df = n - 1
t-value
Margin of error ME
Lower Interval Limit
Upper Interval Limit

90% confidence interval is [25.43, 26.37], which means that we are 90% confident that the true population mean is betw
26.37.

c) 99% Confidence Interval


Now let's write or compute everything that is needed for the confidencve interval formula above.

Sample mean
Standard error (SE)
Level of confidence
Alpha = 1 - Level of confidence
Alpha/2
df = n - 1
t-value
Margin of error ME
Lower Interval Limit
Upper Interval Limit

99% confidence interval is [25.15, 26.65].


d) How does confidence level affect the width?
As we increase the confidence level:
- We want to be more certain,
- So we have to widen the interval to “capture” the true mean more often.
- Higher confidence level → wider interval

e) What if the population is not normal?


This is where the Central Limit Theorem (CLT) comes into play, CLT says:
Even if the original population is not normally distributed, the sampling distribution of the mean will be approximately norm
size is large (n ≥ 30 is a common rule).
Here, n = 90, which is large enough.
So, confidence intervals are still valid, even if the population isn’t normal.
tion s=2.7.

s increased while the sample size is held

n was not normal? Explain.

d deviation s=2.7.
ence levels.

ge where we believe the true mean is

a single sample.

).
true population mean.

46 from the true mean.

mining each confidence interval.

t contains the true population mean (μ).

mall (n < 30) or population is not


mining each confidence interval.

t contains the true population mean (μ).

mall (n < 30) or population is not

0.025 (because 0.05 total alpha →

at the true population mean is

e.
e population mean is between 25.43 and

e.

will be approximately normal if the sample


PROBLEM 4. A random sample of 50 consumers taste-tested a new snack food. Their responses were coded (0: do not like;
follows:

Snack Data a) Use an 80% confidence interval to estimate the proportion of consumers who like snack foo
1 b) Provide a statistical interpretation for the confidence interval you constructed in part a.
0
0
1 SOLUTION 4. So, we have that a random sample of 50 consumers tasted a snack.
2 Each consumer's response is recorded as:
0 - 0 = Does not like
1 - 1 = Likes
1 - 2 = Indifferent
0 We want to calculate an 80% confidence interval for the proportion of consumers wh
0
0 We have the 50 responses in a single column (only the 1s are “likes”). Let's calculate h
1
0
2
0 Likes
2
2 Now we need to calculate the sample proportion:
0
0
1
1
0 Sample proportion (P)
0
0 When we need to calculate the standard error of our sample proportion, which tells us how
0 we use this formula:
1
0
2
0 Why? Because we're working with only a sample, not the whole population, so we expect som
0 To calculate the standard error we are using this Excel formula:
0
1 =SQRT(p̂ * (1 - p̂ ) / 50)
0
0
1
0 Standard error of SP (SE)
0
1 When we want to estimate the range of likely values for the true population proportion (π), b
0
1
0
2
We have already calculated sample proportion (p), and standard error of sample proportion,
80% confidence level.

We want to be 80% sure that our true value (proportion of people who like snack) are in cert
But if the 80% is in the middle, how many retain in the tails of distribution? The answer is:
0 We have already calculated sample proportion (p), and standard error of sample proportion,
0 80% confidence level.
1 We want to be 80% sure that our true value (proportion of people who like snack) are in cert
1 But if the 80% is in the middle, how many retain in the tails of distribution? The answer is:
0
0
0 Since our interval is two-tailed confidence interval with 80% confidence, we have to t
1

This means that 5% goes to the left tail, and 5% goes to the right tail, and 80% is in the middle

What we want to find, we want to find Z-value which lelaves 10% in right tail, e.g. 90% below
point needs to be 90%).

In Excel we are using the formula: =NORM.S.INV(0.90)

Z-value

This means that we are 1,28 standard deviations far from the mean, and we take 80% of the d

Now, let's just calculate the margin of error (ME):

and after that confideence interval:

Margin of error (ME)


Lower Interval Limit
Upper Interval Limit

So, the confidence interval is [0.217, 0.383], which means that we are 80% sure that th
between 21.7% and 38.3%.
ses were coded (0: do not like; 1: like; 2: indifferent) and recorded as

consumers who like snack food.


you constructed in part a.

mers tasted a snack.

roportion of consumers who like the snack (coded as 1).


are “likes”). Let's calculate how many peple likes snacks:

proportion, which tells us how much our sample might vary from the truth

e population, so we expect some natural variation.

ue population proportion (π), based on our sample, we use this formula:

d error of sample proportion, so now we need to find the Z-value for an

ple who like snack) are in certain interval.


distribution? The answer is:
d error of sample proportion, so now we need to find the Z-value for an

ple who like snack) are in certain interval.


distribution? The answer is:

0% confidence, we have to tails so it will be

ht tail, and 80% is in the middle.

0% in right tail, e.g. 90% below itself (because everytihng left from that

mean, and we take 80% of the data (40% left, 40%right).

that we are 80% sure that the real proportion of who love snack is
PROBLEM 5. A problem with a telephone line that prevents a customer from receiving or making calls is upsetting to both th
telephone company. The data contains samples of 20 problems reported to two different offices of a telephone company an
problems (in minutes) from the customers’ lines:

Central Office I Time to Clear Probleems (minutes)


1.48 1.75 0.78 2.85 0.52 1.6 4.15
1.02 0.53 0.93 1.6 0.8 1.05 6.32

Central Office II Time to Celar Problems (minutes)


7.55 3.75 0.1 1.1 0.6 0.52 3.3
3.75 0.65 1.92 0.6 1.53 4.23 0.08

Assuming that the population variances from both offices are equal, is there evidence of a difference in the mean waiting tim
offices? (Use =0.05)

SOLUTION 5. You are given two sets of 20 values:


- Office I: times in minutes for 20 problems
- Office II: same for another 20 problems
We want to know: Is there a significant difference between the two average times?

Office 1 Office 2
Sample size (n1) Sample size (n2)
Sample mean (x1) Sample mean (x2)
Sample SD (s1) Sample SD (s2)

We want to calculate a confidence interval (CI) for the true population mean of Office I and Office II using:

Office 1 Office 2
df = n-1 df = n-1
Significance level α Significance level α
t-value t-value
Standard error Standard error
Margin of error Margin of error
Lower Interval Limit Lower Interval Limit
Upper Limit Error Upper Limit Error

We are 95% confident that the true mean time it takes Office I to resolve a customer problem is between 1.41 an
We are 95% confident that the true mean time it takes Office II to resolve a customer problem is between 1.13 and 2.89 m

Both offices take about the same amount of time to solve a problem. Based on the data, we cannot conclude one
slower/faster than the other based on this data.
We are 95% confident that the true mean time it takes Office I to resolve a customer problem is between 1.41 an
We are 95% confident that the true mean time it takes Office II to resolve a customer problem is between 1.13 and 2.89 m

Both offices take about the same amount of time to solve a problem. Based on the data, we cannot conclude one
slower/faster than the other based on this data.
alls is upsetting to both the customer and the
a telephone company and the time to clear these

es)
3.97 1.48 3.1
3.93 5.45 0.97

es)
2.1 0.58 4.02
1.48 1.65 0.72

ce in the mean waiting time between the two

and Office II using:

blem is between 1.41 and 3.02 minutes.


between 1.13 and 2.89 minutes.

we cannot conclude one office is definitely


blem is between 1.41 and 3.02 minutes.
between 1.13 and 2.89 minutes.

we cannot conclude one office is definitely

You might also like