0% found this document useful (0 votes)
120 views13 pages

Formula Sheet

The document discusses key concepts related to sample distributions and hypothesis testing. It defines key terms like mean, standard deviation, z-value, and sampling distribution. It also covers topics like central limit theorem, sampling distribution of proportions, confidence intervals, hypothesis testing, and ANOVA. Some key points are: 1. The sampling distribution of the mean x-bar is normally distributed with mean μ and standard deviation σ/√n according to the central limit theorem. 2. Hypothesis testing involves a null and alternative hypothesis, a test statistic, a p-value, and decisions to reject or fail to reject the null based on the significance level. 3. Confidence intervals estimate population parameters like the mean μ

Uploaded by

Uoloht Putin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views13 pages

Formula Sheet

The document discusses key concepts related to sample distributions and hypothesis testing. It defines key terms like mean, standard deviation, z-value, and sampling distribution. It also covers topics like central limit theorem, sampling distribution of proportions, confidence intervals, hypothesis testing, and ANOVA. Some key points are: 1. The sampling distribution of the mean x-bar is normally distributed with mean μ and standard deviation σ/√n according to the central limit theorem. 2. Hypothesis testing involves a null and alternative hypothesis, a test statistic, a p-value, and decisions to reject or fail to reject the null based on the significance level. 3. Confidence intervals estimate population parameters like the mean μ

Uploaded by

Uoloht Putin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Sample Distribution

1. Mean = µ = Σxi/ N = Summation of all the samples/ Total sample

2. S.D = σ = = Root of (Σ (xi − µ) ^2/ N) = Summation of (sample value- mean) whole square/
Total number of samples

3. If a population is normal with mean µ and standard deviation σ, the sampling distribution of
¯x is also normally distributed with ( xbar= mean,

µ xbar = µ

σ xbar = σ /√ n
4. Z value of sample distribution of x bar is given by = Z= (x bar - µ xbar) / σ xbar
Where σ xbar = σ /√ n

Question
Suppose a population has mean µ = 8 and standard deviation σ = 3. Suppose a random sample
of size n = 36 is selected. What is the probability that the sample mean is between 7.8 and 8.2 ?
Mean = µ = 8
SD= σ= 3
N= 36
Since N> 30 , it can be considered under normal distribution
µ xbar = µ = 8
σ xbar = σ /√ n = 3/ 6 = 0.5
P(7.8< µ xbar< 8.2) = P(7.8-8/0.5< z< 8.2-8/0.5) = P(-0.4<z<0.4) = z score gives value 1.554 , which
is symmetrical , hence 0.3108
Sampling of a Finite Population-
Finite population multiplier = Root of (N – n/ N-1)
Standard error of the mean from finite population = σ xbar = σ/ root n * (Root of (N – n/ N-1)
Modified Z Formula = x bar - µ/ σ xbar
Sampling Distribution of Proportion-
The sample proportion is the percentage of successes in n binomial trials. It is the number of
successes, x, divided by the number of trials, n.
Sample proportion = x/n = no of success/ no of trials
Standard deviation = root of ( p(1-p)/n)
Z score = (Xbar - µ)/ (S)
Question
Suppose that 25% of all Indian in each income and lifestyle category are interested in buying a
particular brand of car. A random sample of 100 Indian consumers in the category of interest is
to be selected. What is the probability that at least 20% of those in the sample will express an
interest in that brand of car?
P(P^ >= 0.20) ->?
Np = 100 * 0.25 = 25
N(1-P) = 100 *0.75 = 75
Since both the numbers are greater than 5 , we may use normal approximation to the distribution of P^
Mean = 25 = np
Standard deviation = root of ( p(1-p)/n)
root of ( 0.25(1-0.25)/100)= 0.0433
P(P>=20) = P(Z>= (Xbar - µ)/ (S) = P(Z>=( 0.20- 0.25)/ (0.0433)
= = P(Z>=( -1.15) = 0.8749
Estimation
Probability that the unknown population parameter falls within the interval
Denoted (1 − α) % = level of confidence
Ex- If we say that the population mean, µ falls within the interval a and b with 95% confidence (i.e., α
= 0.5), then mathematically, P(a < µ < b) = 0.95
1. Estimating µ(mean) from Large Samples When σ(SD) is Known
• the probability that the interval:

Question
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know
from past testing that the population standard deviation is 0.35 ohms. Determine a 95% confidence
interval for the true mean resistance of the population.
Mean = 2.20
SD = 0.35
1- α = 0.95
α = 0.05 , α/2 = 0.025
Using formula

2.20- 1.96* 0.35/ root 11 < µ < 2.20+1.96* 0.35/ root 11


1.9932< µ < 2.4068
95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms
Question
A survey was taken of U.S. companies that do business with firms in India. One of the questions on
the survey was: Approximately how many years has your company been trading with firms in India?
A random sample of 44 responses to this question yielded a mean of 10.455 years. Suppose the
population standard deviation for this question is 7.7 years. Using this information, construct a 90%
confidence interval for the mean number of years that a company has been trading in India for the
population of U.S. companies trading with firms in India. [Ans: (8.545, 12.365)]
N = 44
Mean = 10.455
SD = 7.7
1-alpha = 0.90
Alpha = 0.10
Alpha/2 = 0.05
90% confidence ; Z(0.05) = 1.645

10.455- 1.645* 7.7/ root 44 < µ < 10.455+ 1.645* 7.7/ root 44
8.854< µ <12.365
Question
A study is conducted in a company that employs 800 engineers. A random sample of 50 engineers
reveals that the average sample age is 34.3 years. Historically the population standard deviation of the
age of the company’s engineers is approximately 8 years. Construct a 98% confidence interval to
estimate the average age of all the engineers in the country. [Ans: (31.66, 36.94)]
N= 800
n = 50
mean = 34.3
SD= 8
Alpha = 0.02
Alpha/2 = 0.01
Z score =2.33

34.3- 2.33* 8/ root 50 * root of ( 750/799) < = µ < = 34.3+ 2.33* 8/ root 50 * root of ( 750/799)
31.75 < µ < 36.94

Estimating µ from Small Samples


when sample size < 30
• When sample size is 30 or less and population standard deviation is unknown, t-distribution (or
Student’s t-distribution) is more appropriate

Symmetric, Unimodal, Mean = 0, Flatter than a Z

T = x bar - µ/ (s/ root n)


DOF = n-1
Question
A random sample of n = 25 taken from a normal population has ¯x = 50 and s = 8. Form a 95%
confidence interval for µ.
Since N< 25 it will come under T distribution
N=25
DOF = 24
Mean = 50
SD =8
Alpha = 0.05
T(alpha/2 , DOF) = T(0.025, 24) = 2.06390
Confidence Interval

50-2.064*8/ root 25 < x bar < 50+2.064*8/ root 25


= [46.698, 53.302]

Question
The owner of a large equipment rental company wants to make a rather quick estimate of the average
number of days a piece of ditchdigging equipment is rented out per person per time. The company has
records of all rentals, but the amount of time required to conduct an audit of all accounts would be
prohibitive. The owner decides to take a random sample of rental invoices. Fourteen different rentals
of ditch diggers are collected randomly from the files, yielding the following data. She uses these data
to construct a 99% confidence interval to estimate the average number of days that a ditch digger is
rented and assumes that the number of days per rental is normally distributed in the population. Data:
31325121421311

N =14
Alpha =0.01
Data: 3 1 3 2 5 1 2 1 4 2 1 3 1 1
Mean = 30/14 = 2.14
SD = 1.29
DOF = 13
T(0.005, 13) = 3.012

2.14-3.012*1.29/ root 14 < x bar < 2.14+3.012*1.29/ root 14


[1.10, 3.18]

Determining Sample Size for the Mean


E= z* sigma / root n ( sampling error (margin of error))
N= z^2 * sigma^2 / e^2
Question
If σ = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence
N = z^2 * sigma^2 / e^2
Sigma = 45
Alpha = 0.10; alpha/2 = 0.05
N = 1.645*1.645 * 45*45/ 5*5 =219.19 rounded to 200

Hypothesis Testing
A hypothesis is a claim (assumption) about a population parameter.
• If the null hypothesis is true, then no corrective action would be necessary.
• If the null hypothesis is not true, then some corrective action would be necessary.
Type 1 Error
• Rejecting a true null hypothesis
• The probability of committing a Type I error is called α, the level of significance. i.e. α = P(Reject
H0|H0 is true)
Type 2 Error
• Failing to reject a false null hypothesis
• The probability of committing a Type II error is called β, i.e., β = P(Accept H0|H0 is false)

 when p-value is less than α, reject H0.

Z Test
Sample size is at least 30
Z = x bar- µ / (σ /√ n) ( σ is known )
T Test
Z = x bar- µ / (s √ n) ( SD is known in this case, σ is unknown )
P Value calculation
Ex- A random sample of 100 yields a sample mean of only 999. assume population standard deviation
is known (say 5).
p = P(X bar ≤ 999) = P( ( x bar − µ/ ( σ/√ n) <= 999-1000/( 5/ root 100)= P( Z<-2)
1-Tailed and 2-Tailed Tests
If action is to be taken if a parameter is less than or equal to some value α, then the alternative
hypothesis is that the parameter is less than α, and the test is a left-tailed test.
H0 : µ > 50 vs. H1 : µ ≤ 50

If action is to be taken if a parameter is either greater than or less than some value α, then the
alternative hypothesis is that the parameter is not equal to α, and the test is a right-tailed test.
H0 : µ < 50 vs. H1 : µ ≥ 50
If action is to be taken if a parameter is either greater than or less than some value α, then the
alternative hypothesis is that the parameter is not equal to α, and the test is a two-tailed test.
H0 : µ = 50 vs. H1 : µ 6= 50
Question

A test of breaking strength of six ropes manufactured by a company showed a mean breaking
strength of 6425 lb and a standard deviation of 120 lb. However, the manufacturer claimed a mean
breaking strength of 7500 lb.

1 Can we support the manufacturer’s claim at a level of significance of 0.10? 2 What assumption did
you make for this problem?

Mean= 6425
SD = 120 lb
Claim = 7500
Alpha = 0.10
H0 = mean= 6425
H1 = 7500
We can use the ztest as n> 30
Z0 = x bar- µ / (σ /√ n)
7500- 6425/ 120/ root 6 = 1075/ 120*2.449 = 3.65
Z0.9 = 2.33
Since Z0> Z0.9 we can reject H0

Completing the Anova Table


Question
No of cases are 26, 3 predator variables, variance of response variable =14 , coeff of
determination =.90

Source DF SS MS F
Regression P=3 315 315/3 =105 105/1.59= 66.03
Error n-p-1 = 22 (350-315)= 35 35/22 =1.59
Total n-1 = 25 350
Variance of response variable = SS total / n-1
14 = SS total / 25
SS total =350
R^2 = SS reg/ SS total
R^2 =0.90= SS reg/ 350
SSreg= 315
Mean Square
SS/ DOF

ANOVA
Question

A company has three manufacturing plants, and company officials want to determine whether there
is a difference in the average age of workers at the three locations. The following data are the ages
of five randomly selected workers at each plant. Determine whether there is a significant difference
in the mean ages of the workers at the three plants. Use α = .0
• The number of degrees of freedom associated with SST is (n-1). n total observations in all r groups,
less one degree of freedom lost with the calculation of the grand mean

• The number of degrees of freedom associated with SSTR is (r-1). r sample means, less one degree
of freedom lost with the calculation of the grand mean

• The number of degrees of freedom associated with SSE is (n-r). n total observations in all groups,
less one degree of freedom lost with the calculation of the sample mean from each of r groups

Chi Square

Question

Suppose a business researcher wants to determine whether type of gasoline preferred is


independent of person’s income. She takes a random survey of gasoline purchasers, asking them
one question about gasoline preference and a second question about income. The respondent is to
check whether he or she prefers (1) regular gasoline, (2) premium gasoline, or (3) extra premium
gasoline. The respondent also is to check his or her income brackets as being (1) less than $30,000,
(2) $30,000 to $49,999, (3) $50,000 to $99,999, or (4) more than $100,000.
Question

A survey of morning beverage market shows that the primary breakfast beverage for 17% of
Americans is milk. A milk producer in Wisconsin, where milk is plentiful, believes that the figure is
higher for Wisconsin. To test this idea, she contacts a random sample of 550 Wisconsin residents and
asks which primary beverage they consumed for breakfast that day. Suppose 115 replied that milk
was the primary beverage. Using a level of significance of .05, test the idea that the milk figure is
higher for Wisconsin

H0p = .17

H1p > .17

Alpha = .05
If Z  1645
. , reject Ho.
If Z  1645
. , do not reject Ho.

115
p  .209
550
p  P .209.17
Z   2.44
P Q (.17)(.83)
n 550

Since z =2.44>1.645 reject h0

You might also like