hypothesis testing
hypothesis testing
hypothesis testing
Population
In statistics, we generally want to study a population.
You can think of a population as a collection of persons,
things, or objects under study.
Population
Ans:
The population is all first-year students attending Christ University this
term.
Key Terms
Sample
To study the population, we select a sample.
The idea of sampling is to select a portion (or subset) of the
larger population and study that portion (the sample) to gain
information about the population.
Sample
Determine the sample:
We want to know the average (mean) amount of money first year college
students spend at Christ University on school supplies (excluding books).
We randomly survey 100 first year students at the college. Three of those
students spent Rs 150, Rs 200, and Rs 225, respectively.
Ans:
The sample could be 100 first year students at the college.
Parameter
A parameter is a number that is a property of the population.
Because it takes a lot of time and money to examine an entire
population, sampling is a very practical technique.
Parameter
Determine the parameter:
We want to know the average (mean) amount of money first year college
students spend at Christ University on school supplies (excluding books).
We randomly survey 100 first year students at the college. Three of those
students spent Rs 150, Rs 200, and Rs 225, respectively.
Ans:
The parameter is the average (mean) amount of money spent (excluding
books) by first year college students at Christ University this term.
Statistic
A statistic is a number that represents a property of the
sample.
Statistic
Determine the statistic:
We want to know the average (mean) amount of money first year college
students spend at Christ University on school supplies (excluding books).
We randomly survey 100 first year students at the college. Three of those
students spent Rs 150, Rs 200, and Rs 225, respectively.
Ans:
The statistic is the average (mean) amount of money spent (excluding
books) by first year college students in the sample.
Relation Between a Population and its Samples
Population – Parameter
Sample – Statistic
Categories of Statistical Analysis
Some examples :
▪ The average rate of inflation in 1970’s was greater than the average rate of inflation in
1990’s.
▪ An increase in the proportion of workers belonging to labor unions increases the wage rate
in a state, Ceteris paribus.
The process that enables a decision maker to test the validity (or
significance) of his claim by analyzing the difference between the value of
sample statistic and the corresponding hypothesized population
parameter value is called hypothesis testing.
Hypothesis Testing
• Is also called significance testing
• Tests a claim about a parameter using evidence (data in a sample
• The technique is introduced by considering a one-sample z test
• The procedure is broken into five steps
• Each element of the procedure must be understood
Step 1: State the Null Hypothesis (H0) and Alternative Hypothesis (H1)/Ha
The problem: In the 1970s, 20–29 year old men in India had a mean μ body weight
of 170 pounds. Standard deviation σ was 40 pounds. We test whether mean body
weight in the population now differs.
This means that the finding has a 95% chance of being true. Instead it will show
you ".05," meaning that the finding has a five percent (.05) chance of not being
true, which is the converse of a 95% chance of being true.
Compare the calculated value of the test statistic with the critical value (also called standard table
value of test statistic). The decision rules for null hypothesis are as follows:
• Accept H0 if the test statistic value falls within the area of acceptance.
• Reject otherwise.
HYPOTHESIS TESTING FOR POPULATION PARAMETERS WITH LARGE SAMPLES
Hypothesis Testing for Single Population Mean
A packaging device is set to fill detergent powder packets with a mean weight of 5 kg, with a standard
deviation of 0.21 kg. The weight of packets can be assumed to be normally distributed. The weight of
packets is known to drift upwards over a period of time due to machine fault, which is not tolerable. A
random sample of 100 packets is taken and weighed. This sample has a mean weight of 5.03 kg. Can we
conclude that the mean weight produced by the machine has increased? Use a 5 per cent level of
significance.
Desired Confidence Interval Z Score Level of Significance
90% 1.645 10%
95% 1.96 5%
99% 2.576 1%
The mean lifetime of a sample of 400 fluorescent light bulbs produced by a
company is found to be 1600 hours with a standard deviation of 150 hours. Test the
hypothesis that the mean life time of the bulbs produced in general is higher than
the mean life of 1570 hours at α = 0.01 level of significance.
The mean lifetime of a sample of 400 fluorescent light bulbs produced by a
company is found to be 1600 hours with a standard deviation of 150 hours. Test the
hypothesis that the mean life time of the bulbs produced in general is higher than
the mean life of 1570 hours at α = 0.01 level of significance.
An ambulance service claims that it takes, on the average, 8.9 minutes to reach its destination in
emergency calls. To check on this claim, the agency which licenses ambulance services has then
timed on 50 emergency calls, getting a mean of 9.3 minutes with a standard deviation of 1.8
minutes. Does this constitute evidence that the figure claimed is too low at the 1 per cent
significance level?
An ambulance service claims that it takes, on the average, 8.9 minutes to reach its destination in
emergency calls. To check on this claim, the agency which licenses ambulance services has then
timed on 50 emergency calls, getting a mean of 9.3 minutes with a standard deviation of 1.8
minutes. Does this constitute evidence that the figure claimed is too low at the 1 per cent
significance level?
Type I Error (False Positive Error)
Cost Assessment Costs (actual costs plus shepherd Replacement cost for the sheep
credibility) associated witheaten by the wolf, and
scrambling the townsfolk to killreplacement cost for hiring a
the non-existing wolf new shepherd
Null Hypothesis Type I Error / False Positive Type II Error / False Negative
Person is not guilty of the Person is judged as guilty when Person is judged not guilty when
crime the person actually did not they actually did commit the crime
commit the crime (convicting an (letting a guilty person go free)
innocent person)
Cost Assessment Social costs of sending an Risks of letting a guilty criminal
innocent person to prison and roam the streets and committing
denying them their personal future crimes
freedoms (which in our society,
is considered an almost
unbearable cost)
Null Hypothesis Type I Error / False Positive Type II Error / False Negative
Medicine A cures (H0 true, but rejected as false) (H0 false, but accepted as true)
Disease B Medicine A cures Disease B, but is Medicine A does not cure Disease B,
rejected as false but is accepted as true
Cost Assessment Lost opportunity cost for rejecting Unexpected side effects (maybe even
an effective drug that could cure death) for using a drug that is not
Disease B effective
Hence, many textbooks and instructors will say that the Type 1
(false positive) is worse than a Type 2 (false negative) error.