0% found this document useful (0 votes)
31 views52 pages

Statistical Inferences, Hypothesis Testing-1

Uploaded by

Samarpan Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views52 pages

Statistical Inferences, Hypothesis Testing-1

Uploaded by

Samarpan Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Lecture-12: Statistical Inferences and

Hypothesis Testing
QTDM-II
Descriptive v/s Inferential Statistics
Simple Random Sampling
 Each sample size of say ‗n‘ has an equal chance of getting selected.

From Yesterday’s Session


Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Consider a population containing 4 elements
such as (3, 7, 11, 15). Now, a sample of size
two has to be withdrawn from this population
and calculate the following:
1. Population Mean
2. Sample Mean,
3. Population Variance,
4. Sample Variance
5. Mean of Sampling Distribution of Mean,
6. Variance of Sampling Distribution
7. Standard Deviation or Standard Error of
Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Consider a population containing 4 elements
such as (3, 7, 11, 15). Now, a sample of size
two has to be withdrawn from this population
and calculate the following:
1. Population Mean
2. Sample Mean,
3. Population Variance,
4. Sample Variance
5. Mean of Sampling Distribution of Mean,
6. Variance of Sampling Distribution
7. Standard Deviation or Standard Error of
Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Consider a population containing 4 elements
such as (3, 7, 11, 15). Now, a sample of size
two has to be withdrawn from this population
and calculate the following:
1. Population Mean
2. Sample Mean,
3. Population Variance,
4. Sample Variance
5. Mean of Sampling Distribution of Mean,
6. Variance of Sampling Distribution
7. Standard Deviation or Standard Error of
Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Mean of Sampling Distribution of Mean
Standard Error of a Estimate
Standard error of the estimate [Standard Deviation of a Statistic]
Often values of sample statistic vary from sample to sample and all the sample values are not equal to the population
parameter. Thus, there arises a need to measure how much the values of sample statistic vary from the population parameter
on average. In order to measure the variation in the values of sample statistic around the population parameter we calculate
the standard deviation of the sampling distribution. This is known as standard error of that statistic. Thus, the standard error
of a statistic can be defined as:
“The standard deviation of a sampling distribution of a statistic is known as standard error and it is denoted by SE.”
Therefore, the standard error of sample mean is given by
Standard Error of a Estimate
Standard Error of a Estimate
Standard Error of a Estimate
Hypothesis Testing

Statistics

Descriptive Inferential
Statistics Statistics

Estimation Testing of
Hypothesis

Point Estimation
Interval
Parametric Test Non-parametric
Estimation
Test
Hypothesis Testing

Estimation: When we don‘t have any information about parameter and want to know about that using Statistic (sample)

 Point Estimation: A single value calculated from sample that estimates the parameter. There are different methods of
point estimation, namely, moment method, maximum likelihood method, least squares method, etc. With the help of
these methods, broadly population characters are estimated by their sample counterparts.

• Interval estimation: A range of values within which the parameter is expected to fall, with a certain degree of
confidence. It is the determination of a range of values within which the numerical character of the population is very
likely to lie. Interval estimation is made after knowing the sampling distribution of the relevant statistic which is
discussed along with the discussion of testing of hypothesis
Hypothesis Testing
STEPS for Hypothesis Testing
1. Formation of Hypotheses
2. Selection of Test Statistic and its Sampling Distribution
3. Choice of Level of Significance (α): The confidence with which an
experimenter rejects or accepts Null Hypothesis depends on the
significance level adopted. Level of significance is the rejection
region ( which is outside the confidence or acceptance region).The
level of significance, usually denoted by the α.
4. Specification of Critical Region
5. Calculation of the value of Test Statistic under H0
6. Decision to be taken
Step-1: Formation of Hypotheses
Step-1: Formation of Hypotheses
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution
Step-2: Selection of Test Statistic & its Sampling Distribution

Chi- Square test : Testing Equality of


Variances
The chi-square value is often used to judge
the significance of population variance i.e.,
we can use the test to judge if a random
sample has been drawn from a normal
population with mean (m) and with a
specified variance ( s2p). The test is based
on chi-square distribution.
Step-2: Selection of Test Statistic & its Sampling Distribution

Chi- Square test : Testing Equality of


Variances
The chi-square value is often used to judge
the significance of population variance i.e.,
we can use the test to judge if a random
sample has been drawn from a normal
population with mean (m) and with a
specified variance ( s2p). The test is based
on chi-square distribution.
Step-3: Choice of Level of Significance ()
Step-3: Choice of Level of Significance ()
Step-3: Choice of Level of Significance ()
Step-3: Choice of Level of Significance ()

 Confidence Level or Acceptance Region:


Confidence level refers to the percentage of probability, or certainty, that the confidence interval would
contain the true population parameter when you draw a random sample many times. The width of the
confidence interval tells us more about how certain (or uncertain) we are about the true figure in the
population. This width is stated as a plus or minus and is called the confidence interval. A 0% confidence
level means we have no faith at all that if the survey is repeated then we will get the same results. A 100%
confidence level means there is no doubt at all that if we repeated the survey then we would get the same
results. However, in reality, we would never publish the results from a survey where we had no confidence
at all that our statistics were accurate. A 100% confidence level doesn‘t exist in statistics, unless you
surveyed an entire population — and even then we probably couldn‘t be 100 percent sure that our survey
wasn‘t open to some kind or error or bias.
Step-3: Choice of Level of Significance ()

 Confidence Coefficient: The confidence coefficient is the confidence level stated as a proportion, rather
than as a percentage. For example, if you had a confidence level of 99%, the confidence coefficient
would be .99.
Step-3: Choice of Level of Significance ()
Step-3: Choice of Level of Significance ()

 Rejection Region: A critical region, also known as the rejection region, is a set of values for the test statistic
for which the null hypothesis is rejected. i.e. if the observed test statistic is in the critical region then we
reject the null hypothesis and accept the alternative hypothesis.
Step-3: Choice of Level of Significance ()

 Level of Significance: The level of significance is the measurement of the statistical significance. It defines
whether the null hypothesis is assumed to be accepted or rejected. It is expected to identify if the result is
statistically significant for the null hypothesis to be false or rejected. The level of significance is denoted by
the Greek symbol α (alpha). Therefore, the level of significance is defined as follows:
Significance Level = p (type I error) = α
The values or the observations are less likely when they are farther than the mean. The results are written
as ―significant at x%‖. The value significant at 5% refers to p-value is less than 0.05 or p < 0.05. Similarly,
significant at the 1% means that the p-value is less than 0.01.
Step-3: Choice of Level of Significance ()

 Level of Significance: The level of significance is the measurement of the statistical significance. It defines
whether the null hypothesis is assumed to be accepted or rejected. It is expected to identify if the result is
statistically significant for the null hypothesis to be false or rejected. The level of significance is denoted by
the Greek symbol α (alpha). Therefore, the level of significance is defined as follows:
Significance Level = p (type I error) = α
The values or the observations are less likely when they are farther than the mean. The results are written
as ―significant at x%‖. The value significant at 5% refers to p-value is less than 0.05 or p < 0.05. Similarly,
significant at the 1% means that the p-value is less than 0.01.
Step-3: Choice of Level of Significance ()
Step-4, 5 and 6

Step-4: Specification of Critical Region


Step-5: Calculation of the value of Test Statistic under H0
Using the different formula, the value of test statistic can be computed and then that value will be used
Step-6: Decision to be taken
Decision Rule
For Right Tail test: Reject H0 if the computed test statistic is > critical value
For Left Tail test: Reject H0 if the computed test statistic is < critical value.
For two-tailed test: Reject H0 if the computed test statistic is extreme, either larger than an upper critical value or smaller
than a lower critical value.
Risks in Decision Making

Risks in Decision Making Using Hypothesis Testing


Using hypothesis testing involves the risk of reaching an incorrect conclusion. We might wrongly reject a true null
hypothesis, or, conversely, we might wrongly not reject a false null hypothesis, H0. These types of risk are called Type I
and Type II errors.
 A Type I error occurs if we reject the null hypothesis, when it is true and should not be rejected. Type I error is a
―false alarm.‖ The probability of a Type I error occurring is .
 A Type II error occurs if we do not reject the null hypothesis, when it is false and should be rejected. Type II error
represents a ―missed opportunity‖ to take some corrective action. The probability of a Type II error occurring is .
Risks in Decision Making

Probability of Type I And Type II Errors


 The level of significance ( ) of a statistical test is the probability of committing a Type I error.
 The  risk is the probability of committing a Type II error.
 The complement of the probability of a Type I error (1 –  ), is called the confidence coefficient. The confidence
coefficient is the probability that we will not reject the null hypothesis, when it is true and should not be rejected.
 The complement of the probability of a Type II error (1 –  ), is called the power of a statistical test. The power of a
statistical test is the probability that we will reject the null hypothesis when it is false and should be rejected.
Risks in Decision Making

Probability of Type I And Type II Errors


 The level of significance ( ) of a statistical test is the probability of committing a Type I error.
 The  risk is the probability of committing a Type II error.
 The complement of the probability of a Type I error (1 –  ), is called the confidence coefficient. The confidence
coefficient is the probability that we will not reject the null hypothesis, when it is true and should not be rejected.
 The complement of the probability of a Type II error (1 –  ), is called the power of a statistical test. The power of a
statistical test is the probability that we will reject the null hypothesis when it is false and should be rejected.

Complements of Type I And Type II Errors


 The confidence coefficient (1 – ), is the probability that you will not reject the null hypothesis, when it is true and
should not be rejected.
 The power of a statistical test (1 – ), is the probability that you will reject the null hypothesis when it is false and
should be rejected
Risks in Decision Making

Practically, it is impossible to minimize both the errors simultaneously. An attempt to reduce 


would raise  and vice-versa. So, the obvious question is which error is more serious and with which
we can choose to live with. This is a subjective topic and depends on type of research and subject so
concerned. This may vary from subject to subject.
Risks in Decision Making
Risks in Decision Making

• One way to reduce the probability of making a Type II error is by increasing the sample size.
• Large samples generally permit us to detect even very small differences between the hypothesized values
and the actual population parameters. For a given level of  increasing the sample size decreases  and
therefore increases the power of the statistical test to detect that the null hypothesis, H0 is false.
• However, there is always a limit, and this affects the decision of how large a sample we can select. For
any given sample size, we must consider the trade-offs between the two possible types of errors.
• Because we can directly control the risk of Type I error, we can reduce this risk by selecting a smaller
value for . For example, if the negative consequences associated with making a Type I error are
substantial, we could select  = 0.01 instead of 0.05.
• However, when we decrease , we increase , so reducing the risk of a Type I error results in an
increased risk of a Type II error. However, to reduce  we could select a larger value for . Therefore, if
it is important to try to avoid a Type II error, you can select of 0.05 or 0.10 instead of 0.01.
Risks in Decision Making
Doubts

You might also like