ERM 4a Final
ERM 4a Final
ERM 4a Final
DATA COLLECTION
&
DATA ANALYSIS
Why a Manager Needs to Know About Statistics
Source of Data:
The Researcher should keep in mind two types of data:
1. Primary
2. Secondary
The Primary Data : Those which are collected afresh and for the first time, and thus
happen to be original in character.
The secondary data : Those which have already been collected by someone else and
which have already been passed through the statistical process.
The distinction between Primary and Secondary data can be made more clear on the
basis of documents:
1. Primary data : Documented as record
2. Secondary data : Documented as report
Exploring the Data Contd….
The researcher has to decide which type of data he would like
to use for this study and accordingly he will have to select
particular type of data base.
Aids of Observation: Diaries, note-books, schedules, photographs and maps are the
commonly used devices for observation.
Observation method has various limitations
It is an expensive method.
The information provided by this method is very limited.
Sometimes unforeseen factors may interfere with the observational task.
At times, the fact that some people are rarely accessible to direct observation
creates obstacle for this method to collect data effectively.
Generally, controlled observation takes place in various experiments that are carried out in a
laboratory or under controlled conditions
Whereas uncontrolled observation is resorted to in case of exploratory researches.
Interview Method
The interview method is one of the important methods of primary data collection.
It is a confiscation between the observer and respondent. It is oral-verbal questions and
corresponding oral – verbal response to the queries made.
Definition of interviews:
PV Young : The interview may be regarded as a systematic method by which one persons enters
more or less legitimately into the inner life of another who is generally a stranger to him.
Hsin Pao Yang: The interview is a technique of field work which is used to watch the behaviour of
an individual or individuals, to record statements, to observe the concrete results of social or
group interactions.
CA Master : In a formal interview pre-determined questions are asked and the answers are
collected in a certain way.
The interviews can be conducted personally or though telephones.
The concept of interview, usually understood as face -to- face encounter, can be extended to
include telephone interviews and in today’s context, video interviews.
Interview Method
The interview method of collecting data involves presentation of oral-verbal
stimuli and reply in terms of oral-verbal responses.
This method can be used through personal interviews and, if possible, through
telephone interviews.
Personal interviews: Personal interview method requires a person known as the
interviewer asking questions generally in a face-to-face contact to the other
person or persons.
At times the interviewee may also ask certain questions and the interviewer
responds to these, but usually the interviewer initiates the interview and collects
the information.
This sort of interview may be in the form of direct personal investigation or it
may be indirect oral investigation.
Direct personal investigation: He has to be on the spot and has to meet people
from whom data have to be collected.
This method is particularly suitable for intensive investigations.
Interview Method
Indirect oral examination can be conducted under which the interviewer has to
cross-examine other persons who are supposed to have knowledge about the
problem under investigation and the information, obtained is recorded.
Most of the commissions and committees appointed by government to carry on
investigations make use of this method.
Major advantages of personal interviews:
1. More information and that too in greater depth can be obtained.
2. There is greater flexibility under this method as the opportunity to restructure
questions is always there, specially in case of unstructured interviews.
3. Observation method can as well be applied to recording verbal answers to
various questions.
4. Personal information can as well be obtained easily under this method.
5. The interviewer can collect supplementary information about the respondent’s
personal characteristics and environment which is often of great value in
interpreting results.
Interview Method
Weaknesses of personal interviews:
1. It is a very expensive method, specially when large and widely spread
geographical sample is taken.
2. There remains the possibility of the bias of interviewer as well as that of the
respondent; there also remains the headache of supervision and control of
interviewers.
3. Certain types of respondents such as important officials or executives or people
in high income groups may not be easily approachable under this method and to
that extent the data may prove inadequate.
4. The presence of the interviewer on the spot may over-stimulate the respondent,
sometimes even to the extent that he may give imaginary information just to
make the interview interesting.
5. Under the interview method the organization required for selecting, training and
supervising the field-staff is more complex with formidable problems.
6. Interviewing at times may also introduce systematic errors.
Interview Method
Telephone interviews: his method of collecting information consists in
contacting respondents on telephone itself. It is not a very widely used
method, but plays important part in industrial surveys, particularly in
developed regions.
The chief merits of such a system are:
1. It is more flexible in comparison to mailing method.
2. It is faster than other methods i.e., a quick way of obtaining information.
3. It is cheaper than personal interviewing method; here the cost per response is relatively low.
4. Recall is easy; callbacks are simple and economical.
5. There is a higher rate of response than what we have in mailing method; the non-response is
generally very low.
6. Replies can be recorded without causing embarrassment to respondents.
7. Interviewer can explain requirements more easily.
8. At times, access can be gained to respondents who otherwise cannot be contacted for one reason
or the other.
9. No field staff is required.
10. Representative and wider distribution of sample is possible
Interview Method Contd….
Telephone interviews
Demerits of collecting information are:
1. Little time is given to respondents for considered answers; interview
period is not likely to exceed five minutes in most cases.
2. Surveys are restricted to respondents who have telephone facilities.
3. Extensive geographical coverage may get restricted by cost considerations.
4. It is not suitable for intensive surveys where comprehensive answers are
required to various questions.
5. Possibility of the bias of the interviewer is relatively more.
6. Questions have to be short and to the point; probes are difficult to handle.
COLLECTION OF DATA THROUGH QUESTIONNAIRES
This method of data collection is quite popular, particularly in case of big
enquiries. It is being adopted by private individuals, research workers, private
and public organisations and even by governments.
In this method a questionnaire is sent (usually by post) to the persons
concerned with a request to answer the questions and return the questionnaire.
A questionnaire consists of a number of questions printed or typed in a
definite order on a form or set of forms.
The questionnaire is mailed to respondents who are expected to read and
understand the questions and write down the reply in the space meant for the
purpose in the questionnaire itself. The respondents have to answer the
questions on their own.
The method of collecting data by mailing the questionnaires to respondents is
most extensively employed in various economic and business surveys.
COLLECTION OF DATA THROUGH QUESTIONNAIRES
Contd…
The opening questions should be such as to arouse human
interest. The following type of questions should generally
be avoided as opening questions in a questionnaire:
1. Questions that put too great a strain on the memory or
intellect of the respondent;
2. Questions of a personal character;
3. Questions related to personal wealth, etc.
Questionnaire Design
General Considerations
The first rule is design the questionnaire to fit the medium
Examples:
Multiple Choice
1. Where do you live?
North
South
East
West
Numeric Open End
2. How much did you spend on groceries this week? ……………..
Questionnaire Design
Text Open End
3. How can our company improve is working conditions?
Rating Scales and Agreement Scales are two types of questions that some researchers treat
as multiple choice questions and others treat as numeric open end questions.
Rating Scales
4. How would you rate this product?
Excellent
Good
Fair
Poor
5. On a scale where “10” means you have a great amount of interest in a subject and “I” means you
have none at all, how would you rate your interest in each of the following topics?
Domestic politics …
Foreign Affairs …
Science and Health …
Business …
Questionnaire Design
Agreement Scale
6. How much do you agree with each of the following statements
S. No Particulars Strongly Agree Dis Strongly
agree agree Disagree
1 My manager provides constructive criticism
2 Our medical plan provides adequate coverage
3 I would prefer to work longer hours on fewer days
A Sample Questionnaire
A study for telephone services company to find the expectations of customers using telephone booths at
Hyderabad and their profiles. The format of the questionnaire used in this study is presented below:
Questionnaire
Study on customer expectations and profiles of PCO booths at Hyderabad
Address of Telephone Booth:
Customer’s personal profile
1.Name :
2.Age :
a. Up to 17 years b. 18-24 years
c. 25-40 years d. 41-50 years
e. 51- 60 years f. More than 60 years
3.Gender
4.a. Male …… b. Female …..
5.Monthly househod income
a. Less than Rs. 10,000 b. Rs. 10,000 – 20,000
c. Rs. 20,000 d. Rs. 30,000 – 50, 000 e. more than Rs. 50,000.
6.Occupation
a. Service sector b. Government c. Public d. Private
e. Business f. Student / house wife g. Others (specify) ………
SOME OTHER METHODS OF DATA COLLECTION
Particularly used by big business houses in modern times.
1. Warranty cards: Warranty cards are usually postal sized cards which are used by dealers of consumer durables
to collect information regarding their products. The consumer to fill in the card and post it back to the dealer.
2. Distributor or store audits: Performed by distributors as well as manufactures through their salesmen at regular
intervals. To estimate market size, market share, seasonal purchasing pattern and so on. The data are obtained in
such audits not by questioning but by observation.
3. Pantry audit technique: It is used to estimate consumption of the basket of goods at the consumer level. It is to
find out what types of consumers buy certain products and certain brands, the assumption being that the contents
of the pantry accurately portray consumer’s preferences.
4. Consumer panel: An extension of the pantry audit approach on a regular basis is known as ‘consumer panel’,
where a set of consumers are arranged to come to an understanding to maintain detailed daily records of their
consumption and the same is made available to investigator on demands.
5. Use of mechanical devices : The use of mechanical devices has been widely made to collect information by
way of indirect means. Eye camera, Pupilometric camera, Psychogalvanometer, Motion picture camera and
Audiometer are the principal devices so far developed and commonlyused by modern big business houses, mostly
in the developed world for the purpose of collecting the required information.
6. Projective techniques: Projective techniques (or what are sometimes called as indirect interviewing techniques)
for the collection of data, it play an important role in motivational researches or in attitude surveys.
7. Depth interviews : Depth interviews are held to explore needs, desires and feelings of respondents Unless the
researcher has specialized training, depth interviewing should not be attempted
8. Content-analysis : Content-analysis consists of analysing the contents of documentary materials such as books,
magazines, newspapers and the contents of all other verbal materials.
COLLECTION OF SECONDARY DATA
Secondary data means data that are already available i.e., they refer to the data which have
already been collected and analyzed by someone else.
When the researcher utilizes secondary data, then he has to look into various sources from
where he can obtain them.
Secondary data may either be published data or unpublished data.
Usually published data are available in:
a.Various publications of the central, state are local governments;
b.Various publications of foreign governments or of international bodies and their subsidiary
organizations;
c.Technical and trade journals;
d.Books, magazines and newspapers;
e.Reports and publications of various associations connected with business and industry,
banks, stock exchanges, etc.;
f.Reports prepared by research scholars, Universities, Economists, etc. In different fields;
g.Public records and statistics, historical documents, and other sources of published
information.
COLLECTION OF SECONDARY DATA Contd….
The sources of unpublished data are many: It may be found in diaries, letters, unpublished
biographies and autobiographies and also may be available with scholars and research workers,
trade associations, labour bureaus and other public/private individuals and organisations.
Researcher must be very careful in using secondary data. By way of caution, the researcher,
before using secondary data, must see that they possess following characteristics:
1.Reliability of data: Reliability can be tested by finding out
(a) Who collected the data? (b) What were the sources of data?
(c) Were they collected by using proper methods (d) At what time were they collected?
(e) Was there any bias of the compiler? (f) What level of accuracy was desired? Was it
achieved ?
2.Suitability of data: The data that are suitable for one enquiry may not necessarily be found suitable
in another enquiry.
3.Adequacy of data: If the level of accuracy achieved in data is found inadequate for the purpose of
the present enquiry, they will be considered as inadequate and should not be used by the researcher.
From all this we can say that it is very risky to use the already available data. The already
available data should be used by the researcher only when he finds them reliable, suitable and
adequate.
Description and analysis of Data
Technically speaking, description implies editing, coding, classification and
tabulation of collected data so that they are amenable to analysis.
The term analysis refers to the computation of certain measures along with
searching for patterns of relationship that exist among data-groups.
Thus, “in the process of analysis, relationships or differences supporting or
conflicting with original or new hypotheses should be subjected to statistical
tests of significance to determine with what validity data an be said to indicate
any conclusions”.
Editing: A routine work,
it has to be carried out with utmost care and devotion,
Checking the filled questionnaires,
Coding: It is an operation which requires judgment, skill, particularly for developing the coding
frame Reducing the mass data into manageable proportion
Classification: Tabulation of data is a common tool
It is used for summarizing the data so that they are amenable for interpretation
Summarizing data into tabular form.
Description Operations
Editing: Editing of data is a process of examining the collected raw data
(specially in surveys) to detect errors and omissions and to
correct these when possible. It involves a careful scrutiny of the
completed questionnaires and/or schedules.
Field editing:
• Consists in the review of the reporting forms by the investigator for
completing (translating or rewriting)
• This type of editing is necessary in view of the fact that individual
writing styles often can be difficult for others to decipher.
Central editing:
• It should take place when all forms or schedules have been
completed and returned to the office. This type of editing implies
that all forms should get a thorough editing by a single editor in a
small study and by a team of editors in case of a large inquiry.
Description Operations Contd…..
Coding:
•Coding refers to the process of assigning numerals or other symbols to
answers so that responses can be put into a limited number of categories or
classes.
•Coding is necessary for efficient analysis and through it the several replies
may be reduced to a small number of classes which contain the critical
information required for analysis.
Classification:
•Most research studies result in a large volume of raw data which must be
reduced into homogeneous groups if we are to get meaningful relationships.
1.Classification according to attributes: Data are classified on the basis of common
characteristics which can either be descriptive (such as literacy, sex, honesty, etc.) or numerical
(such as weight, height, income, etc.).
vi. Availability of finance: In practice, size of the sample depends upon the
amount of money available for the study purposes. This factor should be
kept in view while determining the size of sample for large samples result
in increasing the cost of sampling estimates.
and drawing conclusions there from. Most research studies result in a large
volume of raw data which must be suitably reduced so that the same can be
read easily and can be used for further analysis. Clearly the science of
statistics cannot be ignored by any research worker.
The important statistical measures that are used to summarize the
survey/research data are:
1. Measures of central tendency or statistical averages
2. Measures of dispersion
3. Measures of asymmetry (skewness)
4. Measures of relationship
Some Important Definitions
A Population (Universe) is the whole collection of things under
consideration
Population Sample
Use statistics to
summarize features
Use parameters to
summarize features
D a ta
Discrete Continuous
Summary Measures
Summary Measures
Mean Mode
Median Range Coefficient
of Variation
Variance
Standard Deviation
Geometric Mean
IMPORTANT STATISTICAL
MEASURES
Measures of Central Tendency(Statistical averages)
Mean, Median, Mode, Geometric Mean, Harmonic Mean
Quartiles
Measure of Variation/dispersion
Range, Semi Inter-quartile Range, Mean Deviation, Variance, Standard
Deviation and Coefficient of Variation
Measures of Skewness / Shape (Measure Asymmetry)
Symmetric, Skewed
Measures of Kurtosis/Peakedness
Lepto kurtic / Platy Kurtic / Meso kurtic
Points of Central Tendency
Measures of central tendency (or statistical averages) tell us the point about which
items have a tendency to cluster. Such a measure is considered as the most
representative figure for the entire mass of data. Measure of central tendency is also
known as statistical average. Mean, median and mode are the most popular averages.
Mean, also known as arithmetic average
•Range
•Standard Deviation
•Mean deviation
Measures of Dispersion
In statistics, the measures of dispersion help to interpret the variability of data
i.e. to know how much homogenous or heterogeneous the data is. In simple
terms, it shows how squeezed or scattered the variable is.
The types of absolute measures of dispersion are:
1.Range: It is simply the difference between the maximum value and the
minimum value given in a data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6
1.Variance: Deduct the mean from each data in the set, square each of them and
add each square and finally divide them by the total no of values in the data set to
get the variance. Variance (σ2) = ∑(X−μ)2/N
1.Standard Deviation: The square root of the variance is known as the standard
deviation i.e. S.D. = √σ.
1.Quartiles and Quartile Deviation: The quartiles are values that divide a list of
numbers into quarters. The quartile deviation is half of the distance between the
third and the first quartile.
2.Mean and Mean Deviation: The average of numbers is known as the mean and
the arithmetic mean of the absolute deviations of the observations from a measure
of central tendency is known as the mean deviation (also called mean absolute
deviation).
What is Variance?
Variance is a measure of dispersion. A measure of dispersion is a quantity that is
used to check the variability of data about an average value. Data can be of two
types - grouped and ungrouped. When data is expressed in the form of class
intervals it is known as grouped data. On the other hand, if data consists of
individual data points, it is called ungrouped data. The sample and population
variance can be determined for both kinds of data.
Variance Definition
Population Variance - All the members of a group are known as the population.
When we want to find how each data point in a given population varies or is
spread out then we use the population variance. It is used to give the squared
distance of each data point from the population mean.
Sample Variance - If the size of the population is too large then it is difficult to
take each data point into consideration. In such a case, a select number of data
points are picked up from the population to form the sample that can describe the
entire group. Thus, the sample variance can be defined as the average of the
squared distances from the mean. The variance is always calculated with respect to
the sample mean.
A general definition of variance is that it is the expected value of the squared
differences from the mean.
Variance Example
Suppose we have the data set {3, 5, 8, 1} and we want to find the population
variance. The mean is given as (3 + 5 + 8 + 1) / 4 = 4.25. Then by using the
definition of variance we get [(3 - 4.25)2 + (5 - 4.25)2 + (8 - 4.25)2 + (1 - 4.25)2] /
4 = 6.68. Thus, variance = 6.68.
Standard Deviation
Standard deviation is the positive square root of the variance. It is one of the
basic methods of statistical analysis. Standard Deviation is commonly
abbreviated as SD and denoted by the symbol 'σ’ and it tells about how much
data values are deviated from the mean value. If we get a low standard deviation
then it means that the values tend to be close to the mean whereas a high standard
deviation tells us that the values are far from the mean value.
i.Test of a hypothesis concerning some single value for the given data (such as one-
sample sign test).
ii.Test of a hypothesis concerning no difference among two or more sets of data (such as
two-sample sign test, Fisher-Irwin test, Rank sum test, etc.).
iii.Test of a hypothesis of a relationship between variables.
iv.Test of a hypothesis concerning variation in the given data i.e., test analogous to
ANOVA .
v.Tests of randomness of a sample based on the theory of runs viz., one sample runs
test.
vi.Test of hypothesis to determine if categorical data shows dependency or if two
classifications are independent viz., the chi-square test. The chi-square test can as well
be used to make comparison between theoretical populations and actual data when
categories are used.
Types of Parametric Tests for Hypothesis Testing
1. T-Test
1. It is a parametric test of hypothesis testing based on Student’s T
distribution.
2. It is essentially, testing the significance of the difference of the mean values
when the sample size is small (i.e, less than 30) and when the population
standard deviation is not available.
3. Assumptions of this test:
•Population distribution is normal, and
where,
x̄ is the sample mean
s is the sample standard deviation
n is the sample size
μ is the population mean
Two-Sample T-test: To compare the means of two different samples.
where,
x̄ 1 is the sample mean of the first group
x̄ 2 is the sample mean of the second group
S1 is the sample-1 standard deviation
S2 is the sample-2 standard deviation
n is the sample size
Conclusion:
•If the value of the test statistic is greater than the table value -> Rejects the null
hypothesis.
•If the value of the test statistic is less than the table value -> Do not reject the
null hypothesis.
2. Z-Test
1. It is a parametric test of hypothesis testing.
2. It is used to determine whether the means are different when the population
variance is known and the sample size is large (i.e., greater than 30).
3. Assumptions of this test:
•Population distribution is normal
distribution.
2. It is a test for the null hypothesis that two normal populations have the same
variance.
5. It is calculated as:
F = s12/s22
6. By changing the variance in the ratio, F-test has become a very flexible test.
where:
F=ANOVA coefficient
MST=Mean sum of squares due to treatment
MSE=Mean sum of squares due to error
SIGNIFICANCE OF ANOVA
ANOVA tells you whether the group means are significantly different from each
other.
ANOVA works by partitioning the total variation in a data set into two segments:
variation between groups and variation within groups.
If the between-groups variation is significantly larger than expected by chance,
then there are actual differences in the group means.
ANOVA is useful for assessing the effects of other treatments, studying the
impact of factors, and reading contrasts between groups. It feeds insights that
simple comparisons of means cannot.
The results of an ANOVA tell you whether there are any statistically important
contrasts between groups but not exactly where or why the differences exist. Post
hoc tests are needed to define precisely which group means differ.
ADVANTAGES OF ANOVA
•It allows comparisons of more than two group means simultaneously. Ordinary
t-tests can only compare two group means at a time. ANOVA can compare 3,4,5,
and more group means together.
1. Chi-Square Test
1. It is a non-parametric test of hypothesis testing.
2. As a non-parametric test, chi-square can be used:
• test of goodness of fit.
•as a test of independence of two variables.
3. It helps in assessing the goodness of fit between a set of observed and those
expected theoretically.
4. It makes a comparison between the expected frequencies and the observed
frequencies.
5. Greater the difference, the greater is the value of chi-square.
6. If there is no difference between the expected and observed frequencies,
then the value of chi-square is equal to zero.
7. It is also known as the “Goodness of fit test” which determines whether a
particular distribution fits the observed data or not.
9. Chi-square is also used to test the independence of two variables.
•No one of the groups should contain very few items, say less than 10.
•The reasonably large overall number of items. Normally, it should be at least 50, however small the number of
11. Chi-square as a parametric test is used as a test for population variance based on sample variance.
12. If we take each one of a collection of sample variances, divide them by the known population variance and
multiply these quotients by (n-1), where n means the number of items in the sample, we get the values of chi-
square.
Measures of Relationship
We have dealt with those statistical measures that we use in context of
univariate population i.e., the population consisting of measurement of
only one variable.
For example: Whether the number of hours students devote for studies is
somehow related to their family income, to age, to gender or to similar
other factor.
There are several methods of determining the relationship between
variables, but no method can tell us for certain that a correlation is
indicative of causal relationship.
Descriptive Statistics
Collect Data
E.g., Survey
Present Data
E.g., Tables and graphs
Characterize Data
E.g., Sample Mean = X i
n
Inferential Statistics
Analysis, particularly in case of survey or experimental data,
involves estimating the values of unknown parameters of the
population and testing of hypotheses for drawing inferences.
REJECT
Null Hypothesis
X 2.4
Type I and Type II errors are subjected to the result of the null hypothesis.
In case of type I or type-1 error, the null hypothesis is rejected though it is
true whereas type II or type-2 error, the null hypothesis is not rejected even
when the alternative hypothesis is true. Both the error type-i and type-ii are
also known as “false negative”. A lot of statistical theory rotates around the
reduction of one or both of these errors, still, the total elimination of both is
explained as a statistical impossibility.
Type I Error
A type I error appears when the null hypothesis (H0) of an experiment is true,
but still, it is rejected. It is stating something which is not present or a false hit.
A type I error is often called a false positive (an event that shows that a given
condition is present when it is absent). In words of community tales, a person
may see the bear when there is none (raising a false alarm) where the null
hypothesis (H0) contains the statement: “There is no bear”.
The type I error significance level or rate level is the probability of refusing the
null hypothesis given that it is true. It is represented by Greek letter α (alpha)
and is also known as alpha level. Usually, the significance level or the
probability of type i error is set to 0.05 (5%), assuming that it is satisfactory to
have a 5% probability of inaccurately rejecting the null hypothesis.
Type II Error
A type II error appears when the null hypothesis is false but mistakenly fails to
be refused. It is losing to state what is present and a miss. A type II error is also
known as false negative (where a real hit was rejected by the test and is
observed as a miss), in an experiment checking for a condition with a final
outcome of true or false.
A type II error is assigned when a true alternative hypothesis is not
acknowledged. In other words, an examiner may miss discovering the bear
when in fact a bear is present (hence fails in raising the alarm). Again, H0, the
null hypothesis, consists of the statement that, “There is no bear”, wherein, if a
wolf is indeed present, is a type II error on the part of the investigator. Here, the
bear either exists or does not exist within given circumstances, the question
arises here is if it is correctly identified or not, either missing detecting it when
it is present, or identifying it when it is not present.
The rate level of the type II error is represented by the Greek letter β (beta) and
linked to the power of a test (which equals 1−β).
Type I and Type II Errors Example
Check out some real-life examples to understand the type-i
and type-ii error in the null hypothesis.
2.4 = 3.5 X
If H0 is true
General Steps in Hypothesis Testing
Reject H0
6. Set up critical value(s)
Z
-1.645
7. Collect data 100 households surveyed
8. Compute test statistic Computed test stat =-2,
and p-value p-value = .0228
Reject null hypothesis
9. Make statistical decision
The true mean # TV set is
10. Express conclusion
less than 3
Level of Significance,
Defines Unlikely Values of Sample Statistic if Null
Hypothesis is True
Called rejection region of the sampling distribution
Designated by , (level of significance)
Typical values are .01, .05, .10
Selected by the Researcher at the Beginning
Controls the Probability of Committing a Type I Error
Provides the Critical Value(s) of the Test
Error in Making Decisions Contd…
Type II Error
Fail to reject a false null hypothesis
Probability of Type II Error is
The power of the test is
Probability of Not Making Type I Error
1
Called the Confidence Coefficient
1
Result Probabilities
H0: Innocent
Jury Trial Hypothesis Test
The Truth The Truth
Verdict Innocent Guilty Decision H0 True H0 False
Do Not Type II
Innocent Correct Error Reject 1-
Error ( )
H0
Type I Power
Guilty Error Correct Reject Error
H0 (1 - )
( )
Level of Significance and the Rejection Region
H0: 3.5 Critical
Factors Affecting Type II Error
n
How to Choose between Type I and Type II
Errors
Choice Depends on the Cost of the Errors
Choose Smaller Type I Error When the Cost of
Rejecting the Maintained Hypothesis is High
A criminal trial: convicting an innocent person
The Exxon Valdez: causing an oil tanker to sink
Choose Larger Type I Error When You Have an
Interest in Changing the Status Quo
A decision in a startup company about a new piece of software
A decision about unequal pay for a covered group
Less Variability
Standard Error (Standard Deviation) of the
Sampling Distribution X is Less Than the
Standard Error of Other Unbiased Estimators
f X Sampling
Distribution
of Median Sampling
Distribution of
Mean
X