New Generation University College: AUGUST 2020
New Generation University College: AUGUST 2020
Department of MIS
Module
AUGUST 2020
1|Page
Business statistics
Chapter one
1. Introduction (4 lecture hours)
1.1 Definition and classification of Statistics
1.2 Stages in statistical investigation
1.3 Definition of some basic terms
1.4 Applications, uses and limitations of Statistics
1.5 Types of variables and measurement scales
2|Page
If we calculate the average malaria patients from 1986 to 1990 as
1
Average (3645 4568 5432 6751 7369) 5553 then our work belongs to the
5
domain of descriptive statistics.
If we say that there was an increase of 724 patients from 1986 to 1990, then again this
belongs to the domain of descriptive statistics.
2. Inferential Statistics: consists of generalizing from samples to populations,
performing estimations and hypothesis tests, determining relationships among
variables, and making predictions. Statistical techniques based on probability
theory are required. Example 1: In the above example if we predict the number of
malaria patients in the year 1995 to be 9917, then our work belongs to the domain
of inferential statistics. Example 2: Suppose we want to have an idea about the
percentage of illiterates in our country. We take a sample from the population and
find the proportion of illiterates in the sample. This sample proportion with the
help of probability enables us to make some inferences about the population
proportion. This study belongs to inferential statistics.
3|Page
Formulate the objective in statistical terms
IIII.. Proper collection of data: in order to draw valid conclusions, it is important ‘good’
data. Data are gathered with aim to meet predetermine objectives. In other words,
the data must provide answers to problems. The data itself form the foundation of
statistical analyses and hence the data must be carefully and accurately collected. In
section 1.6 we will see the methods of data collection.
IIIIII.. Organization and classification of data: in this stage the collected data organized in
a systematic manner. That means the data must be placed in relation to each other.
The classification or sorting out of data is, by itself, a kind of organization of data.
IIV
V.. Presentation of data: The purpose of putting the organized data in graphs, charts
and tables is two-fold. First, it is a visual way to look at the data and see what
happened and make interpretations. Second, it is usually the best way to show the
data to others. Reading lots of numbers in the text puts people to sleep and does
little to convey information.
V
V.. Analyses of data: is the process of looking at and summarizing data with the intent
to extract useful information and develop conclusions. Data analysis is closely
related to data mining, but data mining tends to focus on larger data sets, with less
emphasis on making inference, and often uses data that was originally collected for
a different purpose. In this stage different types of inferential statistical methods will
apply. For instance, hypothesis testing such as 2 test of association.
V
VII.. Interpretation of data: interpretation means drawing valid conclusions from data
which form the basis of decision making. Correct interpretation requires a high
degree of skill and experience.
Note that: Analyses and interpretation of data are the two sides of the same coin.
4|Page
Sample: Is a portion of a population which is selected using some technique of
sampling. Sample must be representative of the population so that it must be selected
by any of the developed technique.
Sampling: Is the process of selecting units (e.g., people, households, organizations)
from a population of interest so that by studying the sample we may fairly generalize
our results back to the population from which they were chosen. There are two types of
sampling techniques namely random sampling technique and non-random sampling
technique.
Random sampling technique or probability sampling technique gives a non- zero
chance for all elements to be included in the sample. In other words, there is no
personal bias regarding the selection. The five common random sampling techniques
are:
Simple Random sampling
Systematic Random sampling
Stratified Random sampling
Cluster Random sampling
Multi-stage sampling
Non-random sampling technique is mostly known as non-probability sampling
techniques and in this case not all elements of a population have a known chance of
inclusion or if some outcomes have a zero chance of being selected as a sample. The
most familiar examples of non-random sampling techniques are
Quota sampling
Convenience sampling
Volunteer sampling
Purposive sampling
Haphazard sampling
Snow ball sampling etc…
Sample size: The number of elements or observation to be included in the sample.
Parameter: Any measure computed from the data of a population.
Example: Populations mean and population standard deviation
Statistic: Any measure computed from the sample.
Example: sample mean x , sample standard deviation S
Survey: A collection of quantitative information about members of a population when
no special control is exercised over any of the factors influencing the variable of interest.
Sample survey: A survey that include only a portion of the population.
Census: A collection of information about every member of a population
Sample survey has the following advantages over census
Sample survey saves time and cost
5|Page
Has great accuracy
Avoid wastage of material
Variable: A variable is a characteristic or attribute that can assume different values.
Variables whose values are determined by chance are called random variables.
Variables are often specified according to their type and intended use and hence
variable can be classified in to two namely qualitative and quantitative variables.
A quantitative variable is naturally measured as a number for which meaningful
arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary,
temperature, area, air pollution index (measured in parts per million), etc.
Qualitative variable: Any variable that is not quantitative is qualitative. Qualitative
variables take a value that is one of several possible categories. As naturally
measured, qualitative variables have no numerical meaning. Examples: Hair color,
gender, field of study, marital status, political affiliation, status of disease infection.
Quantitative variables can be classified as discrete and continuous variable. Discrete
variables can assume certain numerical values. That is, there are gaps between the
possible values. Such as 0, 1, 2...It may be countable finite or countable infinite. For
example the number of students in a classroom, number of children a family.
Continuous variable can take any value within a specified interval with a finite enough
measuring device. No gaps between possible values. They are obtained by measuring.
For example, consider the heights of two people no matter how close it is we can find
another person whose height falls somewhere between the two heights is a continuous
variable.
6|Page
II. Uses of Statistics
Statistics presents fact in the form of numerical data.
It condenses and summarizes a mass of data in to a few presentable and precise
figures.
It facilitates comparison of data
It helps in formulating and testing hypothesis.
It helps in predicting future trend.
It helps in formulating polices.
III. Limitations of Statistics
Statistics with all its wide application in every sphere of human activity has its own
limitation. Some of them are given below
Statistics is not suitable to the study of qualitative phenomenon: Since statistics is
basically a science and deals with a set of numerical data, it is applicable to the study
of only these subjects of enquiry, which can be expressed in terms of quantitative
measurements. As a matter of fact, qualitative phenomenon like honesty, poverty,
beauty, intelligence etc, cannot be expressed numerically and any statistical analysis
cannot be directly applied on these qualitative phenomenons. Nevertheless,
statistical techniques may be applied indirectly by first reducing the qualitative
expressions to accurate quantitative terms. For example, the intelligence of a group
of students can be studied on the basis of their marks in a particular examination.
Statistics does not study individuals: Statistics does not give any specific
importance to the individual items; in fact it deals with an aggregate of objects.
Individual items, when they are taken individually do not constitute any statistical
data and do not serve any purpose for any statistical enquiry.
Statistical laws are not exact: It is well known that mathematical and physical
sciences are exact. But statistical laws are not exact and statistical laws are only
approximations. Statistical conclusions are not universally true. They are true only
on an average.
Statistics table may be misused: Statistics must be used only by experts; otherwise,
statistical methods are the most dangerous tools on the hands of the inexpert. The
use of statistical tools by the inexperienced and untraced persons might lead to
wrong conclusions. Statistics can be easily misused by quoting wrong figures of
data. As King says aptly ‘statistics are like clay of which one can make a God or
Devil as one pleases.’
Statistics is only, one of the methods of studying a problem: Statistical method
does not provide complete solution of the problems because problems are to be
studied taking the background of the countries culture, philosophy or religion into
7|Page
consideration. Thus the statistical study should be supplemented by other
evidences.
Examples:
Letter grades (A, B, C, D, F).
Rating scales (Excellent, Very good, Good, Fair, poor).
Military status.
8|Page
3. Interval Scales
Interval scales are measurement systems that possess the following properties:
Level of measurement which classifies data that can be ranked and differences
are meaningful. However, there is no meaningful zero, so ratios are meaningless.
All arithmetic operations except division are applicable.
Relational operations are also possible.
Examples:
IQ, Temperature in F0.
4. Ratio Scales
Ratio scales measurement possess the following properties: Level of measurement
which classifies data that can be ranked, differences are meaningful, and there is a true
zero. True ratios exist between the different units of measure.
All arithmetic and relational operations are applicable.
Examples:
Weight
Height
Number of students
Age
Use of level of measurements
Helps you decide how to interpret the data from the variable.
Helps you decide what statistical analysis is appropriate on the values that were
assigned. For example if a measurement is nominal then you know that you
never average the data level.
9|Page
2. Methods of Data Collection and Presentation (6 lecture hours)
1.6 Methods of data collection
1.6.1 Sources of data
1.6.2 Types of data
1.6.3 Methods of collection
1.7 Methods of Data Presentation
1.7.1 Motivating examples
1.7.2 Frequency distributions: qualitative, quantitative:
absolute, relative, percentage, cumulative
1.7.3 Tabular presentation of data
1.7.4 Diagrammatic display of data: Bar charts, Pie-
chart, Cartograms
1.7.5 Graphical presentation of data: Histogram,
Frequency Polygon, Ogive Curves
10 | P a g e
refer to those that are collected by conducting survey to meet
the specific problem needs at hand.
Example: Population census reports are primary data because
these are collected, complied and published by the population
census organization.
(2). Secondary data -
The secondary data are the second hand information which are
already collected by some one (organization) for some purpose
and are available for the present study. The secondary data are
not pure in character and have undergone some treatment at
least once.
data taken from already available published or unpublished
source.
Exercise-2: Write the merits and demerits of secondary data.
2.1.3. Methods of collection
There are three major methods of data collection
1. self-administered questionnaire
2. direct investigation-measurement (observation) of the
subject and interviewing(face-to-face, telephone, --- )
3. the use of documentary source
1. Self-administered questionnaire.
Questionnaire is the main data collection instrument in formal
sample survey. Before examining the steps in designing a
questionnaire we need to review the types of questions used in
questionnaires. Depending on the amount of freedom given to
respondent in offering responses, there are two basic types of
questions that can be used in questionnaires: open-ended
questions and closed ended questions.
The type of questions for use will be determined by the form of
responses wanted, the nature of the respondents and their
ability to answer the questions.
Open-ended questions: - allows the respondent to answer it
freely in his or her own words.
11 | P a g e
Example: what do you think are the reasons for a high drop-out
rate of village health committee members?
Closed – ended questions:-
Predetermined list of alternate responses is presented to the
respondent for checking the appropriate one(s). It implies that
the respondent’s answers are restricted in some way to a limited
range of alternatives.
Advantage
It is the cheapest and can be conducted by a single
researcher.
Questionnaires can be sending to a wide geographical area.
There is no interviewer variability
Disadvantage
Low response rate
No assurance that the questioners was answered by the right
person.
Mail questionnaire is not suitable for illiterate community
2. direct investigation
I. measurement or/and observation
data can be obtained through direct observation or
measurement
provides accurate information but it is expensive and
inconvenient
eg: Land area measurement, Animal weight gain, Physical
examination, direct observation of work.
II. Interview
a) Face-to-Face interview
Advantage:-
Interviewers can observe the surroundings and can use
nonverbal communication and visual aids.
The interviewer can help the respondent if he/she has
difficulty in understanding the questions.
Respondent is likely to answer all the questions alone
12 | P a g e
Disadvantage:-
Cost is high
Interviewer bias is also high
Untrained interviewer may distort the meaning of the
questions
b) Telephone Interview
Advantage:-
It is less expensive in time and money compared to face to
face interviews
Relatively high response rate
Reach people who would not open their doors to an
interviewer, but might willing to talk on the telephone
Disadvantage:-
Unrepresentative of the groups which do not have
telephones
Unlisted telephone numbers are excluded from the study.
Respondent may be substitute by another
3. The use of documentary source
Extracting information from existing resources.
Is much less expensive than any other two sources
It is difficult to get the information needed when records are
compiled in unstandardized manner.
Example: - Hospital records, professional institutes, Official
statistics, - - -
Editing of Data:
After collecting the data either from primary or secondary
source, the next step is its editing. Editing means the examination
of collected data to discover any error and mistake before
presenting it. It has to be decided before hand what degree of
accuracy is wanted and what extent of errors can be tolerated in
the inquiry. The editing of secondary data is simpler than that of
primary data.
2.2. Methods of Data Presentation
13 | P a g e
This topic introduces tabular and graphical methods commonly
used to summarize both qualitative and quantitative data.
Tabular and graphical summaries of data can be obtained in
annual reports, newspaper articles and research studies.
Everyone is exposed to these types of presentations, so it is
important to understand how they are prepared and how they
will be interpreted.
Modern statistical software packages provide extensive
capabilities for summarizing data and preparing graphical
presentations. MINITAB, SPSS and STATA are three packages
that are widely available.
2.2.1. Classification of Data
The process of arranging data into homogenous group or classes
according to some common characteristics present in the data is
called classification.
For Example: The process of sorting letters in a post office, the
letters are classified according to the regions and further arranged
according to zones, cities, etc.
Bases of Classification:
There are four important bases of classification:
(1) Qualitative Base (2) Quantitative Base (3) Geographical
Base (4) Chronological or Temporal Base
(1) Qualitative Base:
When the data are classified according to some quality or
attributes such as sex, religion, literacy, intelligence etc…
(2) Quantitative Base:
When the data are classified by quantitative characteristics like
heights, weights, ages, income etc…
(3) Geographical Base:
When the data are classified by geographical regions or location,
like states, provinces, cities, countries etc…
(4) Chronological or Temporal Base:
14 | P a g e
When the data are classified or arranged by their time of
occurrence, such as years, months, weeks, days etc… For
Example: Time series data.
Tabulation of Data
The process of placing classified data into tabular form is known
as tabulation. A table is a symmetric arrangement of statistical
data in rows and columns. Rows are horizontal arrangements
whereas columns are vertical arrangements.
2.2.2. Frequency distribution
A frequency distribution is the organization of row data in table
form, using classes and frequencies. There are three basic types of
frequency distributions, and there are specific procedures for
constructing each type. The three types are categorical,
ungrouped and grouped frequency distributions.
The reasons for constructing a frequency distribution are as
follows.
To organize the data in a meaningful, intelligible way.
To enable the reader to determine the nature or shape of the
distribution
To facilitate computational procedures for measures of
average and spread
To enable the researcher to draw charts and graphs for the
presentation of data
To enable the reader to make comparisons between different
data set.
Some of basic terms that are most frequently used while we deal
with frequency distribution are the following:
Lower Class Limits are the smallest number that can belong
to the different class.
Upper Class Limits are the largest number that can belong to
the different classes.
Class Boundaries are the number used to separate classes,
but without the gaps created by class limits.
15 | P a g e
Class midpoints are the midpoints of the classes. Each class
midpoint can be found by adding the lower class limit to the
upper class limit and dividing the sum by 2.
Class width is the difference between two consecutive lower
class limits or two consecutive lower class boundaries.
2.2.2.1. Categorical Frequency Distribution
The categorical frequency distribution is used for data which can be
placed in specific categories such as nominal or ordinal level data. For
example, data such as data such as political affiliation, religious
affiliation, or major field of study would use categorical frequency
distribution.
The major components of categorical frequency distribution are class,
tally and frequency. Moreover, even if percentage is not normally a part
of a frequency distribution, it will be added since it is used in certain
types of graphical presentations, such as pie graph.
Steps of constructing categorical frequency distribution
1. You have to identify that the data is in nominal or ordinal scale of
measurement.
2. Make a table as show below
A B C D
class Tally Frequency Percent
17 | P a g e
C. How many workers had between 3 and 5 days of sick leave?
Solution:
A. Since this data set contains only a relatively small number of
distinct or different values, it is convenient to represent it in a
frequency table which presents each distinct value along with its
frequency of occurrence.
18 | P a g e
When we choose the number of classes, we have to think about the
following criteria
The classes must be mutually exclusive. Mutually exclusive
classes have non overlapping class limits so that values can’t be
placed in to two classes.
The classes must be continuous. Even if there are no values in a
class, the class must be included in the frequency distribution.
There should be no gaps in a frequency distribution. The only
exception occurs when the class with a zero frequency is the first
or last. A class width with a zero frequency at either end can be
omitted with out affecting the distribution.
The classes must be equal in width. The reason for having classes
with equal width is so that there is not a distorted view of the
data. One exception occurs when a distribution is open-ended.
i.e., it has no specific beginning or end values.
4. Find the class width by dividing the range by the number of classes
= =
Note that: Round the answer up to the nearest whole number if
there is a reminder. For instance, 4.7 ≈ 5 and 4.12 ≈ 5
5. Select the starting point as the lowest class limit. This is usually
the lowest score (observation). Add the width to that score to get
the lower class limit of the next class. Keep adding until you
achieve the number of desired class( ) calculated in step 3.
6. Find the upper class limit; subtract unit of measurement( ) from
the lower class limit of the second class in order to get the upper
limit of the first class. Then add the width to each upper class limit
to get all upper class limits.
Unit of measurement: Is the next expected upcoming value. For
instance, 28, 23, 52, and then the unit of measurement is one.
Because take one datum arbitrarily, say 23, then the next
upcoming value will be 24. Therefore, = 24 − 23 = 1. If the data
is 24.12, 30, 21.2 then give priority to the datum with more
19 | P a g e
decimal place. Take 24.12 and guess the next possible value. It is
24.13. Therefore, = 24.12 − 24.13 = 0.01.
Note that: = 1 is the maximum value of unit of measurement
and is the value when we don’t have a clue about the data.
7. Find the class boundaries. =
− and =
− . In short, = − and =
− .
8. Tally the data and write the numerical values for tallies in the
frequency column
9. Find cumulative frequency. We have two type of cumulative
frequency namely less than cumulative frequency and more
than cumulative frequency. Less than cumulative frequency
is obtained by adding successively the frequencies of all the
previous classes including the class against which it is
written. The cumulate is started from the lowest to the
highest size. More than cumulative frequency is obtained by
finding the cumulate total of frequencies starting from the
highest to the lowest class.
For example, the following frequency distribution table
gives the marks obtained by 40 students:
The above table shows how to find less than cumulative frequency and the table shown
below shows how to find more than cumulative frequency.
20 | P a g e
Example 2.3: Consider the following set of data and construct the frequency distribution.
11 29 6 33 14 21 18 17 22 38
31 22 27 19 22 23 26 39 34 27
Steps
1. Highest value = 39, Lowest value = 6
2. R = 39 − 6 = 33
3. K = 1 + 3.32 log 20 = 5.32 ≈ 6
4. W = = = 5.5 ≈ 6
5. Select starting point. Take the minimum which is 6 then add width 6 on it to get the next
class LCL.
6 12 18 24 30 36
6. Upper class limit. Since unit of measurement is one. 12 − 1 = 11. So 11 is the UCL of the
first class. Therefore, 6 − 11 is the first class
Class limit 6-11 12-17 18-23 24-29 30-35 36-41
7. Find the class boundaries. Take the formula in step 7. LCB = LCL − 0.5 and UCB =
UCL − 0.5
Class Boundaries 5.5-11.5 11.5-17.5 17.5-23.5 23.5-29.5 29.5-35.5 35.5-41.5
8. 9 and 10
22 | P a g e
Chart Title
misclaneous
20%
food
Fuel and Light 40%
7%
House Rent
27%
clothing
6%
23 | P a g e
B. Multiple bar charts are used two or more sets of inter-related data are represented (multiple
bar diagram facilities comparison between more than one phenomenon). The technique of
simple bar chart is used to draw this diagram but the difference is that we use different
shades, colors, or dots to distinguish between different phenomena.
Example 2.6: Draw a multiple bar chart to represent the import and export of Canada (values in
$) for the years 1991 to 1995.
C. Stratified (Stacked or component) Bar Chart is used to represent data in which the total
magnitude is divided into different or components. In this diagram, first we make simple bars for
each class taking total magnitude in that class and then divide these simple bars into parts
in the ratio of various components. This type of diagram shows the variation in different
components within each class as well as between different classes. Sub-divided bar diagram
is also known as component bar chart or staked chart.
Example 2.7: The table below shows the quantity in hundred kgs of Wheat, Barley and Oats
produced on a certain form during the years 1991 to 1994. Draw stratified bar chart.
24 | P a g e
Solution: To make the component bar chart, first of all we have to take year wise total
production.
25 | P a g e
7.0
6.0
5.0
Frequency 4. 0
3.0
2.0
1.0
6.0
5.0
4.0
3.0
2.0
2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5
Midpoints
26 | P a g e
20 Less than Ogive
15
10
27 | P a g e
CHAPTER 3
3. Measures of Central Tendency (12 lecture hours) Lecture Note
a. Motivating examples
b. Objectives of measures of central tendency
c. Important characteristics of a good average
d. Summation notation
e. Mean
f. Median
g. Mode
28 | P a g e
N
The symbol X
i=1
i is mathematical shorthand for X11+X2+X3+...+XN
N
X
i=1
i = X 1 + X 2 + + X N
The expression is read, "the sum of X sub i from i equals 1 to N." It means "add up all the
numbers."
Example: Suppose the following were scores made on the first homework assignment for
five students in the class: 5, 7, 7, 6, and 8. In this example set of five numbers, where N=5,
the summation could be written:
X
i=1
i = X 1 + X 2 + X 3 + X 4 + X 5 = 5 + 7 + 7 + 6 + 8 = 33
The "i=1" in the bottom of the summation notation tells where to begin the sequence of
summation. If the expression were written with "i=2", the summation would start with the
second number in the set.
5
For example: X
i=2
i = X 2 + X 3 + X 4 + X 5 = 7 + 7 + 6 + 8 = 28
The "N" in the upper part of the summation notation tells where to end the sequence of
summation. If there were only three scores then the summation and example would be:
X
i=1
i = X 1 + X 2 + X 3 = 5 + 7 + 7 = 19
Sometimes if the summation notation is used in an expression and the expression must be
written a number of times, as in a proof, then a shorthand notation for the shorthand
notation is employed. When the summation sign " ∑ " is used without additional
notation, then "i=1" and "N" are assumed
PROPERTIES OF SUMMATION
n
1. ∑ K = nK , Where k is any constant
i= 1
n n
2. ∑ KX i = K ∑ X i , Where k is any constant
i= 1 i= 1
n n
3. , where a and b are any constant
i
(a + bX i ) = na + b X
i= 1
i
4. n n n
i= 1
(X i + Yi ) =
i= 1
X i + i= 1
Yi
N
5. (X i Y i ) = X 1 Y1 + X 2 Y 2 + + X N YN
i =1
29 | P a g e
Example 3.1: considering the following data determine
X Y
5 6
7 7
7 8
6 7
8 8
5 5 5
5 5
a) ∑ Xi b) ∑ Yi c) ∑ 11 d) (X i + Yi ) e) (X i Yi )
i= 1 i= 1 i= 1
i=1 i=1
5 2 5 5 5
i 5 5
f) ∑X g) ∑ X iYi h) X + Y i i g) ∑ X i ∑ Y i
i= 1 i= 1 i= 1 i= 1
i=1 i=1
The choice of these averages depends up on which best fit the property under discussion.
30 | P a g e
The mean of X1, X2 ,X3 …Xn is denoted by A.M, or X and is given by:
n
X1 + X 2 + + X n
X= = Xi / n
n i=1
When the data are arranged or given in the form of frequency distribution i.e. there
are k variate values such that a value X i has a frequency f i ( i=1,2,---,k) ,then the
Arithmetic mean will be
k
k
fi X i
∑ f i= n
X = i=1 Where k is the number of classes and
k i= 1
i=1
fi
fY i i
X= i=1
n
Where Y i = the class mark of the ith class and fi = the frequency of the ith
f
i=1
i
class
Example 3.2:
2) The distribution of age at first marriage of 130 males was as given below
Age in years(X): 18 19 20 21 22 23 24 25 26 27 28 29
No. of males (f): 2 1 4 8 10 12 17 19 18 14 13 12
31 | P a g e
Class frequency
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6
1) Find the mean of the marks obtained by 51 students with A=48.5 and w=10 of
fi 4 12 15 13 7
Merits:
• It is rigidly defined.
• It is based on all observation.
• It is suitable for further mathematical treatment.
• It is stable average, i.e. it is not affected by fluctuations of sampling to some extent.
• It is easy to calculate and simple to understand.
Demerits:
• It is affected by extreme observations.
• It can not be used in the case of open end classes.
• It can not be determined by the method of inspection.
• It can not be used when dealing with qualitative characteristics, such as intelligence,
honesty, beauty.
• It can be a number which does not exist in a serious.
• Some times it leads to wrong conclusion if the details of the data from which it is
obtained are not available.
• It gives high weight to high extreme values and less weight to low extreme values.
3.6.The Mode
Mode is a value which occurs most frequently in a set of values
The mode may not exist and even if it does exist, it may not be unique.
In case of discrete distribution the value having the maximum frequency is the
model value.
If in a set of observed values, all values occur once or equal number of times, there
is no mode
Examples:
1. Find the mode of 5, 3, 5, 8, and 9
32 | P a g e
Mode =5
2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
It is a bimodal Data: 8 and 9
3. Find the mode of 4, 12, 3, 6, and 7.
No mode for this data.
- The mode of a set of numbers X1, X2, …Xn is usually denoted by X̂ .
If data are given in the shape of continuous frequency distribution, the mode is defined as:
Δ1
Xˆ = Lmod + ( )W
Δ1 + Δ2
Example 3.7: The following is the distribution of the size of certain farms selected at
random from a district. Calculate the mode of the distribution.
Size of farms No. of farms
5- 15 _____________________________8
15- 25____________________________12
25- 35____________________________17
35- 45____________________________29
45- 55____________________________31
55- 65____________________________5
65- 75____________________________3
33 | P a g e
Merits:
It is not affected by extreme observations.
Easy to calculate and simple to understand.
It can be calculated for distribution with open end class.
Can be used for qualitative data as well.
Demerits:
It is not rigidly defined.
It is not based on all observations
It is not suitable for further mathematical treatment.
It is not stable average, i.e. it is affected by fluctuations of sampling to some extent.
Often its value is not unique.
34 | P a g e
Note: The median class is the class with the smallest cumulative frequency (less than type)
greater than or equal to n/2.
Example 3.9: Find the median of the following distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Merits and Demerits of Median
Merits:
• Median is a positional average and hence not influenced by extreme observations.
• Can be calculated in the case of open end intervals.
• Median can be located even if the data are incomplete.
Demerits:
• It is not a good representative of data if the number of items is small.
• It is not amenable to further algebraic treatment.
• It is susceptible to sampling fluctuations.
~
Empirical relationship between X, Xˆ , and X
~
X = Xˆ = X , for symmetrical distribution
~
X Xˆ = 3 X X , for unimodal skewed or asymmetrical frequency distribution.
4. Measures of Variation (Dispersion), Skewness and Kurtosis (4 lecture hours) Lecture note
a. Motivating examples
b. Objectives of measures of variation
c. Measures of Dispersion (Variation)
i. Range and Relative Range
ii. Variance, Standard Deviation and Coefficient of Variation
iii. Standard Scores
4.1 Introduction
Consider the following two sets of scores:
Both these sets have the same mean (50), but the second set is a lot more widely dispersed
("scattered") than the first.
35 | P a g e
Set 1 Set 2
Measure of central tendency alone does not adequately describe a set of observation unless all
observations are the same. So we need some additional information like
1) The extent to which the items in a particular distribution are scatters around the
central tendency i.e. measure of dispersion.
2) The direction of scattered ness whether more items are attached towards higher or
lower values i.e. measure of skew ness.
3) The extent to which the distribution is more peaked or more flat toped than the
normal distribution i.e. measure of kurtosis.
Definition:
The scatter or spread of items of a distribution is known as dispersion or variation.
In other words the degree to which numerical data tend to spread about an average
value is called dispersion or variation of the data.
Measures of dispersions are statistical measures which provide ways of measuring
the extent in which data are dispersed or spread out.
36 | P a g e
Measures of dispersion may be either absolute or relative
Furthermore, although the same unit of measurement is used, the two MCT (means)
may be quite different. If we compare the AMD of weights of first grade children with
the AMD of the weights of high school freshmen, we may find that the latter AMD is
numerically larger than the former, because the weights themselves are larger, not
because the AMD is larger.
What is needed in situation like these is a measure of relative variation rather than
absolute variation. It is the ratios of absolute dispersion to an appropriate average such
as co- efficient of Standard Deviation or Co-efficient of Mean Deviation.
Various measures of dispersions are in use. The most commonly used measures of dispersions
are:
Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45
37 | P a g e
Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45
For this reason, among others, the range is not the most important measure of variability.
X max X min
For ungrouped data: RR =
X max + X min
38 | P a g e
Example 2:
Height Number of
(in) Students
Less than 59.5 0
Less than 62.5 5
Less than 65.5 23
Less than 68.5 65
Less than 71.5 92
Less than 74.5 100
R=74.5-56.5=18
x max x min 74 . 5 56 . 5
coefficien t range 0 . 135
x max x min 74 . 5 56 .5
Example 4.1:1) Find the R, and RR and then identify which data is more dispersed?
a) For the month income of 10 workers Xi: 347, 420, 500,600,696,710, 835, 850, and
900.
b) For the following age distribution.
Class frequenc
y
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6
2. If the range and relative range of a series are 4 and 0.25 respectively. Then what is the
value of:
a) Smallest observation
b) Largest observation
4.3.2 The Variance
Population Variance
If we divide the variation by the number of values in the population, we get something
called the population variance. This variance is the "average squared deviation from the
mean".
N
2 i 1
( xi u ) 2
Population Variance , i 1,2,3,..., N
N
Sample Variance
39 | P a g e
One would expect the sample variance to simply be the population variance with the
population mean replaced by the sample mean. However, one of the major uses of
statistics is to estimate the corresponding parameter. This formula has the problem that
the estimated value isn't the same as the parameter. To counteract this, the sum of the
squares of the deviations is divided by one less than the sample size.
n
i 1
( xi x )2
Sample Variance
n 1
2
I.e. The sample variance, denoted by s , of a set of n observed values having a mean x is the
sum of the squared deviations divided by n 1 .
2
i 1
fi(xi x)
s
n 1
We usually use the following short cut formula.
n 2
2
2
i1
x n x
i
s
n 1
, for raw data
n 2
2
2
i 1
fi x n x
s
n 1
i
, for frequency distribiti on , where fi n
Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means
that the units were also squared. To get the units back the same as the original data
values, the square root must be taken.
2
population s tan dard deviation
2
sample s tan dard deviation s s
Examples: Find the variance and standard deviation of the following sample data
1. 5, 17, 12, 10,8
2. The data is given in the form of frequency distribution.
Class Frequency
40-44 7
40 | P a g e
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Coefficient of Variation (C.V)
• Is defined as the ratio of standard deviation to the mean usually expressed as
percents.
S 0
C .V 100 0
X
• The distribution having less C.V is said to be less variable or more consistent.
Examples:
1. An analysis of the monthly wages paid (in Birr) to workers in two firms A and B belonging
to the same industry gives the following results
Value Firm A Firm B
Mean wage 52.5 47.5
Median wage 50.5 45.5
Variance 100 121
In which firm A or B is there greater variability in individual wages?
Solutions:
SB 11
C .V B 100 0 0 100 0 0 23 .16
XB 47 .5
Since C.VA < C.VB, in firm B there is greater variability in individual wages.
Xi u
Zi , for the population
41 | P a g e
X i x
Z i , for the sample
s
Z gives the deviations from the mean in units of standard deviation.
Z gives the number of standard deviation a particular observation lie above or
below the mean.
It is used to compare two observations coming from different groups.
Examples:
1. Two sections were given introduction to statistics examinations. The following
information was given.
Value Section 1 Section 2
Mean 78 90
Stand. deviation 6 5
Student A from section 1 scored 90 and student B from section 2 scored 95. Relatively
speaking who performed better?
Solutions:
Calculate the standard score of both students.
X X 1 90 78
Z1 1 2
S1 6
X 2 X 2 95 90
Z2 1
S2 5
Student A performed better relative to his section because the score of student A is
two standard deviation above the mean score of his section while, the score of
student B is only one standard deviation above the mean score of his section.
2. Two groups of people were trained to perform a certain task and tested to find out
which group is faster to learn the task. For the two groups the following information
was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stand.dev. 1.2 min 1.3 min
Relatively speaking:
a) Which group is more consistent in its performance
b) Suppose a person A from group one take 9.2 minutes while person B from
Group two take 9.3 minutes, who was faster in performing the task? Why?
Solutions:
a) Use coefficient of variation.
S 1.2
C.V1 1 100 0 0 100 0 0 11.54 0 0
X1 10.4
42 | P a g e
S2 1.3
C.V2 100 0 0
100 0 0 10.92 0 0
X2 11 .9
Since C.V < C.V , group 2 is more consistent.
2 1
X B X 2 9.3 11.9
ZB 2
S2 1.3
Child B is faster because the time taken by child B is two standard deviation shorter
than the average time taken by group 2 while, the time taken by child A is only one
standard deviation shorter than the average time taken by group 1.
Chapter 5.
1. Elementary probability (4 lecture hours) Lecture note
5.1.Introduction
• Probability theory is the foundation upon which the logic of inference is built.
• It helps us to cope up with uncertainty.
• In general, probability is the chance of an outcome of an experiment. It is the measure of
how likely an outcome is to occur.
5.2. Definitions of some probability terms
1. Experiment: Any process of observation or measurement or any process which
generates well defined outcome.
2. Probability Experiment (Random Experiment): It is an experiment that can be
repeated any number of times under similar conditions and it is possible to enumerate
the total number of outcomes without predicting an individual out come.
Example: If a fair coin is tossed three times, it is possible to enumerate all possible
eight sequences of head (H) and tail (T). But it is not possible to predict which
sequence will occur at any occasion.
3. Outcome: The result of a single trial of a random experiment
4. Sample Space(S): Set of all possible outcomes of a probability experiment.
Example 1: Sample space of a trial conducted by three tossing of a coin is
S= {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Example 2: Recording the gender of children of two-child families.
S= {bb, bg, gb, gg}. An event B may be:B=“children of both genders.” Then B={bg,
gb}.
Sample space can be
Countable (finite or infinite)
Uncountable
43 | P a g e
5. Event (Sample Point): It is a subset of sample space. It is a statement about one or
more outcomes of a random experiment. It is denoted by capital letter A, B, C - - -.
For example, in the event, that there are exactly two heads in three tossing of a coin, it
would consist of three points HTH, HHT and THH.
Remark: If S (sample space) has n members with two possible outcomes in each trial then
there are exactly 2n subsets or events.
6. Equally Likely Events: Events which have the same chance of occurring.
7. Complement of an Event: the complement of an event A means non- occurrence of A
and is denoted by A' orAc or {A , contains those points of the sample space which
don’t belong to A.
8. Elementary (simple) Event: an event having only a single element or sample point.
9. Mutually Exclusive (Disjoint) Events: Two events which cannot happen at the same
time.
10. Independent Events: Two events are said to be independent if the occurrence of one
does not affect the probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or
occurrence of the second event in a way the probability is changed.
5.3 Counting Techniques
The number of outcomes of the random experiment or number of cases favorable to an event can be
determined by using mathematical methods (multiplication rule, addition rule, permutation and
combinations) without direct enumeration.
Addition rule
If there are k procedures and the ith procedure may be performed in ways = 1,2, … , then the
number of ways in which we may perform procedure1 or procedure 2 or …procedure k is given by
+ +. . . + assuming that no two procedures may be performed together.
Example:
1. Suppose that we are planning a trip and are deciding between bus or train transportation. If
there are 3 bus routes and 2 train routes, how many different routes are available for the trip?
Solution: There are 3 bus and 2 train routes. Thus there 3 + 2 = 5 routes are available for trip.
Multiplication rule
In a sequence of n events in which the first one has possibilities, the second one has , the 3rd one
has and etc, the total possibility of the sequence will be + + . . . +
Example:
44 | P a g e
1. An instructor gives a six question multiple choice examinations. There are four possible
responses for each question. How many answer keys can be made?
2. A product is assembled in three stages. At the first stage there are five assembly lines, at the
second stage there are there are 6 assembly lines and at the third stage there are 10 assembly
lines. In how many different ways may the product be routed through the assembly process?
Solution:
1. = 6 = 4 = 4 = 4 = 4 = 4 = 4
Totally 4 4 4 4 4 4 = 4096
2. = 5 = 6 = 10 the product can be routed in an assembly process by
5 6 10 = 300
Permutations
Example:
1. Suppose that the photographer want to arrange 4 people in a raw for photographing. By how
many different ways can the arrangement be done?
2. How many different 5 letter permutation can be performed from the letters in the word
DISCOVER?
Solution:
45 | P a g e
1. The number of arrangement of 4 people in a raw is given by 4! = 4 3 2 1 = 24
2. = ℎ = 8, =5
8!
8 p5 6720
(8 5)!
Combination
A selection of distinct objects without regard to order is called a combination. The difference between a
permutation and a combination is that in a combination, the order or arrangement of the objects is not
important; by contrast, order is important in a permutation.
Example: 1 how many different committees of 3 people can be chosen to work on a special project
46 | P a g e
I n a club there is 7 women and 5 men. A committee of 3 women and 2 men is to be chosen. How many
different possibilities are there?
Solution:
Example:
Exercise
A committee of 5 people must be selected from 5 men and 8 women. By how many ways can the
selection be done if there are at least 3 women in the committee?
47 | P a g e
Axiomatic Approach:
Let E be a random experiment and S be a sample space associated with E. With each event A a
real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.
a) 0 P A 1
b) P(s) =1
c) If A and B are mutually exclusive events, the probability that one or the other occur
equals the sum of the two probabilities. i. e. P (AuB) =P (A) +P (B)
d) For any event A , P A 0
e) P φ = 0
f) For any event A and B ,P(AuB)=P(A)+P(B)-P(AnB)
g) P A = 1 P(A)
Conditional Events: If the occurrence of one event has an effect on the next occurrence of the other
event then the two events conditional or dependant events.
Conditional Probability
Let A and B be two events such that P(A) 0. Denote by P(B|A) the probability of B given that A has
occurred. Since A is known to have occurred, it becomes the new sample space replacing the original S
.From this we are led to the definition
p(A B
Or PA B = , P (B) 0 or P (A B) = P (A|B).P(B)
P B
The above definition implies that the probability that both A and B occur is equal to the probability that
A occurs times the probability that B occurs given that A has occurred. We call P the conditional
48 | P a g e
probability of B given A, i.e., the probability that B will occur given that A has occurred. It is easy to show
that conditional probability satisfies the axioms of probability.
Remark:
1) 0 ≤ ( | ) ≤ 1 2) ( | ) = 1 and ( | ) = ( )
3) P A / B 1 P A / B
4) P B / A 1 PB / A
5) ( ∪ | )= ( | )+ ( | ) if and are mutually exclusive
6) For three events
( ∩ ∩ )= ( ) ( | ) ( | ∩ )
Examples
1. The probability that it is Friday and that a student is absent is 0.03. Since there are 5 school days in
a week, the probability that it is Friday is 0.2. What is the probability that a student is absent given
that today is Friday?
Solution:
2. A jar contains black and white marbles. Two marbles are chosen without replacement. The
probability of selecting a black marble and then a white marble is 0.34, and the probability of
selecting a black marble on the first draw is 0.47. What is the probability of selecting white
marble on the second draw, given that the first marble drawn was black?
Solution:
49 | P a g e
P ( Black and White ) 0.34
P White | Black 0.72
P ( Black ) 0.47
Assignment:
4. Suppose a study conducted at Wollega University reveals that students who attended class
95 % to 100% of the time usually scored an A in the class. Students who attended class
80% to 90% of the time usually scored B or C in the class. Students who attended class less
than 80% of the time usually received D or F or eventually withdrew from the class.
i) Are descriptive, inferential, or both types of statistics used? Why?
ii) What is the population under study?
iii) What are the data in the study?
iv) What are the variables under study?
v) Identify the types of variables?
vi) Which type of scale of measurement is used for those variables?
50 | P a g e
City 1 25 24 23 26 17
City2 22 21 24 22 20
City3 32 27 35 24 28
Which city have the most consistent temperature, based on these data?
(Exercise)
6.
51 | P a g e