0% found this document useful (0 votes)

2 views

1 Descriptive Part

Descriptive probability

Uploaded by

hafizyt2014

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

1 Descriptive Part

Descriptive probability

Uploaded by

hafizyt2014

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Chapter one

Introduction

Definition and classification of statistics

Definition of statistics
The word statistics means different to different according to the way they use it but all the meanings given can
be categorized in two different definitions.
Statistics in Plural sense
✓ It refers to any information about any activity expressed in numbers.
Statistics in singular sense
✓ When statistics is used in its singular sense, it has its modern meaning, it is defined as a branch of
mathematics or applied research which is concerned with the development and application of methods
and techniques for collecting, organizing, presenting, analyzing, and interpreting quantitative data
in such a way that the reliability of conclusions based on the data may be evaluated objectively in terms
of probability statements. This meaning of statistics refers to the study of statistics as a science.
✓ It is the science that deals with method of collecting, organizing, analyzing and interpreting the result.
Classification of Statistics
Statistics can be divided in to two broad categories
Descriptive statistics:
✓ It is a branch of statistics that deals with any methods or procedures used to organize and summarize
masses of numerical data into a meaningful form by using various statistical techniques such as table,
chart, graph, average and etc.
✓ This part of statistics comprises of the first four parts of collection organization presentation and
analysis of a numerical data.
Inferential statistics:
✓ It is a branch of statistics concerned with interpreting data & drawing conclusions.
✓ It lies on the last step in statistical investigation and is concerned with drawing conclusions about the
source data by taking a sample.
✓ It can be defined as the science of using probability to make decisions.

THE NATURE OF THIS DISCIPLINE

Descriptive Statistics Probability Inferential Statistics

Application, uses and limitation of statistics

Uses of statistics
Statistics is used in almost all fields of human activities and used by government bodies, private business firms
and research agencies as a major tool. Some of the uses are:
✓ It is also helpful in formulating and testing hypothesis and to develop new theories
✓ It can condenses and summarizes complex data
✓ It helps to predict the future trend
Application area of statistics
✓ In research work: statistics is indispensable in research work
✓ In engineering areas and physical science
✓ In economics and biological science
✓ In social science and politics etc
Limitation of statistics
As there is much usefulness of statistical methods, there are also many potential errors and limitations in carrying
out and interpreting statistical studies.
✓ Complete accuracy in statistics is often impossible.
✓ It cannot deal with a single value. But it deal with a set of data
✓ It cannot deal with qualitative data. It only deals with data which can be quantified. Ex: it does not deal
with marital status (married, single) but it deal with a number of married, a number of single
✓ Statistical values are true on average. The conclusions drawn from the analysis of the sample may
perhaps, differ from the conclusions that would be drawn from the entire population. For this reason
statistics is not an exact science.

1
Some Basic Terminologies in Statistics

Population:
✓ It is a totality of things, objects, people, etc with which the researcher is concerned.
✓ It can be qualitative or quantitative, finite or infinite
Sample:
✓ It is a portion or part of population of interest.
Parameter:
✓ It is a numerical characteristic of an entire population (Greek letters)
Statistic:
✓ It is a numerical characteristic of a sample (Latin letters)
Variable:
✓ It is a certain characteristic that difference from object to object.
Examples: Weight, stock prices, height, price of gasoline
Types of variables
1. Quantitative variables:
✓ They are variables that can be expressed numerically.
✓ They are variables that assume values of the measurable quantity.
✓ It can be classified as:
a. Discrete variables:
o They are variables whose values can obtain by counting.
o The possible values for such variables are 0, 1, 2…. Ex: number of children in a family,
number of trees in forest.
b. Continuous variables:
o They are variables whose value can take any value b/n two №.
o Their values are obtained by measuring. Ex: weight, height, rain fall records.
2. Qualitative variables:
✓ They are variables that cannot be expressed numerically.
✓ It is also known as categorical variables.
Note:
✓ In quantitative variable an operation such as addition or average can make a sense. But for qualitative
it can’t make a sense.
✓ A categorical variable is also known as an attribute, whereas a quantitative variable is often referred
to simply a variable.
✓ If the variable can assume only one value, it is called a constant.
✓ In general, measurements give rise to continuous data, while enumerations, or counting’s, give rise to
discrete data.
Data:
✓ It is the set of values collected for the variable for each of the elements of a population or sample
✓ Data are a numerical representation of a phenomenon.
✓ It is information that expressed in quantitative form

Types of Data

➢ Depending on the level (scale) of measurement

1) Nominal data- Categorical data where the categories are not ordered (e.g., ethnic group). Data that is
classified into categories and cannot be arranged in any particular order.
1. Ordinal data - Categorical data that can be ordered, but the increment between specific values is
arbitrary. data arranged in some order, but the differences between data values cannot be determined or
are meaningless
2. Cardinal data - Data on scale where addition is meaningful (e.g., change in 3 inches for height).
There are two types of cardinal data:
a) Ratio-scale data - Cardinal data on a scale where ratios between values are meaningful (e.g.,
serum-cholesterol levels).
b) Interval-scale data - Cardinal data where the zero point is arbitrary. for such data, ratios are
not meaningful (Julian dates; we can calculate the number of days between two dates, but we
can’t say that one date is twice as large as another date).

2
Note:
✓ For ratio, the origin (i.e., the value zero) is meaningful №. But the origin has no meaning with interval.
Consequently, we can add and subtract interval, we cannot divide & multiply them. In ratio we can
use all operations (i.e. addition, subt. Divi. multiplication)
✓ Nominal & ordinal scales are belongs to qualitative variables, whereas interval & ratio scale are
quantitative.

➢ Depending on time reference

1. Cross-sectional data:
✓ The data that are collected at a time.
✓ This is data collected at the same or one particular point in time on different elements. They are
snapshots that show how things are at one particular time. E.g., sales made at the same point in time
but at d/t market places.
2. Time series(longitudinal) data:
✓ The data that are collected over a period of time.
✓ This is data collected at several points in time from the some study objects or units
3. Panel data: combination of these two
➢ Depending on the source of data
1. Primary data: the data that are collected for the first time for the problem under consideration.
2. Secondary data:
• The data collected previously by others for their own purposes.
• The data which are obtained from archives of organization, bulletins, journals or website.

Basic Steps in Statistical Study

For any statistical study, there are some basic steps to be followed once we draw a sample.
Step1. Gather first-hand information from the sample & this is called the raw data
Step2. Tabular representation of the raw data, i.e. represent the raw data in a table
Step3. Pictorial representation of the data, i.e. draw a diagram with the organized data in a table.
Step4. Numerically summarize the data, i.e. describe the entire data set with some key №s.
Step5. Analyze the data using mathematical formulae
Step6. Draw the final inference or conclusion about the population under study.

Chapter Two
Method of Data Collection and presentation

Data can be collected in a variety of ways. One of the most common methods is through the use of surveys
Question: what is survey?
Survey:
✓ It is requiring data from individual directly or indirectly.
✓ It can be conducted through the mail, telephone, personal interview, etc.
✓ There are two kinds of survey:
1. Census survey (complete enumeration survey):
• It is a survey that includes every element in the population.
2. Sample survey:
• It is a survey that includes only subset of the population.
Note:
✓ If your data represents only a portion of the population you have a sample.
✓ If your data represents the entire population you have a census.
✓ Sample survey is better than census; b/c it reduces cost, reduces effort, and accommodate more detail
information.
✓ Census is better than survey, when the number of population is small, the populations are
heterogeneous.

Organizing a Raw Data Set

Once a sample is drowning, we observe the variable (categorical or quantitative) value for the sampled objects
or individuals. Each value thus obtained is called an entry or a data pt or simply an observation; & the collection
of all the entries or observations is called a data set or often abbreviated as data. The most convenient method
of organizing a raw data is to construct frequency distribution (frequency table)

3
Definition: frequency distribution (f.d)
✓ It is organizing data in table form, using classes & frequencies.
✓ It shows how many observations fall in various categories.
✓ It can be classified as:
A. Categorical (qualitative) f.d
B. Numerical (quantitative) f.d

A. Categorical (qualitative) f.d:

✓ It is used for the data that can be placed in specific categories, such as nominal or ordinal.
✓ Data are classified according to non-numerical categories.

Ex: 25 army inductees were given a blood test to determine their blood type. The data set is

A B AB B O
O O AB B B
B B A O O
A O O O AB
AB A B O A

Construct a frequency distribution for this data.

Solution:
Step1.Determine the class: Thus the classes are A B O & AB
Step2. Determine the frequency for each class
Therefore the f.d is as follows
Class A B O AB
Frequency (f) 5 7 9 4

Note:
We can transform the frequency distribution into a relative frequency distribution, percentage frequency
distributions & cumulative frequency distribution.
✓ In order to transform f.d to relative f.d we can use the f.f formulae
f
Relative f.d= , w/r f = actual frequency & n = total frequency
n
✓ In order to transform f.d to percentage distribution we multiply relative f.d by 100%
f
i.e. percentage distribution = * 100%
n
✓ In order to transform f.d to cumulative f.d we have to define cumulative f.d

Definition: Cumulative frequency distribution of a class is the sum of all frequencies preceding or succeeding
that class including the frequency of that class. There are two types of cumulative frequency distributions namely
“less than “and “more than “cumulative frequency distributions.
I. The “less than” cumulative frequency distribution (LCF) of a class is obtained by adding the frequency
of the preceding classes including the frequency of that class.
II. The “more than” cumulative frequency distribution (MCF)of a class is obtained by adding the
frequency of the succeeding classes including the frequency of that class.
• From the above example let as construct all form of f.d
Class Frequency Relative Percentage Cumulative frequency
frequency frequency LCF MCF
A 5 0.2 20% 5 25
B 7 0.28 28% 12 20
O 9 0.36 36% 21 13
AB 4 0.16 16% 25 4

4
Note: from the above table we can construct
 relative f.d as follows
Class Relative frequency
A 0.2
B 0.28
O 0.36
AB 0.16
 percentage f.d as follows
Class Parentage frequency
A 20%
B 28%
O 36%
AB 16%
 cumulative f.d as follows
Class Cumulative frequency
LCF MCF
A 5 25
B 12 20
O 21 13
AB 25 4

B. Numerical (quantitative) f.d:

✓ It is used to display numerical data type.
✓ Data are classified according to numerical size. This is used to summarize data collected by interval
and ratio level of measurement.
✓ It can be classified as
a) Ungrouped f.d: it is a f.d were we count the number of times each value of variables is represented. It is used
when the range of the data is small.
Ex: The data shown here represents the number of miles per that 30 selected four wheel drive sports utility
vehicles obtained in city driving. Construct a frequency distribution.

12 17 12 14 16 18 16 18 12 16
17 15 15 16 12 15 16 16 12 14
15 12 15 15 19 13 16 18 16 14

Solution:
Step1. Determine the class, (i.e. the classes are 12, 13, 14, 15, 16, 17, 18, and 19)
Step2. Determine the frequency for each class
Therefore the f.d is as follows
Class 12 13 14 15 16 17 18 19
Frequency 6 1 3 6 8 2 3 1

b) Grouped f.d: when the range of the data is large, the data must be grouped in to class that is more than one unit
in width, in what is called a grouped (continuous) f.d.
Ex: A machine produces the following № of rejects in each successive period of five minute. Construct f.d

16 21 26 24 11 17 26 25 13 27
24 26 3 27 23 24 15 22 22 12
22 29 18 22 28 25 7 17 22 28
19 23 23 22 3 19 13 31 23 28
24 9 20 33 30 23 20 8 21 24

Solution:
Step1.Determine the class
Here for grouped f.d we might have two types of class w/c is called class limit (CL) & class boundary (CB). In
order to have a class we have to use the f.f procedure

5
Determine the № of class (K). It can be calculated as k = 1 + 3.322 log10 . Where k=№ of class
n
I.
required (if the value becomes decimal round to the next whole №), n=№ of observation in the sample.
Or we can find K by using k = 2.5n .
14

II. Determine class width (interval) (size) (W). It can be calculated as W=Range/K. (If the value becomes
decimal round to the next whole №). W/r Range= max-min
III. Select starting pt or the lowest class limits (LCL). This can be the smallest data value. Add the width
to the lowest score taken as starting pt to get the lower limit of the next class. Keep adding the W until
the № of class becomes K.
IV. Subtract one unit from the lower limit of the 2ndclass to get the upper limit of the 1stclass. Then add the
class width to each upper limit to get all the upper limits.
V. Find the class boundaries by subtracting 0.5 from each upper class limit& adding 0.5 to the upper class
limit (UCL)
Step2. Determine the frequency for each class
The completed grouped f.d is as follows:
Class limit Class boundary Frequency
3-7 2.5-7.5 3
8-12 7.5-12.5 4
13-17 12.5-17.5 6
18-22 17.5-22.5 13
23-27 22.5-27.5 17
28-32 27.5-32.5 6
33-37 32.5-37.5 1

Note: consider the f.f table

✓ The use of CB is to separate the classes so that there are no gap in the f.d. Ex there is a gap b/n 7&8,12&13
✓ 5 to 15 classes usually are used. If you use fewer than 5 classes, you risk losing too information. If you use more
than 15 classes the data may not be sufficiently summarized.
✓ CL & CB are the same, w/n the diagonal of the class are the same.
✓ When all the classes have the same (uniform) class width (W) then the W of the distribution is the d/c b/n either
the LCL or UCL of the two consecutive classes.
✓ We can find the mid pt (class mark) (Xmi) from the frequency distribution. It can be computed as
 LCL i − UCLi   LCB i − UCBi 
X mi =   or  
 2   2 
Class limit Class boundary Frequency Xmi LCF MCF
3-7 2.5-7.5 3 5 3 50
8-12 7.5-12.5 4 10 7 47
13-17 12.5-17.5 6 15 13 43
18-22 17.5-22.5 13 20 26 37
23-27 22.5-27.5 17 25 43 24
28-32 27.5-32.5 6 30 49 7
33-37 32.5-37.5 1 35 50 1

Pictorial representation of the data

Practically everyone encounters graphs at one time or another.

Definition of graph:
✓ The word graph comes from the Greek word meaning ‘’to draw or write.’’
✓ We define a graph as a pictorial representation of a set of data.
✓ Many types of graphs are employed in statistics, depending on the nature of the data involved and the
purpose for which the graph is intended.

The step of pictorial representation comes after the raw data set has been pruned & organized
The most common & simple form of Pictorial representation of data are
✓ Bar chart
✓ Pie chart
✓ Histogram

6
Bar chart/bar diagram/bar graph
✓ It is used to display distributions of categorical variables.
✓ One bar per category – height is determined by frequency or relative frequency
✓ Order of categories is arbitrary.
✓ Does NOT let you talk about the shape of a distribution.
Features of a bar chart
✓ Bars can be horizontal or vertical
✓ Bars are of uniform width & uniformly spaced [leave space b/n each bar (category) to indicate distinct]
✓ The length of the bar represents values of the variable being displayed, the frequency of occurrence, or
the percentage of occurrence. The same measurement scale is used for the length of each bar.
✓ The graph is well annotated with title, labels for each bar, & vertical scale or actual value for the length
of each bar.
✓ It can be classified as:
• Simple bar chart
• Component bar chart
• Multiple bar chart

Simple bar chart: is used to represent for only one variable.

Example: construct a bar chart to show the religion affiliation of the American population
Religion Number of population(million)
Protestant 79
Roman Catholic 31
Jewish 4
Others 2

Number of population(million)
100

50 Number of
population(million)
0
protestant Roman Calholic Jewish others
Figure of Simple Bar Diagrams

Note:
✓ The above graph show that each bar has an equal width but unequal length.
✓ The length indicates the number of population.
✓ It has a limitation b/c a diagram can display only one classification or one category of data.
✓ It may be noted that the simple bars shown in the above figure are drawn vertically. They are, therefore,
known as vertical bars. But the same bars can be drawn horizontally as shown in figure below.
.

Number of population(million)
others
Jewish
Number of
Roman Calholic
population(million)
protestant

0 20 40 60 80 100
Figure Horizontal Simple Bar Diagram

7
Component Bar Diagram: As the name of this diagram implies, it shows subdivisions of components in a
single bar. When it is desired to show how a total is divided into its components, we use a component bar chart.
In this type of bars different colors are used for identification.

Example: display the following using a suitable chart yield of farmers in SNNPR.
CROP/YEAR 1990 1991 1992 1993
PEAS 14 15 26 19
WHEAT 10 15 14 25
MAIZE 2 6 10 3
TOTAL 26 36 50 47

60
Maize
40
Wheat
20
peas
0
1990 1991 1992 1993

Fig of Component bar Diagram

Multiple Bars: When two or more interrelated series of data are depicted by a bar diagram, then such a diagram
is known as a multiple-bar diagram. Suppose we have birth rate and death rate of different five countries. We
can display by two bars close to each other, one representing birth rate while the other representing death rate
figure shows such a diagram based on hypothetical data.

Example: the following table give birth rates and death rates of different five countries during 1998
Country Birth Rate Death Rate
A 33 24
B 16 11
C 20 14
D 40 18

Birth Rate
60

0
A B C D

Figure Multiple Bar charts

Pie chart/pie diagram/circle graph

It is a type of circles used to display the percentage of total no. of measurement falling in to each of the categories.
Since the total angle at the center of a circle has 360 degrees ( o), we convert the relative frequencies in to
corresponding degrees using the formula: Degree a categories or class = relative frequency * 360o.

Example: Draw a pie diagram for the following data of Five year plan public sector
Agriculture and rural Development 12.9%
Irrigation etc 12.5%
Energy 27.2%
Industry and minerals 15.4%
Transport communication 15.9%
Social services and others 16.1%

8
precentage outlay
Solution: the angle at the center is given by  360o= percentage out lay x 3.6'
100
Percentage outlays Angle at the center
Agriculture and rural Development 12.9% 12.93.6=46o
Irrigation etc 12.5% 12.53.6=45o
Energy 27.2% 27.23.6=98o
Industry and minerals 15.4% 15.43.6=56o
Transport communication 15.9% 15.93.6=57o
Social services and others 16.1% 16.13.6=58o

Agriculture and rural

Development

13% Irrigation etc

16%
13%
16% Energy

15% 27% Industry and minerals

Transport communication

HISTOGRAMS AND FREQUENCY POLYGONS

Histograms and frequency polygons are two graphic representations of frequency distributions.
1. A histogram or frequency histogram, consists of a set of rectangles having
a. bases on a horizontal axis (the X axis), with centers at the class marks and lengths
equal to the class interval sizes, and
b. Areas proportional to the class frequencies.
2. A frequency polygon is
✓ It is a line graph of the class frequencies plotted against class marks. It can be obtained by
connecting the midpoints of the tops of the rectangles in the histogram.
• The first end point is joined to the x-axis to a point showing zero frequency just
before the first class interval, and the last end joined to the one after the last class
interval.

Ogive curves
So far we have discussed the graphic devices, that showed frequencies as are given to us or we may say non-
cumulative frequencies. We now take up another type of graph, which is based on cumulative frequencies. It is a
graph that represents the cumulative frequencies for the classes in f.d.

The cumulative frequency curve (ogive)

The cumulative frequency curve (or ogive) is the graphic representation of a cumulative frequency distribution.
There are two types of ogives. These are
I) Less than ogive
II) Greater than ogive.
I) Less than ogive
The less than cumulative frequencies are plotted against upper boundaries of their respective class intervals
II) Greater than ogive.
The greater than cumulative frequencies are plotted against the lower boundaries of their respective class
intervals.

9
Chapter Three
Numerical representation of a data set

There are three basic ways to summarize numerical data. These are
1. Measure of Central Tendency(MCT)
2. Measure of Variation (Dispersion)
Measure of Central Tendency (MCT):
✓ Quantitative variables contained in raw data or in frequency tables can be summarized by means of a few
numerical values. A key element of this summary is called the MCT. It is also called measure of average

Types of measures of central tendency

There are several different measures of central tendency; each has its advantage and disadvantage.

Three measures of the center of a distribution are commonly used: mean, median, and mode. Any of them can
be used with normally distributed data; however, with ordinal data, the mean of the raw scores is usually not
appropriate. Especially if one is computing certain statistics, the mean of the ranked scores of ordinal data
provides useful information. With nominal data, the mode is the only appropriate measure

Mean
✓ It is a measure of location or central value for a continuous variable.
✓ Most useful when the data have a symmetric distribution and do not contain outliers.
✓ It is the most popular & best understood MCT for a quantitative data set. Thus, it is usually the statistic
of choice, assuming that the data are normally distributed data.

10
Properties of the summation notation
n n n
1.  i = 1 + 2 + 3 + ...n
i =1
4.  ( xi + c) =  xi + (n  c )
i =1 i =1
n n n
2. 1 = n
i =1
5.  cx
i =1
i = c  xi , where c is a
i =1
n constant number
3.  c = n  c , where c is a constant n n n
i =1 6.  ( xi  y i ) =  xi   y i
i =1 i =1 i =1
number

The following table indicates the formula for mean

For individual or raw data For frequency distribution
For ungrouped data For grouped data
For population data For population data For population data
N N N

 xi  f i xi fm i i
AM = E ( x ) =  = i =1
AM = E (x ) =  = i =1
AM = E (x ) =  = i =1

N N N
For sample data For sample data For sample data
n n n

 xi  f i xi fm i i
AM = M (x ) = x = i =1
AM = M (x ) = x = i =1
AM = M (x ) = x = i =1

n n n
Where:

✓ xi = observation of the class

✓ mi = the mid pt of the class
n
✓ n =  f i = the total observation in the sample data
i =1
N
✓ N =  f i = the total observation in the population data
i =1

✓ M (x ) = x = the notation for sample data

✓ E(x) =  = the notation for population data

Alternative to the Arithmetic Mean-Median

Median:
✓ It is the middle number when the measurements are arranged in ascending (descending) order.
✓ It is the appropriate measure of central tendency for ordinal level raw data.
✓ It is a better measure of central tendency than the mean when the frequency distribution is skewed.
Note:
✓ In symmetric distributions the mean and median are the same
✓ In skewed distributions, median more appropriate.
✓ Provides a measure of location of a sample that is suitable for asymmetric distributions and is also
relatively insensitive to the presence of outliers.
Mode:
✓ It is the value of the item which appears most frequently.
✓ It is the most common category, or mode can be used with any kind of data but generally provides the
least precise information about central tendency.
✓ It is the only measure of central tendency that can be used with nominal data.

11
Note:
✓ There can be more than one mode or there may no mode when all observation in the data set have
equal frequency
✓ When all the values occur the same number of times, we usually say that there is no unique mode.

The following table indicates the formula for median & mode (for sample data)
For ungrouped data For grouped data
If n is odd: n 
2 − Cf  w
 n +1
th
✓ Median =   Median = LCB +  
 2  fmi
If n is even:
th th
Where:
n n  ✓ LCB = the lower class boundary of the median class
  +  + 1
Median =   2 
2 ✓ Cf = the LCF of the class above the median class,
✓
2 ✓ fmi = frequency of the median class &
✓ W is the width of the median class.
Mode = the value that have the most frequency  f 1 − f0 
Mode = l o +   w
 ( f 1 − f 0 ) + ( f 1 − f 2 ) 
Where:
✓ l o = is the lowest class boundary of the modal class,
✓ f1 = the frequency of modal class
✓ f 0 = the frequency of the class preceding the modal class,
✓ f 2 = the frequency of the class succeeding the modal class.
✓ w = the class width.

Note: (in case of grouped data)

( )
✓ Median class is the value of n 2  the nearest on LCF.
✓ Modal class is the class with largest frequency.
✓ If we have the value of median or mode, we can know the median class or modal class respectively b/c
the median or mode value is found in their class. Most of the time it is used for finding missing frequency.

Note:

The value of central tendency, however, does not completely describe the data. Therefore, some additional
characteristics of the data must be used to provide for a more complete summary and description of the data and
to distinguish between dissimilar data sets. The next section deals with this additional characteristic, the
variability of the data.
Example: Consider the following two sets of data.
i. 6, 18, 30 and
ii. 17, 18, 19
6 + 18 + 30 54 17 + 18 + 19 54
xi = = = 18 and xii = = = 18
3 3 3 3
Observation Even though the two sets of data have the same arithmetic mean, the values in i are more scattered
or dispersed than that of ii.

Measure of Variation (dispersion):

When comparing sets of data, it is useful to have a way of measuring the scatter of spread of the data.

✓ Variation or dispersion is the degree to which numerical data is scattered or spread about some measure
of central tendency (usually the mean).

12
Variance (V): Variance also indicates a relationship between the mean of a distribution and the data points; it is
determined by averaging the sum of the squared deviations. Squaring the differences instead of taking the absolute
values allows for greater flexibility in calculating further algebraic manipulations of the data. Another measure of
variation is the standard deviation.

The following table indicates the formula for variance & standard deviation

For individual or raw data For frequency distribution

For ungrouped data For grouped data
Variance: For population data Variance: For population data Variance: For population data
N N N

 (x − )  f (x − )  f (m − )
2 2 2
i i i i i
 2 = E ( x −  )2 = i =1
2 = i =1
2 = i =1

N N N
Variance: For sample data Variance: For sample data Variance: For sample data
n n n

 (x − x)  f (x − x)  f (m − x)
2 2 2
i i i i i
S 2 = M (x − x ) = i =1
S2 = i =1
S2 = i =1
2

n −1 n −1 n −1
Standard deviation: For population Standard deviation: For popul. Standard deviation: For population
2 = 2 = 2 =
Standard deviation: For sample Standard deviation: For sample Standard deviation: For sample
S =S
2
S =S
2
S2 = S
Where:

✓ xmax is the maximum observation

✓ xmin is the minimum observation
✓ mi the mid pt of the class
✓ A is
• either population mean or median in case of population data or
• either sample mean or sample median in case of sample data

Note:
✓ The denominator in sample variance formula is n -1. This is b/c the sample variance underestimates the
population variance when the denominator in the sample formula for variance is n.

chapter 1_250119_072242
No ratings yet
chapter 1_250119_072242
11 pages
Note for Int to Statistics
No ratings yet
Note for Int to Statistics
24 pages
Lec 1 - Data, Tables and Graphs
No ratings yet
Lec 1 - Data, Tables and Graphs
18 pages
Nature of Statistics
No ratings yet
Nature of Statistics
7 pages
Nature of Statistics
100% (1)
Nature of Statistics
7 pages
Definition of Statistics
No ratings yet
Definition of Statistics
4 pages
AGROECONO Ch_1 (1)
No ratings yet
AGROECONO Ch_1 (1)
22 pages
Lesson 1 - Meaning, Types and Limitations of Statistics
No ratings yet
Lesson 1 - Meaning, Types and Limitations of Statistics
6 pages
Stats Bio Supp. 1
No ratings yet
Stats Bio Supp. 1
11 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
INTRODUCTION-TO-STATISTICAL-CONCEPTS
No ratings yet
INTRODUCTION-TO-STATISTICAL-CONCEPTS
10 pages
Business Statistics Notes
No ratings yet
Business Statistics Notes
20 pages
Math-101-Statistics
No ratings yet
Math-101-Statistics
100 pages
Statistik 1
No ratings yet
Statistik 1
17 pages
EQT 233_NOTES_ Eco Stat I_S-D2024
No ratings yet
EQT 233_NOTES_ Eco Stat I_S-D2024
203 pages
Week 1
No ratings yet
Week 1
6 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
10 pages
Basic Stat 1-2 PDF-1-1
No ratings yet
Basic Stat 1-2 PDF-1-1
15 pages
1.introductory and Basic Statistics
No ratings yet
1.introductory and Basic Statistics
6 pages
Introduction To Statistics - c1
No ratings yet
Introduction To Statistics - c1
19 pages
Elementary Stats Notes - 1
No ratings yet
Elementary Stats Notes - 1
7 pages
Introduction to Statistics_Note
No ratings yet
Introduction to Statistics_Note
16 pages
Introductory Statistics Notes
No ratings yet
Introductory Statistics Notes
64 pages
Lesson 01
No ratings yet
Lesson 01
6 pages
Statistics and Probability A Brief History of Statistics
No ratings yet
Statistics and Probability A Brief History of Statistics
42 pages
Sta 321
No ratings yet
Sta 321
7 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
2nd Software Engineering
No ratings yet
2nd Software Engineering
107 pages
Note For Students
No ratings yet
Note For Students
68 pages
PAS 111 Week 1
No ratings yet
PAS 111 Week 1
3 pages
Chapter 1: Introduction To Statistics and Data Presentation
No ratings yet
Chapter 1: Introduction To Statistics and Data Presentation
16 pages
Stat for Engand Scientist_231127_120304
No ratings yet
Stat for Engand Scientist_231127_120304
75 pages
Advance Statistics
No ratings yet
Advance Statistics
21 pages
Chapter-1 Data analysis
No ratings yet
Chapter-1 Data analysis
14 pages
Unit - 1: Statistics: Meaning, Significance & Limitations
No ratings yet
Unit - 1: Statistics: Meaning, Significance & Limitations
11 pages
Chapter One
No ratings yet
Chapter One
8 pages
Chapter - I 1. Introduction: - 1.1 Definition and Classification of Statistics
No ratings yet
Chapter - I 1. Introduction: - 1.1 Definition and Classification of Statistics
14 pages
Stat I Chapter 1 & 2 Ppt-1
No ratings yet
Stat I Chapter 1 & 2 Ppt-1
43 pages
Chapter 1 Classification and Graphical Presentation [Becon 2025]
No ratings yet
Chapter 1 Classification and Graphical Presentation [Becon 2025]
67 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Statistics 2ND Sem Reviewer
No ratings yet
Statistics 2ND Sem Reviewer
5 pages
BBFH 103 Notes
No ratings yet
BBFH 103 Notes
38 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Basic Statistics PDF
No ratings yet
Basic Statistics PDF
43 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Introduction to Statistics
No ratings yet
Introduction to Statistics
82 pages
Chapter 1 PDF
No ratings yet
Chapter 1 PDF
5 pages
Satatistics
No ratings yet
Satatistics
40 pages
STT 201-20212022
No ratings yet
STT 201-20212022
20 pages
Applied Statistics
No ratings yet
Applied Statistics
6 pages
Chapter 2 Stat (MMW)
No ratings yet
Chapter 2 Stat (MMW)
13 pages
Lecture 2-Introduction To Satistics
No ratings yet
Lecture 2-Introduction To Satistics
43 pages
Eco2061 Week 2
No ratings yet
Eco2061 Week 2
68 pages
Qmt181 - Chapter 1
No ratings yet
Qmt181 - Chapter 1
34 pages
Stat 1&2
No ratings yet
Stat 1&2
24 pages
1 Introduction To Statistics
No ratings yet
1 Introduction To Statistics
89 pages
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Class-10 Ch-13 & 14 Statistics & Probability-1
No ratings yet
Class-10 Ch-13 & 14 Statistics & Probability-1
7 pages
Six Sigma - Dumps.icgb.v2015!08!06.by - Exampass.144q
No ratings yet
Six Sigma - Dumps.icgb.v2015!08!06.by - Exampass.144q
61 pages
Airtel MS1 MS3
No ratings yet
Airtel MS1 MS3
1,806 pages
Mark Scheme (Results) January 2009: GCE Mathematics (6683/01)
No ratings yet
Mark Scheme (Results) January 2009: GCE Mathematics (6683/01)
7 pages
Research 8 Grade 8 Melc 1 q4 Week1
No ratings yet
Research 8 Grade 8 Melc 1 q4 Week1
26 pages
Statistics Case Study
No ratings yet
Statistics Case Study
22 pages
Written Report - Martin Junior - Sta404g1
No ratings yet
Written Report - Martin Junior - Sta404g1
58 pages
IITM B.Sc. Qualifier Exam Revision
No ratings yet
IITM B.Sc. Qualifier Exam Revision
3 pages
Elementary Practical Statistics
67% (3)
Elementary Practical Statistics
440 pages
Anderson Et Al. 2020_Chap 3 Descriptive Statistics
No ratings yet
Anderson Et Al. 2020_Chap 3 Descriptive Statistics
70 pages
MATHS 3 The Big Debate
No ratings yet
MATHS 3 The Big Debate
48 pages
Excel Function
100% (1)
Excel Function
27 pages
Data Neraca Perdagangan - Edit
No ratings yet
Data Neraca Perdagangan - Edit
8 pages
Quantitative Reasoning - LLB-II (1)
No ratings yet
Quantitative Reasoning - LLB-II (1)
48 pages
Comparing Dot Plots HW KEY
No ratings yet
Comparing Dot Plots HW KEY
1 page
Final Presentation
No ratings yet
Final Presentation
274 pages
Statistics For Business I
No ratings yet
Statistics For Business I
63 pages
PSYD 719 Sample Review Questions For Quiz 1 Part A: Multiple Choice
No ratings yet
PSYD 719 Sample Review Questions For Quiz 1 Part A: Multiple Choice
12 pages
Rosdiana 3
No ratings yet
Rosdiana 3
11 pages
Water 12 01906 v2
No ratings yet
Water 12 01906 v2
19 pages
Family Succession and Firm Performance: Evidence From Italian Family Firms
No ratings yet
Family Succession and Firm Performance: Evidence From Italian Family Firms
15 pages
Guide To Freightos Baltic Global Container Index™ (FBX) : - December 2019
No ratings yet
Guide To Freightos Baltic Global Container Index™ (FBX) : - December 2019
20 pages
Maths Integration
No ratings yet
Maths Integration
7 pages
(B) Mode 3 Median - 2 Mean
No ratings yet
(B) Mode 3 Median - 2 Mean
17 pages
Test - B AP Statistics
67% (6)
Test - B AP Statistics
24 pages
11 Statistics
No ratings yet
11 Statistics
14 pages
Statests
No ratings yet
Statests
20 pages
Kruskal-Wallis H Test Using SPSS Statistics: One-Way ANOVA Mann-Whitney U Test
100% (1)
Kruskal-Wallis H Test Using SPSS Statistics: One-Way ANOVA Mann-Whitney U Test
12 pages
Intro Statistics Dtu
No ratings yet
Intro Statistics Dtu
426 pages

1 Descriptive Part

Uploaded by

1 Descriptive Part

Uploaded by

Chapter one

Definition and classification of statistics

THE NATURE OF THIS DISCIPLINE

Descriptive Statistics Probability Inferential Statistics

Application, uses and limitation of statistics

➢ Depending on the level (scale) of measurement

➢ Depending on time reference

Basic Steps in Statistical Study

Organizing a Raw Data Set

A. Categorical (qualitative) f.d:

Construct a frequency distribution for this data.

B. Numerical (quantitative) f.d:

Note: consider the f.f table

Pictorial representation of the data

Simple bar chart: is used to represent for only one variable.

Fig of Component bar Diagram

Figure Multiple Bar charts

Pie chart/pie diagram/circle graph

Agriculture and rural

13% Irrigation etc

15% 27% Industry and minerals

HISTOGRAMS AND FREQUENCY POLYGONS

The cumulative frequency curve (ogive)

Types of measures of central tendency

The following table indicates the formula for mean

✓ xi = observation of the class

✓ M (x ) = x = the notation for sample data

Alternative to the Arithmetic Mean-Median

Note: (in case of grouped data)

Measure of Variation (dispersion):

For individual or raw data For frequency distribution

✓ xmax is the maximum observation

You might also like