0% found this document useful (0 votes)
39 views65 pages

Statistics Combine

1) The data ranges from 1.235 to 14.093. Using 7 classes with an interval of 2 would result in the following distribution: Class Limits: 1.236-3.235, 3.236-5.235, 5.236-7.235, 7.236-9.235, 9.236-11.235, 11.236-13.235, 13.236-15.235 2) Another option is to use 6 classes with an interval of 3: Class Limits: 1-3, 4-6, 7-9, 10-12, 13-15

Uploaded by

Mustansar saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views65 pages

Statistics Combine

1) The data ranges from 1.235 to 14.093. Using 7 classes with an interval of 2 would result in the following distribution: Class Limits: 1.236-3.235, 3.236-5.235, 5.236-7.235, 7.236-9.235, 9.236-11.235, 11.236-13.235, 13.236-15.235 2) Another option is to use 6 classes with an interval of 3: Class Limits: 1-3, 4-6, 7-9, 10-12, 13-15

Uploaded by

Mustansar saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Ungrouped Data:

Data which is in raw form and we cannot extract any useful information.
For Example:
The data of daily study hours spent by 20 students of a class are given below;
2 4 5 2 6 4 3 7 2 3
1 5 3 4 5 2 6 4 5 1

Grouped Data:
Data which are categorized in different groups or classes in a table.
For Example:
Groups or
Classes
1
2
3
4
5
6
7
OR;
Groups or
Classes
1---3
4---6
7---9
Frequency Distribution:
The organization of a set of data in a table showing the distribution of the data into
classes together with the number of observations in each class is called a frequency
distribution.
Ungrouped Frequency Distribution:
Where each data class does not contain a range.
Ungrouped frequency distribution can be used for both “Qualitative” and
“Quantitative” data sets.
• Ungrouped Frequency Distribution for Qualitative Data:
For Example:
The data of 20 students living at four different types of residences i.e. Own, Rent,
Hostel, Parents are given below;
Hostel Own Hostel Rent Parents Rent Parents Own Parents Own
Own Rent Parents Own Parents Hostel Hostel Rent Own Parents

Type of Residence Frequency


Own 6
Rent 4
Hostel 4
Parents 6
Total = 20
• Ungrouped Frequency Distribution for Quantitative Data:
For Example:
The data of daily study hours spent by 20 students of a class are given below;
2 4 5 2 6 4 3 7 2 3
1 5 3 4 5 2 6 4 5 1

Study Hours Frequency


1 2
2 4
3 3
4 4
5 4
6 2
7 1
Total = 20
NOTE:
Quantitative data have further two types i.e. Discrete and Continuous.
• The ungrouped frequency distribution is used for discrete data having small
range.
• The grouped frequency distribution is used for both discrete and continuous
data having any range.
Grouped Frequency Distribution:
Where each data class contains a specific range.
For Example:
The data of quiz-scores obtained by 30 students are given below;
10 8 7 1 6 8 4 6 9 10
1 2 10 5 3 7 9 4 9 1
2 1 7 2 5 8 3 3 5 2

Classes Frequency
0---2 8
3---5 8
6---8 8
9---11 6

Class Limits:
The class limits are defined as the numbers or the values of the variables which
describe the classes; the smaller number is the lower class limit and the larger
number is the upper class limit. The class limits should be INCLUSIVE.
Class limits
10---14
15---19
20---24
Class Boundaries:
Class boundary is located midway between the upper limit of a class and the lower
limit of the next class. The upper class boundary of a class coincides with the lower
boundary of the next class. The class boundaries should be EXCLUSIVE.
Class
Boundaries
10---15
15---20
20---25

Class Mark/Midpoint:
It can be obtained by dividing either the sum of the lower and upper limits of a class,
or the sum of the lower and upper boundaries of a class by 2.
Class limits Mid Points Class Boundaries Mid Points
10---14 12 9.5---14.5 12
15---19 17 14.5---19.5 17
20---24 22 19.5---24.5 22

Class width or Interval:


It can be obtained by finding the difference between two successive lower class
boundaries. It may also be obtained by finding the difference either between two
successive lower class limits, or between two successive class marks. An equal class
interval, usually denoted by “h” or “c”.
Class limits Class Boundaries Mid Points Class Interval
10---14 9.5---14.5 12 5
15---19 14.5---19.5 17 5
20---24 19.5---24.5 22 5
Constructing a Grouped Frequency Distribution:
Following points should be kept in mind while constructing a grouped frequency
distribution;
Number of Classes:
• No hard and fast rule
• Statistical experience suggests that the number of classes should in between 5
and 20
• Sturges’ Rule: k = 1 + 3.3 log (N)
Where “k” is number of classes and “N” is number of observations. For
example, if there are 100 observations, then by applying Sturges’ rule, we
should have;
k = 1 + 3.3 log (100)
k = 7.6 i.e. 8 classes
Range of Variation in Data:
It is the difference between the largest and the smallest values in the data.
Deciding on the Equal Class Interval:
Divide the range of variation by the number of classes to determine the appropriate
size of the equal class interval.
Creation of Classes:
The lower limit of the class could be equal to or less than the smallest value in the
data. The upper limit should be decided according to appropriate class interval.
Distribute the Data into Appropriate Classes:
This is best done by “Tally Column”. The number of tallies should be written in the
frequency column. The tally column is usually omitted in the final presentation of
the frequency distribution.
Finally, total the frequency column to see that all the data have been entered.
How many classes would you create?
if;
Smallest value = 16
Largest value = 242
if;
Smallest value = 33
Largest value = 56
Example 1:
Make a grouped frequency distribution from the following data;

Solution:
By scanning the data;
Largest value is 204.
Smallest value is 68.
So,
The range is 204 – 68 = 136.
Suppose we decide to take 7 classes of equal size. Then the equal class interval
would be 136/7 = 19.47 but we take 20 for our convenience.
Class Limits Class Boundaries Entries or Tallies Frequency
65---84 64.5---84.5
85---104 84.5---104.5
105---124
125---144
145---164
165---184
185---204

NOTE:
• If data is in decimal form (e.g. up to one decimal place) then you
will create class limits with one decimal place.
• If data is in decimal form (e.g. up to two decimal places) then you
will create class limits with two decimal places.
• If data is in decimal form (e.g. up to three decimal places) then you
will create class limits with three decimal places.
Example 2:
Make a grouped frequency distribution from the following data;
6.4 3.7 13.4 10.1 12.6 14.1 1.3 9.9 8.8 2.1
7.7 13.0 7.7 9.1 8.3 11.2 2.4 9.2 8.4 6.0
13.0 8.8 3.8 15.2 7.0 4.5 1.8 4.5 8.1 9.9

Class Limits Class Boundaries Frequency


1.3---3.2 1.25---3.25
3.3---5.2 3.25---5.25
5.3---7.2
7.3---9.2
9.3---11.2
11.3---13.2
13.3---15.2
Example 3:
Make a grouped frequency distribution from the following data;
1.19 8.54 8.92 7.19 1.64 11.67 6.82 13.63 10.63 16.95
3.17 4.66 9.99 11.62 1.49 5.83 6.05 12.88 7.42 16.12
12.20 17.21 2.78 5.87 12.68 12.17 5.22 15.91 11.10 4.30

Class Limits Class Boundaries Frequency


1.19---4.18 1.185---4.185
4.19---7.18 4.185---7.185
7.19---10.18
10.19---13.18
13.19---16.18
16.19---19.18

Example 4:
Make a grouped frequency distribution from the following data;
3.456 13.347 5.720 8.241 2.476 13.111 6.852 6.450 12.402 11.670
11.077 8.069 3.817 13.341 12.357 10.913 7.836 3.763 7.845 3.879
7.472 9.803 14.093 1.235 4.465 1.323 4.612 6.218 8.765 6.279

Class Limits Class Boundaries Frequency


1.235---3.234
3.235---5.234
Class Task:
Make the grouped frequency distribution from the following questions;
Question 1:
115 111 141 42 31 32 156 120 71 23
91 42 115 125 132 23 77 60 86 154

Question 2:
1.9 4.2 7.4 7.6 12.3 15.3 8.1 4.4 11.9 14.1
24.6 13.1 16.8 1.3 23.5 21.2 22.7 21.6 17.8 6.7

Question 3:
14.48 9.99 6.46 3.34 14.10 15.43 6.58 1.93 19.22 10.95
18.97 11.89 2.39 7.64 8.12 6.96 6.04 6.24 7.06 1.78

Question 4:
9.364 11.726 1.116 8.461 15.113 1.515 3.976 9.163 13.627 9.856
14.457 2.011 8.157 2.192 3.514 9.787 1.553 9.559 16.365 8.805
Cumulative Frequency Distribution:
A frequency distribution involving a column of cumulative frequency.
“Less Than” Type Cumulative Frequency Distribution:
Class Frequency Class Upper Class Cumulative
Limits Boundaries Boundaries Frequency
Less than 64.5 0
65---84 9 64.5---84.5 Less than 84.5 9
85---104 10 84.5---104.5 Less than 104.5 19
105---124 17 104.5---124.5 Less than 124.5 36
125---144 10 124.5---144.5 Less than 144.5 46
145---164 5 144.5---164.5 Less than 164.5 51
165---184 4 164.5---184.5 Less than 184.5 55
185---204 5 184.5---204.5 Less than 204.5 60
Total = 60

“More Than” Type Cumulative Frequency Distribution:


Class Frequency Class Lower Class Cumulative
Limits Boundaries Boundaries Frequency
65---84 9 64.5---84.5 More than 64.5 60
85---104 10 84.5---104.5 More than 84.5 51
105---124 17 104.5---124.5 More than 104.5 41
125---144 10 124.5---144.5 More than 124.5 24
145---164 5 144.5---164.5 More than 144.5 14
165---184 4 164.5---184.5 More than 164.5 9
185---204 5 184.5---204.5 More than 184.5 5
Total = 60 More than 204.5 0
Frequency Curve:
A smooth curve of a “frequency polygon”.
Symmetrical or Normal Frequency Distribution:
When both halves of a frequency curve are identical.

Asymmetrical Frequency Distribution:


When both halves of a frequency curve are not identical.
• Positively Skewed Frequency Distribution
• Negatively Skewed Frequency Distribution
• Extremely Negatively Skewed or J-shaped Frequency Distribution:
• Extremely Positively Skewed or Reverse J-shaped Frequency Distribution:
• The U-shaped Frequency Distribution:
Measures of Central Tendency:
When two data sets are to be compared, we summarize each data set into a single value. Such a
value, usually somewhere in the center of a data set, is a value at which the data tend to concentrate.
Central Tendency:
The tendency of the observations to cluster in the central part of the data set is called central
tendency.
Measure of Central Tendency:
A measure of central tendency indicates the general position of the data in the range of
observations. While the measures of central tendency are generally known as “Averages”.
Types of Averages:
1. Arithmetic Mean
2. Geometric Mean
3. Harmonic Mean
4. Median
5. Mode
ARITHMETIC MEAN
Population
Arithmetic mean = µ
Number of observations = N
Sample
Arithmetic mean = x̅
Number of observations = n
Note: We normally find the arithmetic mean of a sample and generalize it over the population,
because most of the time collecting data for population is not possible.
Arithmetic Mean of Ungrouped Data:
When all the observations are of equal importance then we find the arithmetic mean using the
following formula;
∑𝒙
x̅ = 𝒏

For Example:
Find the arithmetic mean of the following data;
23 12 43 25 32 37

∑𝑥 172
x̅ = = = 28.67
𝑛 6

Weighted Arithmetic Mean of Ungrouped Data:


When all the observations are not of equal importance then we find the weighted arithmetic mean
using the following formula;
∑𝒘𝒙
x̅w = ∑𝒘

For Example:
Items Expense (x) Weights (w) wx
Food 24 7 168
Clothing 45 2.5 112.5
Rent 34 4 136
Education 28 3 84
∑ = 16.5 ∑ = 500.5
∑𝑤𝑥 500.5
x̅w = ∑𝑤
= 16.5
= 30.33
Arithmetic Mean of Grouped Data with “Ungrouped Frequency Distribution”:
∑𝒇𝒙
x̅ = ∑𝒇

Classes (x) Frequency (f) fx


1 4 4
2 6 12
3 9 27
4 7 28
5 5 25
∑ = 31 ∑ = 96

∑𝑓𝑥 96
x̅ = = 31 = 3.10
∑𝑓

Arithmetic Mean of Grouped Data with “Grouped Frequency Distribution”:


Note: The following method is used for both equal and unequal class intervals;
∑𝒇𝒙
x̅ = ∑𝒇

Class Limits Frequency (f) Mid Points (x) fx


10---14 4 12 48
15---19 6 17 102
20---24 9 22 198
25---29 7 27 189
30---34 5 32 160
∑ = 31 ∑ = 697

∑𝑓𝑥 697
x̅ = = = 22.48
∑𝑓 31

Note: Arithmetic mean of grouped data with “grouped frequency distribution” can also be
calculated by a short method, which is not compulsory to remember.
GEOMETRIC MEAN (G)
Note: Geometric mean can only be calculated when each value in a data set is greater than 0.
Geometric Mean of Ungrouped Data:
When all the observations are of equal importance then we find the geometric mean using the
following formula;
G = (x1.x2.x3...xn)1/n G = (Product of x)1/n
But when the number of observations (n) is large then we can use the following formula;
∑𝐥𝐨𝐠(𝐱)
G = Antilog ( )
𝒏

For Example:
Find the geometric mean of the following data;
23 12 43 25 32 37

G = (Product of x)1/n
G = (23 x 12 x 43 x 25 x 32 x 37)1/6
G = 26.56
OR;
x Log(x)
23 1.361728
12 1.079181
43 1.633468
25 1.39794
32 1.50515
37 1.568202
∑ = 8.545669
∑log(x)
G = Antilog ( )
𝑛

8.545669
G = Antilog ( )
6

G = Antilog (1.42428)
G = 26.56
Weighted Geometric Mean of Ungrouped Data:
When all the observations are not of equal importance then we find the weighted geometric mean
using the following formula;
∑𝐰.𝐥𝐨𝐠(𝐱)
Gw = Antilog ( )
∑𝒘

For Example:
Items Expense (x) Weights (w) Log(x) w.Log(x)
Food 24 7 1.380211 9.661479
Clothing 45 2.5 1.653213 4.133031
Rent 34 4 1.531479 6.125916
Education 28 3 1.447158 4.341474
∑ = 16.5 ∑ = 24.2619

∑w.log(x)
Gw = Antilog ( )
∑𝑤

24.2619
Gw = Antilog ( )
16.5

Gw = Antilog (1.4704)
Gw = 29.54
Geometric Mean of Grouped Data with “Ungrouped Frequency Distribution”:
∑𝐟.𝐥𝐨𝐠(𝐱)
G = Antilog ( )
∑𝒇

Classes (x) Frequency (f) Log(x) f.Log(x)


2 4 0.30103 1.20412
3 6 0.477121 2.862728
4 9 0.60206 5.41854
5 7 0.69897 4.89279
6 5 0.778151 3.890756
∑ = 31 ∑ = 18.26893

∑f.log(x)
G = Antilog ( )
∑𝑓

18.26893
G = Antilog ( )
31

G = Antilog (0.5893)
G = 3.88
Geometric Mean of Grouped Data with “Grouped Frequency Distribution”:
Note: The following method is used for both equal and unequal class intervals;
∑𝐟.𝐥𝐨𝐠(𝐱)
G = Antilog ( )
∑𝒇

Class Limits Frequency (f) Mid Points (x) Log(x) f.Log(x)


10---14 4 12 1.0792 4.31672
15---19 6 17 1.2304 7.38269
20---24 9 22 1.3424 12.0818
25---29 7 27 1.4314 10.0195
30---34 5 32 1.5051 7.52575
∑ = 31 ∑ = 41.3265

∑f.log(x)
G = Antilog ( )
∑𝑓

41.3265
G = Antilog ( )
31

G = Antilog (1.3331)
G = 21.53
HARMONIC MEAN (H)
Note: Harmonic mean can only be calculated when any value in a data set is not 0.
Harmonic Mean of Ungrouped Data:
When all the observations are of equal importance then we find the harmonic mean using the
following formula;
𝐧
H= 𝟏

𝒙

For Example:
Find the harmonic mean of the following data;
23 12 43 25 32 37

x 1/x
23 0.043478
12 0.083333
43 0.023256
25 0.04
32 0.03125
37 0.027027
∑ = 0.248344
n
H= 1

𝑥

6
H = 0.248344

H = 24.16
Weighted Harmonic Mean of Ungrouped Data:
When all the observations are not of equal importance then we find the weighted harmonic mean
using the following formula;
∑𝐰
Hw = 𝟏
∑𝒘( )
𝒙

For Example:
Items Expense (x) Weights (w) 1/x w(1/x)
Food 24 7 0.04167 0.29167
Clothing 45 2.5 0.02222 0.05556
Rent 34 4 0.02941 0.11765
Education 28 3 0.03571 0.10714
∑ = 16.5 ∑ = 0.57201
∑w
Hw = 1
∑𝑤( )
𝑥

16.5
Hw = 0.57201

Hw = 28.85
Harmonic Mean of Grouped Data with “Ungrouped Frequency Distribution”:
∑𝐟
H= 𝟏
∑𝒇( )
𝒙

Classes (x) Frequency (f) 1/x f(1/x)


2 4 0.5 2
3 6 0.33333 2
4 9 0.25 2.25
5 7 0.2 1.4
6 5 0.16667 0.83333
∑ = 31 ∑ = 8.48333

∑f
H= 1
∑𝑓( )
𝑥

31
H = 8.48333

H = 3.65
Harmonic Mean of Grouped Data with “Grouped Frequency Distribution”:
Note: The following method is used for both equal and unequal class intervals;
∑𝐟
H= 𝟏
∑𝒇( )
𝒙

Class Limits Frequency (f) Mid Points (x) 1/x f(1/x)


10---14 4 12 0.08333 0.33333
15---19 6 17 0.05882 0.35294
20---24 9 22 0.04545 0.40909
25---29 7 27 0.03704 0.25926
30---34 5 32 0.03125 0.15625
∑ = 31 ∑ = 1.51087

∑f
H= 1
∑𝑓( )
𝑥

31
H = 1.51087

H = 20.52
Limitations of Mean:
• If a data set contains all positive values, then arithmetic, geometric and harmonic mean
can be found.
• If a data set contains positive and negative values, then arithmetic and harmonic mean
can be found.
• If a data set contains positive and negative values including 0, then only arithmetic mean
can be found.
Usage:
Harmonic and geometric mean are used when data is in the form of ratios or rates (e.g. percentage
rate, growth rate, rate of change).
Class Task:
Find the arithmetic, geometric and harmonic mean of each of the following questions;
Question 1: (Hint: Ungrouped data)

115 111 141 42 31 32 156 120 71 23


91 42 115 125 132 23 77 60 86 154

Question 2: (Hint: Grouped data with “ungrouped frequency distribution”)


Classes Frequency
25 3
26 6
27 2
28 7
29 5

Question 3: (Hint: Grouped data with “grouped frequency distribution”)


Class Limits Frequency
5---9 4
10---14 6
15---19 3
20---24 8
25---29 5
MEDIAN
The median is a value which divides an ORDERED DATA set into two equal parts.
Or the median is a value at or below which 50% of the ORDERED DATA lie.
Median of Ungrouped Data:
If “n” is odd;
𝑛+1
Median = ( )th observation
2

For Example:
13 17 5 3 20 14 9 6 4
By ordering the data, we get;
3 4 5 6 9 13 14 17 20
𝑛+1
Median = ( )th observation
2

9+1
Median = ( )th observation
2

Median = 5th observation


Median = 9
If “n” is even;
𝑛 𝑛
Median is the arithmetic mean of (2)th observation and (2 + 1)th observation.

For Example:
13 17 5 3 20 14 9 6 4 2
By ordering the data, we get;
2 3 4 5 6 9 13 14 17 20
𝑛 th
( 2)th observation = 10/2 = 5 observation = 6
𝑛 10
( 2 + 1)th observation = ( 2 + 1) = 6th observation = 9
6+9
Median = Arithmetic mean of 5th & 6th obs. = ( ) = 7.5
2
Median of Grouped Data with “Ungrouped Frequency Distribution”:
If “n” is odd;
Classes Frequency Less than type CF
0 3 3
1 4 7
2 6 13
3 7 20
4 10 30
5 6 36
6 5 41
7 5 46
8 2 48
9 1 49
Total = n = 49
𝑛+1
Median = ( )th observation
2

49+1
Median = ( )th observation
2

Median = 25th observation


Median = 4
If “n” is even;
Classes Frequency Less than type CF
0 3 3
1 4 7
2 6 13
3 7 20
4 10 30
5 6 36
6 5 41
7 5 46
8 3 49
9 1 50
Total = n = 50
𝑛
( 2)th observation = 50/2 = 25th observation = 4
𝑛 50
( 2 + 1)th observation = ( 2 + 1) = 26th observation = 4
4+4
Median = Arithmetic mean of 25th & 26th obs. = ( )=4
2
Median of Grouped Data with “Grouped Frequency Distribution”:
Note: The following method is called “Linear Interpolation” and used for both equal & unequal
class intervals and for both odd & even “n”;
𝒉 𝒏
Median = l + 𝒇 (𝟐 − 𝑪)

Where;
l = Lower class boundary of median group
h = Class interval of median group
f = Frequency of median group
n = Total number of observations in data
C = Cumulative frequency of class previous from median group
For Example:
Class Limits Frequency Class Boundaries Less than type CF
30---39 8 29.5---39.5 8
40---49 87 39.5---49.5 95
50---59 190 49.5---59.5 285
60---69 304 59.5---69.5 589
70---79 211 69.5---79.5 800
80---89 85 79.5---89.5 885
90---99 20 89.5---99.5 905
Total = n = 905
𝑛 905
• (2) = ( ) = 452.5
2
• l = 59.5
• h = 10
• f = 304
• C = 285
ℎ 𝑛
Median = l + 𝑓 ( 2 − 𝐶)

10 905
Median = 59.5 + 304 ( − 285)
2
10
Median = 59.5 + 304 (167.5)

Median = 59.5 + 5.50


Median = 65
Quantiles:
Following are the quantiles which are widely used;
• Quartiles
• Deciles
• Percentiles
Quartiles:
3 values which divides a data set into 4 equal parts and are denoted by Q1, Q2, Q3.
Deciles:
9 values which divides a data set into 10 equal parts and are denoted by D1, D2, … D9.
Percentiles:
99 values which divides a data set into 100 equal parts and are denoted by P1, P2, … P99.
Note: Median is a value which divides a data set into 2 equal parts.
So,
Median = Q2 = D5 = P50

General Methods to Find Quantiles


“Ungrouped Data” and “Grouped Data with Ungrouped Frequency
Distribution”:
• If quantile observation is not an integer;
Quantile = (Quantile obs.+1) Rounded Back
• If quantile observation is an integer;
Quantile = Arithmetic mean of (Quantile obs.) and immediate next obs.

Grouped Data with Grouped Frequency Distribution:


Note: The following method is called “Linear Interpolation” and used for both equal & unequal
class intervals;
𝒉
Quantile = l + 𝒇 (𝑸𝒖𝒂𝒏𝒕𝒊𝒍𝒆 𝒐𝒃𝒔. −𝑪)

Note: Quantile obs. will locate the quantile group.


Where;
l = Lower class boundary of quantile group
h = Class interval of quantile group
f = Frequency of quantile group
n = Total number of observations in data
C = Cumulative frequency of class previous from quantile group
MODE
A value which occurs most frequently in a data set.
Unimodal Distribution: When data have one mode.
Bimodal Distribution: When data have two modes.
Multimodal Distribution: When data have more than two modes.
Distribution with No Mode: When all values of data have same frequencies.
Methods to Find Mode
Ungrouped Data:
Mode of an ungrouped data is that value(s) which occurred more times as compared to the other
values.
Grouped Data with Ungrouped Frequency Distribution:
The value(s) with highest frequency is mode.
Grouped Data with Grouped Frequency Distribution:
𝒇𝒎−𝒇𝟏
Mode = l + (𝒇𝒎−𝒇𝟏)+(𝒇𝒎−𝒇𝟐) x h

Note: Class with highest frequency is called Modal Class.


Where;
l = Lower class boundary of the modal class
fm = Frequency of the modal class
f1 = Frequency of the class preceding the modal class
f2 = Frequency of the class following the modal class
h = Class interval of the modal class
With Equal Class Intervals:
For Example:
Class Limits Frequency (f) Class Boundaries
30---39 8 29.5---39.5
40---49 87 39.5---49.5
50---59 190 49.5---59.5
60---69 304 59.5---69.5
70---79 211 69.5---79.5
80---89 85 79.5---89.5
90---99 20 89.5---99.5

𝑓𝑚−𝑓1
Mode = l + (𝑓𝑚−𝑓1)+(𝑓𝑚−𝑓2) x h

304−190
Mode = 59.5 + (304−190)+(304−211) x 10

Mode = 65
With Unequal Class Intervals:
For Example:
Class Limits Frequency (f) Class Boundaries Class Interval (h) f/h
30---43 8 29.5---43.5 14 0.57
44---54 87 43.5---54.5 11 7.91
55---59 190 54.5---59.5 5 38
60---79 304 59.5---79.5 20 15.2
80---85 211 79.5---85.5 6 35.17
86---91 85 85.5---91.5 6 14.17
92---99 20 91.5---99.5 8 2.5

𝑓𝑚−𝑓1
Mode = l + (𝑓𝑚−𝑓1)+(𝑓𝑚−𝑓2) x h

38−7.91
Mode = 54.5 + (38−7.91)+(38−15.2) x 5

Mode = 57.34
Empirical Relation Between Mean, Median and Mode:
• In a symmetrical or normal distribution (with no skewness), the following relation is valid;
Mean = Median = Mode
• In a unimodal distribution having moderate skewness, the following relation is valid;
Mode = 3Median – 2Mean
MEASURES OF DISPERSION
It is quite possible that two sets of data may have the same average (mean, median, or mode) but
the individual observations of one data set is same to its average and the individual observations
of second data set differ considerably from its average. Thus, a value of central tendency does not
adequately describe the data. We therefore need some additional information concerning with how
the data are dispersed about the average. This is done by measuring the dispersion by which we
mean the extent to which the observations in a data set vary about their average. A quantity that
measures this characteristic is called a Measure of Dispersion.
Absolute Measure of Dispersion:
• This is just a value in same units as the data set have.
• It cannot be used to compare two or more data sets.
Relative Measure of Dispersion:
• This is a coefficient or percentage.
• It can be used to compare two or more data sets.
Main Measures of Dispersion:
1) The Range
2) The Semi-interquartile Range or the Quartile Deviation
3) The Mean Deviation or the Average Deviation
4) The Variance and the Standard Deviation
THE RANGE
Absolute Measure:
R = xm – x0
Relative Measure:
𝑋𝑚−𝑋0
Coefficient of Range = 𝑋𝑚+𝑋0

Where.
xm = Largest value in data set
x0 = Smallest value in data set
Note: Higher coefficient of range indicates higher dispersion (or less consistency) in data.
Note: For grouped data in a grouped frequency distribution, the largest value is the upper-class
boundary of the last class and the smallest value is the lower-class boundary of the first class.
Question: Which data set is more dispersed?
Data Set-1
7 3 6 2 11 8 14

Data Set-2
41 32 36 40 34 44 37

THE SEMI-INTERQUARTILE RANGE OR THE QUARTILE DEVIATION


Absolute Measure:
Q.D. = (Q3 – Q1) / 2
Relative Measure:
Coefficient of Q.D. = (Q3 – Q1) / (Q3 + Q1)
Where,
Q.D. = Quartile deviation
Q3 = The third quartile
Q1 = The first quartile
Note: Higher coefficient of Q.D. indicates higher dispersion (or less consistency) in central 50%
data.
THE MEAN DEVIATION (M.D.)
Note: Ignore the negative signs of deviations.
Absolute Measure:
• For Ungrouped Data:
∑(𝒙−𝐱̅)
M.D. =
𝒏
• For Grouped Data in an “Ungrouped or Grouped Frequency Distribution”:
∑𝒇(𝒙−𝐱̅)
M.D. =
𝒏
Relative Measure:
𝑀.𝐷.
Coefficient of M.D. = 𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛

Note: Higher coefficient of M.D. indicates higher dispersion of observations from their arithmetic
mean (or less consistency in observations).
THE VARIANCE AND STANDARD DEVIATION
Variance of Population = σ2
Standard Deviation of Population = σ
Variance of Sample = S2
Standard Deviation of Sample = S
Absolute Measure:
• For Ungrouped Data:
∑(𝑥−𝑥̅ )2
S2 = 𝑛

∑(𝑥−𝑥̅ )2
S=√ 𝑛

• For Grouped Data in an “Ungrouped or Grouped Frequency Distribution”:


∑𝑓(𝑥−𝑥̅ )2
S2 = 𝑛

∑𝑓(𝑥−𝑥̅ )2
S=√ 𝑛

Relative Measure:
𝑆
Coefficient of S = 𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛

Note: Higher coefficient of S indicates higher dispersion of observations from their arithmetic
mean (or less consistency in observations).
Class Task:
Find all the relative measures of dispersion of the following grouped frequency distribution.
Class Boundaries Frequency (f)
30---60 28
60---90 292
90---120 389
120---150 212
150---180 59
180---210 18
210---240 2
Total = n = 1000

Also create another grouped frequency distribution by yourself and find all relative measures of
dispersion of that frequency distribution. Finally compare each relative measure of dispersion of
both distributions and comment about dispersion.
Probability
• A quantitative measure of uncertainty.
• A measure of degree of belief in a particular statement or problem.
• Uncertainty is also an inherent part of statistical inference because
Inferences are based on a sample.
• Examples (toss a coin, draw a card or throw dice etc.)
• Uncertainty in all these cases is measured in terms of probability.
To solve the gambling problems, the foundations of probability were laid
by two French Mathematicians;
• Blaise Pascal (1623-1662)
• Pierre De Fermat (1601-1665)
After that;
• Jakob Bernoulli (1654-1705)
• Abraham De Moivre (1667-1754)
• Pierre Simon Laplace (1749-1827)
Modern rules were developed in 19th century.
Probability helps to make intelligent decisions in Economics,
Management, Operations Research, Sociology, Psychology, Astronomy,
Physics, Engineering and Genetics where risk and uncertainty are
involved.
The probability theory is best understood through the application of the
modern set theory.
Sets
A set is any well-defined collection or list of distinct objects, e.g. a group
of students, the books in a library, the integers between 1 and 100, all
human beings on the Earth, etc.
• Well-defined refers to belonging of objects to the set.
• Distinct means that each object must appear only once.
• Objects are called elements or members of a set.
• Sets are denoted by Capital Letters e.g. A, B, X, Y etc.
• Elements are denoted by small letters e.g. a, b, x, y etc.
• Elements are enclosed by braces to represent a set, e.g.
A = {a, b, x, y} or B = {1,2,3,8}
• Number of elements of a set A, is written as n(A).
• Empty or null set “φ”.
• 0 is not an empty set.
• Unit set or a singleton set.
• Elements of a set may be sets themselves.
A set may be specified in two ways;
• Roster Method;
A = {1,2,3,4,5,6} or B = {a book, a city, a clock, a teacher}
• Rule method:
A = {x | x is an odd number and x < 12}.
Finite set
A = {1,2,3, ….,99,100}
B = {x | x is a month of the year}
C = {x | x is a printing mistake in a book}
D = {x | x is a living citizen of Pakistan}
Infinite set
A = {x | x is an even integer}
B = {x | x is a point on a line}
D = {x | x is a sentence in the English language}
Subsets
A set that consists of some elements of another set, is called a subset of
that set.
A = {1,2,3,4,5,10} and B = {1,3,5}
In this example B is a subset of A.
• Every set is a subset of itself.
• An empty set is a subset of every set.
Identical sets refer to the sets having exactly same elements. And they
are subsets of each other.
A = {1,2,3,4} and B = {1,2,3,4}
Universal Set or Space (S)
The large or the original set of which all the sets we talk about, are
subsets.
• Universal set is also a subset of itself.
• Universal set with “n” elements will produce 2n subsets, including
S and φ. S = {a, b}
Venn Diagram

Operations on Sets explained with Venn Diagrams


• The union of two sets;

• The intersection of two sets;

• The difference of two sets;


• The complement of a set;

Example:
S = {1,2,3,4,5,6,7,8,9}
A = {1,2,3}
B = {3,4,5}
C = {2,3,7,8}
Different Operations on Sets:
AᴜB = {1,2,3,4,5}
BᴜA = {1,2,3,4,5}
AᴜC = {1,2,3,7,8}
CᴜA = {1,2,3,7,8}
BᴜC = {2,3,4,5,7,8}
CᴜB = {2,3,4,5,7,8}
A∩B = {3}
B∩A = {3}
A∩C = {2,3}
C∩A = {2,3}
B∩C = {3}
C∩B = {3}
A-B = {1,2}
B-A = {4,5}
A-C = {1}
C-A = {7,8}
B-C = {4,5}
C-B = {2,7,8}
Ac = S-A = {4,5,6,7,8,9}
Bc = S-B = {1,2,6,7,8,9}
Cc = S-C = {1,4,5,6,9}
Sc = S-S = φ
A∩Bc = A-B = {1,2}

The Algebra of Sets:


Let A, B and C be any subsets of the universal set S.
Commutative Laws:
• AᴜB = {1,2,3,4,5}
BᴜA = {1,2,3,4,5}
And;
• A∩B = {3}
B∩A = {3}
Associative Laws:
• (AᴜB)ᴜC = {1,2,3,4,5,7,8}
Aᴜ(BᴜC) = {1,2,3,4,5,7,8}
And;
• (A∩B)∩C = {3}
A∩(B∩C) = {3}
Distributive Laws:
• A∩(BᴜC) = {2,3}
(A∩B)ᴜ(A∩C) = {2,3}
And;
• Aᴜ(B∩C) = {1,2,3}
(AᴜB)∩(AᴜC) = {1,2,3}
Idempotent Laws:
• AᴜA = A = {1,2,3}
• A∩A = A = {1,2,3}
Identity Laws:
• AᴜS = S = {1,2,3,4,5,6,7,8,9}
• A∩S = A = {1,2,3}
• Aᴜφ = A = {1,2,3}
• A∩φ = φ
Complementation Laws:
• AᴜAc = S = {1,2,3,4,5,6,7,8,9}
• A∩Ac = φ
• (Ac)c = A = {1,2,3}
• Sc = S-S = φ
• Φc = S-φ = S = {1,2,3,4,5,6,7,8,9}
De Morgan’s Laws:
• (AᴜB)c = S-(AᴜB) = {6,7,8,9}
Ac∩Bc = {6,7,8,9}
And;
• (A∩B)c = {1,2,4,5,6,7,8,9}
AcᴜBc = {1,2,4,5,6,7,8,9}
Partition of Sets
A partition of a set S is a sub-division of the set into non-empty subsets
that are disjoint and exhaustive, i.e. their union is the set S itself.
• AiᴜAk = S
• Ai∩Ak = φ
For Example;
• S = {a, b, c, d, e}
• Ai = {a, b}
• Ak = {c, d, e}
Class of Sets
• A set of sets is called a class.
• The class of all subsets of a set A is called the Power set of A.
For Example;
• A = {H, T}
• Power set of A = {φ, (H), (T), (A)} or {φ, (H), (T), (H, T)}
Cartesian Product Sets
The Cartesian product of sets A and B, denoted by A x B, is a set that
contains all ordered pairs (x, y), where x belongs to set A and y belongs
to set B.
For Example;
• A = {x | x is a side of a coin}
• B = {y | y is a side of a die}
• Then,
• A = {H, T}
• B = {1,2,3,4,5,6}
• If one coin and one die (plural dice) tossed together.
• A x B = {(H, 1); (H, 2); (H, 3); (H, 4); (H, 5); (H, 6); (T, 1); (T, 2); (T, 3);
(T, 4); (T, 5); (T, 6)}

Experiment:
The term experiment means a planned activity or process
whose results yield a set of data.
Trial:
A single performance of an experiment.
Outcome:
The result obtained from an experiment or a trial.
Random Experiment:
An experiment which produces different results even
though it is repeated many times under essentially similar
conditions.
For example;
Tossing a fair coin, throwing of a balanced die and drawing
of a card from a well shuffled deck of 52 playing cards.
Properties of Random Experiment:
➢ The experiment can be repeated any number of
times.
➢ The experiment always has two or more possible
outcomes.
➢ Every possible outcome is already known.
➢ The outcome of each trial is unpredictable.
Deck of playing cards contains;
➢ 52 cards.
➢ Arranged in 4 suits (Clubs, Spades, Hearts and
Diamonds) of 13 each.
➢ Clubs and Spades are black.
➢ Heats and Diamonds are red.
➢ Honor cards are Ace, 10, Jack, Queen, and King.
➢ Face cards are Jack, Queen, and King.
Sample Space:
A set consisting of all possible outcomes that can result
from a random experiment. It is denoted by “S”.
Sample Point:
Each possible outcome is called sample point in a sample
space.
For example;
Sample space of tossing a coin is;
S = {H, T}
Sample space of tossing two coins at a time is;
S = {HH, HT, TH, TT}
Remember the Cartesian product A x A.
Sample space of throwing two six-sided dice will be;
A x A = 6 x 6 = 36
1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
5,1 5,2 5,3 5,4 5,5 5,6
6,1 6,2 6,3 6,4 6,5 6,6
Finite Sample Space:
If the number of sample points are finite.
Discrete Sample Space:
If the sample points are countable.
Continuous Sample Space:
If the sample points are infinite or uncountable.
Event:
An individual outcome or any number of outcomes of a
random experiment.
Simple Event:
An event that contains exactly one sample point.
For example;
The occurrence of 6 when a die is thrown.
Compound Event:
An event that contains more than one sample points.
For example;
The occurrence of a sum of 10 with a pair of dice. It can be
decomposed into three simple events;
(4, 6), (5, 5) and (6, 4).
Explanation:
A sample space consisting of “n” sample points can
produce 2n different simple and compound events.
For example;
A set containing three elements
S = {a, b, c,}
Then 2n = 23 = 8 subsets are;
φ, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}
Similarly;
A sample space containing three sample points
S = {a, b, c,}
Then 2n = 23 = 8 possible events are;
φ, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}
Where;
• {a, b, c} is an event and the sample space itself and it
always occurs so it’s called “sure event”.
• While φ is also an event but it’s called “impossible
event”.
The class of the above 8 events or subsets can be called a
field. These events have the following features;
• The union of any number of events will result in a set
that belongs to field.
{a, b} ᴜ {a, c} = {a, b, c}
• The intersection of any number of events will result in
a set that belongs to field.
{a, b} ∩ {a, c} = {a}
• The difference of any two events belongs to field.
{a, b} - {a, c} = {b}
• The compliment of any event belongs to field.
ac = φ, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}
Mutually Exclusive Events:
Cannot occur at the same time.
For example;
• Tossing a coin can have either head or tail.
• Throwing a die can have either of 1, 2, 3, 4, 5 and 6.
• A student either qualifies or fails.
If two events can occur at the same time, they are not
mutually exclusive.
For example;
• If we draw a card from a deck of 52 playing cards, it
can be both a king and a diamond. So, kings and
diamonds are not mutually exclusive.
• Inflation and recession are not mutually exclusive
events.
Exhaustive Events:
When the union of mutually exclusive events is the entire
sample space “S”.
For example;
In coin-tossing experiment, head and tail are exhaustive
events because the union of them is the entire sample
space (S = {H, T}).
Equally Likely Events:
Two events A and B are equally likely, when one event is
as likely to occur as the other.
For example;
When a coin is tossed, the head is as likely to appear as
the tail (fifty-fifty chances of occurrence).

Events and Symbolic Representations:


Verbal Statement Set Notation
Event A is impossible A=φ
Event A is sure A=S
Event A does not occur Ac
Event A or event Ac A ᴜ Ac = S
Event A or event B AᴜB
Event A and event B A∩B
Event A occurs but B does not occur A ∩ Bc
Events A and B are mutually exclusive A ∩ B = φ
Events A and B are exhaustive AᴜB=S

Compound Experiment:
Tossing a coin and throwing a die together is called
compound experiment because it consists of two different
experiments.
Counting Sample Points or Possible Outcomes:
When the number of sample points (possible outcomes)
of a compound experiment is very large then we need to
use the following methods to count them.
Rule of Multiplication:
Number of outcomes = mn
Where;
m = Outcomes of first experiment
n = Outcomes of second experiment
For example;
Tossing a coin and throwing a die together;
S = {H, T}
S = {1, 2, 3, 4, 5, 6}
m=2
n=6
Number of outcomes = mn = 2 x 6 = 12
Note: This rule can be extended to compound
experiments consisting of any number of experiments.
Rule of Permutation:
Permutation is any ordered subset selected from a set of
“n” distinct objects.
n
Pr = n! / (n - r)!
where;
r = number of objects chosen
n = total number of objects
Example:
A club consists of four members. How many sample points
are in the sample space when three officers--president,
secretary, and treasurer are to be chosen?
n=4
r=3
nP = n! / (n - r)!
r
4P = 4! / (4 - 3)!
3
4P = 4x3x2x1 / (4 - 3)!
3
4P = 4x3x2x1 / 1!
3
4P = 24 / 1
3
4P = 24
3
Explanation:
Four members = A, B, C, D
Three are to be chosen and order is important due to
designation. So;
ABC ABD ACB ACD ADB ADC BAC BAD BCA BCD
BDA BDC CAB CAD CBA CBD CDA CDB DAB DAC
DBA DBC DCA DCB
Note: nPr = nPr

Practice Questions:
• How many 6-digit telephone numbers can be formed
if each number starts with 35 and no digit appears
more than once and the order is important?

Total digits without 3 and 5 = 0, 1, 2, 4, 6, 7, 8, 9

First two digits are 3 and 5 so the remaining four digits will
be selected from eight digits {0, 1, 2, 4, 6, 7, 8, 9} to form
a unique 6-digit telephone number.
Therefore,
Total number of ways = 8P4 = 1680

• There are 4 czech and 3 slovak books on the


bookshelf. Czech books should be placed on the left
side of the bookshelf and slovak books on the right
side of the bookshelf. How many ways are there to
arrange the books while the order is important?

Number of ways Czech books can be arranged = 4P4 = 24


Number of ways Slovak books can be arranged = 3P3 = 6
So,
Total number of ways = 24 x 6 = 144

Rule of Combination:
Combination is any subset selected, without the concern
of order, from a set of “n” distinct objects.
n
Cr = n! / [r! (n - r)!]
where;
r = number of objects chosen
n = total number of objects
Example:
A three-person committee is to be formed from a list of
four persons. How many sample points are associated
with this experiment?
n=4
r=3
nC = n! / [r! (n - r)!]
r
4C = 4! / [3! (4 - 3)!]
3
4C = 4x3x2x1 / [3x2x1 (1!)]
3
4C = 24 / [6 (1)]
3
4
C3 = 24 / 6
4C = 4
3
Explanation:
Four members = A, B, C, D
Three are to be chosen and order is unimportant. So;
ABC ABD ACD BCD

Practice Questions:
• How many sample points are in the sample space
when a person drawn a hand of 5 cards from a well-
shuffled ordinary deck of 52 cards?
n = 52
r=5
nC = n! / [r! (n - r)!]
r
52C = 52! / [5! (52 - 5)!]
5
52C = 52! / [5! (52 - 5)!]
5
52C = 25,98,960
5

Note: nCr = nCr

• In a group of 6 boys and 4 girls, four children are to


be selected. In how many different ways can they be
selected such that at least one boy should be there?
Option-1
We can select 4 boys = 6C4 = 15
Option-2
We can select 3 boys and 1 girl = 6C3 x 4C1 = 80
Option-3
We can select 2 boys and 2 girls = 6C2 x 4C2 = 90
Option-4
We can select 1 boy and 3 girls = 6C1 x 4C3 = 24
So,
Total number of ways = 15 + 80 + 90 + 24 = 209

• From a group of 7 men and 6 women, five persons


are to be selected to form a committee so that at
least 3 men are there in the committee. In how many
ways can it be done?
Option-1
3 men and 2 women = 7C3 x 6C2 = 35 x 15 = 525
Option-2
4 men and 1 woman = 7C4 x 6C1 = 35 x 6 = 210
Option-3
5 men = 7C5 = 21
So,
Total number of ways = 525 + 210 + 21 = 756

• A bag contains 2 white balls, 3 black balls and 4 red


balls. In how many ways can 3 balls be drawn from
the bag, if at least one black ball is to be included in
the draw?
Option-1
1 black and 2 others = 3C1 x 6C2 = 3 x 15 = 45
Option-2
2 black and 1 other = 3C2 x 6C1 = 3 x 6 = 18
Option-3
3 black = 3C3 = 1
So,
Total number of ways = 45 + 18 + 1 = 64

• A box contains 4 red, 3 white and 2 blue balls. Three


balls are drawn at random. Find out the number of
ways of selecting the balls of different colors?
1 red, 1 white and 1 blue = 4C1 x 3C1 x 2C1 = 4 x 3 x 2 = 24

You might also like