0% found this document useful (0 votes)
39 views39 pages

Statistics

Uploaded by

parrotsoftltd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views39 pages

Statistics

Uploaded by

parrotsoftltd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

1

Lecture 01-02 (Basic definition)


Definition: Statistics is concerned with scientific methods for collecting, organizing, summarizing, presenting,
and analyzing data as well as helps to draw valid conclusions and make reasonable decisions on the basis of such
analysis.
Data: The facts and statistics collected together for reference or analysis. OR, Data is a collection of facts, such
as numbers, words, measurements, observations or just descriptions of things.
Types of Data:

Fig. 1 Types of data.

Example: What do we know about the Dog Fig. 2?

Qualitative:
• He is brown and black
• He has long hair
• He has lots of energy
Quantitative:
• Discrete:
o He has 4 legs
o He has 2 eyes
• Continuous:
Fig. 2 A dog is jumping.
o He weighs 25.5 kg
o He is 565 mm tall
Collecting: Data can be collected in many ways.
1. Direct observation (It is the simplest way).
2. Survey questions and etc. (Your TER form)

Example: Counting Cars


You want to find how many cars pass by a certain
point on a road in a 10-minute interval.
So: stand near that road, and count the cars that pass
by in 10 minutes.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


2

You might want to count many 10-minute intervals Fig. 3 Cars passing through a certain points of road.
at different times during the day, and on different
days too!

Population: Population means an aggregate of elements possessing certain characteristics of interest in any
particular investigation or enquiry.
Sample: Sample is a representative part of the population.
Variable: The measurements of elements of a population having certain characteristics may vary from elements
to elements either in magnitude or in quantity. These measurable characteristics are called variable. If variable
can take only one value, it is called constant.
Raw data: Raw data are collected data that have not been organized numerically. An example is the set of heights
of 100 male students obtained from an alphabetical listing of university records.
Array: An array is an arrangement of raw numerical data in ascending or descending order of magnitude. The
difference between the largest and smallest numbers is called the range of the data. For example, if the largest
height of 100 male students is 74 inches (in) and the smallest height is 60 in, the range is 74 60 ¼ 14 in.
Frequency distribution: When summarizing large masses of raw data, it is often useful to distribute the data into
classes, or categories, and to determine the number of individuals belonging to each class, called the class
frequency. A tabular arrangement of data by classes together with the corresponding class frequencies is called a
frequency distribution, or frequency table. Table 1 is a frequency distribution of heights (recorded to the nearest
inch) of 100 students at Northern University Bangladesh.

Table 1 Heights of 100 male students at Northern University Bangladesh.


Height (in) Number of
students
60-62 5
63-65 18
66-68 42
69-71 27
72-74 8
Total 100

Problem 2 Construct a frequency distribution table for the following raw data (height of 20 students): 50, 54, 56,
60, 55, 45, 48, 51, 53, 57, 49, 50, 52, 58, 57, 59, 57, 58, 55, and 54.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


3

Graph: A graph is a pictorial presentation of the relationship between variables. Many types of graphs are
employed in statistics, depending on the nature of the data involved and the purpose for which the graph is
intended. Among these are bar graphs, pie graphs, pictographs, etc.

Graphical representation of frequency distribution:


a. Dot frequency diagram
b. Histogram
c. Frequency polygon
d. Cumulative frequency polygon
e. Cumulative frequency curve
Dot frequency diagram: The frequency distribution of 100 students is given in Table 1. We have to plot the dot
frequency diagram. In this respect, firstly it is required to find the middle class. Then, we have to plot frequencies
relative to the respective middle classes.

Height (in) Middle class (𝑥𝑖 ) Number of students (𝑓𝑖 )


60-62 61 5
63-65 64 18
66-68 67 42
69-71 70 27
72-74 73 8
Total 100

Figure 1 Dot frequency diagram for given data 1


45
40
Number of students

35
30
25
20
15
10
5
0
60 65 70 75
Height (inche)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


4

Bar Diagram: The frequency distribution of 100 students is given in Table 1. We have to plot the Bar diagram.
In this respect, firstly it is required to find the middle class. Then, we have to plot frequencies relative to the
respective class.
Height (in) Middle class (𝑥𝑖 ) Number of students (𝑓𝑖 )
60-62 61 5
63-65 64 18
66-68 67 42
69-71 70 27
72-74 73 8
Total 100

Figure 2 Bar diagram for the given data


50
Number of students

40
30
20
10
0
61 64 67 70 73
Height (inche)

Frequency polygon: The frequency distribution of 100 students is given in Table 1. We have to draw frequency
polygon. In this respect, firstly it is required to find the middle class. Then, we have to plot frequencies relative
to the respective class.

Height (in) Middle class (𝑥𝑖 ) Number of students (𝑓𝑖 )


60-62 61 5
63-65 64 18
66-68 67 42
69-71 70 27
72-74 73 8
Total 100

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


5

Figure 3 Frequency polygon for given data


45
40

Number of students
35
30
25
20
15
10
5
0
60 62 64 66 68 70 72 74
Height (inche)

Cumulative frequency polygon: (Do yourself)


Cumulative frequency curve: (Do yourself)
Class intervals and class limits:
A symbol defining a class, such as 60–62 in Table 1, is called a class interval. The end numbers, 60 and 62, are
called class limits; the smaller number (60) is the lower class limit, and the larger number (62) is the upper class
limit. The terms class and class interval are often used interchangeably, although the class interval is actually a
symbol for the class. A class interval that, at least theoretically, has either no upper class limit or no lower class
limit indicated is called an open class interval. For example, referring to age groups of individuals, the class
interval ‘’65 years and over’’ is an open class interval.

The size, or width, of a class interval:


The size, or width, of a class interval is the difference between the lower and upper class boundaries and is also
referred to as the class width, class size, or class length. If all class intervals of a frequency distribution have equal
widths, this common width is denoted by c. In such case c is equal to the difference between two successive lower
class limits or two successive upper class limits. For the data of Table 1, for example, the class interval is C =
62:5-59:5 = 65:5 - 62:5 = 3.
Assignment:
1. Write down the uses of statistics in different sectors in your own words.
2. Why will we study statistics in CSE sector?
3. Produce a frequency distribution table of GPA of last semester final examination for the students of this
section.
Home work:
1. Write down some examples of population, sample, and variable in your own words.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


6

Lecture 03 (Measures of Location)

Average or measures of central tendency: An average is a value that is typical, or representative, of a set of
data. Since such typical values tend to lie centrally within a set of data arranged according to magnitude, averages
are also called measures of central tendency.

Several types of averages can be defined, the most common being the arithmetic mean, the median, the
mode, the geometric mean, and the harmonic mean. Each has advantages and disadvantages, depending on the
data and the intended purpose.

Arithmetic mean: The arithmetic mean of a set of 𝑁 numbers 𝑥1 , 𝑥2, 𝑥3 , . . . 𝑥𝑁 is denoted by 𝑋 and defined by,

𝑥1 +𝑥2 +𝑥3 +...+𝑥𝑁 1 1


𝑋= = 𝑁 ∑𝑁
𝑖=1 𝑥𝑖 = 𝑁 ∑ 𝑥.
𝑁

1
For group data/frequency distribution, arithmetic mean can defined as 𝐴𝑀 = 𝑁 ∑𝑁
𝑖=1 𝑓𝑖 𝑥𝑖 .

Properties of arithmetic mean:

a. The algebraic sum of the deviations of a set of numbers from their arithmetic mean is zero.

Geometric mean: The geometric mean of a set of 𝑁 positive numbers 𝑥1, 𝑥2 , 𝑥3, . . . 𝑥𝑁 is denoted by 𝐺 and
defined by, 𝐺 = 𝑁√𝑥1 𝑥2 𝑥3. . . 𝑥𝑁.

1
For group data, 𝐺 = 10 𝑁 ∑ 𝑓𝑖 𝑙𝑜𝑔(𝑥𝑖 ) .

Harmonic mean: The harmonic mean of a set of 𝑁 nonzero numbers 𝑥1, 𝑥2 , 𝑥3, . . . 𝑥𝑁 is denoted by 𝐻 and defined
1 1
by, 𝐻 = 1 1 1 1 1 = 1 1 .
( + + +⋯+𝑥 ) ∑
𝑁 𝑥1 𝑥2 𝑥3 𝑁 𝑥𝑖
𝑁

1
For group data, 𝐻 = 1 1 .
∑ 𝑓𝑖
𝑁 𝑥𝑖

Quadratic mean (Root mean square): The root mean square (RMS), or quadratic mean, of a set of numbers

2
𝑥1 , 𝑥2, 𝑥3 , . . . 𝑥𝑁 is sometimes denoted by √𝑋 and is defined by,

2 1
𝑅𝑀𝑆 = √𝑋 = √ ∑ 𝑥𝑖2 .
𝑁

2 1
𝑅𝑀𝑆 = √𝑋 = √𝑁 ∑ 𝑓𝑖 𝑥𝑖2 . (For group data)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


7

Problem: Find the arithmetic mean, geometric mean and harmonic mean of data 8, 5, 6, 3, 10, 12, 13, 9, 7, 6, 5,
3, 8.

8+5+6+3+10+12+13+9+7+6+5+3+8
Solution: 𝐴𝑀 = = 7.308
13

13
𝐺𝑀 = √8 × 5 × 6 × 3 × 10 × 12 × 13 × 9 × 7 × 6 × 5 × 3 × 8 =6.6623 (do yourself)

1
𝐻𝑀 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 =6.0068 (do yourself)
( + + + + + + + + + + + + )
13 8 5 6 3 10 12 13 9 7 6 5 3 8

Problem: Daily wages range of 35 labors in a certain factory are given bellow. Find the arithmetic mean by both
direct and short-cut method.

Class 11-13 13-15 15-17 17-19 19-21 21-23 23-25


Number of labors 3 4 5 10 6 4 3

Solution:

Class of Middle Number of 𝑓𝑖 𝑥𝑖 𝑥𝑖 − 𝐴 𝑓𝑖 𝑢𝑖


𝑢𝑖 =
interval class (𝑥𝑖 ) labors (𝑓𝑖 ) ℎ
11-13 12 3 36 -3 -9
13-15 14 4 56 -2 -8
15-17 16 5 80 -1 -5
17-19 18 10 180 0 0
19-21 20 6 120 1 6
21-23 22 4 88 2 8
23-25 24 3 72 3 9
Total 35 632 1

For direct method

1 632
𝐴𝑀 = 𝑁 ∑ 𝑓𝑖 𝑥𝑖 = = 18.06.
35

For short-cut method

1 1 2
𝐴𝑀 = 𝐴 + (𝑁 ∑ 𝑓𝑖 𝑢𝑖 ) × ℎ = 18 + 35 × 2 = 18 + 35 = 18.06.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


8

Problem: Daily wages range of 35 labors in a certain factory are given bellow. Find the Geometric mean and
Harmonic mean.

Class 11-13 13-15 15-17 17-19 19-21 21-23 23-25


Number of labors 3 4 5 10 6 4 3
Solution:

Geometric mean (GM):

Class of Middle Number of 𝑙𝑜𝑔( 𝑥𝑖 ) 𝑓𝑖 𝑙𝑜𝑔( 𝑥𝑖 )


interval class (𝑥𝑖 ) labors (𝑓𝑖 )
11-13 12 3 1.0792 3.2376
13-15 14 4 1.1461 4.5844
15-17 16 5 1.2041 6.0205
17-19 18 10 1.2553 12.5530
19-21 20 6 1.3010 7.8060
21-23 22 4 1.3424 5.3696
23-25 24 3 1.3802 4.1406
Total 𝑁 =35 ∑ 𝑓𝑖 𝑙𝑜𝑔( 𝑥𝑖 ) =43.7117
1 1
It is known to us, 𝐺𝑀 = 10𝑁 ∑ 𝑓𝑖 𝑙𝑜𝑔(𝑥𝑖 ) = 1035(43.7117) = 101.2489

or, 𝐺𝑀 = 101.2489 = 17.74 (approximately)

Harmonic mean (HM):

Class of Middle Number of 1 1


𝑓𝑖
interval class (𝑥𝑖 ) labors (𝑓𝑖 ) 𝑥𝑖 𝑥𝑖
11-13 12 3 0.0833 0.2499
13-15 14 4 0.0714 0.2856
15-17 16 5 0.0625 0.3125
17-19 18 10 0.0556 0.5560
19-21 20 6 0.0500 0.3000
21-23 22 4 0.455 0.2176
23-25 24 3 0.0417 0.1251
1
Total 𝑁 =35 ∑ 𝑓𝑖 =2.0467
𝑥𝑖
1 1 35
It is known to us harmonic mean, 𝐻𝑀 = 1 1 = 1 = 2.0467 = 17.10 (approximately)
∑ 𝑓𝑖 (2.0467)
𝑁 𝑥𝑖 35

Problem: Find the relation between arithmetic mean, geometric mean and harmonic mean.

Problem: For two positive data, show that 𝐴𝑀 × 𝐻𝑀 = 𝐺𝑀2 . (HW)

Problem: Daily wages range of 35 labors in a certain factory are given bellow. Find the quadratic mean or root
mean square (RMS)
Class 11-13 13-15 15-17 17-19 19-21 21-23 23-25
Number of labors 3 4 5 10 6 4 3

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


9

Lecture 04 (Median, Mode, Quartile, Percentile)

Median: The median of a set of numbers arranged in order of magnitude (i.e., in an array) is either the middle
value or the arithmetic mean of the two middle values.

Examples:

For ungroup data:

𝑁+1
If total number of data (𝑁) is odd then, ( ) 𝑡ℎ data will be median.
2

𝑁 𝑁
If total number of data (𝑁) is even then, the average of ( 2 ) 𝑡ℎ and ( 2 + 1) 𝑡ℎ data will be median.

For group data:

𝑁
−𝐹
𝑚𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 𝑓 × ℎ,

where, 𝐿 is the lower limit of median class, 𝐹 cumulative frequency of pre median class, 𝑓 frequency
of median class, ℎ length of median class, 𝑁 total number of observations.

Mode: The mode of a set of numbers is that value which occurs with the greatest frequency; that is, it is the most
common value. The mode may not exist, and even if it does exist it may not be unique. A distribution having only
one mode is called unimodal.

Examples:

a. The set 2, 2, 5, 7, 9, 9, 9, 10, 10, 11, 12, and 18 has mode 9.


b. The set 3, 5, 8, 10, 12, 15, and 16 has no mode.
c. The set 2, 3, 4, 4, 4, 5, 5, 7, 7, 7, and 9 has two modes, 4 and 7, and is called bimodal.

For group data

𝛥1
mode = 𝐿 + (𝛥 ) ℎ,
1 +𝛥2

where, 𝐿 is the lower limit of modal class, 𝛥1 difference in frequencies of modal and pre-modal class, 𝛥2
difference in frequencies of modal and post modal class, ℎ length of modal class.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


10

Problem: Daily wages range of 35 labors in a certain factory is given bellow. Find the median and mode from
the given frequency distribution.

Class 11-13 13-15 15-17 17-19 19-21 21-23 23-25


Number of labors 3 4 5 10 6 4 3

Solution:

Class of Number of Cumulative


interval labors (𝑓𝑖 ) frequency (F)
11-13 3 3
13-15 4 7
15-17 5 12
17-19 10 22
19-21 6 28
21-23 4 32
23-25 3 35
Total=35

𝑁
−𝐹
It is known to us, 𝑚𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2
× ℎ. (1)
𝑓

35
Now, we have to identify the median class. Here the half of the total frequency is = 17.5 which lies in
2

in range of 22 (cumulative frequency) and 22 lies in the class 17-19. Hence the class 17-19 is the median
class. Thus 𝐿 = 17, 𝑁 = 35, 𝐹 = 12, 𝑓 = 10 and ℎ = 19 − 17 = 2.

Now, from Eq. (1), we have

𝑁 35
−𝐹 − 12 17.5 − 12
𝑚𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 × ℎ = 17 + 2 × 2 = 17 + × 2 = 17 + 0.55 × 2 = 18.1
𝑓 10 10

𝛥1
Again, mode = 𝐿 + (𝛥 ) ℎ. (2)
1 +𝛥2

Here, class 17-19 is the modal class.

Thus, 𝐿 = 17, 𝛥1 = 10 − 5 = 5, 𝛥2 = 10 − 6 = 4, and ℎ = 19 − 17 = 2.

Now, from Eq. (2), we have

𝛥1 5
mode = 𝐿 + (𝛥 ) ℎ = 17 + (5+4) × 2 = (do yourself)
1 +𝛥2

Quartile, deciles and percentiles: If a set of data is arranged in order of magnitude, the middle value (or
arithmetic mean of the two middle values) that divides the set into two equal parts is the median. By extending

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


11

this idea, we can think of those values which divide the set into four equal parts. These values, denoted by Q1,
Q2, and Q3, are called the first, second, and third quartiles, respectively, the value Q2 being equal to the median.

Similarly, the values that divide the data into 10 equal parts are called deciles and are denoted by D1, D2,
. . . , D9, while the values dividing the data into 100 equal parts are called percentiles and are denoted by P1, P2, .
. . , P99.

For a grouped frequency distribution, the quartiles are given by,

𝑁
𝑖−𝐹𝑖
𝑄𝑖 = 𝐿𝑖 + 4
× ℎ; 𝑖 = 1,2,3.
𝑓𝑖

For a grouped frequency distribution, the deciles are given by,


𝑁
𝑖−𝐹𝑖
𝐷𝑖 = 𝐿𝑖 + 10
× ℎ; 𝑖 = 1,2,3. . . ,9.
𝑓𝑖

For a grouped frequency distribution, the percentile is given by,


𝑁
𝑖−𝐹𝑖
𝑃𝑖 = 𝐿𝑖 + 100
× ℎ; 𝑖 = 1,2,3. . . ,99.
𝑓𝑖

Problem 03: Frequency distribution of the weekly wages of 65 employees at the P&R Company is given bellow

Wages ($) Number of


employees
250-260 8
260-270 10
270-280 16
280-290 14
290-300 10
300-310 5
310-320 2
Total= 65
Find 𝑄1 , 𝑄2 , 𝑄3 , 𝐷1 , … 𝐷9 , 𝑃10 , 𝑃53, 𝑃76 , and 𝑃95.

Solution: Please see the following table


class fi Fi
250-260 8 8
260-270 10 18
270-280 16 34
280-290 14 48
290-300 10 58
300-310 5 63
310-320 2 65
Total= 65

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


12

We know, quartiles are given by,

𝑁
𝑖−𝐹𝑖
𝑄𝑖 = 𝐿𝑖 + 4
× ℎ; 𝑖 = 1,2,3.
𝑓𝑖

𝑁 65
For first quartile (i=1) 4 𝑖 = × 1=16.25 which lies in class 260-270. Thus, 𝐿1 = 260, 𝐹1 = 8, 𝑓1 = 10, and ℎ =
4

10. Therefore,

𝑁
×1−𝐹1 16.25−8
𝑄1 = 𝐿1 + 4
× ℎ = 260 + × 10 = 260 + 8.25 = 268.25.
𝑓1 10

𝑁 65
For second quartile (i=2) 4 𝑖 = × 2=32.50 which lies in class 270-280. Thus, 𝐿2 = 270, 𝐹2 = 18, 𝑓2 = 16, and
4

ℎ = 10. Therefore,

𝑁
×2−𝐹2 32.50−18 14.50×10
𝑄2 = 𝐿2 + 4
× ℎ = 270 + × 10 = 270 + =?
𝑓2 16 16

𝑁 65
For third quartile (i=3) 𝑖= × 3=48.75 which lies in class 290-300. Thus, 𝐿3 = 290, 𝐹3 = 48, 𝑓3 = 10, and
4 4

ℎ = 10. Therefore,

𝑁
×3−𝐹3 48.75−48
𝑄3 = 𝐿3 + 4
× ℎ = 290 + × 10 =?
𝑓3 10

Problem for practice:

1. Write down the merits and demerits of arithmetic mean, geometric mean and harmonic mean.
(Assignment)
2. Write down the merits and demerits of median and mode. (Assignment)
3. For any two positive data show that (𝐺𝑀)2 = 𝐴𝑀 × 𝐻𝑀.
4. Prove that the quadratic mean (QM) of any two positive unequal numbers a and b is greater than their
geometric mean.
5. For n positive observations, show that arithmetic mean≥geometric mean≥harmonic mean.
6. Form a frequency distribution table from the following data and hence draw a different frequency
diagrams (scattered dot plot, histogram, polygon).
3, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 9, 10, 10, 10, 10, 10, 10, 12, 13, 15, 15, 16, 17, 17, 18, 19, 21, 23,
23, 25, 26, 27, 28, 29, 31, 33, 35, 34, 34, 35, 36, 37, 37, 38, 39, 42, 45.
7. Find the arithmetic mean, median and mode for the following frequency distribution:
Class 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
frequency 3 4 6 7 12 9 6 3

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


13

8. Find the arithmetic mean (in both direct and shortcut method), geometric mean, harmonic mean, median
and mode for the following frequency distribution.
Class 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40
frequency 2 5 7 13 21 16 8 3
9. Suppose you have some data including positive, negative and zero. Which mean do you like to choose for
measuring data location and why?

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


14

Lecture 05 (Measures of Dispersion)

Dispersion, or variation: The degree to which numerical data tend to spread about an average value is called
the dispersion, or variation, of the data. Various measures of this dispersion (or variation) are available, the
most common being the range, mean deviation, semi-interquartile range, 10–90 percentile range, and standard
deviation.

Range: The range of a set of numbers is the difference between the largest and smallest numbers in the set.

The mean deviation: The mean deviation, or average deviation, of a set of N numbers 𝑥1 , 𝑥2, 𝑥3 , . . . 𝑥𝑁 is
abbreviated by MD and is defined by

1
𝑀𝐷 = 𝑁 ∑ |𝑥𝑖 − 𝑥|.

For group data

1
𝑀𝐷 = 𝑁 ∑ 𝑓𝑖 |𝑥𝑖 − 𝑥|.

The semi-interquartile range: The semi-interquartile range, or quartile deviation, of a set of data is denoted by
Q and is defined by

𝑄3 −𝑄1
𝑄= ,
2

where Q1 and Q3 are the first and third quartiles for the data. The interquartile range (𝑄3 − 𝑄1 ) is sometimes
used, but the semi-interquartile range is more common as a measure of dispersion.

The 10–90 percentile range: The 10–90 percentile range of a set of data is defined by

10-90 percentile range= 𝑃90 − 𝑃10,

1
where P10 and P90 are the 10th and 90th percentiles for the data. The semi-10–90 percentile range, 2 (𝑃90 − 𝑃10),

can also be used but is not commonly employed.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


15

Problem 02: Find the semi-interquartile range, or quartile deviation, and 10–90 percentile range from the
following frequency distribution.

Wages ($) Number of


employees
250-260 8
260-270 10
270-280 16
280-290 14
290-300 10
300-310 5
310-320 2
Total= 65

Solution: Please see the following table


class fi Fi
250-260 8 8
260-270 10 18
270-280 16 34
280-290 14 48
290-300 10 58
300-310 5 63
310-320 2 65
Total= 65
We know, quartiles are given by,

𝑁
𝑖−𝐹𝑖
𝑄𝑖 = 𝐿𝑖 + 4
× ℎ; 𝑖 = 1, 2, 3.
𝑓𝑖

𝑁 65
For first quartile (i=1) 4 𝑖 = × 1=16.25 which lies in class 260-270. Thus, 𝐿1 = 260, 𝐹1 = 8, 𝑓1 = 10, and ℎ =
4

10. Therefore,
𝑁
×1−𝐹1 16.25−8
𝑄1 = 𝐿1 + 4
× ℎ = 260 + × 10 = 260 + 8.25 = 268.25.
𝑓1 10

𝑁 65
For third quartile (i=3) 𝑖= × 3=48.75 which lies in class 290-300. Thus, 𝐿1 = 290, 𝐹1 = 48, 𝑓1 = 10, and
4 4

ℎ = 10. Therefore,

𝑁
×3−𝐹3 48.75−48
𝑄3 = 𝐿3 + 4
× ℎ = 290 + × 10 = 290.75
𝑓3 10

1 1 1
Now, the semi-interquartile range 𝑄 = 2 (𝑄3 − 𝑄1 ) = 2 (290.75 − 268.25) = 2 . 22.50 = 11.25

10-90 percentile range= 𝑃90 − 𝑃10 =?

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


16

1
The semi 10-90 percentile range= 2 (𝑃90 − 𝑃10 ) =?

The standard deviation: The standard deviation of a set of N numbers 𝑥1, 𝑥2 , 𝑥3, . . . 𝑥𝑁 is denoted by 𝜎 and is
defined by

1
𝜎 = √𝑁 ∑(𝑥𝑖 − 𝑥)2.

Thus 𝜎 is the root mean square (RMS) of the deviations from the mean, or, as it is sometimes called, the root-
mean-square deviation.

For the group data,

1
𝜎 = √𝑁 ∑ 𝑓𝑖 (𝑥𝑖 − 𝑥)2.

Short method for computing standard deviation

1 1 2
𝜎 = √𝑁 ∑ 𝑓𝑖 𝑥𝑖2 − (𝑁 ∑ 𝑓𝑖 𝑥𝑖 ) .

If sample data are given then standard deviation is dented by 𝑠 and defined by

1
𝑠=√ ∑(𝑥𝑖 − 𝑥)2
𝑁−1

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


17

Problem: Find the standard deviation (SD) and variance from the following frequency distribution.
Class 11-13 13-15 15-17 17-19 19-21 21-23 23-25
Number of labors 3 4 5 10 6 4 3

Solution:

Class 𝑥𝑖 𝑓𝑖 𝑓𝑖 𝑥𝑖 𝑥𝑖 2 𝑓𝑖 𝑥𝑖 2
11-13 12 3 36 144 432
13-15 14 4 56 196 784
15-17 16 5 80 256 1280
17-19 18 10 180 324 3240
19-21 20 6 120 400 2400
21-23 22 4 88 484 1936
23-25 24 3 72 576 1728
Total 35 632 11800
We know,

1 1 1 2 1 2
Standard deviation, 𝑆𝐷 = √𝑁 ∑ 𝑓𝑖 𝑥𝑖2 − (𝑁 ∑ 𝑓𝑖 𝑥𝑖 ) =√35 × 11800 − (35 × 632)

= √337.143 − 326.06 = √11.083 = 3.329 (Approximately)

1 1 2
Variance, 𝜎 2 = 𝑁 ∑ 𝑓𝑖 𝑥𝑖2 − (𝑁 ∑ 𝑓𝑖 𝑥𝑖 ) =11.083 (approximately)

The variance: The variance of a set of data is defined as the square of the standard deviation and is thus given
by 𝜎 2 . When it is necessary to distinguish the standard deviation of a population from the standard deviation of a
sample drawn from this population, we often use the symbol 𝑠 for the latter and 𝜎 (lowercase Greek sigma) for
the former. Thus 𝑠 2 and 𝜎 2 would represent the sample variance and population variance, respectively.

Coefficient of variation (CV): If the absolute dispersion is the standard deviation 𝑠 and if the average is the
mean 𝑥̄ , then the relative dispersion is called the coefficient of variation, or coefficient of dispersion; it is
denoted by V and is given by

SD
Coefficient of variation, C𝑉(%) = × 100%,

and is generally expressed as a percentage. Note that the coefficient of variation is independent of the units
used. For this reason, it is useful in comparing distributions where the units may be different. A disadvantage of
the coefficient of variation is that it fails to be useful when 𝑥̄ is close to zero.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


18

Problem Show that for two observations, standard deviation is the half of the range.
Solution: Let a and b be two data.
𝑎+𝑏
The mean of the data 𝑥̅ = .
2
1
We know standard deviation, 𝑆𝐷 = √(𝑁 ∑(𝑥𝑖 − 𝑥̅ )2 ).
1
or, 𝑆𝐷 = √(𝑁 ∑(𝑥𝑖 − 𝑥̅ )2 )

1 𝑎+𝑏 2 𝑎+𝑏 2
or, 𝑆𝐷 = √( {(𝑎 − ) + (𝑏 − ) })
2 2 2

1 2𝑎−𝑎−𝑏 2 2𝑏−𝑎−𝑏 2
or, 𝑆𝐷 = √(2 {( ) +( ) })
2 2

1 𝑎−𝑏 2 −𝑎+𝑏 2
or, 𝑆𝐷 = √(2 {( ) +( ) })
2 2

1 𝑎−𝑏 2 𝑎−𝑏 2
or, 𝑆𝐷 = √(2 {( ) +( ) })
2 2

1 𝑎−𝑏 2
or, 𝑆𝐷 = √(2 . 2. ( ) )
2

𝑎−𝑏 2
or, 𝑆𝐷 = √(( ) )
2

𝑎−𝑏 2
or, 𝑆𝐷 = √(| | )
2

𝑎−𝑏
or, 𝑆𝐷 = | |
2
1
or, 𝑆𝐷 = 2 |𝑎 − 𝑏|
1
or, 𝑆𝐷 = 2 × 𝑟𝑎𝑛𝑔𝑒. (proved)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


19

Lecture 07
(𝑛2 −1)
Problem Show that the standard deviation of first n natural numbers is √ .
12

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


20

Laws for combined mean and standard deviation:


Suppose 𝑋1, 𝑠1 , 𝑁1 are mean, standard deviation and number of observations, respectively, of the first group and
𝑋2 , 𝑠2 , 𝑁2 are those of the second group, respectively.

𝑁1 𝑋1 +𝑁2 𝑋2
The mean of the combined group: 𝑋12 = 𝑁1 +𝑁2

𝑁1 (𝑠12 +𝑑12 )+𝑁2 (𝑠22 +𝑑22 )


The standard deviation of the combined group: 𝑠12 = √ ,
𝑁1 +𝑁2

where 𝑑1 = 𝑋12 − 𝑋1 , and 𝑑2 = 𝑋12 − 𝑋2

Problem A group of 40 students was selected and measured their heights which give mean 4.5 ft and standard
deviation 2.1 ft. Another group of 50 students was also selected and measured their mean 4.3 ft and standard
deviation is 1.9 ft. Find the mean and standard deviation of the combined group.

Problem: Calculate combined mean and standard deviation (SD) from the following table of two groups

Group Mean SD No. of observations


G2 10 1 40
G1 15 2 60

Problem A student while calculating mean and standard deviation of 25 observations obtained mean=56 and
standard deviation=2. At the time of checking, it was found that he has copied 64 instead of 46. What would be
the actual value of the mean and standard deviation?

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


21

Problem: The mean and standard deviation of 20 observation are found to be 10 and 2 respectively. On
rechecking it was found that an observation 8 was incorrect. Calculate the correct mean and standard deviation
in each of the following cases:
(i) If wrong item is omitted
(ii) If it is replaced by 12
Problems for practice:

1. Find the mean deviation, standard deviation, quartile deviation and percentile deviation from the data 3,
4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 9, 10, 10, 10, 10, 10, 10, 12.
2. Find the mean deviation, standard deviation, quartile deviation and percentile deviation from the
frequency distribution
Class 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
frequency 3 4 6 7 12 9 6 3

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


22

3. Find the coefficient of variance from the frequency distribution


Class 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
frequency 3 4 6 7 12 9 6 3

4. Find the mean deviation, standard deviation and variance of the following table.
Class 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
frequency 3 4 6 7 12 9 6 3

5. Show that for two observations, standard deviation is the half of the range.
(𝑛2 −1)
6. Show that the standard deviation of first n natural numbers is √ .
12

7. A group of 60 students was selected and measured their heights which give mean 5.5 ft. and standard
deviation 2.1 ft. Another group of 40 students was also selected and measured their mean 5.3 ft. and
standard deviation is 1.9. Find the mean and standard deviation of the combined group.
8. A student while calculating mean and standard deviation of 50 observations obtained mean=56 and
standard deviation=2.50. At the time of checking, it was found that he has copied 64 instead of 46. What
would be the actual value of the mean and standard deviation?

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


23

Regression curve fitting

Curve fitting: The general problem of finding equations of approximating curves that fit given sets of data is
called curve fitting.

Least square method:

The least-squares line fitting:

Regression: Regression is the relationship between dependent variable and independent variable.

Regression curve: On the basis of sample data, we wish to estimate the value of a variable Y corresponding to a
given value of a variable X. This can be accomplished by estimating the value of Y from a least-squares curve
that fits the sample data. The resulting curve is called a regression curve of Y on X, since Y is estimated from X.

Problems for Practice:

1. Fit regression line from the given tabular data. (𝑦 = 𝑎 + 𝑏𝑥)


X: 0 1 2 3 4 5 6
Y: 2 5 7 10 12 18 25

∑𝑌 = ∑𝑎 + 𝑏∑𝑋
∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋 2
Solution:
We know, the equation of straight line is 𝑦 = 𝑎 + 𝑏𝑥 (1)
Again, we know for straight line fitting (using least square method)
∑ 𝑌 = ∑ 𝑎 + 𝑏 ∑ 𝑋; ∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋 2
Now,
X Y 𝑋2 𝑋𝑌
0 2 0 0
1 5 1 5
2 7 4 14
3 10 9 30
4 12 16 48
5 18 25 90
6 25 36 150
∑ 𝑋=21 ∑ 𝑌=79 ∑ 𝑋 2 =91 ∑ 𝑋Y=337

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


24

Thus,
79 = 7𝑎 + 21𝑏 (2)
337 = 21𝑎 + 91𝑏 (3)
Solving Eqs. (2) & (3), we have
𝑎 = 0.571 and 𝑏 = 3.571
Therefore, equation of the straight line is 𝑦 = 0.571 + 3.571 𝑥

2. Wheat is grown on 9 equal-sized plots. The amount of fertilizer put on each plot is given in the
following table along with the yield of wheat. Fit the second-degree polynomial curve or fit a curve in
form 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2 and also fit a linear regression line.
wheat X: 2.3 3.4 4.4 5.1 5.5 5.2 4.9 4.4 3.9
fertilizer Y: 1.2 2.3 3.3 4.1 4.8 5.0 5.5 6.1 6.9

∑ 𝑌 = ∑ 𝑎 + 𝑏 ∑ 𝑋 + 𝑐 ∑ 𝑋2;
∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋 2 + 𝑐 ∑ 𝑋 3
and , ∑ 𝑋 2 𝑌 = 𝑎 ∑ 𝑋 2 + 𝑏 ∑ 𝑋 3 + 𝑐 ∑ 𝑋 4

Solution:

𝑋 𝑌 𝑋𝑌 𝑋2 𝑋2 𝑌 𝑋3 𝑋4
2.3 1.2 2.76 5.29 6.348 12.167 27.984
3.4 2.3 7.82 11.56 26.588 39.304 133.6336
4.4 3.3 14.52 19.36 63.888 85.184 374.8096
5.1 4.1 20.91 26.01 106.641 132.651 676.5201
5.5 4.8 26.40 30.25 126.720 166.375 914.0625
5.2 5.0 26.00 27.04 130.000 140.608 731.1616
4.9 5.5 26.95 24.01 148.225 117.649 576.4801
4.4 6.1 26.84 19.36 163.724 85.184 376.8096
3.9 6.9 26.91 15.21 185.679 59.319 231.3441
∑ 𝑋 =39.10 ∑ 𝑌 =39.2 ∑ 𝑋𝑌 =179.11 ∑ 𝑋 2 = 177.99 ∑ 𝑋2 𝑌 = ∑ 𝑋3 = ∑ 𝑋4 =
957.813 838.441 4042.8052

39.2 = 9𝑎 + 39.10 𝑏 + 177.99 𝑐 (1)


179.11 = 39.10 𝑎 + 177.99 𝑏 + 838.441 𝑐 (2)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


25

957.813 = 177.99 𝑎 + 838.441 𝑏 + 4042.8052 𝑐 (3)

(1)*39.10-(2)*9=> 1532.72 − 1611.99 = (1528.81 − 1601.91)𝑏 + (6959.409 − 7545.969)𝑐


or,

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


26

Moments, Skewness and Kurtosis

Moments:

Skewness: Skewness is the degree of asymmetry, or departure from symmetry of the distribution.

Kurtosis:

Correlation

Hypothesis Test

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


27

Probability (Lecture 01)

Probability: Probability of an event is the measurement of the chance that the event will occur as a result of an
experiment.

Mathematically, the probability of an event A is denoted by P(A) and defined by

Number of outcomes to 𝐴 𝑛(𝐴)


𝑃(𝐴) = Total number of outcomes of the experiment = .
𝑛(𝑆)

Experiment: Experiment is a process leading to one or more possible observation.

Sample Space: A set S that consists of all possible outcomes of a random experiment is called a sample space,
and each outcome is called a sample point.

Examples: If we toss two coins at a time, the sample space is given by

S= {HH, HT, TH, TT}.

Sample space in tree diagram: (Home work)

Event: An event is a subset A of the sample space S, i.e., it is a set of possible outcomes. If the outcome of an
experiment is an element of A, we say that the event A has occurred. An event consisting of a single point of S is
often called a simple or elementary event.

Example:

Referring to the experiment of tossing a coin twice, let A be the event “at least one head occurs” and B the event
“the second toss results in a tail.” Then A = {HT, TH, HH}, B = {HT, TT}, and so we have

Sample space S= {HH, HT, TH, TT}

𝐴 ∪ 𝐵 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}

𝐴 ∩ 𝐵 = {𝐻𝑇}

𝐴 − 𝐵 = {𝐻𝐻, 𝑇𝐻}

𝐴′ = 𝑆 − 𝐴 = {𝑇𝑇}

Mutually Exclusive events: Two events are said to be mutually exclusive event if the occurrence of one of them
excludes the occurrence of the other.

If 𝐴 and 𝐵 are two mutually exclusive events then 𝑃(𝐴 ∩ 𝐵) = 0.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


28

Axioms of probability:

i) For every event A, 𝑃(𝐴) ≥ 0.


ii) For sure or certain event S, 𝑃(𝑆) = 1.
iii) For a number of mutually exclusive events A1, A2, A3, 𝑃(𝐴1 ∪ 𝐴2 ∪ 𝐴3 ∪. . . ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) +
𝑃(𝐴3 )+. ..
In particular, for two mutually exclusive events 𝑃(𝐴1 ∪ 𝐴2 ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ).

Probability (Lecture 02)

Some important theorem on probability:

i) 𝑃(𝐴2 − 𝐴1 ) = 𝑃(𝐴2 ) − 𝑃(𝐴1 ), where 𝑃(𝐴2 ) ≥ 𝑃(𝐴1 ).


ii) For every event 𝐴, 0 ≤ 𝑃(𝐴) ≤ 1. i.e., probability lies between 0 and 1.
iii) 𝑃(𝛷) = 0. i.e., the probability of an impossible event is 0 (zero).
iv) If 𝐴′ is the complement of 𝐴, 𝑃(𝐴′ ) = 1 − 𝑃(𝐴).
v) If 𝐴 = 𝐴1 ∪ 𝐴2 ∪ 𝐴3 ∪. . .∪ 𝐴𝑛 where 𝐴1 , 𝐴2 , 𝐴3 , . . . , 𝐴𝑛 are mutually exclusive event then 𝑃(𝐴) =
𝑃(𝐴1 ) + 𝑃(𝐴2 ) + 𝑃(𝐴3 )+. . . 𝑃(𝐴𝑛 )
vi) For two events 𝐴 and 𝐵, 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵).
vii) If an event A must result in the occurrence of one of the mutually exclusive events
𝐴1 , 𝐴2 , 𝐴3 , . . . . , 𝐴𝑛 , then 𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐴1 ) + 𝑃(𝐴 ∩ 𝐴2 )+. . . +𝑃(𝐴 ∩ 𝐴𝑛 )

Problem 1: If a coin is tossed what is the probability to get head.

Solution: If a coin is being tossed, the sample space 𝑆 = {𝐻, 𝑇}. Suppose 𝐴 is the event of getting head.
Thus 𝐴 = {𝐻}.

𝑛(𝐴) 1
Now 𝑃(𝐴) = = .
𝑛(𝑆) 2

Problem 2: If 10 of 500 randomly selected cars manufactured at a certain factory are found to be the default.
What is the probability of a randomly chosen car will be default?

Solution: Suppose the total number of cars, 𝑛(𝑆) = 500 and the total number of default cars 𝑛(𝐴) = 10.

𝑛(𝐴) 10 1
Thus 𝑃(𝐴) = = 500 = 50.
𝑛(𝑆)

Problem 3: A single dice is tossed once. Find the probability of a 2 or 5 turning up.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


29

Solution: If a dice is tossed, the sample space 𝑆 = {1, 2, 3, 4, 5, 6}. Suppose A=’turning up 2’ and
B=’turning up 5’.

Now, it is known to us the addition law of probability

1 1 2 1
𝑃(𝐴 or 𝐵) = 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) = 6 + 6 = 6 = 3.

Probability (Lecture 03)

Problem 04: Two coins are tossed “A” is the event of “getting two heads” and “B” is the event of “getting the
second coin shows head”. Evaluate 𝑃(𝐴 ∪ 𝐵).

Solution: If two coins are tossed ones, the sample space could be written as 𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}.
Let A=the event of “getting two heads”= {𝐻𝐻} and
B=the event of “getting the second coin shows head” = {𝐻𝐻, 𝑇𝐻}.
𝐴 ∩ 𝐵 = {𝐻𝐻} ∩ {𝐻𝐻, 𝑇𝐻} = {𝐻𝐻} ≠ {}, implies A and B is not mutually exclusive.
It is known to us for not mutually exclusive event
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
𝑛(𝐴) 𝑛(𝐵) 𝑛(𝐴∩𝐵)
or, 𝑃(𝐴 ∪ 𝐵) = + 𝑛(𝑆) −
𝑛(𝑆) 𝑛(𝑆)
1 2 1 2 1
or, 𝑃(𝐴 ∪ 𝐵) = 4 + 4 − 4 = 4 = 2.

Problem 05: Two coins are tossed “A” is the event of “getting two tails” and “B” is the event of “getting the
second coin shows head”. Evaluate 𝑃(𝐴 ∪ 𝐵). (Do it now, Answer: 3/4)
Problem 06: A dice is tossed once. Assume “A” is the event of getting “odd numbers” and “B” is the event of
getting “the number divisible by 3”. Calculate 𝑃(𝐴 ∪ 𝐵). (HW, Answer: 2/3)

Conditional Probability: Let 𝐴 and 𝐵 be two events such that 𝑃(𝐴) > 0. The probability of an event 𝐵 when
event 𝐴 is occurred is known as conditional probability. It is denoted by 𝑃(𝐵\𝐴). Since 𝐴 is known to have
occurred, it becomes the new sample space replacing the original 𝑆.

From this we are led to the definition

𝑃(𝐴∩𝐵)
𝑃(𝐵\𝐴) = (1)
𝑃(𝐴)

or, 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐵\𝐴)𝑃(𝐴) (2)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


30

Problems Related to Conditional probability:

Problem 1: Compute the probability that a single toss of a dice will result in a number less than 4 if (a) no other
information is given and (b) it is given that the toss resulted in an odd number.

Solution:

Problem 2: Compute the probability that a single toss of a dice will result in a number less than 5 if (a) no other
information is given and (b) it is given that the toss resulted in an even number. (HW)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


31

Probability (Lecture 04)

Problem 03: A study showed that 65% of the managers had some business education background and 50% of
the managers had engineering education background. Furthermore 20% of the managers had some business
education but no engineering education. Calculate the probability that a manager has some business education
when there is given that he has some engineering education.

Solution: Let A be the event for the managers from


business education background and also assume that
B is the event for the managers from engineering 20% 50%
education background. We have to find 𝑃(𝐴\𝐵).

𝑃(𝐴∩𝐵)
We know, 𝑃(𝐴\𝐵) = (1)
𝑃(𝐵)
65%
65 13
𝑃(𝐴) = 65% = 100 = 20
50 1
𝑃(𝐵) = 50% = 100 = 2
45 9
𝑃(𝐴 ∩ 𝐵) = 65% − 20% = 45% = 100 = 20.

Now, from Eq. (1), we have


9
𝑃(𝐴∩𝐵) 20 9 2 9
𝑃(𝐴\𝐵) = = 1 = 20 × 1 = 10.
𝑃(𝐵)
2

Problem 04: A box contains 10 green and 8 red balls. Two balls are drawn successively at random from the box.
Find the probability that both balls will be red, considering both cases with and without replacement of the balls
during drawing.

Solution: A box contains 10 green and 8 red balls. The total number of balls is 10+8=18. If two balls are drawn
successively, two cases are raised.

8
Case I (with replacement): The probability of the first ball is to be red is . If the ball is replaced to the box,
18

the number of red balls and the number of total balls will remain same. Thus, the probability will be same for the
8 8 8 16
second ball drawn 18. Therefore, the probability of the both balls to be red is 18 × 18 = 81.

8
Case II (without replacement): The probability of the first ball is to be red is 18. If the ball is not replaced to the

box, the number of red balls will be 8-1=7 and the number of total balls will be 18-1=17. Thus, the probability of
7 8 7 56
drawing the second ball to be red is 17
. Therefore, the probability of the both balls to be red is 18 × 17 = 306.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


32

Problem 05: A box contains 2 red and 3 blue marbles. Find the probability that if two marbles are drawn at
random successively (without replacement), (a) both are blue, (b) both are red, (c) the first marble is red and the
second is blue. (HW)

Problem 06: Find the probability of drawing 3 aces at random, successively, from a deck of 52 ordinary cards if
the cards are (a) replaced, (b) not replaced. (HW)

Problem 07: Two cards are drawn from a well-shuffled ordinary deck of 52 cards, successively. Find the
probability that they are both aces if the first card is (a) replaced, (b) not replaced. (HW)

Probability (Lecture 05)

Problem 08: Find the probability of getting a 7 or 11 total on either phases of a pair of fair dice.

Solution: If two dice are tossed at random, the sample space is

𝑆 = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),


(2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6),
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6),
(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}

Let A be the event of getting a total 7 and B be the event of getting a total 11.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


33

𝐴 = {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)} and 𝐵 = {(5,6), (6,5)}.

We have to find 𝑃(𝐴 ∪ 𝐵). As the event A and B are mutually exclusive (i.e., 𝐴 ∩ 𝐵 = {})

Then the law of probability is 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

𝑛(𝐴) 𝑛(𝐵)
or, 𝑃(𝐴 ∪ 𝐵) = + 𝑛(𝑆)
𝑛(𝑆)

6 2 8 2
or, 𝑃(𝐴 ∪ 𝐵) = 36 + 36 = 36 = 9.

Problem 09: Find the probability of getting 5 in a face or a greater than 10 total on either of two tosses of a pair
of fair dice. (HW)

Independent event: If 𝑃(𝐵\𝐴) = 𝑃(𝐵), i.e., the probability of 𝐵 occurring is not affected by the occurrence or
non-occurrence of 𝐴, then we say that 𝐴 and 𝐵 are independent events. This is equivalent to 𝑃(𝐴 ∩ 𝐵) =
𝑃(𝐴)𝑃(𝐵).

Bayes’ theorem: Suppose that 𝐴1 , 𝐴2 , . . . , 𝐴𝑛 are mutually exclusive events with 𝑃(𝐴𝑖 ) ≠ 0whose union is the
sample space 𝑆. Then for an event 𝐴 that occurs when the experiment is performed, such that 𝑃(𝐴) > 0, we have

𝑃(𝐴𝑖 )𝑃(𝐴/𝐴𝑖 )
𝑃(𝐴𝑖 /𝐴) = ∑𝑛 .
𝑖=1 𝑃(𝐴𝑖 )𝑃(𝐴/𝐴𝑖 )

Problems related to Bayes’ theorem:

Problem 1: There are three identical boxes containing, 4 white and 3 red balls, 3 white and 7 red balls, 5 white
and 4 red balls, respectively. A box is chosen at random and a ball is drawn from it. If the ball is white, find the
probability that is of the first box.

Probability Distribution (Lecture 06)

Random Variable: Suppose we have a sample space 𝑆. Let we be assigning a number for each sample points.
Thus, we can define a function on sample space. This function is called random variable or more precisely random
function (Stochastic function). It is usually denoted by capital letter such as 𝑋 or 𝑌.

Example:

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


34

Probability distribution:

Probability density function:

Mathematical Expectation

Definition: Mathematical expectation or expected value or expectation of random variable for a discrete
random variable 𝑋 with the possible values 𝑥1 , 𝑥2 , 𝑥3 , . . . , 𝑥𝑛 is denoted as 𝐸(𝑋) and defined as

𝐸(𝑋) = ∑𝑛𝑖=1 𝑥𝑖 𝑃(𝑋 = 𝑥𝑖 ), where 𝑃(𝑋 = 𝑥𝑖 ) is probability distribution function.

If the random variable is continuous


Some theorems:

1. If 𝑋 is a random variable and 𝑐is a constant, then 𝐸(𝑐𝑋) = 𝑐𝐸(𝑋).


2. If 𝑋 and 𝑌are two random variables, then 𝐸(𝑋 + 𝑌) = 𝐸(𝑋) + 𝐸(𝑌).
3. If 𝑋 and 𝑌are two random variables, then 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌).
4. If 𝑎 and 𝑏 are two constants, show that 𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏; where 𝑋 is a random variable.

Problems:

1. Let 𝑋 be a random variable with given a probability distribution


𝑋 0 1 2 3
𝑃(𝑋 = 𝑥) 0.2 0.4 0.25 0.15
Find the expected value or mathematical expectation or expectation.

Solution: For random variable X it is given that

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


35

𝑋 0 1 2 3
𝑃(𝑋 = 𝑥) 0.2 0.4 0.25 0.15
We know the expected value or mathematical expectation or expectation
𝐸(𝑋) = ∑𝑛𝑖=1 𝑥𝑖 𝑃(𝑋 = 𝑥𝑖 )
or, 𝐸(𝑋) = 𝑥1 𝑃(𝑥1 ) + 𝑥2𝑃(𝑥2 ) + 𝑥3 𝑃(𝑥3) + 𝑥4𝑃(𝑥4 )
or, 𝐸(𝑋) = 0(0.2) + 1(0.4) + 2(0.25) + 3(0.15)
or, 𝐸(𝑋) = 0 + 0.4 + 0.50 + 0.45
or, 𝐸(𝑋) = 1.35
2. Let 𝑋 be a random variable with probability distribution function as follows:

𝑋 0 1 2 3
𝑓(𝑋 = 𝑥) 1 1 0 1
3 2 6
Find the values of 𝐸(𝑋), 𝐸(𝑎𝑋 + 𝑏) and 𝐸((𝑋 − 1)2 ).
Solution: For random variable 𝑋

𝑋 0 1 2 3
𝑓(𝑋 = 𝑥) 1 1 0 1
3 2 6

We know, 𝐸(𝑋) = 𝑥1𝑓(𝑥1 ) + 𝑥2 𝑓(𝑥2 ) + 𝑥3 𝑓(𝑥3) + 𝑥4 𝑓(𝑥4 )


1 1 1
or, 𝐸(𝑋) = 0. 3 + 1. 2 + 2.0 + 3. 6
1 1
or, 𝐸(𝑋) = 0. + 2 + 2 = 1

Again 𝐸(𝑎𝑋 + 𝑏) = (𝑎𝑥1 + 𝑏)𝑓(𝑥1) + (𝑎𝑥2 + 𝑏)𝑓(𝑥2 ) + (𝑎𝑥3 + 𝑏)𝑓(𝑥3 ) + (𝑎𝑥4 + 𝑏)𝑓(𝑥4 )
1 1 1
or, 𝐸(𝑎𝑋 + 𝑏) = (𝑎. 0 + 𝑏) 3 + (𝑎. 1 + 𝑏) 2 + (𝑎. 2 + 𝑏).0 + (𝑎. 3 + 𝑏) 6
1 1 1 1 1
or, 𝐸(𝑎𝑋 + 𝑏) = 3 𝑏 + 2 𝑎 + 2 𝑏 + 2 𝑎 + 6 𝑏

or, 𝐸(𝑎𝑋 + 𝑏) = 𝑎 + 𝑏

Further, 𝐸((𝑋 − 1)2 ) = (𝑥1 − 1)2 𝑓(𝑥1) + (𝑥2 − 1)2 𝑓(𝑥2 ) + (𝑥3 − 1)2 𝑓(𝑥3) + (𝑥4 − 1)2 𝑓(𝑥4 )
1 1 1
or, 𝐸((𝑋 − 1)2 ) = (0 − 1)2 + (1 − 1)2 + (2 − 1)2 . 0 + (3 − 1)2
3 2 6
1 1 21
or, 𝐸((𝑋 − 1)2 ) = 1. 3 + 0. 2 + 2 6
1 4
or, 𝐸((𝑋 − 1)2 ) = 3 + 6

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


36

or, 𝐸((𝑋 − 1)2 ) = 1

3. A random variable X has a probability function as given below (HW)


Value of 𝑋: 𝑥 -3 -2 -1 0 1 2 3
𝑃(𝑋 = 𝑥) = 𝑓(𝑥) 0.1 0.15 0.25 0.125 0.35 0.25 0.2
Find (𝑖) 𝐸(𝑥) (𝑖𝑖)𝐸(𝑋 2 ) 𝑎𝑛𝑑 (𝑖𝑖𝑖)𝐸(3𝑋 + 1)

Lecture 07 (Mathematical Expectation)

Lecture 08-09 (Binomial Distribution)

Definition:

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


37

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


38

Problem 03: The mean of a binomial distribution is 20 and the standard deviation is 4. Compute n, p, q.
Solution: It is given that the mean and standard deviation of a binomial distribution are 20 and 4, respectively.
For binomial distribution, it is also known to us Mean, 𝜇 = 𝑛𝑝 and standard deviation, 𝜎 = √𝑛𝑝𝑞.
Thus 𝑛𝑝 = 20.
And √𝑛𝑝𝑞 = 4
or, 𝑛𝑝𝑞 = 16
or, 20𝑞 = 16
16
or, 𝑞 = 20
4
or, 𝑞 = 5.

4 1
For binomial distribution, 𝑝 + 𝑞 = 1, or, 𝑝 = 1 − 𝑞 = 1 − 5 = 5.
20 20
Again 𝑛 = = 1 = 20 × 5 = 100.
𝑝
5

Problem 04: The mean of a binomial distribution is 40 and the standard deviation is 5. Compute n, p, q. (HW)

Problem 05: The mean of a binomial distribution is 40 and the variance is 20. Compute n, p, q. (HW)

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB


39

Normal Distribution

Definition:

Mean: 𝜇 = 𝑚; variance: 𝜎 2 = 𝑚

Poisson Distribution

𝑒 −𝑚 𝑚 𝑥
Definition: The Poisson distribution is given by 𝑓(𝑥) = 𝑃(𝑋 = 𝑥) = ; 𝑥 = 0,1,2, . . . , 𝑚. Where 𝑚is called
𝑥!

parameter of the distribution and is the average number of occurrences of the random event 𝑥.

Mean: 𝜇 = 𝑚; variance: 𝜎 2 = 𝑚

Test of hypothesis

𝑋 0 1 2 3 4
𝑃(𝑋 = 𝑥) 0.1 0.35 0.25 0.15 0.15

If 𝐴 and 𝐵 are two mutually exclusive events then 𝑃(𝐴 ∩ 𝐵) is __.

Fatema Khatun, Lecturer (Mathematics), DCSE, NUB

You might also like