0% found this document useful (0 votes)
26 views32 pages

Statistics and Probability For CDW

Uploaded by

charleskazeze3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views32 pages

Statistics and Probability For CDW

Uploaded by

charleskazeze3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

WATER INSTITUTE

Department of General Studies


Lecture Notes
Statistics and Probability

A) Statistics
i) Introduction
a) What is statistics?
Statistics is a field of study concerned with collecting, summarising/organising,
analysing, presenting, interpreting data and making decisions based on data. There are
two basic forms: descriptive statistics and inferential statistics.
 Descriptive Statistics is primarily about summarizing a given data set through
numerical summaries and graphs and can be used for exploratory analysis to
visualize the information contained in the data and suggest hypotheses etc.
 Inferential Statistics is concerned with methods for making conclusions about a
population using information from a sample and assessing the reliability of, and
uncertainty in, these conclusions. This allows us to make judgements in the
presence of uncertainty and variability, which is extremely important in
underpinning evidence-based decision-making in science, government, business
etc.

b) What is Population?
A population is the collection of all individuals or items under consideration in the study.
The entire set of possible observations in which we are interested. For example, consider
the following populations together with corresponding variables of interest:
 All adults in Tanzania who are eligible to vote; the variable of interest is the
political party supported.
 Car batteries of a particular type manufactured by a particular company; the
variable of interest is the lifetime of the battery before failure.
 All adult males working full-time at Water Institute; the variable of interest is the
person’s gross income.

1
 All potential possible outcomes of a planned laboratory experiment; the variable
of interest is the value of a particular measurement.

Gathering all data is not always possible due to barriers such as time, accessibility, or
cost. Instead of that, we often gather information from a smaller subset of the population,
known as a sample.

c) What is Sample?
The sample is a subset of the population from which information is actually collected.
We then use the characteristics of the sample to estimate the characteristics of the
population. In order for this procedure to give a good estimate, the sample must be
representative of the population. Otherwise, if an unrepresentative or ‘biased’ sample is
used the conclusions will be systematically incorrect.

 Sampling Techniques
They are two techniques of sampling: Probability and Nonprobability Sampling
Techniques.
Probability sampling is a technique in which every unit in the population has a chance
(greater than zero) of being selected in the sample, and this probability can be
accurately determined. The combination of these traits makes it possible to produce
unbiased estimates of population totals, by weighting sampled units according to their
probability of selection.

2
Nonprobability sampling is any sampling method where some elements of the
population have no chance of selection (these are sometimes referred to as 'out of
coverage'/'under covered'), or where the probability of selection can't be accurately
determined. It involves the selection of elements based on assumptions regarding the
population of interest, which forms the criteria for selection. Hence, because the
selection of elements is nonrandom, nonprobability sampling does not allow the
estimation of sampling errors. These conditions give rise to exclusion bias, placing
limits on how much information a sample can provide about the population.
Information about the relationship between the sample and the population is limited,
making it difficult to extrapolate from the sample to the population.
d) Variables
Variables are properties or characteristics of some event, object, or person that can take
on different values or amounts;
Variables may be:
 Independent or Dependent: The experimenter manipulates the independent
variable and its effects on the dependent variable are measured.
 Discrete or Continuous:
Discrete variables can take only certain values. For example, a household could
have three children or six children, but not 4.53 children.
Continuous variables can take any value within the range of the scale. For
example, “time to respond to a question” are continuous variables since the
scale is continuous and not made up of discrete steps, say, the response time
could be 1.64 seconds.
 Qualitative or Quantitative:
Qualitative variables are those that express a qualitative attribute such as hair
colour, eye colour, religion, favourite movie, gender, and so on.
The values of a qualitative variable do not imply a numerical ordering.
Quantitative variables are those variables that are measured in terms of
numbers. Some examples of quantitative variables are height, weight, and shoe
size.

ii) Statistical Graphs


(a) Ungrouped Data
3
 Bar Chart
A bar chart (bar graph, column chart) plots numeric values for levels of a categorical feature
as bars. Levels are plotted on one chart axis, and values are plotted on the other axis. Each
categorical value claims one bar, and the length of each bar corresponds to the bar’s value.
Bars are plotted on a common baseline to allow for easy comparison of values.

For example, the following plot counts page views over a period of six months. You can see
from this visualization that there was a small peak in June and July before returning to the
previous baseline.

 Line Chart
The line chart is a simple, two-dimensional chart with an X and Y-axis, each point
representing a single value. A line to depict a trend, usually over time, joins the data points.
The horizontal axis depicts a continuous progression often that of time, while the vertical
axis reports values for a metric of interest across that progression.
In the experimental sciences, data collected from experiments are often visualized by a
graph. For example, if one collects data on the speed of an object at certain points in time,
one can visualize the data and represent them in line chart as follows.

4
Multiple lines can also be plotted in a single-line chart to compare the trend between series.
A common use case for this is to observe the breakdown of the data across different
subgroups. The ability to plot multiple lines also provides the line chart a special use case
where it might not usually be selected. Normally, we would use a histogram to depict the
frequency distribution of a single numeric variable. However, since it’s tricky to plot two
histograms on the same set of axes, the line chart serves as a good mode of comparison as a
substitute. Line charts used to depict frequency distributions are often called frequency
polygons

 Pie Chart
5
The “pie chart” is also known as a “circle chart”, dividing the circular statistical graphic into
sectors or sections to illustrate the numerical problems. Each sector denotes a proportionate
part of the whole. To find out the composition of something, Pie chart works best at that
time. In most cases, pie charts replace other graphs like bar graphs, line plots, histograms,
etc.

Formula
The pie chart is an important type of data representation. It contains different segments and
sectors in which each segment and sector of a pie chart forms a specific portion of the total
(percentage). The sum of all the data is equal to 360°. The total value of the pie is always
100%.
To work out with the percentage for a pie chart, follow the steps given below:
 Categorize the data
 Calculate the total
 Divide the categories
 Finally, calculate the degrees
 You may also convert into percentages for easy interpretation

6
Therefore, the pie chart formula is given as:

Given Data
×360 °
Total value of Data

Example

The percentages of various crops cultivated in a village of particular distinct are given in the
following table.

Items Wheat Pulses Jowar Groundnuts Vegetables Total

Percentage of crops 125/3 125/6 25/2 50/3 25/3 100

Represent this information using a pie chart.

Solution:

Given Data
The central angle= ×360 °
Total value of Data

The central angle for each category is calculated as follows

Items Percentage of crops Central angle

Wheat 125/3 [(125/3)/100] × 360° = 150°

Pulses 125/6 [(125/6)/100] × 360° = 75°

Jowar 25/2 [(25/2)/100] × 360° = 45°

Groundnuts 50/3 [(50/3)/100] × 360° = 60°

Vegetables 25/3 [(25/3)/100] × 360° = 30°

Total 100 360°


Now, the pie-chart can be constructed by using the given data.

7
Steps to construct:

Step 1: Draw the circle of an appropriate radius.

Step 2: Draw a vertical radius anywhere inside the circle.

Step 3: Choose the largest central angle. Construct a sector of a central angle, whose one
radius coincides with the radius drawn in step 2, and the other radius is in the clockwise
direction to the vertical radius.

Step 4: Construct other sectors representing other values in the clockwise direction in
descending order of magnitudes of their central angles.

Step 5: Shade the sectors obtained by different colours and label them as shown in the figure
below.

When you should use a pie chart?


Pie charts have a fairly narrow use case that is encapsulated particularly well by its definition.
In order to use a pie chart, you must have some kind of whole amount that is divided into a
number of distinct parts. Your primary objective in a pie chart should be to compare each
group’s contribution to the whole, as opposed to comparing groups to each other. If the above
points are not satisfied, the pie chart is not appropriate, and a different plot type should be
used instead.
The values that comprise a whole and the categories that divide the whole generally come in
two major varieties. First is when the ‘whole’ represents a total count. Examples of this
include votes in an election divided by candidate, or the number of transactions divided by
user type (e.g. guest, new user and existing user).
A second type of ‘whole’ is when the total is a sum over an actual data variable. For example,
we might be interested not in the number of transactions, but in the monetary total from all

8
transactions. Dividing this total by an attribute like user type, age bracket, or location might
provide insights as to where the business is most successful.

(b) Grouped Data

Class Size
In statistics, class size refers to the difference between a class’s upper and lower boundaries
in a frequency distribution.

The class ¿ Actual upper class boundaries – Actual lower class boundaries=Difference of class boundaries

For example the class size of the overlapping interval 10 - 20

= Actual upper boundary – Actual lower boundary

= 20 – 10

= 10.

Class Interval
Class Interval: It is defined as the size of each class of numerical data in a large frequency
distribution following a specific width. For example, if the raw data has too many variations
in numbers, we make groups of intervals to organize the data such as 0-10, 10-20, 20-30, etc.
These are known as class intervals.
Upper boundary/Limit: It is the highest value of the class interval. There could be no item
greater than the upper boundary in that particular class. For example, the upper boundary of
30-40 is 40. It is known as the upper-class boundary.
Lower boundary/Limit: It is the lowest value of the class interval. No item could be less
than the lower boundary/limit in that class. For example, the lower boundary/limit of 30-40 is
30. It is known as the lower class boundary.

Class Boundaries
In statistics, class boundaries are endpoints used to separate the data into classes or groups.
The boundary with the lower value is called the lower boundary while the one with a higher
value is called the upper boundary. Class boundaries are typically applied to continuous
datasets.

Class Limit

9
Corresponding to a class interval, the class limits may be defined as the minimum value and
the maximum value the class interval may contain. The minimum value is known as the
lower-class limit (LCL) and the maximum value is known as the upper-class limit (UCL).

To understand class limits and class boundaries in statistics, let us consider the data recorded
in Table 1 and Table 2 below.
Table 1: Grouped Data Presented in Class Limit and Class Boundary

Table 2: Grouped Data Presented in Class Limit, Frequency and Class Boundary

Class Limit Frequency Class Boundaries

10
Class Marks

The class mark in a frequency distribution is the midpoint or the middle value of a given
class. For example, the class mark of 10-20 is 15, as 15 is the mid-value that lies between 10
and 20. In statistics, the class mark is used at various places, for example, while calculating
mean, drawing line graphs, finding the average of each class in a frequency distribution, etc.
It is very easy to calculate class marks by using a formula that you will learn in the section
below.

The formula to calculate class mark in a frequency distribution is given as (upper limit +
lower limit)/2 or (Sum of class boundaries)/2. By using this class mark formula, you can
easily find the midpoint of any given class interval. Let use data presented in table 2 to
determine class mark.

Table 3: Class Interval with Class Marks


Class Limit Class Mark from Frequency Class Class Mark from
Limit Boundaries Boundaries
56 – 60.9 (60.9+56)/2=58.45 13 55.95 – 60.95 (55.95+ 60.95)/2=58.45
61 – 65.9 65.45 17 60.95 – 65.95 65.45
66 – 70.9 68.45 19 65.95 – 70.95 68.45
71 – 75.9 73.45 14 70.95 – 75.95 73.45
76 – 80.9 78.45 8 75.95 – 80.95 78.45
Hence, either of the two can be used to compute the class mark.

Histogram
A histogram is a graph used to represent the frequency distribution of a few data points of one
variable. Histograms often classify data into various “bins” or “range groups” and count how
many data points belong to each of those bins. A histogram can be defined also as a set of
rectangles with bases along with the intervals between class boundaries. Each rectangle bar
depicts some sort of data and all the rectangles are adjacent. The heights of rectangles are
proportional to corresponding frequencies of similar as well as for different classes.

It is the graphical representation of data where data is grouped into continuous number ranges
and each range corresponds to a vertical bar.

11
 The horizontal axis displays the number range.
 The vertical axis (frequency) represents the amount of data that is present in each
range.

The number ranges depend upon the data that is being used.

For example

Michael owns a garden with 30 mango trees. Each tree is of a different height. The height of
the trees (in feet): 61, 63, 64, 66, 68, 69, 71, 71.5, 72, 72.5, 73, 73.5, 74, 74.5, 76, 76.2, 76.5,
77, 77.5, 78, 78.5, 79, 79.2, 80, 81, 82, 83, 84, 85, 87. We can group the data as follows in
a frequency distribution table by setting a range:

Height Range (ft) Number of Trees


(Frequency)

60 - 65 3

66 - 70 3

71 - 75 8

76 - 80 10

81 - 85 5

86 - 90 1

This data can be now shown using a histogram. We need to make sure that while plotting a
histogram, there shouldn’t be any gaps between the bars.

12
Difference Between a Bar Chart and a Histogram

The fundamental difference between histograms and bar graphs from a visual aspect is that
bars in a bar graph are not adjacent to each other. A bar graph is the graphical representation
of categorical data using rectangular bars where the length of each bar is proportional to the
value they represent. A histogram is the graphical representation of data where data is
grouped into continuous number ranges and each range corresponds to a vertical bar.

13
14
B) Probability
i) Introduction

Every day, decisions are made that involve uncertainty about the
outcome. The ability to estimate and understand probability helps us
make good decisions. Examples of probability used in everyday life
include the probability that it will rain today and the probability of
winning the lottery. Many events cannot be predicted with total
certainty. We can predict only the chance of an event to occur i.e.,
how likely they are going to happen, using it. Probability can range
from 0 to 1, where 0 means the event is an impossible one and 1
indicates a certain event. Probability is an important topic for the
students which explains all the basic concepts of this topic.

Probability has been introduced in Mathematics to predict how likely


events are to happen. The meaning of probability is basically the
extent to which something is likely to happen. This is the basic
probability theory, which is also used in probability distribution, where
you will learn the possibility of outcomes for a random experiment.

a) Probability Terms

Probability is a measure of the likelihood of an event occurring. It is


a numerical measure that is associated with how certain we are of the
outcomes of a particular experiment or activity. Is a measure that is
associated with how certain we are of the outcomes of a particular
experiment or activity.

To find the probability of a single event occurring, first, we should


know the total number of possible outcomes. For example, when we
toss a coin, either we get a Head or Tail; only two possible outcomes

15
are possible (H, T). But when two coins are tossed then there will be
four possible outcomes, i.e. {(H, H), (H, T), (T, H), (T, T)}.

To calculate the probability of an event A when all outcomes in the


sample space are equally likely, count the number of outcomes for
event A and divide by the total number of outcomes in the sample
space.

number of outcomes∈event
P( A)=
A total number of outcomes ∈the sample space

Example
Suppose a coin is flipped two times.

(a)What is the probability of getting “exactly one head?”


(b)What is the probability of getting “at least one tail?”

Solution:
Previously, we found the sample space for this
experiment: S={HH , HT , TH , TT }

(a)The outcomes in the event “exactly one head” are HT and TH.
We see that there are 2 outcomes in the event out of the 4
possible outcomes in the sample space. So
2
P(exactly one head )= =0.5
4

(b)The outcomes in the event “at least one tail” are HT, TH,
and TT. We see that there are 3 outcomes in the event out of
the 4 possible outcomes in the sample space. So
3
P(at least one tail )= =0.75
4

An experiment is a planned operation carried out under controlled


conditions. If the result is not predetermined, then the experiment is
said to be a chance experiment. Flipping one fair coin twice is an
example of an experiment.

16
This important characteristic of probability experiments is known as
the law of large numbers which states that as the number of
repetitions of an experiment is increased, the relative frequency
obtained in the experiment tends to become closer and closer to the
theoretical probability. Even though the outcomes do not happen
according to any set pattern or order, overall, the long-term observed
relative frequency will approach the theoretical probability. (The word
empirical is often used instead of the word observed).

Term Definition Example

Sample Space The set of all the possible 1. Tossing a coin, Sample Space (S) = {H,T}
outcomes to occur in any 2. Rolling a die, Sample Space (S) =
trial {1,2,3,4,5,6}

Sample Point It is one of the possible In a deck of Cards:


results
 4 of hearts is a sample point.
 The queen of clubs is a sample point.

Experiment or A series of actions where The tossing of a coin, Selecting a card from a deck
Trial the outcomes are always of cards, throwing a dice.
uncertain.

Event It is a single outcome of Getting a Heads while tossing a coin is an event.


an experiment.

Outcome Possible result of a T (tail) is a possible outcome when a coin is


trial/experiment tossed.

Complimentary The non-happening In a standard 52-card deck, A = Draw a heart, then


event events. The complement Aʹ= Don’t draw a heart
of an event A is the
event, not A (or Aʹ)

Impossible The event cannot happen In tossing a coin, impossible to get both head and
Event tail at the same time

"∪" Event: The Union: An outcome is in the event A ∪ B


If the outcome is in A or is in B or is in both A and B. For example, let
A={1 , 2, 3 , 4 ,5 } and B={4 ,5 , 6 , 7 , 8 }

17
 A ∪ B={1, 2 , 3 , 4 , 5 ,6 ,7 ,8 }
 Notice that 4 and 5 are NOT listed twice.

"∩" Event: The Intersection: An outcome is in the event A ∩ B


If the outcome is in both A and B at the same time. For example, let A
and B be {1 , 2 ,3 , 4 , 5} and{4 , 5 , 6 ,7 ,8 }, respectively. Then A ∩ B={4 ,5}

The complement of event A is denoted A′ (read "A prime").


A′ consists of all outcomes that are NOT in A.
Notice that P( A)+ P( A ')=1
→ P ( A ' ) =1−P( A)
For example, let S={1 ,2 , 3 , 4 ,5 , 6 } and let A={1 , 2, 3 , 4 }
Then, A ' ={5 , 6 }

The conditional probability of A given B is written P( A∨B)


P( A∨B) is the probability that event A will occur given that event B

has already occurred. A conditional reduces the sample space. We


calculate the probability of A from the reduced sample space B.

The formula to calculate P( A∨B) is


P( A ∩ B)
P( A∨B)=
P( B)
Where P ¿) is greater than zero.

For example, suppose we toss one fair, six-sided die. The sample
space S={1 ,2 , 3 , 4 ,5 , 6 }. Let A=face is 2∨3 and B=face iseven (2 , 4 ,6).

To calculate P( A∨B) we count the number of outcomes 2 or 3 in the


sample space S={1 ,2 , 3 , 4 ,5 , 6 }. Then we divide that by the number of
outcomes B, B={2 , 4 , 6 } (rather than S). We get the same result by
using the formula. Remember that S has six outcomes.

18
(the number of outcomes that are 2∨3∧even∈B)
number of sample space
P( A∨B)=
(the number of outcomes that are even∈ B)
number of sample space
1
6 1
P ( A|B )= =
3 3
6

Similarly:
(the number of outcomes that are 2∨3∧even∈B)
P ( A|B )=
the number of event that ∈ B
1
P ( A|B )=
3

Odds
The odds of an event present the probability as a ratio of success to
failure. This is common in various gambling formats. Mathematically,
the odds of an event can be defined as:
P( A)
1−P( A)
Where P(A) is the probability of success and of course, 1−P ¿) is the
probability of failure.

b) Probability Axioms

 Axiom 1: For any event A , P( A)≥ 0.


 Axiom 2: Probability of the sample space S is P(S)=1
 Axiom 3: If A1 , A 2 , A3 , ⋯ , ⋯ are disjoint events, then

P( A 1 ∪ A 2∪ A 3 ⋯ )=P( A 1)+ P (A 2)+ P( A 3)+ ⋯

 Axiom 4: P ( A ' ∩ B' ) =P ( A ∪ B )

P ( A ' ∪ B' )=P ( A ∩ B )

19
Example
In a presidential election, there are four candidates. Call them A, B, C,
and D. Based on our polling analysis, we estimate that A has
a 20 percent chance of winning the election, while B has a 40 percent
chance of winning. What is the probability that A or B win the election?

Solution

Notice that the events that {A wins}, {B wins}, {C wins}, and {D wins} are
disjoint since more than one of them cannot occur at the same time. For
example, if A wins, then B cannot win. From the third axiom of
probability, the probability of the union of two disjoint events is the
summation of individual probabilities. Therefore,

P( A wins∨B wins )=P({ A wins }∪ {B wins })

¿ P({ A wins })+ P ({B wins })


¿ 0.2+0.4=0.2+ 0.4
¿ 0.6

c) Probability Theorems

i) Theorem 1: Addition Theorem on Probability

If A and B are any two events that are mutually inclusive (adjoin
event), then
P( A ∪ B)=P( A)+ P( B)−P( A ∩ B)

The above theorem can be extended to any 3 events.


20
P ( A ∪ B ∪C )=[ P ( A ) + P ( B )+ P ( C ) ] −[P ( A ∩ B ) + P ( B ∩C ) + P (C ∩ A ) ]+ P( A ∩ B ∩C)

Example 1

Given that P ( A )=0.52; P ( B )=0.43∧P ( A ∩ B ) =0.24

Find (a) P ( A ∪ B ) (b) P( A ' ∪ B ' )

Solution

a) P( A ∪ B)=P( A)+ P( B)−P( A ∩ B)


¿ 0.52+0.43−0.24

¿ 0.71

b) P ( A ' ∪ B' )=P ( A ∩ B )=1−P ( A ∩ B )


¿ 1−0.24
¿ 0.76

ii) Theorem 2: Mutually Independent Event

Let A and B be two events associated with a sample space S. If


the probability of occurrence of one of them is not affected by
the occurrence of the other, then we say that the two events are
independent. Thus, two events A and B will be independent, if

a) P(F∨E)=P (F ), provided P (E) ≠ 0

b) P(E∨F)=P ( E), provided P (F) ≠ 0

c) P( A ∩ B)=P( A) P(B).

The above theorem can be extended to any 3 events.


P( A ∩ B ∩C)=P( A )P(B) P(C).

Example 2

Let A and B be two independent events such that P( A)=0.2 and


P(B)=0.8 . Find P( A ∩ B), P( A ∪ B), P(B ∩ A '), and P( A ' ∩ B ' ).

Solutions:

21
Given P( A)=0.2 and P(B)=0.8 and events A and B are
independent of each other.
P( A ∩ B)=P( A) P(B)=0.2× 0.8=0.16 .

P( A ∪ B)=P( A)+ P( B)– P( A ∩ B)=0.2+0.8 – 0.16=0.84 .

P ( B ∩ A ' )=P ( B ) P ( A' )=P ( B ) [ 1−P ( A ) ] =P ( B )−P ( A ) P ( B ) =P ( B ) – P ( A ∩ B )=0.8 – 0.16=0.64 .

P ( A ' ∩ B' ) =P ( A ∪ B )=1 – P (A ∪ B)=1 – 0.84=0.16 .

Example 3

The probability that a girl, preparing for competitive examination


will get a State Government service is 0.12, the probability that
she will get a Central Government job is 0.25, and the
probability that she will get both is 0.07. Find the probability that
(i) she will get at least one of the two jobs (ii) she will get only
one of the two jobs.

Solution

Let A be the event of getting State Government service and


B be the event of getting Central Government job.

Given that P( A)=0.12, P(B)=0.25, and P( A ∩ B)=0.07

(i) P ( at least one of thetwo jobs ) ¿ P ( A∨B )


¿ P ( A ∪B )
¿ P( A)+ P(B)−P( A ∩ B)
¿ 0.12+0.25−0.07
¿ 0.30

(ii) P(only one of the two jobs)=P [only A∨only B].

¿ P ( A ∩B ' ) + P ( A ' ∩ B )
¿ {P( A )−P( A ∩ B)}+{P(B)−P( A ∩ B)}
¿ {0.12−0.07 }+{0.25−0.07 }

¿ 0.23 .

22
Example 4

A bag consists of 3 red balls, 5 blue balls, and 8 green balls. A


ball is selected at random. Find the probability of

a) Getting a red ball.

b) Getting a green ball.

c) Not getting a blue ball.

Solution

a) Total number of the balls n ( S )=3+5+8=16.

Let R be the event of getting a red ball.

The number of favourable outcomes n(R)=3.

The required probability is P(R)=3 ⁄ 16

b) Let G be the event of getting a green ball.

The number of favourable outcomes n(G)=8.

The required probability is P(G)=8 ⁄ 16=½

c) Let B be the event of getting a blue ball.

The number of favourable outcomes n(B)=5.

The required probability is P(B)=5 ⁄ 16.

The probability of not getting a blue ball

P(B ' )=1−P (B)

¿ 1 – 5 ⁄ 16

23
¿ 11 ⁄ 16

Also, the event of not getting a blue ball is the same as getting a
red or green ball.

P ( B ' ) =P ( R ) + P (G )

3 8
¿ +
16 16

¿ 11 ⁄ 16.

d) Probability Distributions

I. Random Variables

A variable is something which can change its value. It may vary


with different outcomes of an experiment. If the value of a
variable depends upon the outcome of a random experiment it
is a random variable. A random variable can take up any real
value.

Mathematically, a random variable is a real-valued function


whose domain is a sample space S of a random experiment. A

24
random variable is always denoted by capital letter like X, Y, M
etc. The lowercase letters like x, y, z, m etc. represent the value
of the random variable.

Consider the random experiment of tossing a coin 20 times. You


will earn TZs. 50,000 if you get a head and will lose TZs. 50,000
if it is a tail. You and your friend are all set to see who will win
the game by earning more money.

Here, we see that the value of getting a head for the coin toss
20 times is anything from zero to twenty. If we denote the
number of a head by X, then X ={ 0 , 1, 2 , … , 20 }. The probability of
getting a head is always ½.

a) Properties of a Random Variable

 It only takes the real value.

 If X is a random variable and C is a constant, then CX is


also a random variable.

 If X 1and X 2 are two random variables, then X 1 + X 1 and X 1 X 2


are also random.

 For any constants C 1 and C 2, C 1 X 1 + C 2 X 2 is also random.

 ¿ X ∨¿ is a random variable.

b) Types of Random Variables

A random variable can be categorized into two types.

(i) Discrete Random Variable

25
As the name suggests, this variable is not connected or
continuous. A variable which can only assume a countable
number of real values i.e., the value of the discrete random
sample is discrete in nature. The value of the random variable
depends on chance. In other words, a real-valued function
defined on a discrete sample space is a discrete random
variable.

The number of calls a person gets in a day, the number of items


sold by a company, the number of items manufactured, the
number of accidents, the number of gifts received on birthdays
etc. are some of the discrete random variables.

(ii) Continuous Random variable

A variable which assumes infinite values of the sample space is


a continuous random variable. It can take all possible values
between certain limits. It can also take integral as well as
fractional values. The height, weight, age of a person, the
distance between two cities etc. are some of the continuous
random variables.

II. Probability Distribution of Random Variable

For any event of a random experiment, we can find its


corresponding probability. For different values of the random
variable, we can find its respective probability. The values of
random variables along with the corresponding probabilities are
the probability distribution of the random variable.

Assume X is a random variable.

A function P(X) is the probability distribution of X.


26
Any function F defined for all real x by F (x)=P (X ≤ x ) is called the
distribution function of the random variable X.

a) Properties of Probability Distribution of Random Variable

 The probability distribution of a random variable X is

P( X=x i)=p i for x=x i and P( X=x i)=0 for x ≠ x i.

 The range of probability distribution for all possible values


of a random variable is from 0 to 1, i.e., 0 ≤ p (x)≤1.

b) Probability Distribution of a Discrete Random Variable

If X is a discrete random variable with discrete values x 1, x2, …, xn,


… then the probability function is P(x )= p X (x).

The distribution function is F X (x )=P(X ≤ x i)=∑ i p( xi )= pi

If x=x i and is 0 for other values of x. Here, i=1 , 2 ,… , n , …

Consider an example of the tossing of two fair coins. The possible


outcomes for this random experiment are S = {HH, HT, TH, TT}. If
X is a random variable for the occurrence of the tail, the possible
values for X are 0, 1, and 2. The distribution function for X is
F (x)=P (X ≤ x ) is

Value of X 0 1 2

1 2 1
P( X=x)=p (x)
4 4 4

1 3 4
F (X )=P( X ≤ x)=∑ i p(x i) =1
4 4 4

27
c) Probability Distribution of a Continuous Random
Variable

If X is a discrete random variable with discrete values


x 1 , x 2 ,… , x n , … then the probability distribution function is
F (x)= p X (x i). The distribution function for a continuous
random variable is F X (x)= ∫ p X (x i)dx where i=1 , 2 ,… , n , …
Example 1:

Three fair coins are tossed. Let X = the number of heads, Y = the
number of head runs. (A ‘head run’ is a consecutive occurrence of
at least two heads.) Find the probability function of X and Y.

Solution:

The possible outcomes of the experiment is S = {HHH, HHT, HTH,


HTT, THH, THT, TTH, TTT}. X is the number of heads. It takes up
the values 0, 1, 2, and 3.

Event Random Variable


x y
HHH 3 1
HHT 2 1
HTH 2 0
HTT 1 0
THH 2 1
THT 1 0
TTH 1 0

28
TTT 0 0

1
 P(no head)= p (0)= 8
3
 P(one head )= p(1)= 8
3
 P(two heads)= p(2)= 8
1
 P(three heads)= p (3)= 8

Value of X, x 0 1 2 3
p(x) 1 3 3 1
8 8 8 8

Y is the number of head runs. It takes up the values 0 and 1.

5
P(Y =0)= p(0)= ,∧¿
8

3
P(Y =1)=p (1)= .
8

Value of Y, y 0 1
p(y) 5 3
8 8

d) Mean and Variance of Random Variables

According to some report, the average time spent by a person on


their mobile phone is four hours. What does it mean? What is
average? Does it mean that every day a person spends four hours
of his day on mobile? Or does it mean that every person spends
four hours daily on a mobile phone? How is the time spent by
different persons vary from each other? This gives rise to a new
concept in probability and statistics. This is the mean and the
variability is the variance in probability and statistics.
29
(i) Mean of Random Variables

In probability and statistics, we can find out the average of a


random variable. The term average is the mean or the expected
value or the expectation in probability and statistics. Once we have
calculated the probability distribution for a random variable, we can
calculate its expected value. Mean of a random variable shows the
location or the central tendency of the random variable.

The expectation or the mean of a discrete random variable is a


weighted average of all possible values of the random variable.
The weights are the probabilities associated with the
corresponding values. It is calculated as,

E( X )=μ=Σ x i pi i=1 ,2 , … , n

E ( X )=x 1 p1 + x 2 p2 + x 3 p3 +…+ x n pn .

Properties of Mean of Random Variables

 If X and Y are random variables, then E ( X +Y )=E ( X ) + E ( Y ) .


 If X1, X2, … , Xn are random variables, then

E(X1 + X2 + … + Xn) = E(X1) + E(X2) + … + E(Xn) = ΣE(Xi).

 For random variables, X and Y, E( XY )=E(X ) E(Y ). Here, X and Y


must be independent.
 If a is any constant and X is a random variable,
E [aX ]=aE [ X ] and
E [X + a]=E[ X ]+ a .

 For any random variable, X > 0, E(X) > 0.

 E(Y) ≥ E(X) if the random variables X and Y are such that Y ≥ X.

30
(ii) Variance of Random Variables

Suppose you calculated the mean or the average marks in the five
tests of mathematics. You can easily see the difference of marks in
each of the tests from this average marks. This difference in marks
shows the variability of the possible values of the random variable.
The random variable being the marks scored in the test.

The variance of a random variable shows the variability or the


scatterings of the random variables. It shows the distance of a
random variable from its mean. It is calculated as

2
σ x =Var ( X )

2
¿ ∑ ( xi−μ ) p ( xi )

2
¿ E ( X −μ ) ∨,

2 2
Var ( X)=E( X )−[E ( X )] .

E ( X 2) =∑ x i2 p ( x i ) ,∧¿

2 2 2
[ E(X )] =[∑ xi p(x i)] =μ .

If the value of the variance is small, then the values of the random
variable are close to the mean.

Properties of Variance of Random Variables

 The variance of any constant is zero i.e, V(a) = 0, where a is any


constant.
 If X is a random variable, and a and b are any constants, then
2
V ( aX +b)=a V ( X) .

31
 For any pair-wise independent random variables, X1, X2, … , Xn and
for any constants a1, a2, … , an;

2 2 2
V (a1 X 1+ a2 X 2 +…+a n X n)=a1 V (X 1 )+ a2 V ( X 2)+ …+an V ¿

32

You might also like