Statistics and
Probability
Statistics is the science concerned with
developing and studying methods for
collecting, analyzing, interpreting and
presenting empirical data.
It is the discipline that concerns the collection,
organization, analysis, interpretation, and
presentation of data. In applying statistics to a
scientific, industrial, or social problem, it is
conventional to begin with a statistical population
or a statistical model to be studied.
Statistics is a mathematical body of science that
pertains to the collection, analysis, interpretation
or explanation, and presentation of data. Some
consider statistics to be a distinct mathematical
science rather than a branch of mathematics.
While many scientific investigations make use of
data, statistics is concerned with the use of data
in the context of uncertainty and decision making
in the face of uncertainty.
Statistical data are concerned with quantitative or any kind of
numerical data such as figures on sales, ages, tax returns, population,
births and deaths. Data gathering includes gathering information
through interviews, questionnaires, objective observations,
experimentations, psychological tests.
Two main divisions in the science of statistics.
1. Descriptive statistics- refers to the collection, organization,
presentation, computation, and interpretation of data in order to
describe the samples under investigation.
2. Inferential statistics – is a statistical tool that seeks
to give information or inferences or implications
pertaining to the populations by studying its
representative samples.
Descriptive statistics is the branch of statistics that
includes methods for organizing and summarizing data.
Inferential statistics is the branch of statistics
that involves generalizing from a sample to the population
from which it was selected and assessing the reliability of
such generalizations.
Definitions of Random Variable
A random variable is a result of chance
event, that you can measure or count.
A random variable is a numerical quantity
that is assigned to the outcome of an
experiment. It is a variable that assumes
numerical values associated with the events of
an experiment.
A random variable is a quantitative variable
which values depends on change.
NOTE:We use capital letters to represent a
random variable.
An experiment is a procedure carried out to support
or refute a hypothesis, or determine the efficacy or
likelihood of something previously untried.
A random sequence of events, symbols or steps
often has no order and does not follow an intelligible
pattern or combination.
Variable refers to observable characteristics or phenomena
of a person or object whereby the members of the group or
set vary or differ from one another. It has the capacity of
taking different values representing a certain category.
Examples are weight, height, sex, year level, age, IQ,
achievement test scores, and others. They are considered
raw data or materials gathered by a researcher or
investigator for statistical analysis usually expressed by the
symbols X, Y, and Z.
A sample space is a collection or a set of
possible outcomes of a random experiment.
The sample space is represented using the
symbol, “S”. The subset of possible outcomes of
an experiment is called events. A sample space
may contain a number of outcomes that depends
on the experiment.
Example 1
Suppose two coins are tossed and we are interested to determine the number of
tails that will come out. Let us use T to represent the number of tails that will
come out. Determine the values of the random variable T.
Example 2
Two balls are drawn in succession without replacement from an urn
containing 5 orange balls and 6 violet balls. Let V be the random
variable representing the number of violet balls. Find the values of the
random variable V.
A coin is tossed thrice. Let the variable X
represent the number of heads that
result from this experiment.
A pair of dice is rolled. Let X be the
random variable representing the sum
of the number of dots on the top faces.
Find the values of the random variable
X.
A random variable may be classified:
1.Discrete random variable has a countable
number of possible values.
2.Continuous random variable can assume an
infinite number of values in one or more
intervals.
Determine if the random variable X or Y discrete or continuous.
1. X = number of points scored in the last season by a randomly selected
basketball player in the PBA.
2. Y= the height of a randomly selected student inside the library in
centimeter
3. X= number of birds in a nest
4. Y = the weights in kg of randomly selected dancers after taking up
aerobics
5. X = number of students randomly selected to be interviewed by a
researcher
6. Y= number of left-handed teachers randomly selected in a faculty room
7. X = the weights of randomly selected students in kilograms
8. Y = the hourly temperatures last Sunday.
Classify the following random variables as discrete or continuous.
1. Let Y = number of plants in the garden.
2. Let X = heights of basketball players in a team.
3. Let M = number of female teachers in a University
4. Let P = time spent in studying your lessons.
5. Let Q = number of Grade 11 students in the school.
Answers:
1.Discrete random variable
2.Continuous random variable
3.Discrete random variable
4.Continuous random variable
5.Discrete random variable
6.Discrete random variable
7.Continuous random variable
8.Continuous random variable
Discrete Probability Distribution:
Properties:
a. The probability of each value of a discrete random
variable is between 0 and 1 inclusive.
0 P(x)
b. The sum of all the probabilities is 1.
Consider the following values on the table.
x 0 1 2 3
P (x) 0.2 0.3 0.3 0.2
∑ 𝑃 ( 𝑥 )=0.2+0.3+0.3+0.2=1
Toss a fair coin twice and let X be
equal to the number of heads (H)
0bserved. Construct the discrete
probability of X.
Mass Function of a Discrete
Random Variable
The probability distribution of a discrete
random variable is called probability mass
function (pmf).
Properties:
a. f(x) = P (X=x) if x support S
• The values of the discrete random variable X where
f(x)> 0 are called its mass points.
• The support S of a random variable is the set of
values that the random variable can take.
• The sum of all the probabilities for all possible x
values in the Support S must be equal to 1.
• This means that the elements of S ca be put into
one-to-one correspondence with the set of natural
numbers.
Example1;
Suppose a random variable X can only take four values (0, 1, 2,and 3). If
each value has equal probability, then its probability mass is:
In simplified form the above pmf can
, if x = 0 be represented as:
, if x = 1
, if x = 2 , if x = 0, 1, 2, or 3
f(x) = f(x) =
, if x = 3 0, otherwise
0, otherwise
Hence, the Support, denoted by S, is
S ={0,1, 2, 3}
How to construct a probability distribution?
Suppose two coins are tossed and we are interested to determine the
number of tails that will come out. Let us use T to represent the number of tails
that will come out. Determine the values of the random variable T.
The values of the random variable T (number of tails) in this experiment
are 0, 1, and 2.
The Histogram for the Probability
Distribution of the Discrete Random
Variable T
1/2
P(T)
1/ 4
0
0 1 2