P6 - HTZ-2.1 Statistics and Probability
P6 - HTZ-2.1 Statistics and Probability
1: Random variable: A random variable is a quantity that may take any of a given range of
values that cannot be predicted exactly but can be described in terms of their probability.
Random variables are named by capital letters, like . The same letter but lowercase, like ,
denotes a data value (a number).
Continuous and discrete random variables: A random variable is if it potentially can take on
any value on some line segment or interval (that is, there are no “breaks” between possible
values). A random variable is if the values it can potentially assume constitute a sequence of
isolated or separated points on the real number axis. Continuous random variables usually
measure the amount of something, whereas discrete random variables usually count something.
Examples of continuous and discrete random variables: a person’s height, the length of time
to run a marathon, the mass in kilograms of a celestial object (planet, star, meteor, piece of space
dust, etc.). In this last example, we would consider the mass to be theoretically any value greater
than 0, showing that sometimes the interval of possible values of a random variable is considered
to be infinite in length, whereas the discrete random variables are the number of children in a
family, the number of times a person catches a cold in a given year, and the number of tosses of a
coin before a tail appears. This last example shows that sometimes the sequence of potential
values of a discrete random variable can be infinite because the sequence in this example would
be .
We are interested in probabilities associated with various values of a random variable. A formula
or table that enables us to find such probabilities is called a probability distribution for the
random variable.
Discrete Probability Distributions:
In a study of families with one child, a researcher coded families as follows:
{
Imagine the experiment of randomly selecting a family with one child and recording whether the
child is a boy ( ) or a girl ( ). The sample space is and is a random variable on
this sample space. We can view as the number of girls in a randomly selected family with one
child ( or ). We assume that a boy and a girl are equally likely. Hence, the probability that
is and the probability that is . We sometimes write
The specification of the probabilities associated with the distinct values of this random variable
is called its probability distribution.
For a random variable of the discrete type, the probability is frequently denoted by
, and this function is called the probability mass function. Note that some authors
refer to as the probability function, the frequency function, or the probability density
function. In the discrete case, we shall use “probability mass function,” and it is hereafter
abbreviated pmf.
Properties of pmf: The pmf of a discrete random variable is a function that satisfies the
following properties:
(a)
(b) ∑
(c) ∑
(d)
Example: A college statistics class has students. The ages of these students are as follows:
One student is years old, four are , nine are , three are , two are , and one is .
Let the age of any student (randomly selected). Find the probability and Cumulative
distribution for .
Solution: Since each student has an equal likelihood of being selected, the probability of
selecting a particular student is . The probability of selecting a student that is years old is
, since there are nine students of that age. The probability and Cumulative
distribution is summarized in the following table:
Example 2.1-3: Roll a fair four-sided die twice, and let be the maximum of the two outcomes.
Find the pmf of .