Possible Outcomes With Uncertainty As To Which Will Occur
Possible Outcomes With Uncertainty As To Which Will Occur
The subjects of Statistics and Probability concern the mathematical tools that are designed to
deal with uncertainty. To be more precise, these subjects are used in the following contexts:
To understand the limitations that arises from measurement inaccuracies.
o An extremely large meteor crashed into the earth at the time of the disappearance
of the dinosaurs. The most popular theory posits that the dinosaurs were killed by
the ensuing environmental catastrophe. Does the fossil record confirm that the
disappearance of the dinosaurs was suitably instantaneous?
To find trends and patterns in noisy data.
o We read in the papers that fat in the diet is “bad” for you. Do dietary studies of
large populations support this assertion?
To test hypothesis and models with data.
o Do studies of gene frequencies support the assertion that all extent people are
100% African descent?
To estimate confidence levels for future predictions from data.
o The human genome project claims to have determined the DNA sequences along
the human chromosomes. How accurate are the published sequences? How much
variation should be expected between any two individuals?
There are at least two uses for statistics and probability in the life sciences.
One is to tease information from noisy data, and
The other is to develop predictive models in situations where chance plays a pivotal role.
Probability
Probability theory is the mathematics of chance and luck. Its goal is to make sense of the
following question:
What is the probability of a given outcome from some set of possible outcomes?
Sample space(S): A sample space is the set of all possible outcomes of the particular
“experiment” of interest.
Example: If you are considering the possible birthdates of a person drawn at random, the sample
space consists of the days of the year, thus the integers from 1 to 366.
If you are considering the possible birthdates of two people selected at random, the sample space
consists of all pairs of the form (j; k) where j and k are integers from 1 to 366.
If you are considering the possible birthdates of three people selected at random, the sample
space consists of all triples of the form (j; k;m) where j, k and m are integers from 1 to 366.
Events: An event is a subset of the sample space, thus a subset of possible outcomes for your
experiment. No matter what the original sample space, the event with no elements is called the
empty set, and is denoted by φ.
Example: set of birth
Mutually exclusive: The random experiment result in the occurance of only one
of the n outcomes. E.g. if a coin is tossed, the result is a head or a tail, but not both. That is,
the outcomes are defined so as to be mutually exclusive.
Equally likely: Each outcome of the random experiment has an equal chance of
occuring.
Several concepts of probability have evolved over the centuries. We discuss three different
approaches
Classical probability
Frequency probability
axiomatic probability
Historical development: Classical Frequency Axiomatic
Classical probability: If a random experiment (process with an uncertain outcome) can result in
n mutually exclusive and equally likely outcomes, and if nA of these outcomes has an attribute A,
then the probability of A is the fraction nA/n.
Example: Drawing (with replacement) four balls from an urn with an equal number of red,
white, and blue balls: There are 81 possible outcomes (3x3x3x3 =34). For example, {red, white,
white, blue} is an outcome which is a different outcome from {white, white, red, blue}.
The probability associated with each outcome is 1/81.
A Basic assumption in the definition of classical probability is that n is a finite number; that is,
there is only a finite number of possible outcomes. If there is an infinite number of a possible
outcome, the probability of an outcome is not defined in the classical sense. Also if the outcomes
are not equally likely classical approach is not applicable.
Relative Frequency Probability: We might take a random sample from the population of
interest and identify the proportion of the sample with attribute A. That is, calculate
number of obser in the sample that possess attribute A
Relative freq of A in the sample = number of obser in the sample
Then assume ”Relative freq of A in the sample” is an estimate of Pr(A)