100% found this document useful (1 vote)

8 views76 pages

Concept of Probability

The document discusses the concept of probability, emphasizing its importance in statistical reasoning and decision-making. It highlights common misconceptions about probability, such as overconfidence and the tendency to see patterns in random data. Additionally, it contrasts frequentist and Bayesian statistics, explaining how probabilities can be derived from models and subjective beliefs.

Uploaded by

Jumping Kills

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

8 views76 pages

Concept of Probability

Uploaded by

Jumping Kills

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 76

CONCEPT OF

PROBABILITY
When it is not in our power to
determine what is true , we ought to
follow what is most probable.

Rene Descartes
“Statistics Means Never Having To
Say You’re Certain!”

MYLES HOLLANDER
(STATISTICIAN)
We expect statistical calculations to yield definite conclusions.
But in fact, every statistical conclusion is stated in terms of
probability. Statistics can be very diffi cult to learn if you keep looking
for definitive conclusions
Click icon to add picture
Why Study
PROBABLITY?
Why Study Probability
1. We tend to jump to conclusions
“ Girls don’t drive trucks , only boys do “ “ Boys don’t wear Pink “

◦The ability to generalize from a sample to a population is hardwired

into our brains
◦To avoid our natural inclination to make overly strong conclusions
from limited data, scientists need to use statistics
Why Study Probability

2. We tend to be over confident

Test devised by
Russo and Schoemaker
Answer each of these questions with a range
Pick a range that you think has a 90% chance of containing the correct answer
Don’t use google
The goal is not to provide precise answers, but rather is to correctly quantify your uncertainty and
come up with ranges of values that you think are 90% likely to include the true answer.
If you have no idea, answer with a super wide interval.
For example, if you truly have no idea at all about the answer to the first question, answer with
the range zero to 120 years old, which you can be 100% sure includes the true answer.
But try to narrow your responses to each of these questions to a range that you are 90% sure
contains the right answer:
Why Study Probability And Statistics
1. Martin Luther King Jr.’s age at death
2. Length of the Nile River, in miles or kilometers
3. Number of countries in OPEC
4. Number of books in the Old Testament
5. Diameter of the moon, in miles or kilometers
6. Weight of an empty Boeing 747, in pounds or kilograms
7. Year Mozart was born
8. Gestation period of an Asian elephant, in days
9. Distance from London to Tokyo, in miles or kilometers
10.Deepest known point in the ocean, in miles or kilometers
Why Study Probability And Statistics
1. Martin Luther King Jr.’s age at death: 39
2. Length of the Nile River: 4,187 miles or 6,738 kilometers
3. Number of countries in OPEC: 13
4. Number of books in the Old Testament: 39
5. Diameter of the moon: 2,160 miles or 3,476 kilometers
6. Weight of an empty Boeing 747: 390,000 pounds or 176,901 kilograms
7. Year Mozart was born: 1756
8. Gestation period of an Asian elephant: 645 days
9. Distance from London to Tokyo: 5,989 miles or 9,638 kilometers
10.Deepest known point in the ocean: 6.9 miles or 11.0 kilometers
Why Study Probability

2. We tend to be over confident

Test devised
by Russo and Schoemaker

Russo and Schoemaker (1989) tested more than 1,000 people and reported that 99% of them
were overconfident.

Most people created narrow ranges that included only 30% to 60% of the correct answers.

Since we tend to be too sure of ourselves, scientists must use statistical

methods to quantify confidence properly
Why Study Probability And Statistics
3. We see patterns in random data

Simulated data from 10 basketball players (1 per row) shooting 30 baskets each. An “X”
represents a successful shot and a “–” represents a miss.
Why Study Probability
3. We see patterns in random data
Most people see streaks of successful shots and conclude this is not random

Each spot had a 50% chance of being “X” (a successful shot) and a 50% chance of being
“–” (an unsuccessful shot), without taking into consideration previous shots.

We see clusters perhaps because our brains have evolved to find patterns and do so very
well.
Why Study Probability And Statistics
4. We don’t realise coincidences are
common
While it is highly unlikely that any particular coincidence will occur, it
is almost certain that some seemingly astonishing set of unspecified
events will happen often, since we notice so many things each day.

Remarkable coincidences are always noted in hindsight and never

predicted with foresight.
Why Study Probability
4. We don’t expect variability to depend on
size
◦ Gelman (1998) looked at the relationship between the populations of counties and the age-
adjusted, per-capita incidence of kidney cancer (incidence of about 15 cases per 100,000 adults
in the United States)
◦ First, he focused on counties with the lowest per-capita incidence of kidney cancer. – Most of
these counties had small populations. Why? One might imagine that something about the
environment in these rural counties leads to lower rates of kidney cancer
◦ Then he focused on those with higher rates – those were also small counties.– Why ???? medical
care in these tiny counties leads to higher rates of kidney cancer.
◦ But it seems pretty strange that both the highest and lowest incidences of kidney cancer be in
counties with small populations?
◦ Random variation can have a bigger effect on averages within small groups than within large
groups. This simple principle is logical, yet is not intuitive to many people.
Why Study Probability And Statistics

4. We don’t do Bayesian calculations

intuitively
Screening blood donors for HIV .
0.1 % of blood donors are HIV positive .
The antibody test is highly accurate but not quite perfect. It correctly identifies 99% of
infected blood samples but also incorrectly concludes that 1% of noninfected samples have
HIV

When this test identifies a blood sample as having HIV present, what is
the chance that the donor does, in fact, have HIV, an what is the chance
the test result is an error (a false positive)?
Why Study Probability And Statistics

4. We don’t do Bayesian calculations intuitively

Scenario 1 –
1,00,000 were tested of these 100 (0.1%) have HIV , and 99 (99%) of these will have a
positive result.
99900 will not have HIV but the test will still label 1% of 99,900 = 999 as false positives
So in total there will be 99+999= 1098 positive results of which only 99/1098 =9% will
be true positives and other 91 % false positives .
Thus , only a 9% chance that there is actually HIV in that sample

Most people, including most physicians, intuitively think that a positive test almost
certainly means that HIV is present.
Our brains are not adept at combining what we already know (the prevalence of HIV)
with new knowledge (the test is positive).
Why Study Probability And Statistics

4. We don’t do Bayesian calculations intuitively

Scenario 2 – IV Drug Users -10% have HIV
◦1,00,000 were tested of these 10,000 (10%) have HIV, and the test will
be positive for 9,900 (99%) of them.
◦Other 90,000 people will not have HIV, but the test will incorrectly
return a positive result in 1% of cases.
◦So there will be 900 false positive tests (1% of 90,000)
◦ Altogether, there will be 9,900 + 900 = 10,800 positive tests, of which
9,900/10,800 = 92% will be true positives.
◦The other 8% of the positive tests will be false positives. So if a test is
positive, there is a 92% chance that there is HIV in that sample.
Why Study Probability And Statistics

4. We don’t do Bayesian calculations intuitively

The interpretation of the test result depends greatly on what fraction of
the population has the disease
Why Study Probability And Statistics

4. We are fooled by regression towards mean

RTM is a statistical phenomenon that occurs when repeated
measurements are made on the same subject or unit of observation.
It happens because values are observed with random error.
Random error - a non-systematic variation in the observed values
around a true mean (e.g. random measurement error, or random
fluctuations in a subject).
Systematic error, where the observed values are consistently biased, is
not the cause of RTM.
It is rare to observe data without random error, which makes RTM a
common phenomenon.
Why Study Probability And Statistics

4. We are fooled by regression towards mean

All data in (A) were drawn from random distributions (Gaussian; mean = 120, SD = 15)
without regard to the designations “before” and “after” and without regard to any pairing.
(A) shows 48 random values, divided arbitrarily into 24 before–after pairs (which overlap
enough that you can’t count them all).
Why Study Probability And Statistics

4. We are fooled by regression towards mean

This is an example of regression to the mean: the more
extreme a variable is upon its first measurement, the more
likely it is to be closer to the average the second time it is
measured
PROBABAILITY
BASICS
Probability Basics
◦Probabilities range from 0.0 to 1.0 (or 100%) and are used to quantify
a prediction about future events or the certainty of a belief.
◦A probability of 0.0 means either that an event can’t happen or that
someone is absolutely sure that a statement is wrong.
◦A probability of 1.0 (or 100%) means that an event is certain to
happen or that someone is absolutely certain a statement is correct.
◦A probability of 0.50 (or 50%) means that an event is equally likely to
happen or not happen, or that someone believes that a statement is
equally likely to be true or false
Probability Basics
◦Probability that is “out there,” or outside your head. This is probability as
long term frequency. The probability that a certain event will happen has a
definite value, but we rarely have enough information to know that value
with certainty.

◦Probability that is inside your head. This is probability as strength of

subjective beliefs, so it may vary among people and even among different
assessments by the same person.
PROBABILITY AS LONG-TERM FREQUENCY
A woman plans to get pregnant and wants to know the chance that her baby will be a
boy.
Probabilities is as the predictions of future events that are derived by using a model.
A model is a simplified description of a mechanism.
For this example, we can create the following simple model:
◦ Each ovum has an X chromosome, and none have a Y chromosome.
◦ Half the sperm have an X chromosome (but no Y) and half have a Y chromosome (and no X).
◦ Only one sperm will fertilize the ovum.
◦ Each sperm has an equal chance of fertilizing the ovum.
◦ If the winning sperm has a Y chromosome, the fetus will have both an X and a Y chromosome
and so will be male. If the winning sperm has an X chromosome, the fetus will have two X
chromosomes and so will be female.
◦ Any miscarriage or abortion is equally likely to happen to male and femalefetuses.
PROBABILITY AS LONG-TERM
FREQUENCY
◦Thus, our model predicts that the chance that the fetus will be male is 0.50, or 50%
◦ You can make predictions about the occurrence of future events from any model, even if
the model doesn’t reflect reality---- this one is not perfect but close to reality

Probabilities based on data

◦ Of all the babies born in the world in 2011, 51.7% were boys (Central Intelligence Agency
[CIA], 2012).
◦ Based on these data, we can answer the question, what is the chance that my baby will
be a boy?
◦ The answer is 51.7%.
PROBABILITY AS STRENGTH OF BELIEF

Subjective probabilities:
◦ Someone badly wants a boy.
◦ They search the Internet and read about an interesting book:
◦ How to Choose the Sex of Your Baby explains the simple, at-home, noninvasive Shettles
method and presents detailed steps to take to conceive a child of a specific gender. The
properly applied Shettles method gives couples a 75 percent or better chance of having a
child of the desired sex. (Shettles, 1996)
◦ What is the chance that you’ll have a boy?
◦ If someone has complete faith that the method is correct, then they believe that the
probability, as stated on the book jacket, is 75%.- subjective probability.

(0.850 × 0.750) + (0.150 × 0.517) = 0.715 = 71.5%. (Their)

(0.010 × 0.750) + (0.990 × 0.517) = 0.519 = 51.9%. (Yours )
Probability vs. odds
Odds and probability are two alternative for expressing precisely the same concept.
◦ Worldwide, the sex ratio at birth in many countries is about 1.07. Another way to say this is that the
odds of having a boy versus a girl 1.07 to 1.00, or 107 to 100.
◦ If there are 107 boys born for every 100 girls born, the chance that any particular baby will be male is
107/ (107 + 100) = 0.517, or 51.7%

The odds are defined as the probability that the event will occur divided by the
probability that the event will not occur.

◦ If the probability of having a boy is 51.7%, then you expect 517 boys to be born for every 1,000 births.
◦ Of these 1,000 births, you expect 517 boys and 483 girls (which is 1,000 − 517) to be born.
◦ So the odds of having a boy versus a girl are 517/483 = 1.07 to 1.00.
Probability vs. odds
◦Odds can be any positive value or zero, but they cannot be negative.
◦ A probability must be between zero and 1 if expressed as a fraction, or be between
zero and 100 if expressed as a percentage
◦A probability of 0.5 is the same as odds of 1.0. The probability of flipping a coin to
heads is 50%. The odds are 50:50, which equals 1.0.
◦As the probability goes from 0.5 to 1.0, the odds increase from 1.0 to approach infinity.
◦For example,if the probability is 0.75, then the odds are 75:25, 3 to 1, or 3.0
Probability vs.
statistics
◦ Probability calculations go from general to
specific, from population to sample, and
from model to data

◦ Statistical calculations work in the

opposite direction . You start with one set
of data (the sample) and make inferences
about the overall population or model. The
logic goes from specific to general, from
sample to population, and from data to
model.
PROBABILITY IN STATISTICS
Probability can be “out there” or “in your head.”
We compute confidence intervals (CIs) and p values which are out there.
This style of analyzing data is called FREQUENTIST STATISTICS
Only the data from a current set are used as inputs when
calculating P values or CI’s. Scientists often account for
prior data and theory when interpreting these results,
Bayesian statistics
◦Many statisticians prefer an alternative approach called Bayesian statistics, in which
prior beliefs are quantified and used as part of the calculations.
◦These prior probabilities can be subjective (based on informed opinion), objective (based
on solid data or well-established theory), or uninformative (based on the belief that all
possibilities are equally likely).
◦Bayesian calculations combine these prior probabilities with the current data to compute
probabilities and Bayesian CIs called credible intervals
Sample space and Events

◦The sex of all newborns in hospital - Two outcomes – males and

females
◦A set containing all the possible outcomes – SAMPLE SPACE
◦Outcomes – discrete or continuous
◦E.g Birth weight in grams – Sample space (1000, 4500 gms )
Mutually and Non mutually exclusive events

When the occurrence of one event exclude the possibility of

another event – Mutually Exclusive Events e.f blood groups A,B,
AB, O and Gender

Non – mutually Exclusive - Symptoms such as headache , nausea

and fever, Family history of Diabetes , hypertension Ischaemic
heart Disease
Independent and Dependent Events
When the occurence of one event does not influence the occurrence of another –
INDEPENDENT EVENTS
◦E.g. Gender does not influence blood group or vice versa
◦Gender of first child does not influence gender of future children
If the occurrence of on event affects the occurrence of others –
- DEPENDENT EVENTS
Presence of diabetes and retinopathy , presence of diabetes and family history of
diabetes
Additive rule of Probablity
When computing the probability
of occurrence of one among any
of the events of interest
e.g the probability that a selected
person from a sample has either
blood type “A” OR “B”

P(A or B)= P(A)+P(B)

=0.22+0.33=0.55
Multiplication RuleThe multiplication rule gives the
probability that two (or more) events
happen together.
The probability of two independent events
is the probability that both events occur and
is found by multiplying the probabilities of
the two events, which is called the
multiplication rule for probabilities.
P(Male and blood type O)=
P(Male )x P(blood type O)
= 0.50 X 0.42= 0.21
Non independent Events & the Modified Multiplication Rule
Let A stand for the event “known
meningitis” and B for the event
“recent epidemic” (in which known
meningitis is having either
meningitis alone or meningitis with
sepsis).

We want to know the probability of

event A given event B, written P(A |
B) where the vertical line, |, is read
as “given.”

In other words, we want to know

the probability of event A, assuming
that event B has happened
Nonindependent Events & the Modified Multiplication Rule

CONDITIONAL PROBABILITY, is the

probability of one event given that another
event has occurred.

Put another way, the probability of a patient

having known meningitis is conditional on
the period of the epidemic
Nonindependent Events & the Modified Multiplication Rule

CONDITIONAL PROBABILITY, is the

probability of one event given that another
event has occurred.

Put another way, the probability of a

patient having known meningitis is
conditional on the period of the epidemic
Nonindependent Events & the Modified Multiplication Rule

CONDITIONAL PROBABILITY, is the

probability of one event given that another
event has occurred.

Put another way, the probability of a

patient having known meningitis is
conditional on the period of the epidemic
Nonmutually Exclusive Events & the Modified Addition Rule

◦From a community hospital setting , it is known that for a particular

clinical condition , the patients present with three symptoms fever,
headache and nausea (F,H,N) . TOTAL 100 patients

P(F OR H )= P(F )+P(H)- P(F and H )

=(62/100)+ (61/100) –(35 /100)

=(62+61-35)/100= 88/100= 0.88

COMMON MISTAKES:
PROBABILITY
COMMON MISTAKES: PROBABILITY
Mistake: Ignoring the assumptions- “What is the chance that a fetus will be male?”
◦ We are asking about human babies. Sex ratios may be different in another species.
◦ There is only one fetus. “Will the baby be a boy or girl?” is ambiguous, or needs elaboration,
if you allow for the possibility of twins or triplets.
◦ There is only a tiny probability (which we ignore) that the baby is neither completely female
nor completely male.
◦ The sex ratio is the same for all countries and all ethnic groups.
◦ The sex ratio does not change from year to year, or between seasons.
◦ There will be (and have been) no sex-selective abortions or miscarriages, so the sex ratio at
conception is the same as the sex ratio at birth.

Probabilities are always contingent on a set of assumptions. To think clearly about

probability in any situation, you must know what those assumptions are
COMMON MISTAKES: PROBABILITY
Mistake: Trying to understand a probability without clearly defining both the numerator
and the denominator
Numerator – baby boys denominator – all babies
E.g. in vitro fertilization – ambiguous

Mistake: Reversing probability statements

◦ The probability that a baby is a boy is obviously very different from the probability that a boy is a baby.
◦ The probability that a heroin addict first used marijuana is not the same as the probability that a marijuana user will
later become addicted to heroin.
◦ The probability that someone with abdominal pain has appendicitis is not the same as the probability that someone
diagnosed with appendicitis will have had abdominal pain
◦ The probability that a statistics books will be boring is not the same as the probability that a boring book is about
statistics.
COMMON MISTAKES: PROBABILITY
Mistake: Believing that probability has a memory
If a couple has four children, all boys, what is the chance that the next child will be a boy?
Frequency
Distribution
The set of all possible
outcomes of a variable
along with the
frequency (the number
of occurrences ) of each
outcome or a group of
outcomes is called the
“ Frequency
Distribution “
Probability
Distribution
◦The collection of
probabilities of all possible
outcomes as computed from
the observed or sampled
data is known as observed
or empirical probability
distribution.
Probability Distributions

They are characterized on the basis of the different aspects of

variables .

1. Binomial and Poisson Distributions – Discrete Variables

2. Exponential and Normal- Continuous Variables
NORMAL / Gaussian Distribution
◦ The Gaussian bell-shaped distribution, also called the normal distribution, is the basis for much of
statistics.
◦ It arises when many random factors create variability
◦ Most values are near the mean, some values are farther from the mean, and very few are quite far from
the mean.
◦ When you plot the data on a frequency distribution, the result is a symmetrical, bellshaped distribution,
idealized as the Gaussian distribution.
◦ For example, in a laboratory experiment, variation between experiments might be caused by several
factors: imprecise weighing of reagents, imprecise pipetting, the random nature of radioactive decay, non
homogenous suspensions of cells or membranes, and so on.
◦ Variation in a clinical value might be caused by many genetic and environmental factors.
◦ When scatter is the result of many independent additive causes, the distribution will tend to follow a bell-
shaped Gaussian distribution
The Normal Distribution:

The Normal curve is a mathematical abstraction

which conveniently describes ("models") many
frequency distributions of scores in real-life.
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and
country public schools.' Journal of the Anthropological Institute, 5, 174-180:

Height of 14 year-old children

country
14
town

frequency (%)
10

- 52 -54 -56 - 58 -60 - 62 -64 - 66 -68 -70

51 53 55 57 59 61 63 65 67 69
height (inches)
Properties of the Normal Distribution:
1. It is bell-shaped and asymptotic at the extremes.
2. It's symmetrical around the mean.
3. The mean, median and mode all have same value.
4. It can be specified completely, once mean and SD
are known.
5. The area under the curve is directly proportional
to the relative frequency of observations.
e.g. here, 50% of scores fall below the mean, as
does 50% of the area under the curve.
e.g. here, 85% of scores fall below score X,
corresponding to 85% of the area under the curve.
Relationship between the normal curve and the
standard deviation:
All normal curves share this property: the SD cuts off a
constant proportion of the distribution of scores:-
frequency

68%

95%

99.7%

-3 -2 -1 mean +1 +2 +3
Number of standard deviations either side of mean
About 68% of scores fall in the range of the mean plus and minus 1 SD;
95% in the range of the mean +/- 2 SDs;
99.7% in the range of the mean +/- 3 SDs.

e.g. IQ is normally distributed (mean = 100, SD = 15).

68% of people have IQs between 85 and 115 (100 +/- 15).
95% have IQs between 70 and 130 (100 +/- (2*15).
99.7% have IQs between 55 and 145 (100 +/- (3*15).

68%

85 (mean - 1 SD) 115 (mean + 1 SD)

We can tell a lot about a population just from knowing
the mean, SD, and that scores are normally distributed.
If we encounter someone with a particular score, we can
assess how they stand in relation to the rest of their
group.
e.g. someone with an IQ of 145 is quite unusual (3 SDs
above the mean).
IQs of 3 SDs or above occur in only 0.15% of the
population [ (100-99.7) / 2 ].
z-scores:
z-scores are "standard scores".
A z-score states the position of a raw score in relation to
the mean of the distribution, using the standard
deviation as the unit of measurement.

raw score  mean

z 
standard deviation
1. Find the difference between a score
and the mean of the set of scores.
for a population :
2. Divide this difference by the SD (in
X  μ order to assess how big it really is).
z 
σ

for a sample :
X - X
z 
s
Raw score distributions:
A score, X, is expressed in the original units of measurement:

X = 236
X = 65

X 50 s  10 X 200 s  24

z = 1.5

X 0 s  1

z-score distribution:
X is expressed in terms of its deviation from the mean (in SDs).
z-scores transform our original scores into scores with a
mean of 0 and an SD of 1.
Raw IQ scores (mean = 100, SD = 15)
z for 100 = (100-100) / 15 = 0, z for 115 = (115-100) / 15 = 1,
z for 70 = (70-100) / -2, etc.

raw: 55 70 85 100 115 130 145

z-score: -3 -2 -1 0 +1 +2 +3
Why use z-scores?
1. z-scores make it easier to compare scores from
distributions using different scales.

e.g. two tests:

Test A: Fred scores 78. Mean score = 70, SD = 8.
Test B: Fred scores 78. Mean score = 66, SD = 6.

Did Fred do better or worse on the second test?

Test A: as a z-score, z = (78-70) / 8 = 1.00
Test B: as a z-score , z = (78 - 66) / 6 = 2.00

Conclusion: Fred did much better on Test B.

2. z-scores enable us to determine the relationship
between one score and the rest of the scores, using just
one table for all normal distributions.
e.g. If we have 480 scores, normally distributed with a
mean of 60 and an SD of 8, how many would be 76 or
above?
(a) Graph the problem:
(b) Work out the z-score for 76:
z = (X - X) / s = (76 - 60) / 8 = 16 / 8 = 2.00

(c) We need to know the size of the area beyond z

(remember - the area under the Normal curve corresponds
directly to the proportion of scores).
Many statistics books (and my website!) have z-score
tables, giving us this information:

(a)
z (a) Area between (b) Area
mean and z beyond z
0.00 0.0000 0.5000
0.01 0.0040 0.4960
(b)
0.02 0.0080 0.4920
: : :
1.00 0.3413 * 0.1587
: : :
*
x 2 = 68% of scores
2.00 0.4772 + 0.0228 +
x 2 = 95% of scores
: : : #
x 2 = 99.7% of scores
3.00 0.4987 # 0.0013 (roughly!)
0.0228

(d) So: as a proportion of 1, 0.0228 of scores are likely to

be 76 or more.
As a percentage, = 2.28%

As a number, 0.0228 * 480 = 10.94 scores.

How many scores would be 54 or less?
Graph the problem:

z = (X - X) / s = (54 - 60) / 8 = - 6 / 8 = - 0.75

Use table by ignoring the sign of z : “area beyond z” for
0.75 = 0.2266. Thus 22.7% of scores (109 scores) are 54
or less.
Word comprehension test scores:
Normal no. correct: mean = 92, SD = 6 out of 100
Brain-damaged person's no. correct: 89 out of 100.
Is this person's comprehension significantly impaired?

Step 1: graph the problem:

?
Step 2: convert 89 into a z-score:

z = (89 - 92) / 6 = - 3 / 6 = - 0.5 89 92

Step 3: use the table to find ?
the "area beyond z" for our z
of - 0.5:

Area beyond z = 0.3085 89 92

z-score value: Area between the Area beyond z:
mean and z:
0.44 0.17 0.33
Conclusion: .31 (31%) of 0.45 0.1736 0.3264
normal people are likely to 0.46 0.1772 0.3228
0.47 0.1808 0.3192
have a comprehension score 0.48 0.1844 0.3156
this low or lower. 0.49 0.1879 0.3121
0.5 0.1915 0.3085
0.51 0.195 0.305
0.52 0.1985 0.3015
0.53 0.2019 0.2981
0.54 0.2054 0.2946
0.55 0.2088 0.2912
0.56 0.2123 0.2877
0.57 0.2157 0.2843
0.58 0.219 0.281
0.59 0.2224 0.2776
0.6 0.2257 0.2743
0.61 0.2291 0.2709
Conclusions:
Many psychological/biological properties are
normally distributed.

This is very important for statistical inference

(extrapolating from samples to populations - more
on this in later lectures...).

z-scores provide a way of

(a) comparing scores on different raw-score
scales;
(b) showing how a given score stands in relation to
the overall set of scores.
Conclusions:

The logic of z-scores underlies many statistical tests.

1. Scores are normally distributed around their mean.

2. Sample means are normally distributed around the

population mean.

3. Differences between sample means are normally

distributed around zero ("no difference").

We can exploit these phenomena in devising tests to

help us decide whether or not an observed difference
between sample means is due to chance.
THANKS

Everything Is Predictable: How Bayesian Statistics Explain Our World
From Everand
Everything Is Predictable: How Bayesian Statistics Explain Our World
Tom Chivers
4/5 (12)
Probabilty Slides by Madhav
No ratings yet
Probabilty Slides by Madhav
625 pages
Random Variables and Probability Distribution
80% (5)
Random Variables and Probability Distribution
78 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
Class Work 2
No ratings yet
Class Work 2
13 pages
Lecture - 4 - Start
No ratings yet
Lecture - 4 - Start
116 pages
Probability and Probability Distributions
No ratings yet
Probability and Probability Distributions
68 pages
Bayes' Theorem
100% (1)
Bayes' Theorem
17 pages
NEW CCT Lecture 5
No ratings yet
NEW CCT Lecture 5
65 pages
q3 Stat Prob Week 1 7
No ratings yet
q3 Stat Prob Week 1 7
95 pages
Statistics Presentation 1
No ratings yet
Statistics Presentation 1
64 pages
Bayesian Psychometric Modeling
100% (3)
Bayesian Psychometric Modeling
480 pages
MATH2203 - Statistics I Week 5
No ratings yet
MATH2203 - Statistics I Week 5
47 pages
LS3 Probability JHS
100% (1)
LS3 Probability JHS
34 pages
Probability
No ratings yet
Probability
38 pages
Quantitative Methods in Management: Chapter 4, 5 (Part) : 149-188
No ratings yet
Quantitative Methods in Management: Chapter 4, 5 (Part) : 149-188
96 pages
Hypothesis Testing - A Visual Introduction To Statistical Significance
100% (4)
Hypothesis Testing - A Visual Introduction To Statistical Significance
137 pages
Bayesian
No ratings yet
Bayesian
37 pages
L1 BRSM Ch1 WhyDoStats
No ratings yet
L1 BRSM Ch1 WhyDoStats
37 pages
STK110 - Chapter 4
No ratings yet
STK110 - Chapter 4
47 pages
CH 5-8 Probability155
No ratings yet
CH 5-8 Probability155
112 pages
Introduction To Probability Theory and Statistics For Linguistics
No ratings yet
Introduction To Probability Theory and Statistics For Linguistics
137 pages
QMM Epgdm 6
No ratings yet
QMM Epgdm 6
110 pages
T0a Introduction
No ratings yet
T0a Introduction
25 pages
1.6 Probability Distribution
No ratings yet
1.6 Probability Distribution
50 pages
Statistics
No ratings yet
Statistics
137 pages
Introduction To Probability and Statistics
No ratings yet
Introduction To Probability and Statistics
17 pages
vt59.2708-21417172328 1042234447002868 534542395353161222 N.pdfintroduction-Final - PDF NC Cat 103&ccb 1
No ratings yet
vt59.2708-21417172328 1042234447002868 534542395353161222 N.pdfintroduction-Final - PDF NC Cat 103&ccb 1
53 pages
Module 4 Probability Basics - Filled
No ratings yet
Module 4 Probability Basics - Filled
23 pages
Lecture 9
No ratings yet
Lecture 9
28 pages
Level 3 Comp PROBABILITY PDF
100% (1)
Level 3 Comp PROBABILITY PDF
30 pages
Statistical Methods For BCS
No ratings yet
Statistical Methods For BCS
55 pages
MIDS Unit 2
No ratings yet
MIDS Unit 2
18 pages
2024 Statistics Lecture Notes
No ratings yet
2024 Statistics Lecture Notes
22 pages
EC003 2024 Lecture-5 Random Variables and PDF' With You
No ratings yet
EC003 2024 Lecture-5 Random Variables and PDF' With You
12 pages
MAST20006 Module1 Slides
No ratings yet
MAST20006 Module1 Slides
28 pages
SUMSEM12024-25 MAT1014 ETH AP2024257000245 2025-05-23 Reference-Material-II
No ratings yet
SUMSEM12024-25 MAT1014 ETH AP2024257000245 2025-05-23 Reference-Material-II
15 pages
Lesson 1-3
No ratings yet
Lesson 1-3
12 pages
Table of The Standard Normal Cumulative Distribution Function
100% (1)
Table of The Standard Normal Cumulative Distribution Function
1 page
Inductive Reasoning
No ratings yet
Inductive Reasoning
31 pages
Unit 4 - Probability, Random Variables, and Probability Distributions
No ratings yet
Unit 4 - Probability, Random Variables, and Probability Distributions
10 pages
Reference Book
No ratings yet
Reference Book
3 pages
Definition of Statistics
No ratings yet
Definition of Statistics
9 pages
Module Contents: Introduction To Statistics and Probability
No ratings yet
Module Contents: Introduction To Statistics and Probability
10 pages
Action Research Project
No ratings yet
Action Research Project
21 pages
Probability and Statistics
No ratings yet
Probability and Statistics
110 pages
MATH 240 Probability For Engineers 2023 - 2024 Fall: Course Objective
No ratings yet
MATH 240 Probability For Engineers 2023 - 2024 Fall: Course Objective
9 pages
Engineering Data Analysis
No ratings yet
Engineering Data Analysis
12 pages
Class 6: Conditional Probability (Text: Sections 4.5)
No ratings yet
Class 6: Conditional Probability (Text: Sections 4.5)
5 pages
Q3 Lectures STATS
No ratings yet
Q3 Lectures STATS
7 pages
Data Fallacies
No ratings yet
Data Fallacies
9 pages
STATS-AND-PROBABILITY Reviewer
No ratings yet
STATS-AND-PROBABILITY Reviewer
4 pages
Q3 Statistics and Probability Week 1
No ratings yet
Q3 Statistics and Probability Week 1
19 pages
HW 1
No ratings yet
HW 1
6 pages
LS3 Math DLL (Read and Write Positive-Negative Numbers)
No ratings yet
LS3 Math DLL (Read and Write Positive-Negative Numbers)
11 pages
Probability:: Biostatistics Assist. Prof. Dr. Waleed Arif Tawfiq 11/11/2014
No ratings yet
Probability:: Biostatistics Assist. Prof. Dr. Waleed Arif Tawfiq 11/11/2014
9 pages
Statistical Methods in Experimental Physics 2nd Ed. Edition James PDF Download
100% (2)
Statistical Methods in Experimental Physics 2nd Ed. Edition James PDF Download
84 pages
Statistics For The Behavioural Sciences An Introduction To Frequentist and Bayesian Approaches, 2nd Edition Exclusive Download
100% (13)
Statistics For The Behavioural Sciences An Introduction To Frequentist and Bayesian Approaches, 2nd Edition Exclusive Download
17 pages
Lecture Notes - Inferential Statistics
No ratings yet
Lecture Notes - Inferential Statistics
9 pages
Some Key Comparisons Between Statistics and Mathematics and Why Teachers Should Care
No ratings yet
Some Key Comparisons Between Statistics and Mathematics and Why Teachers Should Care
14 pages
Science Human Knowledge Data Statistical Theory Mathematics Probability Theory
No ratings yet
Science Human Knowledge Data Statistical Theory Mathematics Probability Theory
6 pages
Grade 11 Third Quarter Statistics and Probability Reviewer - Docx 1
No ratings yet
Grade 11 Third Quarter Statistics and Probability Reviewer - Docx 1
5 pages
Bayes Theorem
No ratings yet
Bayes Theorem
4 pages
14 358 149822091462 64 PDF
No ratings yet
14 358 149822091462 64 PDF
3 pages
SML Book Draft Latest (001 046)
No ratings yet
SML Book Draft Latest (001 046)
46 pages
Quiz - 112 - Topic Probability Bayes Theorem 138 Suppose That..
No ratings yet
Quiz - 112 - Topic Probability Bayes Theorem 138 Suppose That..
1 page
Chapter 3 - Bayesian Inference
No ratings yet
Chapter 3 - Bayesian Inference
114 pages
MSO 201a: Probability and Statistics 2015-2016-II Semester Assignment-X
No ratings yet
MSO 201a: Probability and Statistics 2015-2016-II Semester Assignment-X
23 pages
CS3491 - Aiml - Unit Iii Supervised Learning
No ratings yet
CS3491 - Aiml - Unit Iii Supervised Learning
162 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
All DL
No ratings yet
All DL
72 pages
MA40189 20 Mock - Sols
No ratings yet
MA40189 20 Mock - Sols
17 pages
Bayesian Models of Cognition
No ratings yet
Bayesian Models of Cognition
13 pages
Calendar - Asset-V1 - MITx+6.431x+1T2022+type@asset+block@resources - 1T2022 - Calendar - 1t2022-U10
No ratings yet
Calendar - Asset-V1 - MITx+6.431x+1T2022+type@asset+block@resources - 1T2022 - Calendar - 1t2022-U10
2 pages
Balanced Crystalloids Versus Saline For Critically Ill Patients (BEST-Living)
No ratings yet
Balanced Crystalloids Versus Saline For Critically Ill Patients (BEST-Living)
10 pages
Some Notes For Machine Learning
No ratings yet
Some Notes For Machine Learning
40 pages
Poisson Distribution
No ratings yet
Poisson Distribution
4 pages
Submission
No ratings yet
Submission
8 pages
Research Topics For Bayesian Computation (MS and PHD)
No ratings yet
Research Topics For Bayesian Computation (MS and PHD)
23 pages
Comparisons of Faulting-Based Pavement Performance Prediction Models
No ratings yet
Comparisons of Faulting-Based Pavement Performance Prediction Models
11 pages
ML 456
No ratings yet
ML 456
6 pages
Bayesian Astrostatistics: A Backward Look To The Future: Loredo@astro - Cornell.edu
No ratings yet
Bayesian Astrostatistics: A Backward Look To The Future: Loredo@astro - Cornell.edu
27 pages
Lectorial Module 2.1
No ratings yet
Lectorial Module 2.1
9 pages
Tabela Binomial
No ratings yet
Tabela Binomial
5 pages
Mit18 05 s22 Class11 Pset Sol
No ratings yet
Mit18 05 s22 Class11 Pset Sol
9 pages
STAT8150 7150assignment1
No ratings yet
STAT8150 7150assignment1
3 pages
Biostatistics I Handout
No ratings yet
Biostatistics I Handout
5 pages
Hydro Matlab
No ratings yet
Hydro Matlab
6 pages
3023-Article Text-6617-1-10-20210106
No ratings yet
3023-Article Text-6617-1-10-20210106
10 pages
A Joosr Guide to... Superforecasting by Philip Tetlock and Dan Gardner: The Art and Science of Prediction
From Everand
A Joosr Guide to... Superforecasting by Philip Tetlock and Dan Gardner: The Art and Science of Prediction
Joosr
No ratings yet

Concept of Probability

Uploaded by

Concept of Probability

Uploaded by

CONCEPT OF

◦The ability to generalize from a sample to a population is hardwired

2. We tend to be over confident

2. We tend to be over confident

Since we tend to be too sure of ourselves, scientists must use statistical

Remarkable coincidences are always noted in hindsight and never

4. We don’t do Bayesian calculations

4. We don’t do Bayesian calculations intuitively

4. We don’t do Bayesian calculations intuitively

4. We don’t do Bayesian calculations intuitively

4. We are fooled by regression towards mean

4. We are fooled by regression towards mean

4. We are fooled by regression towards mean

◦Probability that is inside your head. This is probability as strength of

Probabilities based on data

(0.850 × 0.750) + (0.150 × 0.517) = 0.715 = 71.5%. (Their)

◦ Statistical calculations work in the

◦The sex of all newborns in hospital - Two outcomes – males and

When the occurrence of one event exclude the possibility of

Non – mutually Exclusive - Symptoms such as headache , nausea

P(A or B)= P(A)+P(B)

We want to know the probability of

In other words, we want to know

CONDITIONAL PROBABILITY, is the

Put another way, the probability of a patient

CONDITIONAL PROBABILITY, is the

Put another way, the probability of a

CONDITIONAL PROBABILITY, is the

Put another way, the probability of a

◦From a community hospital setting , it is known that for a particular

P(F OR H )= P(F )+P(H)- P(F and H )

=(62/100)+ (61/100) –(35 /100)

=(62+61-35)/100= 88/100= 0.88

Probabilities are always contingent on a set of assumptions. To think clearly about

Mistake: Reversing probability statements

They are characterized on the basis of the different aspects of

1. Binomial and Poisson Distributions – Discrete Variables

The Normal curve is a mathematical abstraction

Height of 14 year-old children

- 52 -54 -56 - 58 -60 - 62 -64 - 66 -68 -70

e.g. IQ is normally distributed (mean = 100, SD = 15).

85 (mean - 1 SD) 115 (mean + 1 SD)

raw score  mean

raw: 55 70 85 100 115 130 145

e.g. two tests:

Did Fred do better or worse on the second test?

Conclusion: Fred did much better on Test B.

(c) We need to know the size of the area beyond z

(d) So: as a proportion of 1, 0.0228 of scores are likely to

As a number, 0.0228 * 480 = 10.94 scores.

z = (X - X) / s = (54 - 60) / 8 = - 6 / 8 = - 0.75

Step 1: graph the problem:

z = (89 - 92) / 6 = - 3 / 6 = - 0.5 89 92

Area beyond z = 0.3085 89 92

This is very important for statistical inference

z-scores provide a way of

The logic of z-scores underlies many statistical tests.

1. Scores are normally distributed around their mean.

2. Sample means are normally distributed around the

3. Differences between sample means are normally

We can exploit these phenomena in devising tests to

You might also like