0% found this document useful (0 votes)

53 views5 pages

05 Descriptive Statistics - Distribution

The document discusses probability distributions and introduces key concepts from probability theory that are important for understanding distributions. It focuses on defining random variables, trials, outcomes, sample spaces, and probability. It also defines discrete and continuous probability distributions, and provides the uniform distribution as an example of a continuous distribution. The uniform distribution has a constant probability density function, and the probabilities of ranges of random variables can be calculated by finding the area under the curve.

Uploaded by

Ace Choice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views5 pages

05 Descriptive Statistics - Distribution

Uploaded by

Ace Choice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Descriptive Statistics: Distribution & A Small Part Of Probability.

In statistics, the distribution of data usually refers to how the data is spread when graphed. Previously,
measures of dispersion was discussed, in which a numerical value was used to describe how the data is
spread. Several patterns of distribution have been determined, and these patterns are used frequently
in inferential statistics.

In inferential statistics, a sample of data is used to determine something about the population. Hence,
when the data is assumed to fit a particular type of distribution, more information can be gathered.
There are many types of distributions, but this section will focus on the uniform distribution and the
normal distribution. The uniform distribution will mainly be used to introduce terminology and concepts
used in distribution analysis.

Before beginning on distributions, some terminology and concepts from probability theory will be
formalized. You should be familiar with the idea of probability, so this will not be discussed here.
Probability theory is the mathematical formalization of probability, but this is beyond the scope of this
class. If you decide to look into it, you will see that our usual statistics shares some terminology with
probability theory, but the way it is presented is a bit different.

Probability Theory
Again, to reiterate, terminology and concepts from probability theory that will help understand
distributions will be discussed here. This is not a proper introduction to probability theory.

In probability theory, a trial, or experiment, is any procedure that can be repeated infinitely many times
and has a well-defined set of possible results from running the experiment. In probability theory, these
possible results are called outcomes and the set of all possible outcomes is called the sample space.
“Well-defined” is a mathematical concept whose definition varies slightly depending on what is being
discussed. In this case, the set is well-defined, so what you have been taught about sets being well-
defined applies (if you forgot, the basic idea is that a set is well-defined if it’s clear whether something
belongs in the set or not).

Performing a trial results in only one outcome. To be clear, a trial may result in any of the outcomes in
the sample space, but once it is executed, the result must be only one. For example, when you roll a six-
sided die, the sample space is that any one of the six sides is face-up. But after you throw the die, it is
not possible for more than one side to be face up. For the more “technical” amongst us, yes, depending
on where and how you throw it, it may be possible that the die ends up on a corner or an edge. If you
really want to account for these outcomes, you should include them in your sample space. Just note that
this tends to complicate analysis, so unless necessary for your needs, only the simple cases of one face
up are considered.

A trial is said to be deterministic if it has only one possible outcome. That is, there is only one element in
the sample space. Most of the experiments in your science classes are deterministic. If there are at least
2 possible outcomes, then the trial is said to be random.

In probability theory, a number between zero and one (including zero and one) is associated with each
outcome. These numbers are called the probability of the outcome, and it describes the likelihood that
the outcome will occur. A value of zero means the outcome will never occur, while a value of one means
the outcome will always occur. Note that when you add the probabilities of each outcome in the sample
space, you should get exactly one. This means that you are guaranteed that one of the outcomes will
occur. You should be familiar with this idea from basic probability.

A random variable is a variable used to represent outcomes. If you recall your basic math, variables
usually represent some number that is to be determined. In statistics, a random variable represents an
outcome. Ironically, a common random variable is one that you are familiar with that also happens to be
a common usual variable, which is the letter “x”. Note though, that to take advantage of numerical
processes, outcomes are usually represented by numbers, so, you will frequently see random variables
representing numbers. For these cases, make sure to keep in mind that these numbers actually
represent an outcome.

For example, when throwing a coin, we may let H represent the outcome that a heads lands face up,
while T represents the outcome that a tails lands face up. Assuming that there is a 50% chance for each
to occur, we may write
P( x = H ) = 0.5
which can be read as “the probability that the random variable x is the outcome represented by H is 0.5.
Or, less formally, “the probability that the outcome is H is 0.5”.

Note that the use of the letters T and H are arbitrary. Any letter, number, or symbol may be used.
Sometimes, numerical labelling is convenient though, for example, if we are labelling the outcomes for
rolling a die, we can use the number 1 to represent the outcome when a one lands face up, the number
2 to represent the outcome when a two lands face up, etc.

A probability distribution describes the probabilities for each value of the random variable.
Distributions are usually expressed as a formula, graph, or table.

A discrete probability distribution is a distribution whose random variable is discrete. There are many
discrete distributions, but two of the more common ones are the binomial probability distribution and
the Poisson probability distribution. The binomial distribution is a distribution for a random variable that
has only two possible values (usually called success or failure). The Poisson distribution is a distribution
for the number of occurrences of an event over an interval (which may be time, length, etc.)

A continuous probability distribution is a distribution whose random variable is continuous. There are
many types of continuous distributions, the two that will be discussed are the uniform continuous
probability distribution and the normal probability distribution. Recall (from calculus ....) that the area
under a curve represents the sum of the function values over the interval. Thus, if the formula for a
continuous distribution is integrable, you may use calculus to determine probabilities.

There is an important difference between discrete and continuous distributions. For discrete
distributions, the formula usually gives the probability of a random variable. In continuous distributions,
because of the nature of continuity, the probability of any single random variable is always zero, but the
probability of a group of random variables may not be zero. Thus, the formula doesn’t give a probability
of a random variable, but it gives the probability of a group of random variables. There are omitted
details regarding how the group is formed, but just be aware that there are technical considerations
involved. But, because of this, the area under a portion of the graph represents the probability of that
group of random variables.
The following link describes some other common distributions:
https://www.analyticsvidhya.com/blog/2017/09/6-probability-distributions-data-science/

A continuous uniform distribution or uniform distribution for short, is a distribution in which its
formula is a constant. Its graph is a horizontal segment. As stated above, the area under the graph
represents the sum of all the probabilities and it should be equal to one. The area under any portion of
the graph will look like a rectangle or a square. This makes our life easier as the computation of the area
is simplified to computing the area of a rectangle.

The following graph is an example of what the graph of a uniform distribution could look like.

You will notice that the random variables are any real numbers between 1 and 3. Also, the y-value is
always 0.5. The graph also satisfies that the total area under the graph is one, since the area is length
times height. The length of the line is 2 (coming from 3 – 1 = 2), and the height is 0.5 (the distance from
the x-axis to the line). So, the area is 2 × 0.5 = 1.

In continuous distributions, random variables are usually represented by numbers, so when talking
about probabilities, there are additional notations formed by using inequalities. For example,

P( x < 7 ) = 0.3
which can be read as “the probability that the random variable x is any outcome represented by any
number less than 7 is 0.3. Or, less formally, “the probability that the outcome is less than 7 is 0.3”.

Another example,

P( – 3.5 < x < 9.12 ) = 0.8

which can be read as “the probability that the random variable x is any outcome represented by any
number greater than –3.5 but less than 9.12 is 0.8. Or, less formally, “the probability that the outcome is
between – 3.5 and 9.12 is 0.8”.
With these notations in mind, we can determine some probabilities for the above graph.

P( x < 7 ) = 1 (all possible values of x are between 1 and 3, which are all less than 7, so it’s guaranteed
that x is less than 7)

P( x < 0.5 ) = 0 (all possible values of x are between 1 and 3, none of which are less than 0.5, so it’s
guaranteed that x is never less than 0.5)

P( x < 2 ) = 0.5 (from the length being 2 – 1 = 1, and the height being 0.5, so the area is length times
height, which would be 1 × 0.5 = 0.5)

P( x < 2.5 ) = 0.75 (from the length being 2.5 – 1 = 1.5, and the height being 0.5, so the area is length
times height, which would be 1.5 × 0.5 = 0.75)

P( x > 0.2 ) = 1 (all possible values of x are between 1 and 3, which are all greater than 0.2, so it’s
guaranteed that x is greater than 0.2)

P( x > 5 ) = 0 (all possible values of x are between 1 and 3, none of which are greater than 5, so it’s
guaranteed that x is never greater than 5)

P( x > 2 ) = 0.5 (from the length being 3 – 2 = 1, and the height being 0.5, so the area is length times
height, which would be 1 × 0.5 = 0.5)

P( x > 2.2 ) = 0.4 (from the length being 3 – 2.2 = 0.8, and the height being 0.5, so the area is length
times height, which would be 0.8 × 0.5 = 0.4)

P( 1.5 < x < 2.1 ) = 0.3 (from the length being 2.1 – 1.5 = 0.6, and the height being 0.5, so the area is
length times height, which would be 0.6 × 0.5 = 0.3)

P( 1.1 < x < 2.9 ) = 0.9 (from the length being 2.9 – 1.1 = 1.8, and the height being 0.5, so the area is
length times height, which would be 1.8 × 0.5 = 0.9)

P( x < 1.5 or x > 2.1 ) = 0.7 (from two areas being formed. One area is formed with the length being 1.5
– 1 = 0.5, and the height being 0.5, so the area is length times height, which would be 0.5 × 0.5 = 0.25.
The other area is formed with the length being 3 – 2.1 = 0.9, and the height being 0.5, so the area is
length times height, which would be 0.9 × 0.5 = 0.45. So the total area is 0.25 + 0.45 = 0.7).

Remember, the distribution of the data is under descriptive statistics because it describes how the data
is spread. There are formulas/procedures to determine how close to a particular distribution a given
data set is. Just note that there are many types of distributions. The continuous uniform distribution is
very simple. It’s graph is just a line segment, so variations between different uniform distributions is
mainly on the length of the segment and where it is placed. Although there are some real-world cases of
uniform distributions, in this class, it has mainly been used to introduce notations and concepts.

Another type of distribution that you will frequently encounter is the normal distribution. It is described
by the following formula:
( )
2
−1 x−μ
1 2 σ
f ( x )= e
σ √2 π

The only variable in the above equation is x . You should be familiar with the constant π being
approximately 3.14. You’ve encountered the Greek letters μ and σ previously, and these just represent
constants in the formula. Not coincidentally, they also represent the mean and standard deviation,
respectively, of the distribution. The graph of normal distributions are usually called “bell-shaped”.

As you can see from the formula, there are many variations of the normal distribution, depending on the
values of the mean and standard deviation. Also, if you recall your calculus, areas under curves can be
computed using integration. Unfortunately, the above function can’t be integrated “nicely”, that is, it
doesn’t have a closed form. In other words, it’s not easy to compute the areas, so it’s not easy to
compute the probabilities.

The good news is that instead of studying each distinct normal distribution, we can study a particular
normal distribution, and we can use information from this to get information about the other normal
distributions. The standard normal distribution is the normal distribution when μ=0 and σ =1.

The standard normal distribution has been studied, and tables for areas have been made. The most used
table is the z-table.

If you have forgotten how to read z-tables, you may find it here: https://towardsdatascience.com/how-
to-use-and-create-a-z-table-standard-normal-table-240e21f36e53

Probability Distributions
100% (5)
Probability Distributions
21 pages
Statistics and Probability
100% (2)
Statistics and Probability
81 pages
Iso Cie 11664-6-2014
100% (1)
Iso Cie 11664-6-2014
18 pages
Chapter 3
100% (1)
Chapter 3
19 pages
Lesson 1: Basic Probability: Learning Objectives
No ratings yet
Lesson 1: Basic Probability: Learning Objectives
33 pages
Homological Algebra
0% (1)
Homological Algebra
279 pages
Chapter - 5 Is - LM Model Econ - 102 2
No ratings yet
Chapter - 5 Is - LM Model Econ - 102 2
28 pages
Probability Notes
No ratings yet
Probability Notes
7 pages
Intro To Probability (Pattern Recognition)
No ratings yet
Intro To Probability (Pattern Recognition)
94 pages
Statistics and Probability Second SEMESTER S.Y. 2020 - 2021: Quest
No ratings yet
Statistics and Probability Second SEMESTER S.Y. 2020 - 2021: Quest
6 pages
Unit 3
No ratings yet
Unit 3
70 pages
Random Variables
No ratings yet
Random Variables
11 pages
Probability Distributions: Values That The Random Variable Can Take On. Thus, The Expression P (X X) Symbolizes The
No ratings yet
Probability Distributions: Values That The Random Variable Can Take On. Thus, The Expression P (X X) Symbolizes The
6 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
37 pages
Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical
No ratings yet
Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical
32 pages
CH01. Random Variables 2023
No ratings yet
CH01. Random Variables 2023
43 pages
Statistics and Probability Reviewer
No ratings yet
Statistics and Probability Reviewer
3 pages
Stats and Probab Reviewer
No ratings yet
Stats and Probab Reviewer
4 pages
Unit 4.
No ratings yet
Unit 4.
22 pages
Probability
No ratings yet
Probability
10 pages
Theory of Probability .
No ratings yet
Theory of Probability .
11 pages
Unit 3 R As A Set of Statistical Tables
No ratings yet
Unit 3 R As A Set of Statistical Tables
31 pages
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
No ratings yet
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
5 pages
Stat Prob - Q3-Handout
No ratings yet
Stat Prob - Q3-Handout
6 pages
CH 8 - Special Continuous Probability Distribution
No ratings yet
CH 8 - Special Continuous Probability Distribution
12 pages
STATISTICS Module 1
No ratings yet
STATISTICS Module 1
31 pages
Statistics Notes Part-2
No ratings yet
Statistics Notes Part-2
24 pages
R-6 Theory
No ratings yet
R-6 Theory
4 pages
UNIT 1 Notes by ARUN JHAPATE
No ratings yet
UNIT 1 Notes by ARUN JHAPATE
20 pages
STATSPROB
No ratings yet
STATSPROB
11 pages
Chapter05 - Probability Disty
No ratings yet
Chapter05 - Probability Disty
17 pages
Different Types of Distributions
No ratings yet
Different Types of Distributions
12 pages
STAT Lesson1 Random Variables
No ratings yet
STAT Lesson1 Random Variables
14 pages
Module 3 Introduction To Probability
No ratings yet
Module 3 Introduction To Probability
4 pages
Chap 6
No ratings yet
Chap 6
7 pages
Lecture Notes - Inferential Statistics
No ratings yet
Lecture Notes - Inferential Statistics
9 pages
Unit 5 & 6. Probability and Prob Disti
No ratings yet
Unit 5 & 6. Probability and Prob Disti
90 pages
Unit1 - Read-Only
No ratings yet
Unit1 - Read-Only
191 pages
Unit 1 Ssmda Notes
No ratings yet
Unit 1 Ssmda Notes
35 pages
Random Variable and ProbabilityDistribution
No ratings yet
Random Variable and ProbabilityDistribution
10 pages
Aerodynamic Flutter Analysis of Suspension Bridges by A Modal Technique
No ratings yet
Aerodynamic Flutter Analysis of Suspension Bridges by A Modal Technique
8 pages
vt59.2708-21417172328 1042234447002868 534542395353161222 N.pdfintroduction-Final - PDF NC Cat 103&ccb 1
No ratings yet
vt59.2708-21417172328 1042234447002868 534542395353161222 N.pdfintroduction-Final - PDF NC Cat 103&ccb 1
53 pages
Unit 5
No ratings yet
Unit 5
16 pages
Module 4
No ratings yet
Module 4
87 pages
Stat - G. Assignment
No ratings yet
Stat - G. Assignment
21 pages
Week 1 StatProb Module
No ratings yet
Week 1 StatProb Module
11 pages
Inbound 4421484962866478386
No ratings yet
Inbound 4421484962866478386
68 pages
Mathematics10 B Lesson9
No ratings yet
Mathematics10 B Lesson9
9 pages
Random Variables and Probability Distribution
No ratings yet
Random Variables and Probability Distribution
50 pages
Lesson 2. Random Variables and Probability Distributions: Janet C. Fernando Subject Teacher
No ratings yet
Lesson 2. Random Variables and Probability Distributions: Janet C. Fernando Subject Teacher
23 pages
SHS - stat&Prob.Q3.W1 5.52pgs
No ratings yet
SHS - stat&Prob.Q3.W1 5.52pgs
52 pages
Probability Distribution
No ratings yet
Probability Distribution
14 pages
Classify Sample Observation
No ratings yet
Classify Sample Observation
2 pages
Grade 11 Third Quarter Statistics and Probability Reviewer - Docx 1
No ratings yet
Grade 11 Third Quarter Statistics and Probability Reviewer - Docx 1
5 pages
Module 4 1
No ratings yet
Module 4 1
55 pages
MODULE 1 - Random Variables and Probability Distributions
No ratings yet
MODULE 1 - Random Variables and Probability Distributions
12 pages
Probability Distribution of Discrete Random Variable (Lesson Plan) 2
No ratings yet
Probability Distribution of Discrete Random Variable (Lesson Plan) 2
8 pages
SP - Quarter 3 LAS 1
No ratings yet
SP - Quarter 3 LAS 1
7 pages
Instructions For Chapter 5-By Dr. Guru-Gharana The Binomial Distribution Random Variable
No ratings yet
Instructions For Chapter 5-By Dr. Guru-Gharana The Binomial Distribution Random Variable
10 pages
Information About The Course Work: Tutorial 2, 3
No ratings yet
Information About The Course Work: Tutorial 2, 3
20 pages
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
27 pages
Parallelograms
No ratings yet
Parallelograms
4 pages
Jee Main - (One Year Crp-2425) C-Lot-Ph-1 (Vec, KM, Lom, Wep & Com)
No ratings yet
Jee Main - (One Year Crp-2425) C-Lot-Ph-1 (Vec, KM, Lom, Wep & Com)
20 pages
The Balanced Scorecard: Superfactory Excellence Program™
No ratings yet
The Balanced Scorecard: Superfactory Excellence Program™
65 pages
Cut & Bent Reinforcement
No ratings yet
Cut & Bent Reinforcement
3 pages
Mathematics W 21
100% (1)
Mathematics W 21
25 pages
ChE 3323 Syllabus 2016
No ratings yet
ChE 3323 Syllabus 2016
5 pages
An Introduction To The Guide To The Expression of Uncertainty in Measurement'
No ratings yet
An Introduction To The Guide To The Expression of Uncertainty in Measurement'
10 pages
Term Project
No ratings yet
Term Project
8 pages
Uji Shapiro Wilk
No ratings yet
Uji Shapiro Wilk
8 pages
Lead Compensator Design Paper
No ratings yet
Lead Compensator Design Paper
17 pages
Virani Sir
No ratings yet
Virani Sir
17 pages
Absorption: Instructor: Zafar Shakoor
No ratings yet
Absorption: Instructor: Zafar Shakoor
14 pages
Stable and Unstable Manifold, Heteroclinic Trajectories and The Pendulum
No ratings yet
Stable and Unstable Manifold, Heteroclinic Trajectories and The Pendulum
7 pages
A Novel Robust Crypto Watermarking Scheme Based On Hybrid Transformers
No ratings yet
A Novel Robust Crypto Watermarking Scheme Based On Hybrid Transformers
10 pages
The 10 Minute Talk
No ratings yet
The 10 Minute Talk
11 pages
CAT 2011 Question Paper
No ratings yet
CAT 2011 Question Paper
22 pages
I-Tutor Weekly Test-3A Maths (C-IX) - 26-04-2020
No ratings yet
I-Tutor Weekly Test-3A Maths (C-IX) - 26-04-2020
1 page
Part 1 Functions Equations and Their Graphs
No ratings yet
Part 1 Functions Equations and Their Graphs
30 pages
A Fast Algorithm For The Simplified Theory of Rolling Contact - FASTSIM
No ratings yet
A Fast Algorithm For The Simplified Theory of Rolling Contact - FASTSIM
14 pages
Get Signals and Systems Principles and Applications 1st Edition Shaila Dinkar Apte Free All Chapters
No ratings yet
Get Signals and Systems Principles and Applications 1st Edition Shaila Dinkar Apte Free All Chapters
55 pages
Day 1 August 24 - Grade 8
No ratings yet
Day 1 August 24 - Grade 8
4 pages
C27 Btest-1 Physics Paper
No ratings yet
C27 Btest-1 Physics Paper
8 pages
Kinematics (Motion in Straight Line) WS 1
No ratings yet
Kinematics (Motion in Straight Line) WS 1
3 pages
Probability Test Grade 4 2018 2019
No ratings yet
Probability Test Grade 4 2018 2019
5 pages
GentryMath 1101 Syllabus Fall 2017 14605
No ratings yet
GentryMath 1101 Syllabus Fall 2017 14605
2 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Structured Decision Making
From Everand
Structured Decision Making
Andreas Michael Theodorou
No ratings yet

05 Descriptive Statistics - Distribution

Uploaded by

05 Descriptive Statistics - Distribution

Uploaded by

Descriptive Statistics: Distribution & A Small Part Of Probability.

P( – 3.5 < x < 9.12 ) = 0.8

You might also like