0% found this document useful (0 votes)

15 views43 pages

Week2 Modified

The document discusses various data visualization techniques including frequency tables, histograms, boxplots, and scatterplots. It also covers probability distributions like the normal distribution and how to find probabilities using these distributions.

Uploaded by

turbonstre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views43 pages

Week2 Modified

Uploaded by

turbonstre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

BDM 2053

Big Data Algorithms and Statistics

Weekly Course Objectives
● Why do we visualize data?
● Frequency tables and histograms.
● Box plots, scatterplots, barplots.
● Discuss the normal distribution.
● Explore the application of the normal distribution in data
science.
● Go over conﬁdence intervals.
● How do we check if something is “Normally” distributed?
● Do some examples in Python!
Data Visualization
● A picture is worth a thousand words… or numbers!
● The process of taking data, and obtaining insights using charts
and graphs is data visualization.
● We use data visualization to story tell.
○ We can use it to quickly get a feel of our data.
○ Sometimes central tendencies and measures of dispersions
are hard to picture, but with charts you can visually see
what is going on.
● Many times, you can identify patterns and trends just through
charts and visuals!
● Leads to actionable insights (insights that you can take action
on to guide a business problem).
Data Visualization cont.
● You may not understand the beneﬁts of data visualization
because we give very basic examples in lectures.
○ Ex, Heights…
● In reality, you won’t work with just 1 variable and 10
observations. You will work with tens to hundreds of variables
with over 50k observations (often times millions of
observations)!
○ For example; telephony data, customer banking
transactions, click-stream data, etc.
Frequency Tables
● Frequency shows the number of times a particular event
occurs.
● Therefore, a frequency table is a table that shows the number
of times particular events occur in ascending or descending
order of events.
● They are a great way to understand the distribution of data
tabularly.
○ Whether you have clusters of points.
○ Outliers.
○ In general, story tell with your data!
● Procedure is to:
1) Select number of bins (grouped intervals).
2) Count the number of observations within each bin
3) Record them sequentially on a plot.
Frequency Tables Example
● Once again, let’s visit our favourite data of Heights.
● Heights = {150, 156, 183, 230, 143, 138, 145, 165, 167, 158}
● Say we wanted to make 4 bins here. Since the max is 230, and
the min is 150, we can make the bins have (230-138)/4 sized
intervals.
○ (230-138)/4 = 23
● If the bins are of size 23, that means the ranges are :
○ 138 - 161, 162 - 185, 186 - 209 and 210 - 233
Bins Frequency Relative Frequency

138 - 161 6 6/10 = 0.6

162 - 185 3 3/10 = 0.3

186 - 209 0 0/10 = 0

210 - 233 1 1/10 = 0.1

Total 10 1
Histograms & Barplots
● Histograms extend frequency tables 1 step further by showing
them visually.
● They convey the same story but in diﬀerent ways.
● In some ways, they are better because if we had many bins, it
would be very hard to comprehend and see in a table format.
● Visually, we can spot trends and patterns a lot better than in
tables with numbers and various sections.
● Barplots are the same as histograms, except we do not need to
bin numbers together and simply represent each category as its
own bin.
Histograms: Tables vs. Graphs

versus
Histograms Example
● Converting our frequency table for Heights into a chart, we get
the following:

One optimal bin size can be found here!

Boxplots
● Sometimes our goal is to not just understand the distribution
of data, but the overall symmetry and spread of our data.
● A boxplot is a way of plotting the 5 figure summary
(minimum, Q1, Median, Q3, and maximum) of our data.
● Much like the histogram, it can help us see the distribution of
our data (is it symmetric, left skewed, right skewed?), and
identify outliers.
● Values that fall within the following ranges are considered
outliers:
○ Less than Q1 - 1.5 * IQR
○ Greater than Q3 + 1.5 * IQR
● You might be wondering where the minimum and maximum is
used here. It is captured as part of the criteria above, among
other potential points, to see if there are many outliers.
Boxplot Example
● Looking at the… you guessed it, Heights, we obtained the
following statistics from the last lecture:
○ Q1 = 147.5
○ Q2 (Median) = 157
○ Q3 = 166
○ Min = 138
○ Max = 230
○ IQR = Q3 - Q1 = 18.5
● From these values, we simply need to find the last 2 statistics
which would be Q1 - 1.5 * IQR and Q3 + 1.5 * IQR
○ Q1 - 1.5 * IQR = 147.5 - 1.5 * 18.5 = 119.75
○ Q3 + 1.5 * IQR = 193.75
● Values below and above the points above, respectively, are
outliers by this definition of outliers!
Boxplot Example

● We can see that the lower end and upper end of the box plot show values that are “typical”.
● There is 1 outlier identiﬁed here which is the height of 230 cm.
Scatterplots
● Often times looking at just one variable at a time isn’t
meaningful or what we are trying to get insights for.
● Scatterplots are a way of comparing pairs of values across your
entire data set simultaneously. This way you can draw the
relationship for two (or sometimes more) variables.
● When looking at 2 variables, the x-axis would represent one
variable and the y-axis would represent another.
○ Each point would represent one observation’s
characteristic.
Scatterplots Example
● FINALLY! Let’s look at another variable besides Height…
Weight!
● Heights = {150, 156, 183, 230, 143, 138, 145, 165, 167, 158}
Weights ={115, 110, 182, 210, 104, 100, 109, 121, 124, 131}
● In terms of data, the data for this might look something like
the following:

Name Height Weight …

Sailor Moon 150 115 …

Sailor Venus 156 110 …

Sailor Jupiter 183 182 …

…
Scatterplots Example cont.

Sailor Jupiter

Sailor Moon
Sailor Venus
Scatterplots Example cont.

Say we had another variable that was categorical, like the strength of each person being
strong or weak
Scatterplots Example cont.

Weak

Strong

We can add that into our scatterplot to enhance our insights. If they give more information,
it is worth showing!
Scatterplots Example cont.

Not as useful, but sometimes could be under the right settings.

More graphs!
● There exists so many more types of graphs!
More graphs!
● For more plots and how to make them, check out:
https://www.python-graph-gallery.com/
Break!!!
Probability Distributions
● In the last lecture and even in this one, we have quantified and
visualized how data is distributed.
● Much of this data follows assumptions and patterns that
resemble some classical distributions, called probability
distributions.
● More formally, probability distributions are functions that
tell you the likelihood of obtaining possible values from some
random event. For example, the probability that:
○ You will arrive in the first 5 minutes of class.
○ That someone’s height is between 150cm and 160cm.
○ That the average weight is greater than 200lbs.
○ Can be represented by a probability density function (pdf,
not the file format!)
● We can draw probabilities from such events because we have
data that often follows certain assumptions!
● All probabilities under the curve must add up to 1.
Probability Distributions Example

Normal Distribution: µ = 162cm, σ = 17cm

Continuous

Poisson Distribution: µ = 3, σ = 9

Discrete
Normal Distributions
● The normal distribution (Gaussian distribution) is one of the
most commonly sampling distribution.
● The following are key facts about the normal distribution:
○ The mean, median and mode are the exact same.
○ The distribution is symmetric around the mean, µ.
○ Exactly 50% of the data lie on the left and right side of the
mean.
○ 68% of the data lies within one standard deviation of the
mean, and 95% lies within two standard deviations.
● A big misconception is that most data in the world behaves
“normally” (is normally distributed). Actually, their statistics
follow a normal distribution.
○ Most data follows a long-tail distribution (data that is
skewed, typically to the right).
Normal Distributions cont.

= ,
Normal Distributions: Standard Normal
● The standard normal distribution is when the mean is equal
to 0, and the standard deviation is equal to 1.
● You can standardize any sets of values by the following
equation:
Z = (x - µ)/σ

● Standardizing your data is not only useful to ﬁnd probabilities

for normal distributions, but to scale your data (more on this
later).
Normal Distributions: Finding probabilities
● To find the probability that your random variable, X, will be
less than some specific event, x, can be written as follows:
P(X < x)
● With continuous distributions, this is not as simple as adding
probabilities as in the discrete case. For example:
○ If X is a random variable representing the number of rooms
in a rental unit, find the probability of a rental unit having
less than 3 rooms given the following:

P(X < 3) = P(X =0 ) +

P(X = 1 ) +
P(X = 2 )
= 0 + 0.009
+ 0.017
= 0.026
Normal Distributions: Finding probabilities cont.
● The density of a specific event in the continuous case is 0. I.e.,
if I asked you to find the probability that someone weighs
exactly 180 lbs, you would spend a very long time because
weights are continuous. Someone might be 179.9 lbs, or
180.0002 lbs, but the likelihood someone is exactly 180 lbs
(180.0000000000000 lbs) is 0.
● However, the probability that someone is less than 180 lbs is
much more likely. Similarly, we could find the probability of
someone being between two weight classes (180 lbs and 190
lbs) or greater than 180 lbs.
● Since the area under a pdf must equal 1, we need to simply do
some math via integration… but integrated the pdf of the
normal is not easy!!!!!! This is called the cdf (cumulative
density function
● Using software makes this extremely fast and easy!
Normal Distributions: Example
● For his brand new Banana phone, Atinder knows that the
length of time it takes the battery to recharge fully is normally
distributed with a mean of 3 hours and a standard deviation of
30 minutes. Atinder owns one of these computers and wants to
know the probability that the length of time will be between 2
and 2.5 hours.
● µ = 3, σ = 0.5

● P( 2 < X < 2.5 ) = ?

Normal Distributions: Example
● P( 2 < X < 2.5 ) = P( X < 2.5 ) - P( X < 2 )

= -

● P( 2 < X < 2.5 ) = P( X < 2.5 ) - P( X < 2 )

= 0.1359
Normal Distributions: Example 2
● The lifetime of Atinder’s phone has a normal distribution with
a mean of 24 months and standard deviation of 4 months. Find
the probability that his phone will last more than 30 months
● µ = 24, σ = 4
● P( 30 < X ) = ?

= -

● P( 30 < X ) = 1 - P(X < 30 ) = 0.0668

Normal Distributions: Example 3
● Entry to a certain University is determined by a national test.
The scores on this test are normally distributed with a mean of
500 and a standard deviation of 100. Tom wants to be admitted
to this university and he knows that he must score better than
at least 70% of the students who took the test. Tom takes the
test and scores 585. Will he be admitted to this university?
● µ = 500, σ = 100

● P( X < x ) = 0.7
Normal Distributions: Example 3 cont.
● Here the tricky thing is we know what the probability is, we
just don’t know what value satisfies it within the parameters
provided.
● We can use software to find the inverse of this very easily!
● P( X < x ) = 0.7, x must be 552.44 ~ 553.
● Since Tom scored greater than 553, he will be admitted! Yay!
Confidence Intervals
● Placing our trust in 1 value for an analysis is very risky.
● Say we wanted to forecast budgets and stated that we expect
the average savings to be $10,000 next month. When the next
month financial results occur, we find out we actually only
saved $8,000 - but the business trusted us so much that they
allocated the $2,000 not saved to some other product. Now we
are in trouble!
● To avoid this issue of saying “I don’t know” or “maybe we expect
somewhere around $10,000”, we instead give a range of what we
can expect!
● So, a confidence interval is an estimate of how likely are our
estimates to be within a range.

Point estimate: mean conﬁdence interval: mean

Confidence Intervals cont.

● We essentially want to ﬁnd how likely we are to capture the

true population mean within an interval.
○ We could have gotten a bad sample.
○ The cost of getting more observations in our sample is
expensive.
● The bigger the interval, the more confident we are!
● We need to use a bit of math to derive the values needed to
calculate the confidence intervals.
Confidence Intervals Example
● Atinder has made his own chocolate called Atindies! Like
Smarties, but with an A on them. He would like to know the
average weight a box of chocolates can have. Say he took a
sample of 1000 chocolates and found the mean and standard
deviation to be 45 grams and 3.8 grams, respectively.
● What is the 90% confidence interval for the weight of the box?

● The 90% conﬁdence interval is captured

when you are 1.645 standard deviations of
the mean!
○ We want the interval to be around the
mean s.t. 90% of the data is captured.
○ How? Proof on next slide
Confidence Intervals Example: Proof
● We know that since the confidence interval is symmetric
around the mean, we must trim off α% from the standard
normal distribution.
○ 1-α = 90%
○ Therefore, here, α = 10%. This means 5% is trimmed off
both sides of the normal distribution:

● Therefore, P(Z<z) = 0.95 -> z = 1.645.

Confidence Intervals Example: Proof
● Remember, to standardize your results we must subtract by the
mean, and then divide by the standard deviation. So we get the
following:
Z = (X-µ)/σ
● However, the mean and standard deviation for a sample of size
n is:
○ E(X̄) = x̄
○ Var(X̄) = σ2/n
■ S.D. = σ / √(n)
● So we get as the confidence interval:
-zα/2< Z < zα/2
-zα/2< (X - x̄ ) / σ / √(n) < zα/2
x̄ - zα/2(σ / √(n) < X < x̄ + zα/2(σ / √(n),
where zα/2 is the value that covers 1-α/2 probability
For proof of the expected value and variance of the sample mean, visit here! , and the CI proof here!
Confidence Intervals: Back to the example
x̄ - zα/2(σ / √(n) < X < x̄ + zα/2(σ / √(n)
45 - 1.645*(3.8 / √(1000), 45 + 1.645*(3.8 / √(1000)
45 - 0.19767, 45 + 0.19767
44.802, 45.198
Is our data really Normal?
● We make all these assumptions, but one of the most important
things to check is if our data is even normally distributed to
begin with!
● We turn to QQ plots, which stands for quantile-quantile plots.
● As the name suggests, we want to compare each quantile of our
data to a distribution we think it is, and if it truly is distributed
according to that distribution, they should fit perfectly 1-1.
○ This means that the quantiles should form a line!
● But how do we do this?
QQ Plots Continued
● Begin by ranking your data in ascending order (smallest to
largest).
● Calculate the percentiles by taking the rank, subtracting 0.5
and then dividing by the total number of observations.
○ Percentile = (Rank - 0.5) / (n)
● Find the value according to the normal distribution that
achieves that percentile.
○ Ex: If a value is in the 10th percentile, we need to find what
value of the normal distribution (the standard normal
preferably) covers that probability.
● Standardize your data point to convert it to a z value.
● Compare your quantiles!
● An example will be done in Python!
Resources
● https://www.easycalculation.com/statistics/bell-curve-calculat
or.php
● http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R
-Code.html
Thank you

Introduction To The Practice of Basic Statistics (Textbook Outline)
100% (14)
Introduction To The Practice of Basic Statistics (Textbook Outline)
65 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Lecture 5: Let's Look at Some Data: Exploratory Data Analysis
No ratings yet
Lecture 5: Let's Look at Some Data: Exploratory Data Analysis
29 pages
Statistics 1
No ratings yet
Statistics 1
291 pages
Module-1 Part 2 Data Visualization
No ratings yet
Module-1 Part 2 Data Visualization
35 pages
Lecture3 Classnotes
No ratings yet
Lecture3 Classnotes
31 pages
CKW Vol. 4B Probability and Statistics
100% (2)
CKW Vol. 4B Probability and Statistics
210 pages
Word File For Prob and Stats
No ratings yet
Word File For Prob and Stats
18 pages
CS361 FA23 Lec2 Post
No ratings yet
CS361 FA23 Lec2 Post
67 pages
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
No ratings yet
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
62 pages
Chapter Two
No ratings yet
Chapter Two
36 pages
Secproject - Itskillsanddataanalysis 2
No ratings yet
Secproject - Itskillsanddataanalysis 2
69 pages
Manm526 W1
No ratings yet
Manm526 W1
38 pages
Lecture 8
No ratings yet
Lecture 8
76 pages
StatiF 1 Slides
No ratings yet
StatiF 1 Slides
27 pages
Engineering Data Analysis (Report)
No ratings yet
Engineering Data Analysis (Report)
18 pages
6 Descriptive StatisticsIntroduction
No ratings yet
6 Descriptive StatisticsIntroduction
72 pages
Descriptive Stats
No ratings yet
Descriptive Stats
39 pages
3 Data Description
No ratings yet
3 Data Description
87 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
3-Data Description
No ratings yet
3-Data Description
91 pages
Statistics 24 04 2021 20210618114031
No ratings yet
Statistics 24 04 2021 20210618114031
41 pages
Types of Statistics
No ratings yet
Types of Statistics
7 pages
Statistics I Chapter 1: Introduction
No ratings yet
Statistics I Chapter 1: Introduction
12 pages
00 Probability 2
No ratings yet
00 Probability 2
19 pages
STAB22 Lecture's Notes
No ratings yet
STAB22 Lecture's Notes
64 pages
Unit 1 Assignment SKELETON R spr18
No ratings yet
Unit 1 Assignment SKELETON R spr18
23 pages
ANALYST Sources
No ratings yet
ANALYST Sources
23 pages
5 - Data Summaries and Visualization
No ratings yet
5 - Data Summaries and Visualization
97 pages
7u7 PDF
No ratings yet
7u7 PDF
31 pages
MEASURES
No ratings yet
MEASURES
5 pages
AS Level Mathematics Statistics (New)
No ratings yet
AS Level Mathematics Statistics (New)
49 pages
Word File For Prob and Stats
No ratings yet
Word File For Prob and Stats
25 pages
CHAPTER 1 - PART 1 Latest PDF
No ratings yet
CHAPTER 1 - PART 1 Latest PDF
69 pages
Statistics With MATLABOctave
No ratings yet
Statistics With MATLABOctave
46 pages
Statistics For Css
No ratings yet
Statistics For Css
73 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
41 pages
Final Invoice For Epoxy & Screeding Works
No ratings yet
Final Invoice For Epoxy & Screeding Works
25 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
29 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
51 pages
Unit 4
No ratings yet
Unit 4
41 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
33 pages
Quality Control: Fundamentals of Statistics
No ratings yet
Quality Control: Fundamentals of Statistics
62 pages
Displaying Descriptive Statistics: Chapter 2 Map
No ratings yet
Displaying Descriptive Statistics: Chapter 2 Map
58 pages
Statistics Week 1
No ratings yet
Statistics Week 1
8 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
FDSA Unit-2
No ratings yet
FDSA Unit-2
41 pages
Chapter1 Statistic
No ratings yet
Chapter1 Statistic
33 pages
Descriptive Statistics and Probability Distributions: Session 1
No ratings yet
Descriptive Statistics and Probability Distributions: Session 1
34 pages
(UploadMB - Com) Kuro Tensei
67% (3)
(UploadMB - Com) Kuro Tensei
143 pages
MAT 211 Introduction To Business Statistics I Lecture Notes
No ratings yet
MAT 211 Introduction To Business Statistics I Lecture Notes
69 pages
Calculation of Power Pumps On Otec Power Plant Ocean (Ocean Thermal Energy Conversion)
No ratings yet
Calculation of Power Pumps On Otec Power Plant Ocean (Ocean Thermal Energy Conversion)
14 pages
Reliability Distribution 1
No ratings yet
Reliability Distribution 1
41 pages
QM1 Notes
No ratings yet
QM1 Notes
81 pages
Course Code & Number:FET201
No ratings yet
Course Code & Number:FET201
70 pages
Data Visualization
No ratings yet
Data Visualization
5 pages
Anesthesia: Mr. Renato D. Lacanilao, RN, MAN Lecturer
100% (1)
Anesthesia: Mr. Renato D. Lacanilao, RN, MAN Lecturer
25 pages
MATH 361 (Autosaved)
No ratings yet
MATH 361 (Autosaved)
17 pages
CGI 2024 Version Anglaise
No ratings yet
CGI 2024 Version Anglaise
550 pages
Data Visualizations: Histograms
No ratings yet
Data Visualizations: Histograms
27 pages
Chapter 1 Eqt 271 (Part 1) : Basic Statistics
No ratings yet
Chapter 1 Eqt 271 (Part 1) : Basic Statistics
69 pages
Roadmap C1 TB 9781292228709 UNIT 1
No ratings yet
Roadmap C1 TB 9781292228709 UNIT 1
38 pages
4 Chemical Reactions
No ratings yet
4 Chemical Reactions
48 pages
Agriculture Guide 2018 PDF
No ratings yet
Agriculture Guide 2018 PDF
37 pages
Client Engagement Letter
100% (1)
Client Engagement Letter
2 pages
Crane Truck Mounted Hiab Swms 2020
No ratings yet
Crane Truck Mounted Hiab Swms 2020
20 pages
SEMIKRON DataSheet SKM100GB125DN 21915390
No ratings yet
SEMIKRON DataSheet SKM100GB125DN 21915390
6 pages
Script MC Presentation Day
No ratings yet
Script MC Presentation Day
10 pages
Personal Best B1p-Nury
No ratings yet
Personal Best B1p-Nury
44 pages
8.6 Answers
No ratings yet
8.6 Answers
4 pages
Cloudcomputinglabmanual Final
No ratings yet
Cloudcomputinglabmanual Final
66 pages
00 - Topic Allotment
No ratings yet
00 - Topic Allotment
3 pages
PDF Document 5
No ratings yet
PDF Document 5
25 pages
Senior 1 Physics
No ratings yet
Senior 1 Physics
3 pages
Leather Exporting PDF
No ratings yet
Leather Exporting PDF
20 pages
Domain Driven Design - Step by Step
No ratings yet
Domain Driven Design - Step by Step
34 pages
CLOTHES Presentc
No ratings yet
CLOTHES Presentc
2 pages
Listening Exercises B2
No ratings yet
Listening Exercises B2
2 pages
Exploring Useful and Harmful Materials in Science (Grade 5)
100% (4)
Exploring Useful and Harmful Materials in Science (Grade 5)
5 pages
Module 5
No ratings yet
Module 5
21 pages
Multiple Regression
No ratings yet
Multiple Regression
11 pages
List of Elements
No ratings yet
List of Elements
2 pages
CPU Scheduling: Bibliographical Notes
No ratings yet
CPU Scheduling: Bibliographical Notes
4 pages
List of General & Commercial Banks in The Philippines: Philippine Banking System
No ratings yet
List of General & Commercial Banks in The Philippines: Philippine Banking System
3 pages
Optional Break Packet Parent Letter-English
No ratings yet
Optional Break Packet Parent Letter-English
3 pages
VKSS08 Floppy Brim Hat
No ratings yet
VKSS08 Floppy Brim Hat
1 page
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Hypothesis Testing Made Simple
From Everand
Hypothesis Testing Made Simple
Leonard Gaston
4/5 (5)
How Pi Can Save Your Life: Using Math to Survive Plane Crashes, Zombie Attacks, Alien Encounters, and Other Improbable Real-World Situations
From Everand
How Pi Can Save Your Life: Using Math to Survive Plane Crashes, Zombie Attacks, Alien Encounters, and Other Improbable Real-World Situations
Chris Waring
No ratings yet
From Average To K-means
From Everand
From Average To K-means
Beam van Waardenberg
No ratings yet

Week2 Modified

Uploaded by

Week2 Modified

Uploaded by

BDM 2053

Big Data Algorithms and Statistics

138 - 161 6 6/10 = 0.6

162 - 185 3 3/10 = 0.3

186 - 209 0 0/10 = 0

210 - 233 1 1/10 = 0.1

One optimal bin size can be found here!

Name Height Weight …

Sailor Moon 150 115 …

Sailor Venus 156 110 …

Sailor Jupiter 183 182 …

Not as useful, but sometimes could be under the right settings.

Normal Distribution: µ = 162cm, σ = 17cm

● Standardizing your data is not only useful to ﬁnd probabilities

P(X < 3) = P(X =0 ) +

● P( 2 < X < 2.5 ) = ?

● P( 2 < X < 2.5 ) = P( X < 2.5 ) - P( X < 2 )

● P( 30 < X ) = 1 - P(X < 30 ) = 0.0668

Point estimate: mean conﬁdence interval: mean

● We essentially want to ﬁnd how likely we are to capture the

● The 90% conﬁdence interval is captured

● Therefore, P(Z<z) = 0.95 -> z = 1.645.

You might also like