0% found this document useful (0 votes)

0 views

StatisticsLecture1

The document discusses the importance of descriptive statistics in communicating information and reasoning about data, using the Challenger disaster as a case study. It outlines various methods for visualizing data, including dot plots, pie charts, bar graphs, histograms, boxplots, and scatterplots, each serving different purposes. Additionally, it explains numerical summary measures such as mean, median, percentiles, and standard deviation to summarize data effectively.

Uploaded by

thekonan726

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

StatisticsLecture1

Uploaded by

thekonan726

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Introduc)on to Sta)s)cs | Lecture 1

Why are descrip-ve sta-s-cs important?

In January 1986, the space shu<le Challenger broke apart shortly aBer liBoff. The
accident was caused by a part that was not designed to fly at the unusually cold
temperature of 29◦ F at launch.
Here are the launch-temperatures of the first 25 shu<le missions (in degrees F):

16
14
12
10
8
6
4
2
0
[29, 42] (42, 55] (55, 68] (68, 81]

The two most important func-ons of descrip-ve sta-s-cs are: Communicate informa-on
and support reasoning about data.
When exploring data of large size, it becomes essen-al to use summaries.
It is best to use a graphical summary to communicate informa-on, because people
prefer to look at pictures rather than at numbers. There are many ways to visualize data. The
nature of the data and the goal of the visualiza-on determine which method to choose.

Pie chart and Dot plot

The dot plot makes it easier to compare frequencies of various categories, while the pie
chart allows more easily to eyeball what frac-on of the total a category corresponds to.
Bar graph
When the data are quan-ta-ve (i.e. numbers), then they should be put on a number
line. This is because the ordering and the distance between the numbers convey
important informa-on. The bar graph is essen-ally a dot plot put on its side.

The Histogram
The histogram allows to use blocks with diﬀerent widths. Key point here is that the areas of
the blocks are propor-onal to frequency.

So, the percentage falling into a block can be ﬁgured without a ver-cal scale since the
total area equals 100%. But it’s helpful to have a ver-cal scale (density scale). Its unit is ‘%
per unit’, so in the above example the ver-cal unit is ‘% per year’.

The histogram gives two kinds of informa-on about the data:

1. Density (crowding): The height of the bar tells how many subjects there are for one
unit on the horizontal scale. For example, the highest density is around age 19 as
.04 = 4% of all subjects are age 19. In contrast, only about 0.7% of subjects fall into
each one year range for ages 60–80.

2. Percentages (rela:ve frequences): Those are given by

area = height x width.
For example, about 14% of all subjects fall into the age range 60–80, because the
corresponding area is (20 years) x (0.7 % per year)=14 %. Alterna-vely, you can ﬁnd
this answer by eyeballing that this area makes up roughly 1/7 of the total area of the
histogram, so roughly 1/7=14% of all subjects fall in that range.

The boxplot
The boxplot depicts ﬁve key numbers of the data. The boxplot conveys less informa-on than
a histogram, but it takes up less space and so is well suited to compare several datasets:

The Sca5erplot
The sca<erplot is used to depict data that come as pairs. The sca<erplot visualizes the
rela-onship between the two variables.
Numerical summary measures
For summarizing data with one number, use the mean (average) or the median.
The median is the number that is larger than half the data and smaller than the other
half.

Diﬀerences between mean and median:

1. Symmetric Data – data sets whose values are evenly spread around the centre.
2. Skewed Data – data sets aren’t symmetric.

Mean and median are the same when the histogram is symmetric:

If the median sales price of 10 homes is $ 1 million, then we know that 5 homes sold for
$ 1 million or more. If we are told that the average sale price is $ 1 million, then we can’t
draw such a conclusion.
Percen)les

The 90th percen-le of incomes is 135,000 $. 90% of households report an income of $

135,000 or less, 10% report more.
The 75th percen-le is called 3rd quar-le: 85,000 $
The 50th percen-le is the median: 50,000 $
The 25th percen-le is called 1st quar-le.

Recall that the boxplot gives a ﬁve-number summary of the data:

the smallest number, 1st quar-le, median, 3rd quar-le, largest number.

The interquar-le range = 3rd quar-le− 1st quar-le. It measures how spread out the data are.

The standard devia)on

A more commonly used measure of spread is the standard devia-on.
𝑥̅ stands for the average of the numbers 𝑥! , 𝑥" , , … , 𝑥# .
The standard devia-on of these numbers is:

$ $
1 1
𝑠 = ' *(𝑥! − 𝑥̅ )# = ' *(𝑥! − 𝑥̅ )#
𝑛 𝑛−1
!%& !%&

The two numbers 𝑥̅ and 𝑠 are oBen used to summarize data. Both are sensi-ve to a
few large or small data. If that is a concern, use the median and the interquar-le range.

Types of Graphs
100% (1)
Types of Graphs
16 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
01 Descriptive Statistics For Exploring Data
No ratings yet
01 Descriptive Statistics For Exploring Data
21 pages
Probability+&+Statistics Formulas
No ratings yet
Probability+&+Statistics Formulas
47 pages
Descriptive Statistics, Tables and Graphs 20
No ratings yet
Descriptive Statistics, Tables and Graphs 20
34 pages
First Week
No ratings yet
First Week
8 pages
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
No ratings yet
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
35 pages
Data Visualization
No ratings yet
Data Visualization
17 pages
AEM Lecture 2
No ratings yet
AEM Lecture 2
71 pages
IE 220 Probability and Statistics: Descriptive Statistics - Graphical Summary: Describing Data With Graphs
No ratings yet
IE 220 Probability and Statistics: Descriptive Statistics - Graphical Summary: Describing Data With Graphs
36 pages
Unit 01 Statistics
No ratings yet
Unit 01 Statistics
10 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
WEEK1
No ratings yet
WEEK1
36 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
14 pages
It B.tech II Year II Sem DV (R18a0555)
No ratings yet
It B.tech II Year II Sem DV (R18a0555)
73 pages
Slides 1 Statistics
No ratings yet
Slides 1 Statistics
171 pages
01 Data & Statistics
No ratings yet
01 Data & Statistics
35 pages
Math Project (Section A)
No ratings yet
Math Project (Section A)
10 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
Biostat Aguila Mission Solis (1)
No ratings yet
Biostat Aguila Mission Solis (1)
44 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
SMA 140 Lectures Notes 2024 Sep
No ratings yet
SMA 140 Lectures Notes 2024 Sep
87 pages
Basic Statistical Descriptions of Data
No ratings yet
Basic Statistical Descriptions of Data
7 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Summary of Article
No ratings yet
Summary of Article
3 pages
Picturing Distributions With Graphs
No ratings yet
Picturing Distributions With Graphs
21 pages
Figure 1 Showing Discrete Variable
No ratings yet
Figure 1 Showing Discrete Variable
3 pages
Types of Charts
No ratings yet
Types of Charts
20 pages
Chapter 1 Descriptivestatistics
No ratings yet
Chapter 1 Descriptivestatistics
21 pages
Types of Graphs and Charts and Their Uses
100% (1)
Types of Graphs and Charts and Their Uses
17 pages
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
No ratings yet
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
1 page
Ch1 Prob&Stat NEW
No ratings yet
Ch1 Prob&Stat NEW
35 pages
Data Analytics Summary
No ratings yet
Data Analytics Summary
80 pages
7_1_3
No ratings yet
7_1_3
45 pages
Measures of Central Tendency & Variation
No ratings yet
Measures of Central Tendency & Variation
86 pages
SLIDES Statistics-Chapter 2
No ratings yet
SLIDES Statistics-Chapter 2
31 pages
lec2
No ratings yet
lec2
59 pages
Lesson-2-Data-Presentation
No ratings yet
Lesson-2-Data-Presentation
54 pages
EDA
No ratings yet
EDA
21 pages
ap_stat_exam_rev_ch1-13
No ratings yet
ap_stat_exam_rev_ch1-13
120 pages
Week 4 Assignment
No ratings yet
Week 4 Assignment
5 pages
SWE 335 Slide 07
No ratings yet
SWE 335 Slide 07
29 pages
Stats Review
No ratings yet
Stats Review
5 pages
CHAPTER 3 THE NATURE OF STATISTICS Copy 1
No ratings yet
CHAPTER 3 THE NATURE OF STATISTICS Copy 1
14 pages
Data Analysis
No ratings yet
Data Analysis
43 pages
Descriptive Stats
No ratings yet
Descriptive Stats
50 pages
Summary Statistics and Visualization Techniques To Explore
100% (1)
Summary Statistics and Visualization Techniques To Explore
30 pages
Graphical Presentation
No ratings yet
Graphical Presentation
6 pages
Data Presentation and Analysis
No ratings yet
Data Presentation and Analysis
71 pages
descriptive statistics Lecture 3
No ratings yet
descriptive statistics Lecture 3
25 pages
Chapter 1 Lecture Slides
No ratings yet
Chapter 1 Lecture Slides
22 pages
KWV Education Statistics
No ratings yet
KWV Education Statistics
48 pages
1 - 3 - 4 - Class1 - Descriptive Statistics - 4slines - 1trang
No ratings yet
1 - 3 - 4 - Class1 - Descriptive Statistics - 4slines - 1trang
99 pages
Stats and its Real world applications.
No ratings yet
Stats and its Real world applications.
53 pages
1.ungrouped Data Mean, Median&Mode
No ratings yet
1.ungrouped Data Mean, Median&Mode
39 pages
Univariate Statistics w24 Update
No ratings yet
Univariate Statistics w24 Update
144 pages
02 Descriptive Statisctics
No ratings yet
02 Descriptive Statisctics
59 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Image Histogram: Unveiling Visual Insights, Exploring the Depths of Image Histograms in Computer Vision
From Everand
Image Histogram: Unveiling Visual Insights, Exploring the Depths of Image Histograms in Computer Vision
Fouad Sabry
No ratings yet
Numerical Methods I - Roundoff Errors
No ratings yet
Numerical Methods I - Roundoff Errors
46 pages
Complex Analysis 2
No ratings yet
Complex Analysis 2
44 pages
Numerical Methods I - Foundations
No ratings yet
Numerical Methods I - Foundations
68 pages
StatisticsLecture2
No ratings yet
StatisticsLecture2
3 pages
Fundamentals of Programming-2
No ratings yet
Fundamentals of Programming-2
19 pages
Complex Analysis 3
No ratings yet
Complex Analysis 3
46 pages
Complex Analysis 5
No ratings yet
Complex Analysis 5
55 pages
Complex Analysis 4
No ratings yet
Complex Analysis 4
38 pages

StatisticsLecture1

Uploaded by

StatisticsLecture1

Uploaded by

Introduc)on to Sta)s)cs | Lecture 1

Why are descrip-ve sta-s-cs important?

Pie chart and Dot plot

The histogram gives two kinds of informa-on about the data:

2. Percentages (rela:ve frequences): Those are given by

Diﬀerences between mean and median:

The 90th percen-le of incomes is 135,000 $. 90% of households report an income of $

Recall that the boxplot gives a ﬁve-number summary of the data:

The standard devia)on

You might also like