0% found this document useful (0 votes)

30 views35 pages

Introduction To Descriptive Statistics

The document discusses key measures and concepts in descriptive statistics such as measures of center, spread, skew, and kurtosis. It also covers the distinction between population and sample notation, how to calculate the mean, variance, standard deviation, and how to visualize univariate data through histograms, density plots, box plots, and other graphs.

Uploaded by

Sudhir Aggarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views35 pages

Introduction To Descriptive Statistics

Uploaded by

Sudhir Aggarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

Introduction to

Descriptive
Statistics
17.871
Key measures
Describing data

Moment Non-mean based

measure
Center Mean Mode, median

Spread Variance Range,

(standard deviation) Interquartile range

Skew Skewness --

Peaked Kurtosis --
Key distinction
Population vs. Sample Notation

Population vs. Sample

Greeks Romans
μ, σ, β s, b
Mean
n

 i
x
i 1
X
n
Variance, Standard Deviation
n
( xi   )2


i 1 n
 ,
2

n
( xi   )
2


i 1 n

Variance, S.D. of a Sample
n
( xi   )2


i 1 n 1
s ,
2

Degrees of freedom
n
( xi   )
2


i 1 n 1
s
Binary data

X  prob( X )  1  proportion of time x  1

s  x (1  x )  s x  x (1  x )
2
x
Normal distribution example
 IQ
Frequency  SAT
 Height

 “No skew”
 “Zero skew”
 Symmetrical
Value
 Mean = median = mode
1 ( x   ) / 2 2
f ( x)  e
 2
Skewness
Asymmetrical distribution
Frequency  Income
 Contribution to
candidates
 Populations of
countries
 “Residual vote” rates

Value  “Positive skew”

 “Right skew”
Skewness
Asymmetrical distribution
Frequency
 GPA of MIT students

 “Negative skew”
 “Left skew”

Value
Skewness
Frequency

Value
Kurtosis
k>3 leptokurtic
Frequency

k=3 mesokurtic

k<3 platykurtic

Value
Normal distribution
 Skewness = 0
 Kurtosis = 3

1 ( x   ) / 2 2
f ( x)  e
 2
More words about the normal curve
The z-score
or the
“standardized score”

z x x
x
Commands in STATA for
univariate statistics
 summarize varname
 summarize varname, detail
 histogram varname, bin() start() width()
density/fraction/frequency normal
 graph box varnames
 tabulate [NB: compare to table]
Example of Sophomore Test
Scores
 High School and Beyond, 1980: A Longitudinal
Survey of Students in the United States (ICPSR
Study 7896)

 totalscore = % of questions answered correctly

minus penalty for guessing
 recodedtype = (1=public school, 2=religious
private, 3 = non-sectarian private)
Explore totalscore some more

. table recodedtype,c(mean totalscore)

--------------------------
recodedty |
pe | mean(totals~e)
----------+---------------
1 | .3729735
2 | .4475548
3 | .589883
--------------------------
Graph totalscore
. hist totalscore

2
1.5
Density

1
.5
0

-.5 0 .5 1
totalscore
Divide into “bins” so that each bar
represents 1% correct
 hist totalscore,width(.01)
 (bin=124, start=-.24209334, width=.01)

2
1.5
Density

1
.5
0

-.5 0 .5 1
totalscore
Add ticks at each 10% mark
histogram totalscore, width(.01) xlabel(-.2 (.1) 1)
(bin=124, start=-.24209334, width=.01)
2
1.5
Density

1
.5
0

-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
totalscore
Superimpose the normal curve
(with the same mean and s.d. as the empirical distribution)

. histogram totalscore, width(.01) xlabel(-.2 (.1) 1)

normal
(bin=124, start=-.24209334, width=.01)
2
1.5
Density

1
.5
0

-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
totalscore
Histograms by category
.histogram totalscore, width(.01) xlabel(-.2 (.1)1)
by(recodedtype)
(bin=124, start=-.24209334, width=.01)

1 2
3

Public Religious private

2
1
0
Density

-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1

3
3

Nonsectarian private
2
1
0

-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
totalscore
Graphs by recodedtype
Main issues with histograms
 Proper level of aggregation
 Non-regular data categories
A note about histograms with
unnatural categories
From the Current Population Survey (2000), Voter and Registration Survey

How long (have you/has name) lived at this address?

-9 No Response
-3 Refused
-2 Don't know
-1 Not in universe
1 Less than 1 month
2 1-6 months
3 7-11 months
4 1-2 years
5 3-4 years
6 5 years or longer
Solution, Step 1
Map artificial category onto
“natural” midpoint
-9 No Response  missing
-3 Refused  missing
-2 Don't know  missing
-1 Not in universe  missing
1 Less than 1 month  1/24 = 0.042
2 1-6 months  3.5/12 = 0.29
3 7-11 months  9/12 = 0.75
4 1-2 years  1.5
5 3-4 years  3.5
6 5 years or longer  10 (arbitrary)
Graph of recoded data
histogram longevity, fraction

.557134
Fraction

0
0 1 2 3 4 5 6 7 8 9 10
longevity
Density plot of data
Total area of last bar = .557
Width of bar = 11 (arbitrary)
Solve for: a = w h (or)
.557 = 11h => h = .051

0
0 1 2 3 4 5 6 7 8 9 10 15
longevity
Density plot template
Height
Category Fraction X-min X-max X-length (density)
< 1 mo. .0156 0 1/12 .082 .19*

1-6 mo. .0909 1/12 ½ .417 .22

7-11 mo. .0430 ½ 1 .500 .09

1-2 yr. .1529 1 2 1 .15

3-4 yr. .1404 2 4 2 .07

5+ yr. .5571 4 15 11 .05

* = .0156/.082
Draw the previous graph with a box
plot
. graph box totalscore
1

Upper quartile
Inter-quartile
.5

Median } range
Lower quartile

} 1.5 x IQR
0
-.5
Draw the box plots for the different
types of schools
. graph box totalscore, by(recodedtype)

1 2
1
.5
0
-.5

3
1
.5
0
-.5

Graphs by recodedtype
Draw the box plots for the different
types of schools using “over” option
graph box totalscore, over(recodedtype)
1
.5
0
-.5

1 2 3
Three words about pie charts:
don’t use them
So, what’s wrong with them
 For non-time series data, hard to get a
comparison among groups; the eye is very
bad in judging relative size of circle slices
 For time series, data, hard to grasp cross-
time comparisons
Some words about graphical
presentation
 Aspects of graphical integrity (following
Edward Tufte, Visual Display of
Quantitative Information)
 Main point should be readily apparent
 Show as much data as possible
 Write clear labels on the graph
 Show data variation, not design variation

Statistics
No ratings yet
Statistics
289 pages
Lecture 2 - Descriptive Statistics
No ratings yet
Lecture 2 - Descriptive Statistics
53 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
Lecture 01 Introduction To Statistics PPT 06022025 095924am
No ratings yet
Lecture 01 Introduction To Statistics PPT 06022025 095924am
40 pages
At The Office: Ielts Listening Section 2
100% (1)
At The Office: Ielts Listening Section 2
42 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Tutoring Session 2023 - Statistics For Business
No ratings yet
Tutoring Session 2023 - Statistics For Business
65 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
PDF Notes
No ratings yet
PDF Notes
28 pages
Statistics Pages
No ratings yet
Statistics Pages
67 pages
Lesson 6 Presentation of Data
No ratings yet
Lesson 6 Presentation of Data
73 pages
Chapter 2 Descriptive Statistics
No ratings yet
Chapter 2 Descriptive Statistics
12 pages
CH 1
No ratings yet
CH 1
40 pages
Manm526 W1
No ratings yet
Manm526 W1
38 pages
Tian Statistics Lesson 4 Frequency Distribution Definition and Properties of Probability
No ratings yet
Tian Statistics Lesson 4 Frequency Distribution Definition and Properties of Probability
54 pages
Basic Biostatistics
No ratings yet
Basic Biostatistics
31 pages
Subacromial Bursitis
No ratings yet
Subacromial Bursitis
28 pages
Lecture 1, 2 and 3
No ratings yet
Lecture 1, 2 and 3
45 pages
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
No ratings yet
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
62 pages
Joy Tindiwegi V Julia Tigeita Munubi and Harriet Nyanjura Munubi 2025 UGRSB 11 (12 May 2025)
No ratings yet
Joy Tindiwegi V Julia Tigeita Munubi and Harriet Nyanjura Munubi 2025 UGRSB 11 (12 May 2025)
14 pages
Engineering Data Analysis
100% (1)
Engineering Data Analysis
82 pages
MÔ TẢ BIẾN SỐ
No ratings yet
MÔ TẢ BIẾN SỐ
48 pages
Sps 2291 Lesson 2
No ratings yet
Sps 2291 Lesson 2
40 pages
Further 3&4 Solutions
No ratings yet
Further 3&4 Solutions
223 pages
Biostat Aguila Mission Solis
No ratings yet
Biostat Aguila Mission Solis
44 pages
Paired Passage ECR Text Set - VIDEO GAMES English I-II
No ratings yet
Paired Passage ECR Text Set - VIDEO GAMES English I-II
21 pages
Video Notes Unit 2
No ratings yet
Video Notes Unit 2
16 pages
Lecture 1
No ratings yet
Lecture 1
28 pages
SSLC Result 2024 25 Division Wise
No ratings yet
SSLC Result 2024 25 Division Wise
7 pages
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 2 Full Version
17 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Sexual Reproduction in Flowering Plants Worksheet
No ratings yet
Sexual Reproduction in Flowering Plants Worksheet
3 pages
StatiF 1 Slides
No ratings yet
StatiF 1 Slides
27 pages
Unit 1 Assignment SKELETON R spr18
No ratings yet
Unit 1 Assignment SKELETON R spr18
23 pages
Proposal For Post Merger Integration - Group 6: MBAZG541 (Assignment)
No ratings yet
Proposal For Post Merger Integration - Group 6: MBAZG541 (Assignment)
22 pages
Lec 11 Chapter IV Descriptiv and Inferential Stat.
No ratings yet
Lec 11 Chapter IV Descriptiv and Inferential Stat.
26 pages
Marketing Management Notes For All 5 Units PDF
No ratings yet
Marketing Management Notes For All 5 Units PDF
133 pages
02descriptive Stats 2011
No ratings yet
02descriptive Stats 2011
35 pages
Vocabulary For Poetry Analysis
100% (3)
Vocabulary For Poetry Analysis
2 pages
Subud Voice Newsletter
No ratings yet
Subud Voice Newsletter
16 pages
Statistics For Business and Economics: Describing Data: Numerical
No ratings yet
Statistics For Business and Economics: Describing Data: Numerical
56 pages
Bio 101 Hereditary Notes-Dr Anifowoshe
No ratings yet
Bio 101 Hereditary Notes-Dr Anifowoshe
10 pages
Lec 2 - Descriptive Statistics
No ratings yet
Lec 2 - Descriptive Statistics
40 pages
Lesson 3 - The Global Economy
No ratings yet
Lesson 3 - The Global Economy
6 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Jgygy
No ratings yet
Jgygy
92 pages
Elementary Unit
No ratings yet
Elementary Unit
46 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
35 pages
145
No ratings yet
145
4 pages
Pramod Pathak, Saumya Singh Page-9-15
No ratings yet
Pramod Pathak, Saumya Singh Page-9-15
7 pages
318-Extra Ordinary 16-09-2013
No ratings yet
318-Extra Ordinary 16-09-2013
21 pages
Stat 153 Slides PDF Statistics Mode (Statis
No ratings yet
Stat 153 Slides PDF Statistics Mode (Statis
10 pages
Kissinger The Negotiator Sebenius en 31109
No ratings yet
Kissinger The Negotiator Sebenius en 31109
6 pages
Two Types of Writing - Creative and Letter Writing PDF
No ratings yet
Two Types of Writing - Creative and Letter Writing PDF
3 pages
Offences Relating Marriage
No ratings yet
Offences Relating Marriage
12 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
23 pages
Kora Kagaz Tha Yeh Man Mera
100% (1)
Kora Kagaz Tha Yeh Man Mera
3 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
AP Stats Semester 1 Finals Prep
No ratings yet
AP Stats Semester 1 Finals Prep
4 pages
Smart Test Series: 731298 English-12 Inter Part-II
No ratings yet
Smart Test Series: 731298 English-12 Inter Part-II
3 pages
Part 1 Descriptive
No ratings yet
Part 1 Descriptive
42 pages
Managing People & Organizations: The Gateway To Career Success
No ratings yet
Managing People & Organizations: The Gateway To Career Success
10 pages
Introduction To Probability and Statistics
No ratings yet
Introduction To Probability and Statistics
30 pages
1.3 Student Company Mapping
No ratings yet
1.3 Student Company Mapping
9 pages
Dalik4ielts t78154
No ratings yet
Dalik4ielts t78154
3 pages
Section 2: Descriptive Statistics Part 1: Organizing Data
No ratings yet
Section 2: Descriptive Statistics Part 1: Organizing Data
59 pages
PSPCL Brochure PDF
No ratings yet
PSPCL Brochure PDF
19 pages
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
No ratings yet
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
35 pages
CE Data Analysys Chap1.
No ratings yet
CE Data Analysys Chap1.
60 pages
Descriptive Stats 2007
No ratings yet
Descriptive Stats 2007
37 pages
Quality Control: Fundamentals of Statistics
No ratings yet
Quality Control: Fundamentals of Statistics
62 pages
Statistics Midterms Reviewer 1
No ratings yet
Statistics Midterms Reviewer 1
9 pages
Chapter 2 - Representing Sample Data: Graphical Displays
No ratings yet
Chapter 2 - Representing Sample Data: Graphical Displays
16 pages
IE 220 Probability and Statistics: Descriptive Statistics - Graphical Summary: Describing Data With Graphs
No ratings yet
IE 220 Probability and Statistics: Descriptive Statistics - Graphical Summary: Describing Data With Graphs
36 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
35 pages
Informative Speech Assignment Packet - Leaders - Online Class
No ratings yet
Informative Speech Assignment Packet - Leaders - Online Class
6 pages
WEEK1
No ratings yet
WEEK1
36 pages
Exercise 2 Wordon Corporation
No ratings yet
Exercise 2 Wordon Corporation
2 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Net Present Value of Capital Project: Cash Inflows Year 0 1 2
No ratings yet
Net Present Value of Capital Project: Cash Inflows Year 0 1 2
4 pages
Stats Review
No ratings yet
Stats Review
5 pages
Assignment Group 11 - MBA ZG541 - v3.0
No ratings yet
Assignment Group 11 - MBA ZG541 - v3.0
33 pages
Introduction To The Practice of Basic Statistics (Textbook Outline)
100% (14)
Introduction To The Practice of Basic Statistics (Textbook Outline)
65 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
American Born Chinese
No ratings yet
American Born Chinese
3 pages
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
No ratings yet
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
24 pages
Theorem: Using The Law of Cosines
No ratings yet
Theorem: Using The Law of Cosines
8 pages
Introduction To Probability and Statistics Thirteenth Edition
No ratings yet
Introduction To Probability and Statistics Thirteenth Edition
30 pages
Theron Fox, Thomas Nguyen Jr. Julie H. Vo and Tina Nguyen - Mjudgement For Fradulaent Activities!
No ratings yet
Theron Fox, Thomas Nguyen Jr. Julie H. Vo and Tina Nguyen - Mjudgement For Fradulaent Activities!
3 pages
Course No. Course Title Instructor-In-Charge Instructors: RD RD TH TH
No ratings yet
Course No. Course Title Instructor-In-Charge Instructors: RD RD TH TH
2 pages
State Bank of Halstad V Bilstad
No ratings yet
State Bank of Halstad V Bilstad
6 pages
Practical Research 2
78% (9)
Practical Research 2
27 pages
Review: EX 1: Use The Correct Form of Verbs in Brackets
No ratings yet
Review: EX 1: Use The Correct Form of Verbs in Brackets
6 pages
Class X (Mathematics) : Holiday Homework
No ratings yet
Class X (Mathematics) : Holiday Homework
7 pages
Lab 05 PDF
No ratings yet
Lab 05 PDF
7 pages
Rural Marketing The Changing Scenario
No ratings yet
Rural Marketing The Changing Scenario
9 pages
Statistics - 1: Presentation of Data
No ratings yet
Statistics - 1: Presentation of Data
37 pages
3rd Quarter Periodic Exam
No ratings yet
3rd Quarter Periodic Exam
4 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet

Introduction To Descriptive Statistics

Uploaded by

Introduction To Descriptive Statistics

Uploaded by

Introduction to

Moment Non-mean based

Spread Variance Range,

Population vs. Sample

X  prob( X )  1  proportion of time x  1

Value  “Positive skew”

 totalscore = % of questions answered correctly

. table recodedtype,c(mean totalscore)

. histogram totalscore, width(.01) xlabel(-.2 (.1) 1)

Public Religious private

How long (have you/has name) lived at this address?

1-6 mo. .0909 1/12 ½ .417 .22

7-11 mo. .0430 ½ 1 .500 .09

1-2 yr. .1529 1 2 1 .15

3-4 yr. .1404 2 4 2 .07

5+ yr. .5571 4 15 11 .05

You might also like