0% found this document useful (0 votes)
22 views37 pages

Lesson 1

This document outlines an online course on Descriptive Statistics offered by Roger L. Brown, Ph.D., aimed at assisting researchers with statistical analyses. It covers key topics such as measures of central tendency, measures of dispersion, and data presentation techniques, including frequency distributions and box-and-whisker plots. The course is free for all Medical Research Consulting clients and emphasizes the importance of understanding data characteristics and relationships.

Uploaded by

ken.falculan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views37 pages

Lesson 1

This document outlines an online course on Descriptive Statistics offered by Roger L. Brown, Ph.D., aimed at assisting researchers with statistical analyses. It covers key topics such as measures of central tendency, measures of dispersion, and data presentation techniques, including frequency distributions and box-and-whisker plots. The course is free for all Medical Research Consulting clients and emphasizes the importance of understanding data characteristics and relationships.

Uploaded by

ken.falculan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Online Course #1

Descriptive Statistics
Roger L. Brown, Ph.D.
Medical Research Consulting
Middleton, WI
This online course is a FREE
service to all MRC clients
Purpose of this series

To assist researchers in the


interpretation and application of
statistical analyses
Statistics ?
The Science of collecting,
organizing, analyzing,
interpreting and presenting data
Topics we will review
• Descriptive Statistics

• Frequency Distributions and Histograms

Relative / Cumulative Frequency

• Measures of Central Tendency


Mean, Median, Mode, Midrange
Topics (continued)
• Measures of Dispersion (Variation)
Range, Standard Deviation,
Variance and Coefficient of variation
• Shape
Symmetric, Skewed, using Box-and-
Whisker Plots
• Quartile
• Statistical Relationships
Correlation , Covariance
Descriptive Statistics

A collection of quantitative measures and


ways of describing data. This includes:
Frequency distributions & histograms,
measures of central tendency
and
measures of dispersion
Descriptive Statistics
•Collect Data e.g. Survey

•Present Data e.g. Tables and Graphs

•Characterize Data e.g. Mean  xi


n
A Characteristic of a:
Population is a Parameter
Sample is a Statistic.
Collection of Data

 Survey/questionnaires/interviews
 Direct observation
 Secondary data source (e.g., Medical charts)
Presenting Data
Graphics

The visual representation of data may be used not


only to present results/findings in the data, but
may also be used to learn about the data.
Summary Measures in Descriptive
Statistics
Summary Measures

Central Tendency Quartile Variation

Mean Mode
Range Coefficient of
Median Variation
Midrange Variance

Standard Deviation
Measures of Central Tendency
Central Tendency

Mean Median Mode

Midrange
The Mean (Arithmetic Average)
•It is the Arithmetic Average of data values:

x
n
 xi xi  x2     xn
i 1

Sample Mean
n n
•The Most Common Measure of Central Tendency
•Affected by Extreme Values (Outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6
The Median
•Important Measure of Central Tendency
•In an ordered array, the median is the
“middle” number.
•If n is odd, the median is the middle number.
•If n is even, the median is the average of the 2
middle numbers.
•Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5
The Mode
•A Measure of Central Tendency
•Value that Occurs Most Often
•Not Affected by Extreme Values
•There May Not be a Mode
•There May be Several Modes
•Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9 No Mode
Midrange
•A Measure of Central Tendency
•Average of Smallest and Largest
Observation:
x l arg est  x smallest
Midrange 
2
•Affected by Extreme Value

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10
10 Midrange = 5 Midrange = 5
Summary Measures in Descriptive
Statistics
Summary Measures

Central Tendency Quartile Variation

Mean Mode
Range Coefficient of
Median Variation
Midrange Variance

Standard Deviation
Quartiles
 Not a Measure of Central Tendency
 Split Ordered Data into 4 Quarters

25% 25% 25% 25%


 Position of i-th
Q1 Quartile:
Q2 Q3 of point
position
i(n+1)
Qi  4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22

Position of Q1 = 1•(9 + 1) = 2.50 Q1 =12.5


4
Quartiles
 Not a Measure of Central Tendency
 Split Ordered Data into 4 Quarters

25% 25% 25% 25%


 Position of i-th
Q1 Quartile:
Q2 Q3 of point
position
i(n+1)
Qi  4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22

Position of Q3 = 3•(9 + 1) = 7.50 Q3 =19.5


4
Summary Measures
Summary Measures

Central Tendency Quartile Variation


Mean Mode
Range Coefficient of
Median Variation
Variance
Midrange
Standard Deviation
Measures of Dispersion (Variation)

Variation

Variance Standard Deviation Coefficient of


Variation
Range Population Population
Variance Standard
Sample Deviation
Variance Sample
Standard
Deviation
Understanding Variation

• The more Spread out or dispersed data


the larger the measures of variation
• The more concentrated or homogenous the data
the smaller the measures of variation
• If all observations are equal
measures of variation = Zero
• All measures of variation are Nonnegative
The Range
• Measure of Variation
• Difference Between Largest & Smallest
Observations:
Range = x La rgest  x Smallest

• Ignores How Data Are Distributed:


Range = 12 - 7 = 5 Range = 12 - 7 = 5

7 8 9 10 11 7 8 9 10 11
12 12
Variance
•Important Measure of Variation
•Shows Variation About the Mean:
2
•For the Population:    Xi   
2

N
 X i  X 
2
2
•For the Sample: s 
n 1
For the Population: use N in the For the Sample : use n - 1
denominator. in the denominator.
Standard Deviation
•Most Important Measure of Variation
•Shows Variation About the Mean:
 X i   
2
•For the Population: 
N

 X i  X 
2

•For the Sample: s


n 1

For the Population: use N in the For the Sample : use n - 1


denominator. in the denominator.
Sample Standard Deviation

s   Xi  X 
2
For the Sample : use n - 1
in the denominator.
n 1

Data: Xi : 10 12 14 15 17 18 18
24
n=8 Mean =16

s= (10  16 ) 2  (12  16 ) 2  (14  16 ) 2  (15  16 ) 2  (17  16 ) 2  (18  16 ) 2  ( 24  16 ) 2


8 1

= 4.2426
Comparing Standard Deviations
Data : X i : 10 12 14 15 17 18 18 24

N= 8 Mean =16

 X i  X 
2
s = = 4.2426
n 1
 X i   
2
  = 3.9686
N
Value for the Standard Deviation is larger for data considered as a Sample.
Comparing Standard Deviations
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Coefficient of Variation

Measure of Relative Variation


Always a%
Shows Variation Relative to Mean
Used to Compare 2 or More Groups
Formula ( for Sample):
S 
CV   100%
X 
Comparing Coefficient of Variation

 Group A: Average Health Measure = 50


 Standard Deviation = 5
 Group B: Average Health Measure = 100
 Standard Deviation = 5
Coefficient of Variation:
S 
CV   100% Group A: CV = 10%
X 
Group B: CV = 5%
Shape
 Describes How Data Are Distributed
 Measures of Shape:
 Symmetric or skewed
Shape
 Describes How Data Are Distributed
 Measures of Shape:
 Symmetric or skewed
-0.5 <0 < 0.5
Symmetric
Mean = Median = Mode
Shape
 Describes How Data Are Distributed
 Measures of Shape:
 Symmetric or skewed
< -1 -0.5 <0 < 0.5
Left-Skewed Symmetric
Mean Median Mode Mean = Median = Mode
Shape
 Describes How Data Are Distributed
 Measures of Shape:
 Symmetric or skewed
< -1 -0.5 <0 < 0.5 >1
Left-Skewed Symmetric Right-Skewed
Mean Median Mode Mean = Median = Mode Mode Median Mean

Negatively Skewed Positively Skewed


Box-and-Whisker Plot
 Graphical Display of Data Using
5-Number Summary

X smallest Q1 Median Q3 Xlargest

4 6 8 10 12
Distribution Shape &
Box-and-Whisker Plots

Left-Skewed Symmetric Right-Skewed


Q1 Median Q3 Q1 Median Q3 Q1 Median Q3
Summary
 Discussed Measures of Central Tendency
 Mean, Median, Mode, Midrange
 Quartiles
 Addressed Measures of Variation
 The Range, Interquartile Range, Variance,
 Standard Deviation, Coefficient of Variation
 Determined Shape of Distributions
 Symmetric, Skewed, Box-and-Whisker Plot

Mean Median Mode Mean = Median = Mode Mode Median Mean

You might also like