0% found this document useful (0 votes)

13 views50 pages

Topic3 Descriptive Statistics

This document provides an overview of descriptive statistics used to summarize datasets. It discusses how to summarize categorical data through frequency distributions, tables, pie charts, and bar charts. For quantitative data, it describes measures of central tendency like the mean, median, and mode, as well as measures of variability such as standard deviation, range, and interquartile range. Percentiles and how to identify outliers are also covered.

Uploaded by

Alfred Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views50 pages

Topic3 Descriptive Statistics

Uploaded by

Alfred Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

Descriptive Statistics

Linh Nghiem
MATH1905
Overview

• In this topic, we will look at some graphical and numerical measures to

summarize variables in a dataset

• Reading: Chapters 2 and 3 of Ross

Summarizing Categorical Data
Frequency distribution

Favorite Sport

Football Others Others Football Baseball Basketball Others Basketball

Tennis Soccer Basketball Soccer Others Soccer Football Basketball

Others Football Basketball Football Others Baseball Football Soccer

Others Football Basketball Football Baseball Soccer Basketball Basketball

Tabular summary
Favorite Sport

# Students Percent
(Frequency) Frequency (%)
Football 8 25
Basketball 8 25
Baseball 3 9.375
Tennis 1 3.125
Soccer 5 15.625
Others 7 21.875
Total 32 100
Pie chart

Favorite Sport

21.875% 0.25
Football
Basketball
Baseball
Tennis
Soccer
Others
15.625%

0.25
3.125% 9.375%
Bar chart
Favorite Sport
8 8
8

7
(Frequency) Number of students

0
Football Basketball Baseball Tennis Soccer Others
Grouped bar chart
Favourite Sport by Gender
4 4 4 4 4
4

3 3 3
3
Number of students

2
2

1
1

0 0
0
Football Basketball Baseball Tennis Soccer Others
Women Men
Summarizing Quantitative Data
Main descriptions

• Location:
- Mean, median and mode
- Relative standing: quartiles, percentiles
• Variability:
- Standard deviation
- Range and interquartile range
• Shape:
- Symmetry and skewness
- Uni-modal and multi-modal
Mean

• Average of all the elements

• Sample mean:
1 1 n
∑
x̄ = (x1 + x2 + … + xn) = xi,
n n i=1

with n = sample size.

Example

Hours Worked Last Week

40 20 40 35 0
50 0 25 30 40
40 40 40 40 10
40 62 37 40 45
40 30 10 42 40

40 + 20 + 40 + … + 42 + 40
x̄ = = 34.24
25
Median

• The middle observation

- Sort the observations in ascending order
- If the number of observations is odd, the median = middle
observation
- If the number of observations is even, the median = average of
the middle two observations
Median

Sorted data (by rows)

0 0 10 10 20
25 30 30 35 37
40 40 40 40 40
40 40 40 40 40
42 45 50 60 62

We have n = 25 observations, so the median is the 13th

observation in the sorted data; i.e median = 40.
Median

Student marks in a quiz

40 20 80 48 76 46
45 65 55 50 42 64

Sorted marks
20 40 42 45 46 48
50 55 62 64 76 80

We have n = 12 observations, so the two middle observations are the 6th and
48 + 50
the 7th in the sorted data. Median = = 49.
2
Mode

• The most frequent value(s) in the dataset, if exists

- Eg: Hours work data: mode = 40
- Eg: Student quiz data: no mode
Relative standing

• A measure of relative standing measures the location of a particular

value relative to the rest of the distribution of your data.

• Given a set of data and a proportion p between 0 and 1, the (100p)th

percentile is the value dividing the data so that (100p)% of data values
are below the percentile.

20% 80%

20th percentile
Percentiles
• First, second, third quartiles: p = .25, .50, .75 respectively.
- Median = 50th percentile = second quartile.
- Denoted as Q1, Q2, and Q3 respectively.

25% 25% 25% 25%

Q1 Q2 Q3
First Quartile Second Quartile Third Quartile
(25th percentile) (50th percentile) (75th percentile)
(median)
Calculating percentiles
• If we have n observations, the location of the p-percentile is given by
p
Lp = (n + 1)
100

• Using linear interpolation if Lp is not an integer.

• Example: What is the 31th percentile of the work hours data?
Sorted data (by rows)
0 0 10 10 20
25 30 30 35 37
40 40 40 40 40
40 40 40 40 40
42 45 50 60 62
31
• n = 25, p = 31, so L31 = (25 + 1) × 100 = 8.06
• Hence, the 31th percentile lies 0.06 of the distance between the
8th and the 9th observations in the sorted data, which are 30 and
35 respectively.
• Then 31th percentile is 30 + 0.06 * (35 − 30) = 30.3.
Calculating percentiles

• The above method is only an approximation. In practice, there are

many other formulas for computing percentiles on the data.
- quantile() function in R has 9 options for type, each corresponding to
one distinct way of calculating quantiles and leading to a (slightly)
different result.

• The concept is more important than the actual computation.

Measure of variability

• Variance: measure the spread of observations around the mean,

always non-negative.

Sample variance
n n

( )
1 1
s2 = (xi − x̄)2 = xi2 − n x̄2
n−1∑
i=1
n − 1 ∑
i=1

• Standard deviation: square root of variance.

- Has the same unit as the unit of observations.
Example

Student marks in a quiz

40 20 80 48 76 46
45 65 55 50 42 64

40 + 20 + 80 + … + 64
x̄ = = 52.75
12
1
s2 = {(40 − 52.75) + (20 − 52.75) + … + (64 − 52.75) } = 277.3561
2 2 2
12 − 1

s= 277.3561 = 16.65
Range and interquartile range
• Range: difference between maximum and minimum
• Interquartile range (IQR): difference between third and first quartile.

Sorted marks
20 40 42 45 46 48
50 55 64 65 76 80

Range = 80 - 20 = 60
75
L75 = (12 + 1) × = 9.75, Q3 = 64 + (65 − 64) × 0.75 = 64.75
100
25
L25 = (12 + 1) × = 3.25, Q1 = 42 + (45 − 42) × 0.25 = 42.75
100

IQR = Q3 − Q1 = 64.75 − 42.75 = 22

Outliers
• Broadly speaking, an outlier is an observation that is far from the
majority of other observations in the data

• A common rule (suggested by Tukey) to identify outliers is any point,

either:
- Smaller than Q1 − 1.5 × IQR
- Bigger than Q3 + 1.5 × IQR

• Outlier can contain information, so don’t automatically remove them

Q3 = 64.75
Student marks in a quiz Q1 = 42.75
40 5 80 50 76 46 IQR = 22
45 65 55 50 42 64 Q1 −1.5 × IQR = 9.75
Q3 +1.5 × IQR = 97.75
Outliers

• The median is more robust to outliers than the mean.

• The interquartile range is more robust to outliers than the variance/
standard deviation.

Student marks in a quiz

Mean = 52.58,
40 20 80 48 76 46 Median = 49
45 65 55 50 42 64 SD = 16.65
IQR = 22

Student marks in a quiz Mean = 51.33,

Median = 49
40 5 100 48 76 46
SD = 19.62
45 65 55 50 42 64 IQR = 22
Histogram

• A histogram plot the frequency of data falling into defined intervals.

• Constructing histograms:
- Determine the minimum and maximum values of the data
- Divide the range into non-overlapping, contiguous, and roughly
equal intervals
- Count frequency or relative frequency in each interval
- Plot the intervals on the horizontal axis and the (relative)
frequency on the vertical axis.
• There is no consensus rule for defining the number of intervals.
Histogram: Examples
Example
Usual Travel Time to Work (Minutes)
Source: 2009 American Community Survey

Number of workers Relative

(thousands) Frequency (%)

<10 Minutes 18,565 14.0

10-14 Minutes 19,328 14.6
15-19 Minutes 20,775 15.7
20-24 Minutes 19,559 14.7
25-29 Minutes 8,040 6.1
30-34 Minutes 17,874 13.5
35-44 Minutes 8,321 6.3
45-59 Minutes 9,834 7.4
60+ Minutes 10,378 7.8
Total 132,674 100.0
Histogram
Percent Frequency

Travel Time (Minutes)

Symmetry and skewness
Symmetry and skewness
Symmetric Right-skewed Left-skewed
Mean ≈ Median Mean > Median Mean < Median

400

400
200

300

300
150
Frequency

Frequency

Frequency
200

200
100

100

100
50
0

0
−3 −2 −1 0 1 2 3 4 0 2 4 6 0 5 10 15

Mean = Median ≈ 0 Mean = 0.95 > median = 0.68 Mean = 12.8 < median = 13.2
Unimodal, bimodal, and multimodal
Boxplot

A boxplot (or box-and-whiskers) plot provides summary of continuous

data based on five-number summary: minimum, Q1, median, Q3, and
maximum
Boxplot
Side-by-side boxplots are useful to compare distributions of one quantitative
variable on different categories of another qualitative variable.

Fuel consumption on highways for different classes of car

30
hwy

2seater compact midsize minivan pickup subcompact suv

class
Summarizing Data for Two Variables
Cross-tabulation

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Total 78 118 76 28 300
Joint and marginal percentages

Quality Rating and Prices for 300 LA Restaurants (%)

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 14.0 13.3 0.7 0.0 28.0
Very Good 11.3 21.3 15.3 2.0 50.0
Excellent 0.8 4.7 9.3 7.3 22.0
Total 26.1 39.3 25.3 9.3 100
Cross-tabulation

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Total 78 118 76 28 300
Cross-tabulation: Row Percentages

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 50 47.9 2.4 0 100
53.8 33.9 2.6 0
Very Good 22.7 42.7 30.6 4 100
43.6 54.2 60.5 21.4
Excellent 3 21.2 42.4 33.4 100
2.6 11.9 36.8 78.6
Total 100 100 100 100
Cross-tabulation

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Total 78 118 76 28 300
Cross-tabulation: Column Percentages

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Good 50 47.9 2.4 0 100
53.8 33.9 2.6 0
Very Good 22.7 42.7 30.6 4 100
43.6 54.2 60.5 21.4
Excellent 3 21.2 42.4 33.4 100
2.6 11.9 36.8 78.6
Total 100 100 100 100
Covariance and correlation
• Covariance describes how the two quantitative variables change in
relation to the other.

• Eg: For two stocks A and B, we want to see how their returns move
with each other.
- A positive covariance implies if the return on A increases
(decreases), then the return on B also increases (decreases)
- A negative covariance implies if the return on A increase
(decreases), then the return on B decreases (increases)
Covariance and correlation

• For two quantitative variables X and Y,

n n

n − 1 ( i=1 )
1 1
∑ ∑
Cov(X, Y ) = (xi − x̄)(yi − ȳ) = xi yi − n x̄ ȳ
n − 1 i=1

with x̄ and ȳ sample means of X and Y respectively.

• Correlation is covariance standardised by the standard deviations.

Cov(X, Y )
rXY =
sxsy
Example: Rates of return (%) for two stocks X and Y
Scatterplots
Covariance and correlation

• −1 ≤ rXY ≤ 1(this is a consequence of the Cauchy–Schwarz inequality)

• Correlation measures the strength of linear relationship between X and Y
- A positive (negative) correlation implies positive (negative) association
- rXY = ± 1 suggests a perfectly positive (negative) linear relationship
- rXY = 0 implies no linear relationship
Correlation

• Correlation is unaffected by the scale/unit of measurements of any

variable.
Height Weight Height Weight
(in cm) (in kg) (in m) (in lb)
151.76 47.82 1.5176 105.204
139.7 36.49 1.397 80.278
136.52 31.86 1.3652 70.092
156.85 53.04 1.5685 116.688
145.41 41.28 1.4541 90.816
163.83 62.99 1.6383 138.578
149.22 38.24 1.4922 84.128

r = 0.96 r = 0.96
Correlation does not imply causation
Summary

• We can summarise data using tabular, graphical, and numerical

measures.
- Many variations of the same measures are possible.
- Many other descriptive statistics are possible, eg trimmed-mean,
coefficient of skewness, kurtosis, etc.
- It is important to know pros and cons of each measure.

• These measures are statistics, because we compute them on

samples (observed data).
- A central question is whether these statistics represent the
corresponding quantities in the population well.
- Answering this question requires concepts from probability, which
is the next topic of the course.

Unit 4 Lesson 2 Quantitative Analysis and Interpretation
No ratings yet
Unit 4 Lesson 2 Quantitative Analysis and Interpretation
37 pages
Statistics Notes
100% (1)
Statistics Notes
94 pages
Fin Math
100% (1)
Fin Math
151 pages
2.3 Summary Statistics - Measures of Center and Spread
No ratings yet
2.3 Summary Statistics - Measures of Center and Spread
11 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
1 Basics of Stat (Statistics IEM 2-2)
No ratings yet
1 Basics of Stat (Statistics IEM 2-2)
29 pages
Kinds & Classification of Research: Reported By: Marina G. Servan
No ratings yet
Kinds & Classification of Research: Reported By: Marina G. Servan
52 pages
EDA W3 Obtaining-Data
No ratings yet
EDA W3 Obtaining-Data
57 pages
EPS201 Lecture3 Jan23 2024b 085916
No ratings yet
EPS201 Lecture3 Jan23 2024b 085916
56 pages
2a. Describing Variables With Numbers
No ratings yet
2a. Describing Variables With Numbers
30 pages
Enma 104 1.4
50% (2)
Enma 104 1.4
23 pages
Descriptive Statistics - Handout
No ratings yet
Descriptive Statistics - Handout
10 pages
Quantitative Data Analysis
100% (2)
Quantitative Data Analysis
27 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
44 pages
IL2-Describing Variation in Data
No ratings yet
IL2-Describing Variation in Data
7 pages
Group 7 MMW Reporting
No ratings yet
Group 7 MMW Reporting
28 pages
Descriptive Stats
No ratings yet
Descriptive Stats
50 pages
Reviewer For Stat
No ratings yet
Reviewer For Stat
7 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
BB Module 2 BASIC STATISTICS
No ratings yet
BB Module 2 BASIC STATISTICS
63 pages
Measures of Central Tendency and Spread: Chapter 1, Section 2
No ratings yet
Measures of Central Tendency and Spread: Chapter 1, Section 2
36 pages
1final Measures of Dispersion
No ratings yet
1final Measures of Dispersion
48 pages
Stats
No ratings yet
Stats
109 pages
Note 02
No ratings yet
Note 02
31 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
25 pages
Variability Final
No ratings yet
Variability Final
53 pages
Variables and Data Presentation
No ratings yet
Variables and Data Presentation
64 pages
Chapter 1
No ratings yet
Chapter 1
51 pages
TDA1
No ratings yet
TDA1
57 pages
Group-1 Module-1 PPT
No ratings yet
Group-1 Module-1 PPT
100 pages
01 Data
No ratings yet
01 Data
100 pages
Gtu 302 Biostatistics: Descriptive Statistics
100% (2)
Gtu 302 Biostatistics: Descriptive Statistics
57 pages
AGA 3842-2022-2023. Descriptive Statistics
No ratings yet
AGA 3842-2022-2023. Descriptive Statistics
101 pages
Uji Normalitas: One-Sample Kolmogorov-Smirnov Test
No ratings yet
Uji Normalitas: One-Sample Kolmogorov-Smirnov Test
2 pages
Statistics I Chapter 2: Univariate Data Analysis
No ratings yet
Statistics I Chapter 2: Univariate Data Analysis
27 pages
m110 Handout 12
No ratings yet
m110 Handout 12
16 pages
Topic 1 Describing Data II
No ratings yet
Topic 1 Describing Data II
68 pages
Actuary Math - Stat. Lec1-9
No ratings yet
Actuary Math - Stat. Lec1-9
22 pages
ST8114 Module1 PartI UnivariateEDA
No ratings yet
ST8114 Module1 PartI UnivariateEDA
60 pages
2.descriptive Statistics
No ratings yet
2.descriptive Statistics
53 pages
Basic Stat 1
No ratings yet
Basic Stat 1
50 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
2.data Description
No ratings yet
2.data Description
57 pages
Descriptive Statistics W2
No ratings yet
Descriptive Statistics W2
29 pages
Lecture-1 Introduction To Basic Concepts of Statistics
No ratings yet
Lecture-1 Introduction To Basic Concepts of Statistics
16 pages
Business Statistics
No ratings yet
Business Statistics
106 pages
Measures of Dispersion (Part 2)
No ratings yet
Measures of Dispersion (Part 2)
13 pages
TUT1
No ratings yet
TUT1
7 pages
Stat Case 1 Century National Banks
No ratings yet
Stat Case 1 Century National Banks
22 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
City Uni of New York
No ratings yet
City Uni of New York
33 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Chapter 10 Booklet v1
No ratings yet
Chapter 10 Booklet v1
12 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Probability+&+Statistics Formulas
No ratings yet
Probability+&+Statistics Formulas
47 pages
Descriptive and Inferential
No ratings yet
Descriptive and Inferential
3 pages
SALMAN ALAM SHAH - Definitions of Statistics
No ratings yet
SALMAN ALAM SHAH - Definitions of Statistics
16 pages
Bussiness Statistics Book
No ratings yet
Bussiness Statistics Book
5 pages
STATS
No ratings yet
STATS
3 pages
Basic Maths23su
No ratings yet
Basic Maths23su
42 pages
Minitab Statguide Time Series
No ratings yet
Minitab Statguide Time Series
72 pages
AP Stats Semester 1 Finals Prep
No ratings yet
AP Stats Semester 1 Finals Prep
4 pages
Torts Problem Solving Guide
No ratings yet
Torts Problem Solving Guide
25 pages
Confidence Intervals For Population Proportion
No ratings yet
Confidence Intervals For Population Proportion
24 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Solutions Chapter 11
No ratings yet
Solutions Chapter 11
9 pages
CH 2 Lecture Notes
No ratings yet
CH 2 Lecture Notes
12 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Exploring Data: AP Statistics Unit 1: Chapters 1-4
No ratings yet
Exploring Data: AP Statistics Unit 1: Chapters 1-4
83 pages
Raja Daniyal (0000242740) 8614 - Assignment 1
No ratings yet
Raja Daniyal (0000242740) 8614 - Assignment 1
30 pages
Chapter 1
No ratings yet
Chapter 1
104 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
4 pages
"Responsi Praktikum": Statistik Pertanian
No ratings yet
"Responsi Praktikum": Statistik Pertanian
6 pages
X X X X Q X: 5A Maths Test Soluton For 5B10 & 11 Measures of Dispersion I & II - P.1
No ratings yet
X X X X Q X: 5A Maths Test Soluton For 5B10 & 11 Measures of Dispersion I & II - P.1
3 pages
Case Processing Summary
No ratings yet
Case Processing Summary
3 pages
XM04 01
No ratings yet
XM04 01
8 pages
Gleeson 2009
No ratings yet
Gleeson 2009
21 pages
Assignment - KI - Group 2
No ratings yet
Assignment - KI - Group 2
22 pages
2024 - Assignment 01 Fronsheet
No ratings yet
2024 - Assignment 01 Fronsheet
23 pages
Performance Level Results
No ratings yet
Performance Level Results
2 pages
Employee Management Sample Data2
No ratings yet
Employee Management Sample Data2
16 pages
R Assignment 1
No ratings yet
R Assignment 1
6 pages
Descriptive Statistics 50 102 New
No ratings yet
Descriptive Statistics 50 102 New
53 pages
Q1: Total: Chi-Square Test
No ratings yet
Q1: Total: Chi-Square Test
3 pages
SLG 5.1 Measures of Location
No ratings yet
SLG 5.1 Measures of Location
6 pages
Final S1101 21-22
No ratings yet
Final S1101 21-22
6 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
12 pages
Ebs 348 - Educational Statistics - Ms
No ratings yet
Ebs 348 - Educational Statistics - Ms
3 pages
English s6 Slessor Resource 14 Kenneth Slessor Out of Time
No ratings yet
English s6 Slessor Resource 14 Kenneth Slessor Out of Time
2 pages
Kambala 2022 English Trial Paper 1
100% (1)
Kambala 2022 English Trial Paper 1
20 pages
AP Statistics Flashcards, Fifth Edition: Up-to-Date Practice
From Everand
AP Statistics Flashcards, Fifth Edition: Up-to-Date Practice
Barron's Educational Series
No ratings yet

Topic3 Descriptive Statistics

Uploaded by

Topic3 Descriptive Statistics

Uploaded by

Descriptive Statistics

• In this topic, we will look at some graphical and numerical measures to

• Reading: Chapters 2 and 3 of Ross

Football Others Others Football Baseball Basketball Others Basketball

Tennis Soccer Basketball Soccer Others Soccer Football Basketball

Others Football Basketball Football Others Baseball Football Soccer

Others Football Basketball Football Baseball Soccer Basketball Basketball

• Average of all the elements

with n = sample size.

Hours Worked Last Week

• The middle observation

Sorted data (by rows)

We have n = 25 observations, so the median is the 13th

Student marks in a quiz

• The most frequent value(s) in the dataset, if exists

• A measure of relative standing measures the location of a particular

• Given a set of data and a proportion p between 0 and 1, the (100p)th

25% 25% 25% 25%

• Using linear interpolation if Lp is not an integer.

• The above method is only an approximation. In practice, there are

• The concept is more important than the actual computation.

• Variance: measure the spread of observations around the mean,

• Standard deviation: square root of variance.

Student marks in a quiz

IQR = Q3 − Q1 = 64.75 − 42.75 = 22

• A common rule (suggested by Tukey) to identify outliers is any point,

• Outlier can contain information, so don’t automatically remove them

• The median is more robust to outliers than the mean.

Student marks in a quiz

Student marks in a quiz Mean = 51.33,

• A histogram plot the frequency of data falling into defined intervals.

Number of workers Relative

<10 Minutes 18,565 14.0

Travel Time (Minutes)

A boxplot (or box-and-whiskers) plot provides summary of continuous

Fuel consumption on highways for different classes of car

2seater compact midsize minivan pickup subcompact suv

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating and Prices for 300 LA Restaurants (%)

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating and Prices for 300 LA Restaurants

Quality Rating $10-19 $20-29 $30-39 $40-49 Total

• For two quantitative variables X and Y,

with x̄ and ȳ sample means of X and Y respectively.

• Correlation is covariance standardised by the standard deviations.

• −1 ≤ rXY ≤ 1(this is a consequence of the Cauchy–Schwarz inequality)

• Correlation is unaffected by the scale/unit of measurements of any

• We can summarise data using tabular, graphical, and numerical

• These measures are statistics, because we compute them on

You might also like