0% found this document useful (0 votes)

11 views

Chapter 1

Chapter 1 provides an overview of populations, samples, and processes in statistics, defining key terms and methods of sampling. It discusses descriptive statistics, including measures of location and variability, as well as methods for visualizing data. The chapter emphasizes the importance of understanding data types and distributions to effectively analyze and interpret statistical information.

Uploaded by

mazen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Chapter 1

Uploaded by

mazen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Chapter 1 : Overview and Descriptive Statistics

1.1 – Populations, Samples and Processes

 Definitions:
– _______________: A complete collection of all objects to be studied.
– _______________: the collection of data from every element in the population.
– _______________: a subset of the population.

EX) Suppose we are interested in the average GPA of all Camosun College students, then:

The relevant population is _________________________________________________

A sample can be _________________________________________________________

If we have all the Camosun College student’s GPAs then we have __________________

 Definitions cont:
– ________________: A numerical aspect of a ________________.
EX) The mean age of all Canadians, µ = 25.4

– : A numerical aspect of a .

EX) the average number of people in 20 randomly selected housing units, x =3.75.

Note: A __ can be used to estimate a _.

– ________________: any characteristic whose value may change from one object to another in
the population.
EX) X = height of a randomly selected student: 135cm,162cm...
Y = hair colour of a randomly selected student: black , brunette, blonde,...
– Data and Observations:
– ________________ data consists of observations on a single variable.
EX) height of STAT 218 students: 135cm,162cm...
– ________________ data consists of observations on each of two variables.
EX) height and weight of STAT 218 students: (135cm,50kg ), (162cm, 63kg )...
– ________________ data consists of observations on each of more than two variables.

EX) height, weight and hair colour of STAT 218 students:

(135cm,50kg , blonde), (162cm, 63kg , red )...
– Two Types of Processes:
– ______________________________: A process in which we observe and measure certain
characteristics, but we don’t attempt to manipulate or modify the subjects being studied. That is,
_________________________ are applied to the subjects studied.
Valuable for discovering trends and possible relationships but cannot be used to establish cause
and effect.
EX) One study observed that the higher a teacher’s salary, the higher the beer prices.
Did teacher’s pay increase cause beer prices to go up?

No, teacher’s pay increase did not cause beer prices to go up. This observational
study did not control other variables and it has a “lurking variable” – inflation, which
caused the both the beer price and the teacher’s salary to go up.

– ______________________________: A process in which we apply some treatment and then

proceed to observe its effects on the subject.
There is at least one control group (where subjects receive no treatment) in an experiment so that
comparisons can be made and any difference in the outcomes can be attributed to “treatment”.
Experiments can establish cause and effect.
EX) The 1954 American Polio Vacinne experiment followed 200,000 randomly
selected children given the Salk Vacinne and another 200,000 randomly selected
children given a placebo. The difference in the number of polio cases between the
two groups was attributed to the Salk Vacinne effect.

– Methods of Sampling:
– Random: Individuals are randomly selected from the population. Selections are made so that
each has an equal chance of being selected.

– Systematic: Randomly select some starting point and then select every kth element in the
population.

– Stratified: Subdivide the population into subgroups that share the same characteristic (i.e.
gender or age group) then draw a random sample from each subgroup (stratum).

– Cluster: Divide the population into sections (or clusters); each cluster contains individuals with
different characteristics; randomly select some of those clusters; choose all members from
selected clusters.

– Convenience: A sample is obtained by selecting individuals or objects without randomization.

For example, asking a MATH 218 class what program of study they are in to obtain information
about all of Camosun students programs.
 Branches of Statistics:
– ________________________ statistics: summary and description of collected data
(Sections 1.2 – 1.4)
– ________________________statistics: generalizing from a sample to a population (Chpts 6-9)

 Relationship between Probability and Statistics:

– To solve a _________________ problem, certain characteristics of a population are assumed to

be known. We then answer questions concerning a sample from that population.

– In a ________________ problem, we assume very little about a population. We use the

information about a sample to answer questions concerning the population.

1.2 – Pictorial and Tabular Methods in Descriptive Statistics

 Three important features to report when describing a distribution of a quantitative variable

– ______________
– ______________
– ______________

 Distribution Shapes:

Symmetric unimodal bimodal

Right or positive skew Left or negative skew

 The ____________________of the distribution (chpt 1.3)
EX) mean family size, median housing price

 The ____________________of the distribution (chpt 1.4)

EX) variance, standard deviation, range, fourth spread

DATA

CATEGORICAL NUMERICAL

DISCRETE CONTINUOUS

 Two types of Numerical Variables

– A variable is ________________ if its set of possible values constitute a finite set or an infinite
sequence. Usually results from counting.
EX) class size

– A variable is _______________________if its set of possible values consist of an entire interval

on a number line. Usually results from a measurement.
EX) height

EX) The following are ages of 34 Oscar-Winning Best Actresses:

21 24 26 26 26 27 28 30 30 31
31 33 33 34 34 34 35 35 35 37
37 38 41 41 41 42 44 49 50 60
61 61 74 80

We will describe this (numerical) data set using several graphing techniques.
1. Histogram

Frequency Table
Guidelines for constructing frequency tables and histograms.
(1) Number of intervals: 𝐾 ≅ √𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
we choose 𝐾 = 6 since 𝑛 = 34
(2) Starting point ≤ smallest data value. Choose 20 < 21
𝑚𝑎𝑥−𝑠𝑡𝑎𝑟𝑡𝑖𝑛𝑔 𝑝𝑜𝑖𝑛𝑡 80−20
(3) Class width: 𝑊 > 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 = = 10
6

Intervals Frequency Relative Cumulative Cumulative

Frequency Frequency Relative
Frequency
(20,30] 9 9/34 9 9/34
(30,40] 13 13/34 22 22/34
(40,50] 7 7/34 29 29/34
(50,60] 1 1/34 30 30/34
(60,70] 2 2/34 32 32/34
(70,80] 2 2/34 34 34/34
Total 34 1.00
2. Stem-and-leaf Plot
The decimal point is 1 digit(s) to the right of the |

2 | 1466678

3 | 001133444555778

4 | 111249

5 | 0

6 | 011

7 | 4

8 | 0

Features of the Stem-and-leaf plot:

– Displays the shape of the distribution: right-skewed.

– Sorts data

– Shows all data points

 Histograms of discrete data

37 people surveyed on the number of credit cards they own. Here is the data:
Number of cards Frequency
0 3
1 11
2 15
3 4
4 2
5 1
6 1
 Graphing Categorical Data:
1.3 – Measures of Location

 Center of a Dataset

a) Mean – is the balance point

Given a sample: x1, x2, …, xn where n = sample size

𝑥1 +𝑥2 +⋯+𝑥𝑛 ∑𝑛
𝑖=1 𝑥𝑖
Sample Mean: 𝑥̅ = =
𝑛 𝑛

Given a population: x1, x2, …, xN where N = population size

𝑥1 +𝑥2 +⋯+𝑥𝑁 ∑𝑁
𝑖=1 𝑥𝑖
Population Mean: 𝜇 = =
𝑁 𝑁

EX) Suppose we have data from two samples. Find the mean of each.

Sample x: 5, 7, 9

Sample y: 50, 70, 90

This example leads us to the following property:

If y=cx, where c is a constant, then 𝑦̅ = 𝑐𝑥̅ and/or 𝜇𝑦 = 𝑐𝜇𝑥

EX) Suppose we have a population: {1, 2, 3, 4, 5, 6}, find the mean:

If we take a sample from this population and get sample = {1, 4, 5}, find the mean:

Note: Usually the sample mean is NOT equal to the population mean but 𝑥̅ can be used to estimate µ.

b) Median – the middle point of ordered data

For ordered data values x1, x2, …, xn

𝑛+1 𝑡ℎ
* if n is odd, the median is the middle or ( ) value
2

* if n is even, the median is the average of the two middle values: that is, the average of the
𝑛 𝑡ℎ 𝑛 𝑡ℎ
(2) and (2 + 1) values

Sample median: is always denoted by 𝑥̃

Population median: is always denoted by 𝜇̃

EX) Suppose we have data on two samples. Find the median of each.

Sample x: 2, 2, 3, 7, 8

Sample y: 20, 20, 30, 70, 80

This example leads us to another property:

If y=cx, where c is a constant, then 𝑦̃ = 𝑐𝑥̃ and/or 𝜇̃𝑦 = 𝑐𝜇̃𝑥

Note: Usually the sample median is NOT equal to the population median but 𝑥̃ can be used to estimate µ̃.

Mean Vs Median

Q) Find the mean and median of each sample:

x: 4, 7, 11, 17, 22

y: 4, 7, 11, 17, 220

* Median is a better measure of center if there are extreme values.

* Mean is sensitive to extreme values while the median is not

c) Trimmed Mean

The trimmed mean is a compromise between the mean and the median

EX) Given the following sample, x: 20, 60, 67, 70, 99. Find the

a) 20% trimmed mean, 𝑥̅ 𝑡𝑟(20)

b) 25% trimmed mean, 𝑥̅𝑡𝑟(25)

Q) Find the 30% trimmed mean, 𝑥̅ 𝑡𝑟(30)

d) Mode

The mode is the value that occurs most

EX) Suppose we have the following data sets. Find the mode of each

a) 2, 2, 3, 5

b) 2, 2, 3, 5, 5

c) 2, 2, 3, 3, 5, 7, 7
 Location of a Data Point in Ordered Datasets
EX) Your test score is 85, where do you stand in the class?

a) Quartiles (Q1, Q2, Q3) or fourths

They divide the data set into approximately four equal parts

EX) Given the data: 2, 5, 7, 10, 17, 14

Find the 5-Number summary: min Q1 Q2 Q3 max
EX) Given the data: 2, 5, 7, 10, 14
Find the 5-Number summary

*Note: Include the middle value into both the lower and upper groups to find Q1 and Q3 if n is odd

b) Percentiles (p1, p2, …, p100)

The percentiles are number that break the ordered data into 100 equal pieces approximately
 Categorical Data and Sample Proportions

EX) Consider a survey of n people on the question of “Do you agree with marijuana legalization?”
There are 3 categories for response: Yes No No Opinion
Number in each category: x1 x2 x3

The proportion of people who support legalization:

The proportion of people who are against legalization:

Let p1 = the true proportion of all Canadians who support legalization then 𝑝̂1 can be used to estimate p1

EX) Flip a coin 8 times: H H T H T T H H

If we think of H=1 and T=0 then we could find the proportion of heads by adding up all the 1’s.

We can think of proportion as a special type of mean. In fact, they share many properties.
1.4 – Measures of Variability

Let’s look at two different distributions:

Both distributions are bell-shaped, have the same center but have very different spreads: variability
matters!!

EX) Consider two samples. Find the mean and median of each.

A: 6.5 6.6 6.7 6.8 7.1

B: 4.4 5.1 6.7 7.3 10.2

EX) Suppose we have the following information for waiting times(in minutes) at a bus stop.

6.5 6.8 7.0 7.7 the mean = 7.0

Data Deviation Absolute Deviation Squared Deviation

x 𝑥𝑖 − 𝑥̅ (𝑥𝑖 − 𝜇) |𝑥𝑖 − 𝑥̅ | (|𝑥𝑖 − 𝜇|) (𝑥𝑖 − 𝑥̅ )2 [ (𝑥𝑖 − 𝜇)2 ]

 Range

The range, R, is the difference between the min data value and the max data value

EX) For the waiting times example above, find R

 MAD (Mean Absolute Deviation)

∑𝑛
𝑖 |𝑥𝑖 −𝑥̅ |
MAD = 𝑛

EX) Find MAD for the waiting times example

 Variance

∑𝑁
𝑖 (𝑥𝑖 −𝜇)
2
For population: x1, x2, …, xN The population variance: 𝜎 2 = 𝑁

∑𝑛
𝑖 (𝑥𝑖 −𝑥̅ )
2
For sample: x1, x2, …, xn The sample variance: 𝑠 2 = 𝑛−1

Why is the sample variance divided by n – 1 instead of n?

- because the xi’s tend to be closer to their average 𝑥̅ than to the population average µ so
n – 1 is used instead of n to compensate.

- if we used n then the sample would tend to underestimate 𝜎 2

EX) Waiting times:

If we are thinking of the data as a population then find 𝜎 2 =

If we are thinking of the data as a sample then find 𝑠 2

 Standard Deviation (SD)

Population standard deviation = 𝜎 = √𝜎 2

EX) Waiting times: find 𝜎

Sample standard deviation = 𝑠 = √𝑠 2

EX) Waiting times: find 𝑠

Why do you think we might prefer s to s2?

Q) Find the mean and standard deviation for the samples:

a) x: 2, 5, 4, 7, 9

b) y: 20, 50, 40, 70, 90

c) w: 25, 55, 45, 75, 95

d) u: -25, -55, -45, -75, -95

FACT: If y = a + bx, where a, b are constants, then

𝑦̅ = 𝑎 + 𝑏𝑥̅

𝑠𝑦2 = 𝑏 2 𝑠𝑥2

𝑠𝑦 = |𝑏|𝑠𝑥

 Application of Standard Deviation

Empirical Rule: If X has bell-shaped distribution with mean = µ and standard deviation = σ then,

- approximately 68% of values of X fall within 1 standard deviation of the mean

- approximately 95% of values of X fall within 2 standard deviations of the mean

- approximately 99.7% of values of X fall within 3 standard deviation of the mean

𝑟𝑎𝑛𝑔𝑒
Therefore, for bell-shaped distributions: 6𝜎 ≈ 𝑟𝑎𝑛𝑔𝑒 → 𝜎 ≈ 6

EX) Suppose that adult IQ scores follow a bell-shaped distribution with mean µ=100 and standard
deviation σ=15. What values do 95% of all such scores fall between?
 Fourth Spread, fs

fs = upper fourth – lower fourth

= Q3 – Q1

= IQR (Interquartile Range)

= range of the middle 50% of the data

The fourth spread is NOT sensitive to outliers. In fact, we use it to identify outliers!

 Outliers

An observation, x, is a mild outlier if:

x < Q1 – 1.5fs or x > Q3 + 1.5fs

An observation, x, is an extreme outlier if:

x < Q1 – 3fs or x > Q3 + 3fs

 Boxplot

Making Boxplots

* First find the five-number summary and fs

* Find the outlier cut points: Q1 – 3fs, Q1 – 1.5fs, Q3 + 1.5fs, Q3 + 3fs

EX) Given the sample, x: 4, 17, 20, 22, 26

Draw the boxplot

Boxplot and Shape

* A distribution is symmetric if both the box and the whiskers are symmetric about Q2

* Longer whiskers imply a longer tail in the distribution

Q) A sample of 20 glass bottles of a particular type was selected and the internal pressure of strength of each
bottle was determined. Consider the following partial sample information:

median = 202.2 lower fourth = 196.0 upper fourth = 216.8

Three smallest observations: 125.8 188.1 193.7

Three largest observations: 221.3 230.5 250.2

a) Are there any outliers in the sample? Any extreme outliers?

b) Construct a boxplot that shows outliers, and comment on any interesting features.

The Alcohol-Textbook PDF
100% (3)
The Alcohol-Textbook PDF
449 pages
Galenic Corpus PDF
No ratings yet
Galenic Corpus PDF
7 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Statistics and Propability 1
No ratings yet
Statistics and Propability 1
35 pages
Statistics Review
No ratings yet
Statistics Review
59 pages
Elementary Statistics and Probability Chapter 1 3
No ratings yet
Elementary Statistics and Probability Chapter 1 3
5 pages
Reviewer in IE-SAN1
No ratings yet
Reviewer in IE-SAN1
5 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
GNED 03 Finals Reviewer
No ratings yet
GNED 03 Finals Reviewer
10 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Sta 131 Complete Note
No ratings yet
Sta 131 Complete Note
33 pages
6938
No ratings yet
6938
41 pages
Stat 1 Notes
No ratings yet
Stat 1 Notes
4 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
Statistics
No ratings yet
Statistics
12 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
19 pages
Statistics L 1
No ratings yet
Statistics L 1
27 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
Data Management ( 1)
No ratings yet
Data Management ( 1)
46 pages
Math 5
No ratings yet
Math 5
3 pages
Unit 4 - Descriptive Statistics (A)
No ratings yet
Unit 4 - Descriptive Statistics (A)
19 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Topic 1 Descriptive Statistics SV
No ratings yet
Topic 1 Descriptive Statistics SV
113 pages
Physics
No ratings yet
Physics
6 pages
Chap 1_5c732e71caaab5259ff09db79c47e6bc (1)
No ratings yet
Chap 1_5c732e71caaab5259ff09db79c47e6bc (1)
5 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Chapter 1 Mathematics
No ratings yet
Chapter 1 Mathematics
2 pages
Review of Statistical Concepts
No ratings yet
Review of Statistical Concepts
60 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
12 pages
MTPDF1 - Introduction To Statistics
No ratings yet
MTPDF1 - Introduction To Statistics
106 pages
Chapter 1 BFC34303
No ratings yet
Chapter 1 BFC34303
104 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
No ratings yet
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
40 pages
Bio Epi
No ratings yet
Bio Epi
6 pages
Introduction To Statistics and SPSS
100% (1)
Introduction To Statistics and SPSS
110 pages
Statistics
No ratings yet
Statistics
68 pages
Maths Statistics LN PDF
No ratings yet
Maths Statistics LN PDF
36 pages
Introduction To Statistics Presentation of Data
No ratings yet
Introduction To Statistics Presentation of Data
20 pages
Introduction To Basic Concepts Terminologies
No ratings yet
Introduction To Basic Concepts Terminologies
6 pages
Lecture 1-Statistics-New
No ratings yet
Lecture 1-Statistics-New
8 pages
Statistical Method
No ratings yet
Statistical Method
136 pages
FDS UNIT 2 NOTES
No ratings yet
FDS UNIT 2 NOTES
46 pages
Chapter 2 Descriptive Statistics
No ratings yet
Chapter 2 Descriptive Statistics
12 pages
MMW Finals Notes Mod 5 Part 1&2
No ratings yet
MMW Finals Notes Mod 5 Part 1&2
32 pages
MMW Finals Notes Mod 5&6
No ratings yet
MMW Finals Notes Mod 5&6
52 pages
1 Unnamed 04 01 2024
No ratings yet
1 Unnamed 04 01 2024
66 pages
CHAPTER+ONE+Descriptive+Statistics+ +Univariate
No ratings yet
CHAPTER+ONE+Descriptive+Statistics+ +Univariate
12 pages
STPDF1 - Recalling Basic Concepts
No ratings yet
STPDF1 - Recalling Basic Concepts
31 pages
College of Business and Economics
No ratings yet
College of Business and Economics
7 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
41 pages
summry biostatstics pptx
No ratings yet
summry biostatstics pptx
32 pages
Data Management
No ratings yet
Data Management
39 pages
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
100% (1)
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
4 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Lecture_1_Section_16.1
No ratings yet
Lecture_1_Section_16.1
9 pages
Lecture_2_Sections_16.1&16.2
No ratings yet
Lecture_2_Sections_16.1&16.2
13 pages
Lecture_3_Sections_16.2_Corrected
No ratings yet
Lecture_3_Sections_16.2_Corrected
12 pages
ch9.1
No ratings yet
ch9.1
4 pages
ee261-lec-40-2025-04-25 (1)
No ratings yet
ee261-lec-40-2025-04-25 (1)
27 pages
ee261-lec-39-2025-04-23
No ratings yet
ee261-lec-39-2025-04-23
18 pages
ee261-lec-08-2025-01-24
No ratings yet
ee261-lec-08-2025-01-24
23 pages
ee261-lec-09-2025-01-27
No ratings yet
ee261-lec-09-2025-01-27
19 pages
Monchick, A. - Paul Hindemith and The Cinematic Imagination
100% (1)
Monchick, A. - Paul Hindemith and The Cinematic Imagination
39 pages
Queensland Brain Institute
0% (1)
Queensland Brain Institute
3 pages
Chapter 10 Key Issue 4
No ratings yet
Chapter 10 Key Issue 4
3 pages
Defining Publics in Public Relations: The Case of A Suburban Hospital
No ratings yet
Defining Publics in Public Relations: The Case of A Suburban Hospital
12 pages
First Week
No ratings yet
First Week
8 pages
01-Catalogo Valvula Rotativa
No ratings yet
01-Catalogo Valvula Rotativa
2 pages
Global Maritime Distress and Safety System
100% (1)
Global Maritime Distress and Safety System
167 pages
OCLAB Exercise 5
No ratings yet
OCLAB Exercise 5
5 pages
Project Report Group Housing Near Dwarka More Metro Station New Delhi
No ratings yet
Project Report Group Housing Near Dwarka More Metro Station New Delhi
7 pages
Assignment #7 - EDA
No ratings yet
Assignment #7 - EDA
11 pages
Symbol Table and Activation Records
No ratings yet
Symbol Table and Activation Records
31 pages
Quarter 3-Grade 6 Summative 1
No ratings yet
Quarter 3-Grade 6 Summative 1
20 pages
New Migration Form
No ratings yet
New Migration Form
2 pages
HR Intern in Vio Lernx PVT - LTD (1) - 2
No ratings yet
HR Intern in Vio Lernx PVT - LTD (1) - 2
76 pages
Causative Verbs
No ratings yet
Causative Verbs
3 pages
NUTR - 150 - Assignment06 - Dietary - Analysis-5 (1) WK6
No ratings yet
NUTR - 150 - Assignment06 - Dietary - Analysis-5 (1) WK6
14 pages
Mock Aqe 1
No ratings yet
Mock Aqe 1
15 pages
IFRS 4 Insurance Contract
No ratings yet
IFRS 4 Insurance Contract
24 pages
8_updated_tcs-smarthiring-sample-question-paper (2)
No ratings yet
8_updated_tcs-smarthiring-sample-question-paper (2)
5 pages
Industrial Report
100% (1)
Industrial Report
24 pages
Lesson Plan in Methods of Teaching by Cha
No ratings yet
Lesson Plan in Methods of Teaching by Cha
3 pages
Injection Technique in Neurotoxins and Fillers: Indications, Products, and Outcomes
No ratings yet
Injection Technique in Neurotoxins and Fillers: Indications, Products, and Outcomes
13 pages
Health 9 Community and Environmental Health
79% (14)
Health 9 Community and Environmental Health
127 pages
Literature Review
No ratings yet
Literature Review
15 pages
Python Cheatsy
No ratings yet
Python Cheatsy
1 page
Chapter 6 Agri Marketing
No ratings yet
Chapter 6 Agri Marketing
14 pages
Landscape Planning
100% (1)
Landscape Planning
372 pages
Antibacterial Antibiotic Agents
No ratings yet
Antibacterial Antibiotic Agents
13 pages

Chapter 1

Uploaded by

Chapter 1

Uploaded by

Chapter 1 : Overview and Descriptive Statistics

1.1 – Populations, Samples and Processes

The relevant population is _________________________________________________

A sample can be _________________________________________________________

– ________________: A numerical aspect of a ________________.

Note: A __________________ can be used to estimate a _________________.

EX) height, weight and hair colour of STAT 218 students:

– ______________________________: A process in which we apply some treatment and then

– Convenience: A sample is obtained by selecting individuals or objects without randomization.

 Relationship between Probability and Statistics:

– To solve a _________________ problem, certain characteristics of a population are assumed to

– In a ________________ problem, we assume very little about a population. We use the

1.2 – Pictorial and Tabular Methods in Descriptive Statistics

 Three important features to report when describing a distribution of a quantitative variable

Symmetric unimodal bimodal

Right or positive skew Left or negative skew

 The ____________________of the distribution (chpt 1.4)

 Two types of Numerical Variables

– A variable is _______________________if its set of possible values consist of an entire interval

EX) The following are ages of 34 Oscar-Winning Best Actresses:

Intervals Frequency Relative Cumulative Cumulative

Features of the Stem-and-leaf plot:

– Displays the shape of the distribution: right-skewed.

– Shows all data points

 Histograms of discrete data

a) Mean – is the balance point

Given a sample: x1, x2, …, xn where n = sample size

Given a population: x1, x2, …, xN where N = population size

Sample y: 50, 70, 90

This example leads us to the following property:

If y=cx, where c is a constant, then 𝑦̅ = 𝑐𝑥̅ and/or 𝜇𝑦 = 𝑐𝜇𝑥

b) Median – the middle point of ordered data

For ordered data values x1, x2, …, xn

Sample median: is always denoted by 𝑥̃

Population median: is always denoted by 𝜇̃

Sample y: 20, 20, 30, 70, 80

If y=cx, where c is a constant, then 𝑦̃ = 𝑐𝑥̃ and/or 𝜇̃𝑦 = 𝑐𝜇̃𝑥

Q) Find the mean and median of each sample:

y: 4, 7, 11, 17, 220

* Median is a better measure of center if there are extreme values.

* Mean is sensitive to extreme values while the median is not

a) 20% trimmed mean, 𝑥̅ 𝑡𝑟(20)

b) 25% trimmed mean, 𝑥̅𝑡𝑟(25)

The mode is the value that occurs most

a) Quartiles (Q1, Q2, Q3) or fourths

EX) Given the data: 2, 5, 7, 10, 17, 14

b) Percentiles (p1, p2, …, p100)

The proportion of people who support legalization:

The proportion of people who are against legalization:

EX) Flip a coin 8 times: H H T H T T H H

Let’s look at two different distributions:

A: 6.5 6.6 6.7 6.8 7.1

B: 4.4 5.1 6.7 7.3 10.2

6.5 6.8 7.0 7.7 the mean = 7.0

Data Deviation Absolute Deviation Squared Deviation

EX) For the waiting times example above, find R

 MAD (Mean Absolute Deviation)

EX) Find MAD for the waiting times example

Why is the sample variance divided by n – 1 instead of n?

- if we used n then the sample would tend to underestimate 𝜎 2

EX) Waiting times:

If we are thinking of the data as a population then find 𝜎 2 =

If we are thinking of the data as a sample then find 𝑠 2

 Standard Deviation (SD)

Population standard deviation = 𝜎 = √𝜎 2

EX) Waiting times: find 𝜎

Sample standard deviation = 𝑠 = √𝑠 2

EX) Waiting times: find 𝑠

Q) Find the mean and standard deviation for the samples:

b) y: 20, 50, 40, 70, 90

c) w: 25, 55, 45, 75, 95

d) u: -25, -55, -45, -75, -95

 Application of Standard Deviation

- approximately 68% of values of X fall within 1 standard deviation of the mean

- approximately 95% of values of X fall within 2 standard deviations of the mean

- approximately 99.7% of values of X fall within 3 standard deviation of the mean

– : A numerical aspect of a .

Note: A __ can be used to estimate a _.