0% found this document useful (0 votes)

15 views51 pages

Statistical Concepts

Miguel Rodriguez, a Ph.D. student in Mathematics Education, is researching critical statistical literacy (CSL) in preservice teacher education to assess its benefits and implications on data modeling. The document outlines the use of CODAP for teaching statistics and provides an introduction to key statistical concepts, including descriptive and inferential statistics, measures of central tendency, and the construction of box plots. It emphasizes the importance of collaboration and creating a supportive learning environment while exploring statistical relationships and regression analysis.

Uploaded by

miguel rodríguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views51 pages

Statistical Concepts

Uploaded by

miguel rodríguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Statistics

INTRODUCING CRITICAL STATISTICAL LITERACY

A little bit about me…
Hi I am Miguel Rodriguez,
A second year student in the Ph.D. program in Mathematics Education PRIME.
Also, I am a student in the MSc in Applied Statistics.
Now I am developing my practicum research project. This is a necessary step in my Ph.D.
journey.
I am very interested in statistics education in preservice teacher and undergraduate level.
I have been a mathematics teacher for almost 10 years, with most of my experience in
secondary education.
I am also passionate for performative arts, specially theatre and dance.
Title: Introducing critical statistical literacy in preservice teacher preparation,
imagining alternatives
Goals:
To identify possible benefits, impacts, limitations, etc., of CSL in preservice
teacher education.
To inform about participants' attitudes and appreciation of CSL in their future
practice.
To provide a panorama of the implications of CSL on participants' data modeling
process as a manifestation of statistical practice.
RQs:
How does preservice teachers' interest in implementing critical statistical literacy
(CSL) in their future practice vary after accessing CSL?
What uses of statistics and statistical arguments emerge in the data modeling
process of preservice teachers when they get access to CSL?
Some recommendations to take into account

Be willing to work with others.

Be open to share your contributions to the group, all opinions, questions,
comments, matter.
Be respectful, kind, and supportive.
Don’t be ashamed for asking when you need clarification. I’m more than happy
to provide support.
Please try to don’t miss the lessons. Surely, your participation will enrich the
classroom.
Contribute to a safe space for everyone.
See this space as an opportunity to learn something new, and to learn in
community.
Intro: Getting started
with CODAP
Common Online Data Analysis Platform
CODAP (Common Online Data Analysis Platform) is a free, web-based, open-
source software designed for teaching students about dynamic data explorations.
It will be the primary technology tool used in these materials.

To get started with CODAP, please do the following:

1) In a web browser (Google Chrome is preferred), go to CODAP’s website at

http://codap.concord.org

2) Click the Try CODAP button in the top right corner to open CODAP in a new
tab.

3) In the ‘What Would you Like to Do?’ box, select ‘Open Document or Browse
Examples’, select ‘Getting started with CODAP’, then click Open.
4) Complete the five basic CODAP tasks listed on the screen. If you need
assistance with a task, click ‘Show me’. You’re finished with these tasks when all
task checkboxes have been checked.

5) Next you will add a ‘Body Mass Index’ attribute to the case table to learn how to
add an attribute based on a formula.

● Resize the table by dragging the right edge or a lower corner until you can see
all nine attributes.
● Add a new attribute to this table by clicking the grey plus button in the top right
corner of the table. Make sure the table is selected to see this option.
Type ‘BMI’ then press Enter to name the new attribute.

● Click on the BMI attribute heading, then select ‘Edit Formula’. Enter the
formula Mass/Height^2. You can find attribute names, like ‘Mass’ and
‘Height’, under the ---Insert Value--- button.

Then click on Apply.

6) Graph a dotplot of the BMI attribute. Notice there are two upper outlier
mammals with BMIs of 400 or greater (you can hover your mouse over a point in
the graph to see its value).

7) Select those points in the graph by dragging a rectangle around them and look
at the table to see what mammals they are. CODAP’s representations are linked
dynamically, so if you select items in one representation they are automatically
selected in all other representations.
8) Hide Selected Cases by clicking the Eye icon in the Graph Menu (make sure
you have the graph selected for the menu to appear). This will remove those two,
selected outliers from the graph.
9) Rescale the graph by moving the mouse to the x-axis where it changes to a
hand icon, then dragging to the right.

10) There are several methods for saving a CODAP file.

● Save a File to Google Drive by clicking on the menu in the top left corner
of the header bar, selecting ‘Save…’, selecting the Google Drive tab in the
prompt box (second option), then following the Google Drive dialogue.
● Save a File to a shared URL by clicking on the menu in the top left corner
of the header bar, selecting ‘Share…’, selecting ‘Get link to shared view’,
enabling sharing, then copying the displayed URL. To save additional work
done after initially saving a fi le, select ‘Share… > Update shared view’.
Initial Terminology
INTRODUCING CRITICAL STATISTICAL LITERACY
What is statistics?

“Statistics is the science of learning from data and of measuring, controlling, and
communicating uncertainty.” American Statistical Association (ASA)

“Statistics has three primary components: How best can we collect data? How
should it be analyzed? And what can we infer from the analysis?” (Diez, et al.,
2015)

What questions from current events or from your own life can you think of
that could be answered by collecting and analyzing data?
The Statistical Investigation Cycle

GAISEIIPreK-12_Full.pdf
Population and Sample

“If a factory produces Population

thousands of electronic Parameter
components, instead of testing
each item, quality control
teams might randomly sample
a certain number of items Estimators
(e.g., 100 components) and
check how many are
defective. If they find that 4 Sample
out of 100 components are Statistic
defective, they can estimate
the proportion of defective
items in the entire production
batch as 4%”.
What Statistics?

Descriptive Inferential
Statistics Statistics
“Consists of methods for “Consists of methods
organizing, displaying, that use sample results
and describing data by to help make decisions
using tables, graphs, and or predictions about a
summary measures”. population”.

Mean, median, mode, Hypothesis testing, type

SD, etc. error, p-value, etc.
Ordinal Level of
satisfaction

Categorical
Nominal Party
afiliation

VARIABLE
Discrete # of
siblings
Numerical
Continuous Height

“It is a characteristic or measurement that can be determined for each member of a population”.
What kind of variable do you think is “phone number”?
Measures of Central Tendency
Example
Data about students’ height (in cm) from a classroom
Data set
(195,170,165,165,160) Sample size (n) = 5

Sample mean: Mode: 165 Median:

160 165 165 170 195

165

n is even n is odd

Average of (n + 1) ÷ 2 and n ÷ 2) + 1 (n + 1) ÷ 2
Variable

Case

DATA MATRIX
Bar Graph
Ap Kiwifru Bluebe
Fruit: Orange Banana Grapes
ple it rry
Peop
35 30 10 25 40 5
le:
-In a bar graph, the length of the bar for each
category represents the number of observations
in each category (frequency).

-Bars may be vertical or horizontal.

-We use bar graphs when we want to compare

categories or show changes over time.

-Frequencies are shown on the Y-axis and the

variable being compared is shown on the X-axis.

-The percentage of observations in each

category as is typical in pie charts.
Box plot

-It uses boxes and lines to depict the

distributions of one or more groups of numeric
data

- It is a type of chart that depicts a group of

numerical data through their quartiles.

- It displays key summary statistics: median,

quartiles and potential outliers.
Activities proposed for today

- Activity 1

- Activity 2

- Activity 3
For the data set (195,170,165,165,160)

Dispersion
Some Measures of dispersion
Ex.2

Dataset
3
5
6
8
11
14
17
24

Mean= 11 SD= 6.595 MAD= 5.5

Ex.1

Dataset
3
5
6
8
11
14
17
200

Mean= 33 SD= 63.27 MAD= 51.75

Mean = 171±12.41
Box plots
A box plot summarizes a data set using five summary statistics while
also plotting unusual observations, called outliers.

Five-number summary: the minimum, the maximum, and the three

quartiles (Q1, Q2, Q3) of the data set being studied.

Q2 represents the second quartile, which is equivalent to the 50th

percentile (i.e. the median).

Q1 represents the first quartile, which is the 25th percentile, and is the
median of the smaller half of the data set.

Q3 represents the third quartile, or 75th percentile, and is the median of

the larger half of the data set.

We calculate the variability in the data using the range of the middle
50% of the data:
Q3 - Q1, interquartile range (IQR, for short).
Box plots

What do you notice?

Box plots
How to Build a Box Plot

Draw an axis (vertical or horizontal) and draw a scale.

Draw a dark line denoting Q2, the median.

Draw a line at Q1 and at Q3. Connect the Q1and Q3 lines to

form a rectangle.

The width of the rectangle corresponds to the IQR and the

middle 50% of the data is in this interval.

The whiskers attempt to capture all of the data remaining

outside of the box, except outliers.

Is it possible to identify skew from the box plot?

Example 1
Consider the following data set:

5, 5, 9, 10, 15, 16, 20, 30, 80

Find the 5-number summary and identify how small or

large a value would need to be, to be considered an
outlier. Are there any outliers in this data set?

Q2= 15 Q3-Q1= 25 - 7= 18

Q1= 7
Q1 - 1.5*IQR = -20
Q3= 25
Q3 + 1.5*IQR = 52
min= 5

max= 80
Example 2
Consider the following data set:

5, 8, 1, 19, 3, 1, 11, 18, 20, 5

Find the 5-number summary and identify how small or

large a value would need to be, to be considered an
outlier. Are there any outliers in this data set?

1, 1, 3, 5, 5, 8, 11, 18, 19, 20

Q2= 6.5 Q3-Q1= 18 - 3= 15

Q1= 3
Q1 - 1.5*IQR = -19.5
Q3= 18 Q3 + 1.5*IQR = 40.5

min= 1 max= 20
Rules of thumb for identifying outliers

There are two rules of thumb for identifying outliers:

• More than 1.5* IQR below Q1 or above Q3

• More than 2 standard deviations above or below the mean.

The median and IQR are called robust

Which is more affected by extreme observations, the mean or
estimates e because extreme observations
median?
have little effect on their values. The mean
and standard deviation are much more
Is the standard deviation or IQR more affected by extreme
affected by changes in extreme
observations?
observations.
Relations Between Variables (bivariate analysis)

A pair of variables are either related in some way (associated)

or not (independent). No pair of variables are both associated
and independent.

Some examples of associated variables?

Some examples of independent variables?

Relations Between Variables

Educational Attainment of Householder Estimate median income

No high school diploma 36,230

High school, no college 53,510

Some college 71,420

Bachelor's degree or higher 123,000

Are these variables associated? How would you describe the association? Who is affecting whom?
Explanatory and response variable

Might affect
Explanatory variable(s) Response variable
(Independent variable) (Dependent variable)

 Association doesn’t imply causation

 Association is claimed in observational studies (no interference in how data arise)

 Causation is claimed in experimental studies (randomization, control group vs

experimental group).
Plotting independent and dependent variables

Some trends can be

found when plotting
data.

https://isaim2018.cs.ou.edu/papers/ISAIM2018_Deebani_Kachouie.pdf
Simple Linear Regression Model

Lea (1965) discussed the relationship between mean annual temperature and a mortality index for a type
of breast cancer in women. The data taken from certain regions of Great Britain, Norway, and Sweden,
consist of the mean annual temperature (in degrees Fahrenheit), and a mortality index for neoplasms of
the female breast.

What should be the first step in analyzing any possible relationship between mean annual temperature and
mortality index?
Let’s make a scatter plot
What is this plot revealing?

This linear relationship can

be expressed in the model:

β0 and β1, parameters, regression

coefficients.
Β0 is the intercept.
Β1 is the slope. Change in y for a-
unit change in x.
x, the predictor.
y, the response.
Least squares regression (LSE)

Residuals: the difference between the observed response yi and the

fitted value 𝑦𝑖.

The residuals are expressed:

ei = yi − 𝑦𝑖, i = 1, . . . , N,

The best-fitted linear regression line minimizes the sum of squared

residuals:
Least squares regression (LSE)

The fitted model is

What is the fitted model then?

LSE Model for breast cancer mortality

The fitted regression line for the breast cancer

mortality data is:

𝑀 = −21.79 + 2.36 𝑇

Β0= - 21.79, How do you interpret Β0 and Β1

Β1= -2.36

What is the average mortality index due to breast

cancer at a location that has a mean annual
temperature of 49F?
𝑀 = −21.79 + 2.36 (49)
= 93.85
Pearson (r) Correlation

Besides plotting, Pearson (r) correlation is a size effect measure that can be used to assess the linear
relationship between two variables, and the direction of it.

r = 0 means there is no correlation

r = 1 means there is a perfect positive correlation
r = -1 means there is a perfect negative
correlation

1
𝑟= 𝑍 𝑍
𝑛−1

zxi = (xi – 𝑥̅ )/SDx

zYi = (yi – 𝑦)/SDy

What is the correlation between M and T for breast cancer mortality data

 x= c(102.5, 104.5, 100.4, 95.9, 87, 95, 88.6, 89.2, 78.9, 84.6, 81.7, 72.2, 65.1, 68.1, 67.3, 52.5)

> y= c(51.3, 49.9, 50, 49.2, 48.5, 47.8, 47.3, 45.1, 46.3, 42.1, 44.2, 43.5, 42.3, 40.2, 31.8, 34)

 Using functions set in R  Creating your own function

Studio by default in R Studio:

mean(x)= 83.34375 > correlation<- function(x, y){

> z=(x-mean(x))/sd(x)
mean(y)= 44.59375 > w=(y-mean(y))/sd(y)
> r= (1/(n-1))*sum(z*w)
sd(x)= 15.04757 > return(r)}
sd(y)= 5.583603 > correlation(x,y)
corr(x, y) = 0.8748544 0.8748544

STAB22 Lecture's Notes
No ratings yet
STAB22 Lecture's Notes
64 pages
Fundamentals of Meter Provers and Proving Methods
100% (1)
Fundamentals of Meter Provers and Proving Methods
9 pages
Descriptive Statistics Analysis Part 1
No ratings yet
Descriptive Statistics Analysis Part 1
42 pages
Statistics Training For Math Tutors VWZdTNUo
No ratings yet
Statistics Training For Math Tutors VWZdTNUo
94 pages
Staticus: Math 103 Lecture 9 Class Notes
No ratings yet
Staticus: Math 103 Lecture 9 Class Notes
4 pages
Data Analytics Summary
No ratings yet
Data Analytics Summary
89 pages
Statistics For Business and Economics
No ratings yet
Statistics For Business and Economics
123 pages
Quantitative Skills 1 Graphing
No ratings yet
Quantitative Skills 1 Graphing
40 pages
Mat 152 P2 Reviewer
No ratings yet
Mat 152 P2 Reviewer
11 pages
Data Handling Learner Notes-1
No ratings yet
Data Handling Learner Notes-1
30 pages
Initial Data Analysis
No ratings yet
Initial Data Analysis
38 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Intro Mate
No ratings yet
Intro Mate
21 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
Data Presentation
No ratings yet
Data Presentation
64 pages
Box and Whisker Lesson
No ratings yet
Box and Whisker Lesson
4 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
22 pages
Tutoring Session 2023 - Statistics For Business
No ratings yet
Tutoring Session 2023 - Statistics For Business
65 pages
Methods For Describing Sets of Data
No ratings yet
Methods For Describing Sets of Data
114 pages
Statistics Week 1
No ratings yet
Statistics Week 1
8 pages
Variable: An Item of Data Examples
No ratings yet
Variable: An Item of Data Examples
60 pages
Data Handling Learner Notes
No ratings yet
Data Handling Learner Notes
28 pages
Chapter2 Stats
No ratings yet
Chapter2 Stats
9 pages
Data Preprocessing Data Basics
No ratings yet
Data Preprocessing Data Basics
86 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
Variables and Data Presentation
No ratings yet
Variables and Data Presentation
64 pages
Exploring Data: AP Statistics Unit 1: Chapters 1-4
No ratings yet
Exploring Data: AP Statistics Unit 1: Chapters 1-4
83 pages
Chapter Five
No ratings yet
Chapter Five
48 pages
Written Report Gathering and Organizing Data
No ratings yet
Written Report Gathering and Organizing Data
13 pages
Chapter 2 Final of Final
No ratings yet
Chapter 2 Final of Final
158 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Biostat Aguila Mission Solis
No ratings yet
Biostat Aguila Mission Solis
44 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Quant Descriptive Statistics
No ratings yet
Quant Descriptive Statistics
37 pages
Data and Presentation
No ratings yet
Data and Presentation
31 pages
Lecture 1-1 Methods of Data Collection
No ratings yet
Lecture 1-1 Methods of Data Collection
30 pages
Stats & HD Reviewer Prelims
No ratings yet
Stats & HD Reviewer Prelims
15 pages
Chap13 - Quantitative Data Analysis - Revised - Jan2021
No ratings yet
Chap13 - Quantitative Data Analysis - Revised - Jan2021
54 pages
Variables & Chart
No ratings yet
Variables & Chart
60 pages
2.data Description
No ratings yet
2.data Description
57 pages
Notes - Biostatitics
No ratings yet
Notes - Biostatitics
13 pages
Untitled Document 4
No ratings yet
Untitled Document 4
13 pages
Video Notes Unit 2
No ratings yet
Video Notes Unit 2
16 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
Week1 Introduction
No ratings yet
Week1 Introduction
36 pages
Mathematics Mean and Mode
No ratings yet
Mathematics Mean and Mode
37 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Statistics
No ratings yet
Statistics
81 pages
WK 1b Biostat
No ratings yet
WK 1b Biostat
38 pages
1 The Role of Statistics and The Data Analysis Process
100% (1)
1 The Role of Statistics and The Data Analysis Process
30 pages
Business Statistics and Computing Complete Ppts
No ratings yet
Business Statistics and Computing Complete Ppts
213 pages
Bio Statistics
No ratings yet
Bio Statistics
55 pages
Statistics For Css
No ratings yet
Statistics For Css
73 pages
STATS
No ratings yet
STATS
3 pages
Pertemuan 01 02
No ratings yet
Pertemuan 01 02
123 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Introduction To The Practice of Basic Statistics (Textbook Outline)
100% (14)
Introduction To The Practice of Basic Statistics (Textbook Outline)
65 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Business Statistics For Dummies
From Everand
Business Statistics For Dummies
Alan Anderson
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Appendices Reduced File Size
No ratings yet
Appendices Reduced File Size
162 pages
Insider: Smiling Through Fall
No ratings yet
Insider: Smiling Through Fall
4 pages
Ernest2018 - Chapter - ThePhilosophyOfMathematicsEduc 13-35
No ratings yet
Ernest2018 - Chapter - ThePhilosophyOfMathematicsEduc 13-35
23 pages
Santa Claus Is Coming To Town: Hillside Press
No ratings yet
Santa Claus Is Coming To Town: Hillside Press
1 page
Bingo Words
No ratings yet
Bingo Words
1 page
Unit 1 Plan - Personal Presen - Color - Flags
No ratings yet
Unit 1 Plan - Personal Presen - Color - Flags
6 pages
Unit 1 Plan - Personal Presen - Color - Flags
No ratings yet
Unit 1 Plan - Personal Presen - Color - Flags
6 pages
1 Plant Nutrition
No ratings yet
1 Plant Nutrition
35 pages
Python
100% (1)
Python
635 pages
Microwave Remote Sensing
No ratings yet
Microwave Remote Sensing
66 pages
Periodical Exam Science 8
No ratings yet
Periodical Exam Science 8
3 pages
Operations Research: Dr. Sarat K Jena
No ratings yet
Operations Research: Dr. Sarat K Jena
98 pages
Nigerian Communications Commission Grant Presentation
No ratings yet
Nigerian Communications Commission Grant Presentation
69 pages
SOAv 1
No ratings yet
SOAv 1
50 pages
Mathematical Literacy P2 Feb-March 2011 Memo Eng
No ratings yet
Mathematical Literacy P2 Feb-March 2011 Memo Eng
23 pages
PDMS Procedure: 2D DRAFT Intermediate - Structural Discipline
No ratings yet
PDMS Procedure: 2D DRAFT Intermediate - Structural Discipline
14 pages
32 Unit Wise Maths Formulas
No ratings yet
32 Unit Wise Maths Formulas
10 pages
Notes For Practical
No ratings yet
Notes For Practical
49 pages
Ma3151 Matrices and Calculus Two Mark Questions 2
No ratings yet
Ma3151 Matrices and Calculus Two Mark Questions 2
14 pages
Pupil Practice Book
67% (3)
Pupil Practice Book
89 pages
2 75
33% (3)
2 75
18 pages
Handbook Rheometer
No ratings yet
Handbook Rheometer
328 pages
Introduction To Differential Calculus PDF
No ratings yet
Introduction To Differential Calculus PDF
45 pages
1516-Advanced Paper-2 Set-A PDF
No ratings yet
1516-Advanced Paper-2 Set-A PDF
21 pages
10-An - Swimming Pool Dehumidifier Sizing
No ratings yet
10-An - Swimming Pool Dehumidifier Sizing
4 pages
How The Switching Frequency Affects The Performance of A Buck Converter
No ratings yet
How The Switching Frequency Affects The Performance of A Buck Converter
8 pages
Chapter 2 - Review Questions: Operating-System Structures
No ratings yet
Chapter 2 - Review Questions: Operating-System Structures
2 pages
Courses and Instructors To Develop Your Potential.: Vmware Cloud Foundation Management and Operations V3.9.1
No ratings yet
Courses and Instructors To Develop Your Potential.: Vmware Cloud Foundation Management and Operations V3.9.1
4 pages
Chemical Resistance Guide
No ratings yet
Chemical Resistance Guide
20 pages
Henry CL System
No ratings yet
Henry CL System
12 pages
F
No ratings yet
F
45 pages
Final Demonstration LP
No ratings yet
Final Demonstration LP
12 pages
High Electron Mobility Transistor-Foti
No ratings yet
High Electron Mobility Transistor-Foti
17 pages
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
No ratings yet
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
8 pages
C MCQ's
No ratings yet
C MCQ's
6 pages