0% found this document useful (0 votes)

30 views3 pages

3.2 - Relationshipd Between Categorical Variables

This document discusses analyzing relationships between categorical variables in data analysis. It uses education level as the outcome variable and age group/gender as predictor variables in sample data. Separate bar charts are created for education level by gender, showing females are slightly more likely to have college degrees. Side-by-side bar charts make the differences clearer. Separate plots also show education level differs significantly between age groups, with the youngest and oldest having far fewer college graduates. Side-by-side bars further highlight differences between age groups' educational distributions.

Uploaded by

franco668

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views3 pages

3.2 - Relationshipd Between Categorical Variables

Uploaded by

franco668

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

FutureLearn 1

DATA TO INSIGHT: AN INTRODUCTION

TO DATA ANALYSIS
THE UNIVERSITY OF AUCKLAND

WEEK 3
RELATIONSHIPS BETWEEN CATEGORICAL VARIABLES

Hi again. In this video, you'll learn how to plot data on two categorical variables so
that you can look for relationships between them.

In the week three introductory video, we talked about relationships in terms of a
variable of primary interest-- called an outcome variable-- and variables that might
help us predict the outcome. These we call predictor variables.

Using the enhanced 2009-2012 data, we'll investigate how age group and gender
predict educational achievement levels. Education is our outcome of interest. Age
and gender are our predictor variables.

First, using gender as a predictor, we want to see how the distribution of
educational attainment differs between females and males. Here we have two
separate plots for education -- one for females and one for males. If there was no
difference between the female and male distributions, we'd say there was no
relationship between education and gender.

The two graphs here are very similar but slightly different. For example, about 30%
of females are college graduates, compared with about 28% of males. It looks as
though the two right-hand bars for females are higher than those for males--
females slightly more likely to have college education, while the left-hand bars are
slightly shorter-- proportionately more males in the lower levels of educational
attainment.

We'll now rearrange the sets of bars so that corresponding bars for females and
males are placed beside one another. This is called a side-by-side bar chart. This
makes the small differences between the educational attainment outcomes much
more obvious. The colour coding tells us what predictor group we are looking at--
green for females, red for males.

Which plot should we use? Both! Both have their strengths, and we should use
both.

FutureLearn 2

The first law of using graphics for discovery is that you should look at many types
of graphs. Often you'll spot something in one that you missed in another. A
separate set of plots of the outcome variable, one for each predictor group, is good
for revealing gross differences and overall shape. The side-by-side plot is good for
looking at detailed differences.

Age in decades is a predictor variable with more categories. Here we have a
separate plot for the outcome variable education for each category of age decade.
The separateness of the groups has been emphasised by using colour. If there
was no relationship between the outcome education and the predictor age decade,
all of these plots would be the same. But in fact, there are quite large differences.

These freehand curves emphasise how the plots differ in shape. The shapes of
the education distributions for the youngest and the oldest group are very different
from the shapes for any of the other age groups. The main difference is the
substantially lower percentages of college graduates. We'll think about possible
explanations later.

Now we'll use side-by-side bars to highlight the differences. Higher bars
correspond to larger percentages and lower bars to lower percentages.

On the right-hand side, we can see the reduction in the percentages of college
graduates with each decade from age 50, and the low percentage of college
graduates in the light blue 20-29 age group. The red 70+group generally has less
education, shown by lower than usual bars for college graduates and by higher
than usual bars in all of the lower three categories of education. The light blue 20-
29 group has unusually large percentages in the high school and some college
categories. We should expect this because many of the younger ones in particular
would not yet have finished their formal education.

You may have noticed in this plot that some bars are narrower than others. The
widths of the bars have been made proportional to the number of people in the
group. There are less than half as many people in the 70+group then there are in
the first three age groups. For all their good points, a big disadvantage of the side-
by-side arrangement of bars is that people often get confused by these graphs,
particularly by getting the percentages of what for who the wrong way around.

They may look at the cluster of CollegeGrad bars and think they're being told
about the percentages of CollegeGrads who fall into each age group. But they do
not. These percentages do not add to 100%. We have to think what is the
outcome variable? These percentages add to 100%. And for what groups are we
comparing those outcomes? The percentages that add to 100% are those four

FutureLearn 3
bars with the same colour. They tell us about the outcome variable results for
people in that colour group.

With iNZight graphs, the outcome variable is in the default graph title. If we see
distribution of education, we know the graph is telling us about educational
outcomes. The colour groups are the age groups. We're looking at the educational
outcomes of the various age groups. This is very different from the age group
outcomes for the different educational groups.

To keep your bearings, it is best to look at both separate and side-by-side plots.
And bear in mind that the side-by-side plot is just a rearrangement of the separate
plots' bars.

In summary, our main tools for investigating the relationship between two
categorical variables is separate bar charts of the outcome variable for each
predictor group and side-by-side bar charts.

Separate bar charts are best for revealing gross differences in our overall shape,
while side-by-side bar charts are better for highlighting detailed differences
between corresponding categories.

Finally, I'll leave you with these questions to remind you of the ideas we've just
covered.

Pie Chart Bar Chart Exercises Answer 12
100% (2)
Pie Chart Bar Chart Exercises Answer 12
4 pages
Model42C Chemilun 156file - 17809
0% (1)
Model42C Chemilun 156file - 17809
234 pages
A Detailed Lesson Plan On Introduction To Statistics
No ratings yet
A Detailed Lesson Plan On Introduction To Statistics
9 pages
Ad3301 Dev Unit 3 Notes Eduengg
No ratings yet
Ad3301 Dev Unit 3 Notes Eduengg
36 pages
541SolutionsManuel PDF
No ratings yet
541SolutionsManuel PDF
418 pages
Lesson 3 Writing
100% (1)
Lesson 3 Writing
5 pages
Tappi T264 Cm-97
100% (6)
Tappi T264 Cm-97
3 pages
CH - 2 - Application To Univariate and Bivariate Analysis in Stata
No ratings yet
CH - 2 - Application To Univariate and Bivariate Analysis in Stata
32 pages
Unit Unit 4
No ratings yet
Unit Unit 4
61 pages
A Detailed Lesson Plan On Introduction To Statistics
100% (1)
A Detailed Lesson Plan On Introduction To Statistics
9 pages
Analysis of Variables
No ratings yet
Analysis of Variables
31 pages
Statistical Literacy Unit 4 Lectiure 1a 09 October
No ratings yet
Statistical Literacy Unit 4 Lectiure 1a 09 October
27 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
41 pages
Lesson Plan - Ksa 2
50% (4)
Lesson Plan - Ksa 2
2 pages
Unit IV and V Dev
No ratings yet
Unit IV and V Dev
33 pages
Lec 5
No ratings yet
Lec 5
18 pages
Ielts Writing - Stage 1 Session 3
No ratings yet
Ielts Writing - Stage 1 Session 3
28 pages
Water Cooling System 2. 3. Fuel Oil System 4. Lubricating Oil System Air System
100% (1)
Water Cooling System 2. 3. Fuel Oil System 4. Lubricating Oil System Air System
23 pages
Chapter 03 - Processed
No ratings yet
Chapter 03 - Processed
86 pages
Ảnh Màn Hình 2024-11-15 Lúc 17.32.19
No ratings yet
Ảnh Màn Hình 2024-11-15 Lúc 17.32.19
28 pages
BUP-04-Creating Graphs and Cross Tables
No ratings yet
BUP-04-Creating Graphs and Cross Tables
18 pages
A Detailed Lesson Plan On Introduction To Statistics
0% (1)
A Detailed Lesson Plan On Introduction To Statistics
9 pages
Module 2 - 4
No ratings yet
Module 2 - 4
40 pages
Lesson 2
No ratings yet
Lesson 2
22 pages
Action Research Data Analysis Tutorial
No ratings yet
Action Research Data Analysis Tutorial
30 pages
Data Analysis-Univariate & Bivariate
100% (1)
Data Analysis-Univariate & Bivariate
9 pages
Unit 5
No ratings yet
Unit 5
33 pages
Data Interpretation PPT (1)
No ratings yet
Data Interpretation PPT (1)
87 pages
Graphic Presentation of Data
No ratings yet
Graphic Presentation of Data
8 pages
Bag III
No ratings yet
Bag III
32 pages
Company Wide Quality Management
100% (1)
Company Wide Quality Management
46 pages
A Detailed Lesson Plan On Introduction To Statistics
100% (2)
A Detailed Lesson Plan On Introduction To Statistics
9 pages
Org Data Categorical Data W25
No ratings yet
Org Data Categorical Data W25
23 pages
Presenting and Interpreting Data 2: Chapter Overview
No ratings yet
Presenting and Interpreting Data 2: Chapter Overview
16 pages
Writing Revision - Adkt2
No ratings yet
Writing Revision - Adkt2
34 pages
Notes - EDA-Unit4
No ratings yet
Notes - EDA-Unit4
29 pages
Chapter 3: Graphic Presentation
No ratings yet
Chapter 3: Graphic Presentation
28 pages
Week 2
No ratings yet
Week 2
15 pages
Summary of Statistics I
No ratings yet
Summary of Statistics I
7 pages
Research Notes Discussions
No ratings yet
Research Notes Discussions
4 pages
Research Notes Discussions
No ratings yet
Research Notes Discussions
4 pages
Graphing - Distributions
No ratings yet
Graphing - Distributions
25 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Chapter 3
No ratings yet
Chapter 3
28 pages
Unit 3 Tech
No ratings yet
Unit 3 Tech
16 pages
Unit 1
No ratings yet
Unit 1
3 pages
Presentation and Interpretation of Data: Competency
No ratings yet
Presentation and Interpretation of Data: Competency
9 pages
Unit 3
No ratings yet
Unit 3
42 pages
Ielts Writing Band 9 - Line Graph & Bar Chart
No ratings yet
Ielts Writing Band 9 - Line Graph & Bar Chart
2 pages
7 Graphs and Charts
No ratings yet
7 Graphs and Charts
21 pages
Presenting Tables and Charts
No ratings yet
Presenting Tables and Charts
31 pages
Graphs and Digrams Used in Presentation
No ratings yet
Graphs and Digrams Used in Presentation
17 pages
Section 2comparative Charts
No ratings yet
Section 2comparative Charts
55 pages
Unit II Introducing Two Variable and Third Variable
No ratings yet
Unit II Introducing Two Variable and Third Variable
27 pages
Data Presentation and Analysis
No ratings yet
Data Presentation and Analysis
71 pages
تشغيل وصيانة الصهاريج
100% (3)
تشغيل وصيانة الصهاريج
500 pages
A Detailed Lesson Plan On Introduction To Statistics
78% (41)
A Detailed Lesson Plan On Introduction To Statistics
9 pages
Y9 Statistics Notes
No ratings yet
Y9 Statistics Notes
13 pages
Assignment 1 Decision Science
No ratings yet
Assignment 1 Decision Science
11 pages
Graph 1 Graph 2
No ratings yet
Graph 1 Graph 2
4 pages
Detailed Lesson Plan in Mathematics Grade 5
No ratings yet
Detailed Lesson Plan in Mathematics Grade 5
11 pages
EDA ES 214 Module 1
No ratings yet
EDA ES 214 Module 1
9 pages
What Is Statistics
No ratings yet
What Is Statistics
147 pages
50 Excel VBA Oral Interview Questions
50% (2)
50 Excel VBA Oral Interview Questions
13 pages
Turbocharger Appliation
100% (1)
Turbocharger Appliation
19 pages
John Lautner Bibliography
No ratings yet
John Lautner Bibliography
8 pages
Boq For Pump House
100% (1)
Boq For Pump House
5 pages
MF-218 Piping and Miscellaneous Practice in Engine Room PDF
No ratings yet
MF-218 Piping and Miscellaneous Practice in Engine Room PDF
35 pages
Writing The Results
No ratings yet
Writing The Results
4 pages
Information Technology (EMIS and GIS) 1) Development of EMIS in Asia and The Pacific Region
No ratings yet
Information Technology (EMIS and GIS) 1) Development of EMIS in Asia and The Pacific Region
3 pages
6 Laws For The Glory of Sentient Beings
100% (5)
6 Laws For The Glory of Sentient Beings
3 pages
DDI0475C Corelink Nic400 Network Interconnect r0p2 TRM
No ratings yet
DDI0475C Corelink Nic400 Network Interconnect r0p2 TRM
74 pages
Job Analysis of HRM
No ratings yet
Job Analysis of HRM
4 pages
Commonly Used Materials: Forging (Limited To A Maximum Wt. of 10000 LB)
No ratings yet
Commonly Used Materials: Forging (Limited To A Maximum Wt. of 10000 LB)
1 page
Ritesh Agarwal The Rise of A Young Entrepreneur
No ratings yet
Ritesh Agarwal The Rise of A Young Entrepreneur
10 pages
Sikkim Manipal University Synopsis & Project
No ratings yet
Sikkim Manipal University Synopsis & Project
2 pages
IncidentRequest Resolved Apr
No ratings yet
IncidentRequest Resolved Apr
420 pages
2-1 Supple Registration Data
No ratings yet
2-1 Supple Registration Data
99 pages
51 43 252 Removing and Installing/replacing Panel For Rear Roof Pillar (D-Pillar), Left or Right Special Tools Required
No ratings yet
51 43 252 Removing and Installing/replacing Panel For Rear Roof Pillar (D-Pillar), Left or Right Special Tools Required
2 pages
Kivymd Readthedocs Io en Latest
No ratings yet
Kivymd Readthedocs Io en Latest
441 pages
Bharat Sanchar Nigam Limited: Application Form For New Mobile Connection (D-Kyc Process)
No ratings yet
Bharat Sanchar Nigam Limited: Application Form For New Mobile Connection (D-Kyc Process)
3 pages
Category (11) - General Inspection Services
No ratings yet
Category (11) - General Inspection Services
1 page
SAW&amp Comsol
No ratings yet
SAW&amp Comsol
5 pages
Rohzin Rahman Abbas Instant Download
No ratings yet
Rohzin Rahman Abbas Instant Download
8 pages
Development of Ku Compact Broadband 1x4, 1x8 and 1x16 Power Dividers With SIW Optimized Chamfered Bends
No ratings yet
Development of Ku Compact Broadband 1x4, 1x8 and 1x16 Power Dividers With SIW Optimized Chamfered Bends
10 pages
EEE 303 Mid Assignment
No ratings yet
EEE 303 Mid Assignment
9 pages
Adsorption: Adsorption Is The Adhesion of Atoms, Ions or Molecules From A Gas, Liquid or
No ratings yet
Adsorption: Adsorption Is The Adhesion of Atoms, Ions or Molecules From A Gas, Liquid or
12 pages
Cat
No ratings yet
Cat
2 pages
Fresh Around The World: Discover Our Technologies and Solutions For Temperature-Sensitive Goods
No ratings yet
Fresh Around The World: Discover Our Technologies and Solutions For Temperature-Sensitive Goods
7 pages
Beginner’s Guide to Correlation Analysis: Bite-Size Stats, #4
From Everand
Beginner’s Guide to Correlation Analysis: Bite-Size Stats, #4
Lee Baker
No ratings yet

3.2 - Relationshipd Between Categorical Variables

Uploaded by

3.2 - Relationshipd Between Categorical Variables

Uploaded by

FutureLearn 1

DATA TO INSIGHT: AN INTRODUCTION

You might also like