0% found this document useful (0 votes)
44 views10 pages

How Do You Use This Module?: Module 4 - Chapter 4: Test For Contingency Tables

This document provides instructions on how to use a module that teaches statistical concepts through a series of lessons and activities. It outlines the following: 1) Each chapter contains lessons with learning outcomes, discussions, writing assignments, and performance tasks to be completed independently with instructor assistance as needed. 2) Students should read to understand learning outcomes, work through all materials and activities, complete written works, and apply knowledge in performance tasks. 3) All outputs should be kept in a portfolio and submitted periodically. Students must complete this module before beginning the next.

Uploaded by

Xebi Cassim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views10 pages

How Do You Use This Module?: Module 4 - Chapter 4: Test For Contingency Tables

This document provides instructions on how to use a module that teaches statistical concepts through a series of lessons and activities. It outlines the following: 1) Each chapter contains lessons with learning outcomes, discussions, writing assignments, and performance tasks to be completed independently with instructor assistance as needed. 2) Students should read to understand learning outcomes, work through all materials and activities, complete written works, and apply knowledge in performance tasks. 3) All outputs should be kept in a portfolio and submitted periodically. Students must complete this module before beginning the next.

Uploaded by

Xebi Cassim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

How Do You Use This Module?

This module is written in a very user-friendly manner. Definitions,


processes and samples are included as an input knowledge or as a guide.
Instructions are made clear and straight to the point. Your resourcefulness and
creativity are needed to be able to answer or do the task well. Just follow the
directions and you will be guided as you move on page after page.

In this module, you are required to go through a series of activities in


order to complete each learning outcome. Each chapter has lessons with Specific
Learning Outcomes, Discussions, Written Works, and Performance Tasks. Follow
and perform the activities on your own. If you have questions, do not hesitate to
ask for assistance from your instructor.

Remember to:

 Read and understand the Specific Learning Outcome(s). These tell you
what you should know and be able to do at the end of this module.

 Work through all the information and complete the activities in each
section.

 Read the discussions very well. Suggested references are included to


supplement the materials provided in this module.

 After reading every discussion, test yourself on how much you learned by
means of the Written Works. Use the White Book to write your answers.

 Demonstrate what you learned by doing the Performance Tasks. You


must be able to apply what you have learned in another activity or in real
life situation.

 Keep all the outputs in your portfolio as a record of your


accomplishments and submit on the designated period.

Note: You need to complete this module before you can perform the next
module.

Module 4 | Chapter 4: Test for Contingency Tables 1


Course Outline Chapter 3: Sampling Distribution
Module Assessment  Lesson 1: Sampling April 1 –
Contents Time Frame April 30,
Number Period Distribution
Module 3
Chapter 1: Introduction to Data  Lesson 2: Test for Means 2021
 Lesson 1: Introduction to  Lesson 3: Test for
Data February 1 – Proportions
 Lesson 2: Categorical Data February 28, Chapter 4: Tests for Contingency
Module 1 Final Exam
 Lesson 3: Numerical Data 2021 Tables
 Lesson 1: Tests for May 1 –
Midterm Contingency Tables May 30,
Module 4
Exam  Lesson 2: Inferences for 2018
Chapter 2: Basics of Sampling Correlation and Simple
 Lesson 1: Correlation March 1 – Linear Regression
Module 2 Analysis March 31,
 Lesson 2: Basics of Sampling 2021

Grading System

The grading system is as follows:

Assessment Period SA Exam TOTAL


Midterm Grades 60% 40% 100%
Final Grades 60% 40% 100%

(Midterm Grades + Final Grades) / 2 = Final Rating

Module 4 | Chapter 4: Test for Contingency Tables 2


Table of Contents

Title Page Number

How Do You Use This Module…………………………………………………………………… 1


Module 4: Chapter 4: Inferences for Correlation and Simple Linear
Regression
Lesson 1: Inferences for Correlation and Simple Linear 1-9
Regression

References………………………………………………………………………………………………… 10

Chapter 4: Inferences for Correlation and Simple Linear Regression


Objective/s: After completing this chapter the students will be able to:
1. Identify the Inferences for Correlation and Simple Linear
Regression;
2. Identify the x and y variables; and
3. Interpret the data gathered.

Module 4 | Chapter 4: Test for Contingency Tables 3


In this example, we plot bear chest girth (y) against bear length (x). When
Lesson 1: Inferences for Correlation and Simple Linear Regression examining a scatterplot, we should study the overall pattern of the plotted points.
In this example, we see that the value for chest girth does tend to increase as the
In many studies, we measure more than one variable for each individual. value of length increases. We can see an upward slope and a straight-line pattern
For example, we measure precipitation and plant growth, or number of young in the plotted data points.
with nesting habitat, or soil erosion and volume of water. We collect pairs of data
and instead of examining each variable separately (univariate data), we want to A scatterplot can identify several different types of relationships between
find ways to describe bivariate data, in which two variables are measured on each two variables.
subject in our sample. Given such data, we begin by determining if there is a
relationship between these two variables. As the values of one variable change,  A relationship has no correlation when the points on a scatterplot do
do we see corresponding changes in the other variable? not show any pattern.
 A relationship is non-linear when the points on a scatterplot follow a
We can describe the relationship between these two variables graphically pattern but not a straight line.
and numerically. We begin by considering the concept of correlation.  A relationship is linear when the points on a scatterplot follow a
somewhat straight line pattern. This is the relationship that we will
Correlation is defined as the statistical association between two variables. examine.

A correlation exists between two variables when one of them is related to Linear relationships can be either positive or negative. Positive relationships have
the other in some way. A scatterplot is the best place to start. A points that incline upwards to the right. As x values increase, y values increase. As
scatterplot (or scatter diagram) is a graph of the paired (x, y) sample data x values decrease, y values decrease. For example, when studying plants, height
with a horizontal x-axis and a vertical y-axis. Each individual (x, y) pair is typically increases as diameter increases.
plotted as a single point.

Module 4 | Chapter 4: Test for Contingency Tables 4


Negative relationships have points that decline downward to the right. As x
values increase, y values decrease. As x values decrease, y values increase. For When two variables have no relationship, there is no straight-line relationship or
example, as wind speed increases, wind chill temperature decreases.
non-linear relationship. When one variable changes, it does not influence the
other variable.

Non-linear relationships have an apparent pattern, just not linear. For example, as
age increases height increases up to a point then levels off after reaching a
maximum height.

Module 4 | Chapter 4: Test for Contingency Tables 5


Linear Correlation Coefficient The properties of “r”:

Because visual examinations are largely subjective, we need a more precise and  It is always between -1 and +1.
objective measure to define the correlation between the two variables. To  It is a unitless measure so “r” would be the same value whether you
quantify the strength and direction of the relationship between two variables, we measured the two variables in pounds and inches or in grams and
use the linear correlation coefficient: centimeters.
 Positive values of “r” are associated with positive relationships.
𝑥1 −𝑥 𝑦1 −𝑦  Negative values of “r” are associated with negative relationships.
𝑠𝑥 𝑠𝑦
𝑟= 𝑛−1

where x̄ and sx are the sample mean and sample standard deviation of the x’s,
and ȳ and sy are the mean and standard deviation of the y’s. The sample size is n.

An alternate computation of the correlation coefficient is:

The linear correlation coefficient is also referred to as Pearson’s product moment


correlation coefficient in honor of Karl Pearson, who originally developed it. This
statistic numerically describes how strong the straight-line or linear relationship is
between the two variables and the direction, positive or negative.

Module 4 | Chapter 4: Test for Contingency Tables 6


Examine these next two scatterplots. Both of these data sets have an r =
0.01, but they are very different. Plot 1 shows little linear relationship between x
and y variables. Plot 2 shows a strong non-linear relationship. Pearson’s linear
correlation coefficient only measures the strength and direction of a linear
relationship. Ignoring the scatterplot could result in a serious mistake when
describing the relationship between two variables.

When you investigate the relationship between two variables, always begin
with a scatterplot. This graph allows you to look for patterns (both linear and non-
linear). The next step is to quantitatively describe the strength and direction of the
linear relationship using “r”. Once you have established that a linear relationship
exists, you can take the next step in model building.

Simple Linear Regression


Once we have identified two variables that are correlated, we would like to
model this relationship. We want to use one variable as a predictor or explanatory
variable to explain the other variable, the response or dependent variable. In
order to do this, we need a good relationship between our two variables. The
model can then be used to predict changes in our response variable. A strong
relationship between the predictor variable and the response variable leads to a
good model.

Module 4 | Chapter 4: Test for Contingency Tables 7


A simple linear regression model is a mathematical equation that allows us to
predict a response for a given predictor value.

Our model will take the form of ŷ = b 0 + b1x where b0 is the y-intercept,
b1 is the slope, x is the predictor variable, and ŷ an estimate of the mean value of
the response variable for any value of the predictor variable.

The y-intercept is the predicted value for the response (y) when x = 0. The
slope describes the change in y for each one unit change in x. Let’s look at this
example to clarify the interpretation of the slope and intercept.

This simple model is the line of best fit for our sample data. The regression line
does not go through every point; instead it balances the difference between all
data points and the straight-line model. The difference between the observed
data value and the predicted value (the value on the straight line) is the error or
residual. The criterion to determine the line that best describes the relation
between two variables is based on the residuals.

Residual = Observed - Predicted

Module 4 | Chapter 4: Test for Contingency Tables 8


For example, if you wanted to predict the chest girth of a black bear given its This random error (residual) takes into account all unpredictable and unknown
weight, you could use the following model. factors that are not included in the model. An ordinary least squares regression
line minimizes the sum of the squared errors between the observed and predicted
Chest girth = 13.2 +0.43 weight
values to create a best fitting line. The differences between the observed and
The predicted chest girth of a bear that weighed 120 lb. is 64.8 in. predicted values are squared to deal with the positive and negative differences.

Chest girth = 13.2 + 0.43(120) = 64.8 in. Coefficient of Determination

But a measured bear chest girth (observed value) for a bear that weighed 120 lb. After we fit our regression line (compute b0 and b1), we usually wish to know how
was actually 62.1 in. well the model fits our data. To determine this, we need to think back to the idea
of analysis of variance. In ANOVA, we partitioned the variation using sums of
The residual would be 62.1 – 64.8 = -2.7 in. squares so we could identify a treatment effect opposed to random variation that
occurred in our data. The idea is the same for regression. We want to partition the
A negative residual indicates that the model is over-predicting. A positive residual
total variability into two parts: the variation due to the regression and the
indicates that the model is under-predicting. In this instance, the model over-
variation due to random error. And we are again going to compute sums of
predicted the chest girth of a bear that actually weighed 120 lb.
squares to help us do this.

Suppose the total variability in the sample measurements about the sample mean
is denoted by 11856.png, called the sums of squares of total variability about the
mean (SST). The squared difference between the predicted value 13147.png and
the sample mean is denoted by 11878.png, called the sums of squares due to
regression (SSR). The SSR represents the variability explained by the regression
line. Finally, the variability which cannot be explained by the regression line is
called the sums of squares due to error (SSE) and is denoted by 11892.png. SSE is
actually the squared residual.

Module 4 | Chapter 4: Test for Contingency Tables 9


The Coefficient of Determination measures the percent variation in the response
variable (y) that is explained by the model.

 Values range from 0 to 1.


 An R2 close to zero indicates a model with very little explanatory power.
 An R2 close to one indicates a model with more explanatory power.

The Coefficient of Determination and the linear correlation coefficient are related
mathematically.

𝑅2 = 𝑟 2

However, they have two very different meanings: r is a measure of the strength
and direction of a linear relationship between two variables; R2 describes the
percent variation in “y” that is explained by the model.

Self-Assessment 10 (10 points)


1. Teen Birth Rate and Poverty Level Data. This dataset of size n=51 for the
50 states and the District of Columbia in the United States. The Variables
are y=year 2002 birth rate per 1000 females 15 to 17 years old and
The sums of squares and mean sums of squares (just like ANOVA) are typically x=poverty rate, which is the percent of the state’s population living in
presented in the regression analysis of variance table. The ratio of the mean sums households with incomes below the federally defined poverty level.
of squares for the regression (MSR) and mean sums of squares for error (MSE)
form an F-test statistic used to test the regression model. Role of
Entrepreneurship
Activity 5 (10 points)
The relationship between these sums of square is defined as and
Lung Function in 6 to Entrepreneurs
10 year old children.
in The data are from n=345
Total Variation = Explained Variation + Unexplained Variation
children between 6 and 10 years Economic
old. The variables are y= forced exhalation
volume (FEV), a measure of howDevelopment
much air somebody
and can forcibly exhale from their
lungs, and x= age in years.
The larger the explained variation, the better the model is at prediction. The Society
larger the unexplained variation, the worse the model is at prediction. A
quantitative measure of the explanatory power of a model is R2, the Coefficient of
Reference/s

https://milnepublishing.geneseo.edu/natural-resources-
2 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 biometrics/chapter/chapter-7-correlation-and-simple-linear-regression/
𝑟 =
𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

Module 4 | Chapter 4: Test for Contingency Tables 10

You might also like