0% found this document useful (0 votes)

91 views14 pages

Module 2 Examining Relationship Quiz Assignment

Uploaded by

Jeffer Sol E. Cortado

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views14 pages

Module 2 Examining Relationship Quiz Assignment

Uploaded by

Jeffer Sol E. Cortado

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Jeffer Sol E.

Cortado BSA - 3rd Year

STT041

APPLY YOUR KNOWLEDGE

2.2 Price versus size. You visit a local Starbucks to buy a Mocha Frappuccino©. The barista
explains that this blended coffee beverage comes in three sizes and asks if you want a Tall, a
Grande, or a Venti. The prices are $3.50, $4.00, and $4.50, respectively.
(a) What are the variables and cases?
The cases are Tall, Grande and Venti.

(b) Which variable is the explanatory variable? Which is the response variable?
Explain your answers.
The explanatory variable is the prices $3.50, $4.00, and $4.50 because prices cause
the size of the beverage and vice versa.

(c) The Tall contains 12 ounces of beverage, the Grande contains 16 ounces, and the
Venti contains 20 ounces. Answer parts (a) and (b) with ounces in place of the
names for the sizes.

Coffee Beverage Price Ounces

Tall $3.50 12 oz.

Grande $4.00 16 oz.

Vendi $4.50 20 oz.

2.3 Make a data set.

(a) Create a spreadsheet that contains the spam botnet data.

BOTNET BOTS (thousands) SPAMS PER DAY(billions)

Scrizbi 315 60
Bobax 185 9
Rustock 150 30
Cutwail 125 16
Storm 85 3
Grum 50 2
Ozdok 35 10
Nucrypt 20 5
Wopla 20 1
Spamthru 12 0

(b) How many cases are in your data set?

There are ten cases in the data set which are the botnets; Srizbi, Grum, Bobax, Ozdok,
Rustock, Nucrypt, Cutwail, Wopla, Storm and Spamthru.

(c) Describe the labels, variables, and values that you used.
The variables are (a) botnet, (b) bots and (c) spams per day. A botnet is a remotely and
silently controlled collection of networked computers used to send unwanted commercial emails,
called spam botnet. A bot is a software application that is programmed to do automated certain
tasks. Spam is an unsolicited and unwanted junk email that is sent out in bulk to a user's
computer through the internet. The values in numbers are their respective quantities.

(d) Which columns give quantitative variables?

The column BOTS and SPAMS PER DAY give quantitative variables.

2.5 Make a scatterplot.

(a) Make a scatterplot similar to Figure 2.1 for the spam botnet data.

(b) Mark the location of the botnet Bobax on your plot.

The green arrow points to the Bobax
2.6 Change the units.
(a) Create a spreadsheet with the spam botnet data using the actual values. In other
words, for Srizbi use 315,000 for the number of bots and 60,000,000,000 for the
number of spam messages per day.

BOTNET BOTS (thousands) SPAMS PER DAY

Scrizbi 315,000 60,000,000,000

Bobax 185,000 9,000,000,000
Rustock 150,000 30,000,000,000
Cutwail 125,000 16,000,000,000
Storm 85,000 3,000,000,000
Grum 50,000 2,000,000,000
Ozdok 35,000 10,000,000,000
Nucrypt 20,000 5,000,000,000
Wopla 20,000 600,000,000
Spamthru 12,000 350,000,000

(b) Make a scatterplot for the data coded in this way.

(c) Describe how this scatterplot differs from Figure 2.1.

The scatterplot differs in a way that the first figure is much easier to locate data since it
has a narrow range of quantities while the second figure has a much wider range of datas,
making it harder to estimate and plot data.

2.13 Is the cost too high? Because it is so costly, many individuals and families cannot afford
to purchase health insurance. The Current Population Survey collected data on the
characteristics of the uninsured. Below are the numbers of uninsured and the total number of
people classified by age. The units are thousands of people.

(a) Plot the number of uninsured versus age group.

(b) Find the total number of uninsured persons, and use this total to compute the percent of the
uninsured who are in each age group.

Age Group Percentage of Uninsured

Under 18 years 11.69%

18 to 24 years 29.30%

25 to 34 years 26.87%

35 to 44 years 18.75%
45 to 64 years 14.19%

65 years and older 1.50%

(c) Plot the percentages versus age group.

(d) Explain how the plot you produced in part (c) differs from
the plot that you made in part (a).
The plot in part (a) plots the numbers of uninsured and total number of population for
each age group, while in part (c), only the percentages of uninsured are being plotted.

(e) Summarize what you can conclude from these plots.

Based on my observation, the plot in part (a) is useful to users of information who want
to see where the total number is plotted to be able to compare it to the uninsured or the main
data. The plot in part (c) on the other hand, is useful to users of information who only want to
see the percentages of the uninsured over its total population.

2.15 Compare the two percents. In the previous two exercises, you computed percents in two
different ways and generated plots versus age group. Describe the difference between the two
ways with an emphasis on what kinds of conclusions can be drawn from each.
In plot (a), the data is more informative since it plots the two variables which are the total
population and the numbers of uninsured. While in plot (c), only the percentages are shown
which means that the information is brief compared to the first plot.

2.25 Spam botnets. In Exercise 2.3 you made a data set for the botnet data. Use that data set
to compute the correlation between the number of bots and the number of spam messages per
day.

X Values
∑ = 997
Mean = 99.7
∑(X - Mx)2 = SSx = 84068.1

Y Values
∑ = 136
Mean = 13.6
∑(Y - My)2 = SSy = 3126.4

X and Y Combined
N = 10
∑(X - Mx)(Y - My) = 14330.8

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 14330.8 / √((84068.1)(3126.4)) = 0.884

2.26 Change the units. In the previous exercise bots were given in thousands and spam messages
per day were recorded in billions. In Exercise 2.6 you created a data set using the actual values. For
example, Srizbi has 315,000 bots and generates 60,000,000,000 spam messages per day.

(a) Find the correlation between bots and spam messages using this data set.

X Values
∑ = 997000
Mean = 99700
∑(X - Mx)2 = SSx = 84068100000

Y Values
∑ = 135950000000
Mean = 13595000000
∑(Y - My)2 = SSy = 3.12724225E+21

X and Y Combined
N = 10
∑(X - Mx)(Y - My) = 1.4331985E+16

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 1.4331985E+16 / √((84068100000)(3.12724225E+21)) = 0.8839

(b) Compare this correlation with the one that you computed in the previous exercise.
They have the same results. The only difference is the length of computation given that the
amounts are in wider range or form.

(c) What can you say in general about the effect of changing units in this way on the
size of the correlation?
I’d rather choose the first way of computing since it requires less time.

2.27 Correlation for debt. Figure 2.6 (page 84) is a scatterplot of 2007 debt versus 2006 debt for 24
countries. Is the correlation r for these data near −1, clearly negative but not near −1, near 0, clearly positive
but not near 1, or near 1? Explain your answer.

X Values
∑ = 10400
Mean = 2080
∑(X - Mx)2 = SSx = 9920096

Y Values
∑ = 11316
Mean = 2263.2
∑(Y - My)2 = SSy = 10253300.8

X and Y Combined
N=5
∑(X - Mx)(Y - My) = 10074160

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))
r = 10074160 / √((9920096)(10253300.8)) = 0.9989
r = 0.9989

Thus, the correlation r is near 1.

2.29 Strong association but no correlation. Here is a data set that illustrates an important point
about correlation:

x 20 30 40 50 60
y 10 30 50 30 10
(a) Make a scatterplot of y versus x.

(b) Describe the relationship between y and x. Is it weak or strong? Is it linear?

As a rule of thumb, a correlation coefficient between 0.25 and 0.5 is considered to be a

“weak” correlation between two variables. Thus, the correlation between y and x with a value of
0 is considered weak. It is still a linear correlation.

(c) Find the correlation between y and x.

X Values

∑ = 200
Mean = 40
∑(X - Mx)2 = SSx = 1000

Y Values
∑ = 130
Mean = 26
∑(Y - My)2 = SSy = 1120

X and Y Combined
N=5
∑(X - Mx)(Y - My) = 0

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 0 / √((1000)(1120)) = 0

(d) What important point about correlation does this exercise illustrate?

The important point in this exercise is the illustration of weak correlation between two
quantitative variables, and what it looks like if plotted.

2.32 First test and final exam. How strong is the relationship between the score on the first test
and the score on the final exam in an elementary statistics course? Here are data for eight students
from such a course:
First-test score 153 144 162 149 127 158 158 153
Final-exam score 145 140 145 170 145 175 170 160

(a) Do you think that one of these variables should be an explanatory variable and the other a
response variable? Give reasons for your answer.
Yes, that's possible. One explanation for the final scores being higher or lower than the first
score is that the students carried their knowledge of the subjects covered on the initial test over to
the final exam. Thus, the final exam results are influenced by how well they understood the subjects
covered on the first test. In this case, the explanatory variable could be the first-test score and the
response variable is the final-exam score.

(b) Make a scatterplot and describe the relationship.

Although technically a positive correlation, the relationship between your variables is weak because
the nearer the value is to zero, the weaker the relationship.

(c) Find the correlation.

X Values
∑ = 1204
Mean = 150.5
∑(X - Mx)2 = SSx = 854

Y Values
∑ = 1250
Mean = 156.25
∑(Y - My)2 = SSy = 1387.5

X and Y Combined
N=8
∑(X - Mx)(Y - My) = 445

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 445 / √((854)(1387.5)) = 0.4088

(d) Give some possible reasons why this relationship is so weak.

The nearer the value is to zero, the weaker the relationship.

2.33 Second test and final exam. Refer to the previous exercise. Here are the data for the second
test and the final exam for the same students.

Second-test score 158 162 144 162 136 158 175 153
Final-exam score 145 140 145 170 145 175 170 160

(a) Explain why you should use the second-test score as the explanatory variable.
The second-test score is the explanatory variable because it affects or causes the results in
the final-exam score.

(b) Make a scatterplot and describe the relationship.

The scatterplot shows a weak to moderate positive relationship between the variables.

(c) Find the correlation.

X Values
∑ = 1248
Mean = 156
∑(X - Mx)2 = SSx = 994

Y Values
∑ = 1250
Mean = 156.25
∑(Y - My)2 = SSy = 1387.5

X and Y Combined
N=8
∑(X - Mx)(Y - My) = 610

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 610 / √((994)(1387.5)) = 0.5194

(d) Why do you think the relationship between the second-test score and the final-exam score is
stronger than the relationship between the first-test score and the final-exam score?

The relationship between the second-test score and final-exam score is stronger because
the timeframe between the happening of the case is closer compared to the timeframe between first-
test and final-exam.

2.34 Add an outlier. Refer to the previous exercise. Add a ninth student whose scores on the
second test and final exam would lead you to classify the additional data point as an outlier.

(a) Highlight the outlier on your scatterplot.

(b) Find the correlation and describe the effect of the outlier on the correlation.
X Values
∑ = 1438
Mean = 159.778
∑(X - Mx)2 = SSx = 2021.556
Y Values
∑ = 1445
Mean = 160.556
∑(Y - My)2 = SSy = 2722.222

X and Y Combined
N=9
∑(X - Mx)(Y - My) = 1781.111

R Calculation
r = ∑((X - My)(Y - Mx)) / √((SSx)(SSy))

r = 1781.111 / √((2021.556)(2722.222)) = 0.7593

This is a strong positive correlation, which means that high X variable scores go with high Y variable
scores (and vice versa).

(c) Describe the performance of the student on the second exam and final exam and why that leads
to the conclusion that the result is an outlier. Give a possible reason for the performance of this
student.
The ninth student got high scores on both exams. These scores greatly affect the correlation
of the two variables resulting in an outlier or an increase in a correlation. Notice that from r 0.5194, it
went to r 0.7593.

2.57 A regression line. A regression equation is y = 10 + 20x.

(a) What is the slope of the regression line? 20

(b) What is the intercept of the regression line? 10

(c) Find the predicted values of y for x = 10, for x = 20, and for x = 30.
x = 10
y = 10 + 20 (x)
y = 10 + 20 (10)
y = 10 + 200
y = 210

x = 20
y = 10 + 20 (x)
y = 10 + 20 (20)
y = 10 + 400
y= 410

y = 410x = 30
y = 10 + 20 (x)
y = 10 + 20 (30)
y = 10 + 600
y = 610

(d) Plot the regression line for values of x between 0 and 50.

2.59 The “January effect.” Some people think that the behavior of the stock market in January
predicts its behavior for the rest of the year. Take the explanatory variable x to be the percent
change in a stock market index in January and the response variable y to be the change in the index
for the entire year. We expect a positive correlation between x and y because the change during
January contributes to the full year’s change. Calculation based on 38 years of data gives

x = 1.75% sx = 5.36% r = 0.596

y = 9.07% sy = 15.35%
(a) What percent of the observed variation in yearly changes in the index is explained by a straight-
line relationship with the change during January?
𝑟²
𝑟=0.596²
𝑟=0.3552 𝑥 100
𝑟=35.52%
r = 0.596

(b) What is the equation of the least-squares line for predicting full-year change from January
change?
𝑏=𝑟 𝑥 𝑆_𝑦/𝑆_𝑥
𝑏=1.1707yˆ
= 6.08% + 1.707 x

(c) The mean change in January is x = 1.75%. Use your regression line to predict change in the
index in a year in which the index rises 1.75% in January. Why could you have given this result (up
to round off error) without doing the calculation?

𝑏=𝑟 𝑥 𝑆_𝑦/𝑆_𝑥
ȳ = 9.08%.

The least-squares regression line passes through the point (x, y). Thus, we would predict ȳ
= ȳ = 9.07% when x = x = 1.75%.

Design of Concrete Structures RCC Marathon by Sandeep Jyani Helpful
No ratings yet
Design of Concrete Structures RCC Marathon by Sandeep Jyani Helpful
446 pages
Introduction To Photonics Lecture 1 Introduction
100% (1)
Introduction To Photonics Lecture 1 Introduction
41 pages
Correlation and Regression
100% (6)
Correlation and Regression
36 pages
Mx3ipg2a PDF
No ratings yet
Mx3ipg2a PDF
2 pages
Gade 12 & 12 Promaths STATS 2024 June 2024
No ratings yet
Gade 12 & 12 Promaths STATS 2024 June 2024
206 pages
Iphone 14 Plus
No ratings yet
Iphone 14 Plus
1 page
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
Correlation Lecture
No ratings yet
Correlation Lecture
20 pages
Saxena Samarth TechnicalReport
No ratings yet
Saxena Samarth TechnicalReport
4 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
GSM + Accessories Price List: Table of Content
100% (1)
GSM + Accessories Price List: Table of Content
27 pages
Notes Scatterplots
No ratings yet
Notes Scatterplots
145 pages
Second Stats Packet 24
No ratings yet
Second Stats Packet 24
100 pages
IPS7e LecturePPT ch02
No ratings yet
IPS7e LecturePPT ch02
105 pages
Chapter 9 - Correlation and Regression
No ratings yet
Chapter 9 - Correlation and Regression
112 pages
Racing
No ratings yet
Racing
130 pages
Retail Management Synopsis
0% (1)
Retail Management Synopsis
2 pages
Alg 2.2 2.6 Originals
No ratings yet
Alg 2.2 2.6 Originals
20 pages
Bi Variate 1
No ratings yet
Bi Variate 1
75 pages
Stats10 - Chapter+4 2
No ratings yet
Stats10 - Chapter+4 2
54 pages
Stats CH 4 Powerpoint
No ratings yet
Stats CH 4 Powerpoint
67 pages
1-19#-LS-909-SIZE ANALYSIS MACHINE-user Manual PDF
No ratings yet
1-19#-LS-909-SIZE ANALYSIS MACHINE-user Manual PDF
101 pages
Unit1 and Unit2
No ratings yet
Unit1 and Unit2
85 pages
Captura de Ecrã 2024-10-16 À(s) 13.04.06
No ratings yet
Captura de Ecrã 2024-10-16 À(s) 13.04.06
38 pages
Final Term Lectures 1
No ratings yet
Final Term Lectures 1
44 pages
Chapter XI Correlation and Regression
No ratings yet
Chapter XI Correlation and Regression
41 pages
Correlation-Regression 2019
No ratings yet
Correlation-Regression 2019
76 pages
Regression and Correlation Notes
No ratings yet
Regression and Correlation Notes
28 pages
Binary Search Tree
No ratings yet
Binary Search Tree
80 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
52 pages
Jan 30 - Correlation III
No ratings yet
Jan 30 - Correlation III
31 pages
Chapter2-ESTA3042 2020S2
No ratings yet
Chapter2-ESTA3042 2020S2
80 pages
Zamfira Ioana Ruxandra - Raport
No ratings yet
Zamfira Ioana Ruxandra - Raport
10 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
RocheCobasC111Host Interface Manual - 2.1 - EN - 2 PDF
No ratings yet
RocheCobasC111Host Interface Manual - 2.1 - EN - 2 PDF
93 pages
Colin Boyd Information Security Research Centre School of Data Communications Queensland University of Technology Brisbane Q4001 Australia
No ratings yet
Colin Boyd Information Security Research Centre School of Data Communications Queensland University of Technology Brisbane Q4001 Australia
17 pages
Lind 18e Chap013 PPT
No ratings yet
Lind 18e Chap013 PPT
43 pages
Lesson 3.3 Probability Normal Distribution Linear Regression and Correlation
No ratings yet
Lesson 3.3 Probability Normal Distribution Linear Regression and Correlation
29 pages
DADM-Correlation and Regression
No ratings yet
DADM-Correlation and Regression
138 pages
R Square Cautions
No ratings yet
R Square Cautions
10 pages
L3 Bivariate Worksheet
No ratings yet
L3 Bivariate Worksheet
25 pages
Internship-Report 2028208
No ratings yet
Internship-Report 2028208
24 pages
Math 133 - Unit 7 Graphing Data-1
No ratings yet
Math 133 - Unit 7 Graphing Data-1
20 pages
CS Project
No ratings yet
CS Project
14 pages
5 - Chapter9-Linear Regression
No ratings yet
5 - Chapter9-Linear Regression
15 pages
Karvy Stock Broking Limited Mobile App User Manual
No ratings yet
Karvy Stock Broking Limited Mobile App User Manual
37 pages
Chapter 12
No ratings yet
Chapter 12
36 pages
Sophia Answers Unit 4 Milestone Statistic
No ratings yet
Sophia Answers Unit 4 Milestone Statistic
26 pages
FCASD - Lab Assignment - 6
No ratings yet
FCASD - Lab Assignment - 6
7 pages
Correlation Project KV
No ratings yet
Correlation Project KV
7 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Lesson 11 Pearsons R
No ratings yet
Lesson 11 Pearsons R
12 pages
MIS BA 20232024 Notes Chapter02
No ratings yet
MIS BA 20232024 Notes Chapter02
8 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Android Controlled Spy Robot With Night Vision Camera
No ratings yet
Android Controlled Spy Robot With Night Vision Camera
16 pages
SLG 9.2 Bivariate Quantitative Data - Correlation Coefficient
No ratings yet
SLG 9.2 Bivariate Quantitative Data - Correlation Coefficient
5 pages
Activity Management Process Map
100% (2)
Activity Management Process Map
7 pages
Correlation and Linear Regression: Prior Written Consent of Mcgraw-Hill Education
No ratings yet
Correlation and Linear Regression: Prior Written Consent of Mcgraw-Hill Education
14 pages
Chapter 4
No ratings yet
Chapter 4
8 pages
Correg
No ratings yet
Correg
19 pages
2.2 Correlation
No ratings yet
2.2 Correlation
3 pages
SAFE Tutorial v. 12 Ingles
No ratings yet
SAFE Tutorial v. 12 Ingles
112 pages
Transcript
No ratings yet
Transcript
3 pages
Notes 2 - Scatterplots and Correlation
No ratings yet
Notes 2 - Scatterplots and Correlation
6 pages
Lesson Plan 24 Collecting Like Terms 2
No ratings yet
Lesson Plan 24 Collecting Like Terms 2
2 pages
Correlation
No ratings yet
Correlation
72 pages
Regression Ex
No ratings yet
Regression Ex
13 pages
A Beginner's Guide To Scanning With DirBuster For The NCL Games
No ratings yet
A Beginner's Guide To Scanning With DirBuster For The NCL Games
7 pages
Correlation and Regression
No ratings yet
Correlation and Regression
9 pages
S1ED3
No ratings yet
S1ED3
25 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Correlations: Islamic University of Gaza Statistics and Probability For Engineers (ENGC 6310)
No ratings yet
Correlations: Islamic University of Gaza Statistics and Probability For Engineers (ENGC 6310)
22 pages
The Big Picture: Department of Statistics University of Wisconsin-Madison
No ratings yet
The Big Picture: Department of Statistics University of Wisconsin-Madison
7 pages
Dice Profile Thirupathi Kelli PDF
No ratings yet
Dice Profile Thirupathi Kelli PDF
10 pages
Choppa Sravani: Professional Objective
No ratings yet
Choppa Sravani: Professional Objective
2 pages
5 Best Voicemail Greeting Examples For 2022 Tip
No ratings yet
5 Best Voicemail Greeting Examples For 2022 Tip
1 page
Introduction To Correlation Analysis GB6023 2012
No ratings yet
Introduction To Correlation Analysis GB6023 2012
34 pages
Kafd A1 CJ01 P504 Gas TRN 00402
No ratings yet
Kafd A1 CJ01 P504 Gas TRN 00402
2 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
Programmatically Update The UpdatePanel Using ASP
No ratings yet
Programmatically Update The UpdatePanel Using ASP
4 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
Pearson's Correlation Coefficient
No ratings yet
Pearson's Correlation Coefficient
7 pages
Correlation Coefficient: How Well Does Your Regression Equation Truly Represent Your Set of Data?
No ratings yet
Correlation Coefficient: How Well Does Your Regression Equation Truly Represent Your Set of Data?
3 pages
Correlation
No ratings yet
Correlation
19 pages
Correlation and Regression: Predicting The Unknown
No ratings yet
Correlation and Regression: Predicting The Unknown
5 pages
LinearRegression Correlation
No ratings yet
LinearRegression Correlation
3 pages
Indian Technology Congress Brochure
No ratings yet
Indian Technology Congress Brochure
4 pages
The Numerate Leader: How to Pull Game-Changing Insights from Statistical Data
From Everand
The Numerate Leader: How to Pull Game-Changing Insights from Statistical Data
Thomas A. King
No ratings yet
The Basics of 3D Platonic Order.: 3D Platonic Order, #1
From Everand
The Basics of 3D Platonic Order.: 3D Platonic Order, #1
Paul Maddock
No ratings yet
SSC CGL Preparatory Guide -Mathematics (Part 2)
From Everand
SSC CGL Preparatory Guide -Mathematics (Part 2)
Dr. DK Sukhani
4/5 (1)