0% found this document useful (0 votes)
92 views26 pages

Tutorial 7

1. The document provides instructions for completing tutorial exercises for a quantitative methods course, including downloading data files and attempting assessment exercises. 2. It then summarizes chi-square tests for analyzing frequencies, which are used to test the goodness of fit of a distribution, independence of classifications, and similarity of populations. 3. An example exercise tests whether a die is fair by comparing observed and expected frequencies of outcomes from rolling the die 600 times, calculating a chi-square statistic that does not exceed the critical value, so the die is likely fair.

Uploaded by

Jonty Jenkins
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views26 pages

Tutorial 7

1. The document provides instructions for completing tutorial exercises for a quantitative methods course, including downloading data files and attempting assessment exercises. 2. It then summarizes chi-square tests for analyzing frequencies, which are used to test the goodness of fit of a distribution, independence of classifications, and similarity of populations. 3. An example exercise tests whether a die is fair by comparing observed and expected frequencies of outcomes from rolling the die 600 times, calculating a chi-square statistic that does not exceed the critical value, so the die is likely fair.

Uploaded by

Jonty Jenkins
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

ECON20003 – QUANTITATIVE METHODS 2

TUTORIAL 7

Download the t7e1, t7e2, t7e4, t7e5 and t7e8 Excel data files from the subject website and
save them to your computer or USB flash drive. Read this handout and try to complete the
tutorial exercises before your tutorial class, so that you can ask help from your tutor during
the Zoom session if necessary.

After you have completed the tutorial exercises attempt the “Exercises for assessment”. You
must submit your answers to these exercises in the Tutorial 7 Homework Canvas
Assignment Quiz by the next tutorial in order to get the tutorial mark. For each assessment
exercise type your answer in the relevant box available in the Quiz or type your answer
separately in Microsoft Word and upload it in PDF format as an attachment. In either case,
if the exercise requires you to use R, save the relevant R/RStudio script and printout in a
Word document and upload it together with your written answer in PDF format.

Chi-Square Tests for the Analysis of Frequencies

These tests are based on multinomial experiments involving qualitative or quantitative


variables measured on a nominal or ordinal scale. A multinominal experiment is a
generalization of the binomial experiment and is characterised by the following three
properties:

(i) There is a sequence of n identical but independent trials.


(ii) Each trial has the same k mutually exclusive but exhaustive outcomes.
(iii) The probability of each outcome is constant.

If Ai is the ith possible outcome (i = 1, 2, …, k) in each trial, pi is its unknown probability, ei


is its expected frequency, and oi is its observed frequency, then

k k
ei  npi and ei  n pi  n
i 1 i 1

k
1 k

pˆ i 
oi
n
and  pˆ  n o  1
i 1
i
i 1
i

and for large samples, i.e. when each ei  5,

 pˆ i  pi 
2
k
 2  n   df2 , df  k  m
i 1 pi

where m denotes the number of constraints imposed on the data plus the number of
coefficients to be estimated to calculate the expected frequencies. Its value depends on the
1
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
actual application of the chi-square test, but it is always at least one because the sum of the
expected frequencies must be equal to the number of trials, i.e. to the sample size.

In practice, this definitional formula can be replaced with the following computational
formulas:

 pˆ i  pi   npˆ i  npi   o i  ei 
2 2 2
k k k k
oi2
 2  n      ...   n
i 1 pi i 1 np i i 1 ei i 1 ei

The expected frequencies must be based on the null hypothesis and relatively large
discrepancies between the observed and expected frequencies, and hence a large test
statistic value indicates that the null hypothesis is likely false.

Chi-square tests for the analysis of frequencies are typically applied to answer three types
of research questions:

(a) How well does certain distribution fit to the sample data?
(b) Are two classifications of some nominal or ordinal data statistically independent?
(c) Are several sampled populations similar to each other with respect to some
characteristic?

Accordingly, there are three types of chi-square tests: (a) test of goodness of fit, (b) test of
independence and (c) test of homogeneity. They are based on the following assumptions:

i. The data is a random sample of independent observations generated by some


multinomial experiment.
ii. The variable(s) of interest is (are) qualitative or quantitative.
iii. The measurement scale is nominal.
iv. Each expected frequency is at least 5.

Chi-square test of goodness of fit

The null hypothesis is that the sample has been drawn from some ‘known’ distribution. This
distribution can be some standard theoretical probability distribution with known parameters
(e.g. a normal distribution with 20 expected value and 5 standard deviation) or a
nonstandard distribution.

The test statistic can be calculated in three steps:

i. Specify categories or class intervals for the variable of interest;


ii. Find the observed frequencies for all categories or class intervals;
iii. Determine the expected frequencies assuming that H0 is correct.

2
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Exercise 1 (Selvanathan, p. 678, ex. 16.9)

To determine whether a single die is balanced, or fair, the die was rolled 600 times. The
observed frequencies with which each six of the sides of the die turned up are recorded in
the table on top of the next page.

Face Observed frequency


1 114
2 92
3 84
4 101
5 107
6 102

Is there sufficient evidence to conclude, at the 5% level of significance, that the die is not
fair?

The null hypothesis is that the die is fair, that is the six possible outcomes are equally likely,
while the alternative hypothesis is that the die is not fair and at least one outcome is more
likely or less likely than the others.

In symbols,

1
H 0 : p1  p 2  ...  p6  vs H A : ' not H 0 '
6

The data has been generated by a multinomial experiment since

(i) The same die is rolled 600 times and the outcome of each roll (trial) is independent
of the outcomes of other rolls.
(ii) At each roll the outcome is either 1 or 2 or 3 or 4 or 5 or 6, i.e. there are 6 mutually
exclusive but exhaustive outcomes.
(iii) Since the same die is rolled at each time, the probability of each outcome is
constant.

Consequently, H0 can be tested by a chi-square test of goodness of fit. Granted that each
expected frequency is at least five, the test statistic is

 oi  ei 
2
k
 
2
  df2
i 1 ei

There are k = 6 possible outcomes and the expected frequencies must add up to n, so m =
1. Hence, the degrees of freedom of the chi-square test statistic is df = k – m = 6 – 1 = 5.
From Table 5, Appendix B of the Selvanathan book, the critical value is  2,df =  20.05,5 =
11.1, and H0 is to be rejected if the calculated test statistic value is larger than this critical
value.

3
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
The number of trials is 600, so according to H0 each expected frequency is

1
ei  npi  600   100 , i  1,..., 6
6

The calculations are summarised in the following table

i  pi,0  oi  ei  (oi – ei)2/ ei 


1  0.1667  114  100  1.96 
2  0.1667  92  100  0.64 
3  0.1667  84  100  2.56 
4  0.1667  101  100  0.01 
5  0.1667  107  100  0.49 
6  0.1667  102  100  0.04 
Sum  1.000  600  600  5.70 

Each ei is large, so we can rely on the chi-square approximation.

Since  2obs = 5.70 < 11.1, H0 cannot be rejected at the 5% level. The die is probably fair.
Launch RStudio, create a new project and script, and name them t7e1. Import the observed
frequencies (o) and the hypothesized probabilities (p_H0) from the t7e1 Excel file.

The relevant R function is

chisq.test(x, p = p0)

where x is a numeric vector of the observed frequencies and p0 is a numeric vector of the
probabilities under the null hypothesis.

In this case x is o and p0 is p_H0, so execute

chisq.test(o, p = p_H0)

to obtain

Chi-squared test for given probabilities


data: o
X-squared = 5.7, df = 5, p-value = 0.3365

The observed test statistic value is 5.7 and the p-value is 0.3365, so H0 cannot be rejected
at any reasonable significance level.

In this example we imported the hypothesized probabilities from an Excel file in decimal
format. This might be problematic because if the decimals do not add up to one, you might

4
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
get an error message.1 To avoid this happening, we could add the rescale.p = TRUE
argument to the chisq.test function or enter the probabilities as fractions straight in the R
function, i.e. execute

chisq.test(o, p = c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6))

Chi-square test of independence

The null hypothesis is that two variables (criteria of classifications defined for the same
population) are independent, so the distribution of one of them in no way depends on the
distribution of the other.

Because there are two variables this time, not just one, the formula for the test statistic is
slightly different:

o  eij 
2
r c
     df2
2 ij

i 1 j 1 eij

where r and c are the numbers of levels of the two variables, i.e. the numbers of rows and
columns in the contingency table, and df = (r  1)(c – 1).
The test statistic can be calculated in three steps:

i. Specify categories or class intervals for both variables;


ii. Summarize the observed joint frequencies in an r  c contingency table;
iii. Determine the expected joint frequencies assuming that H0 is correct.

In the third step, based on the multiplication rule for independent events, the expected joint
frequencies are calculated as

1 c r
elk    olj   oik
n j 1 i 1

The computational formula for the test statistic is

o  eij 
2
r c r c oij2
     n
2 ij

i 1 j 1 eij i 1 j 1 eij

1
You did not get an error message this time because I entered the probabilities as 1/6 in the Excel spreadsheet
and they have been imported to R with 7 decimals.
5
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Exercise 2

In a 2008 research study across Melbourne 448 grocery shoppers exiting four of Australia’s
biggest supermarket retailers (Store: Safeway, Coles, IGA, Aldi) were interviewed regarding
their views on green issues that contribute to carbon footprints and on their grocery shopping
behavior.2 These attitudes and dispositions were used as likely antecedents to predict the
inclination of consumers to pay premium prices for carbon footprint compliant grocery items.
Two of the questions were “Do you actively look for Australian made grocery products to
purchase? (Aussie, yes = 1 / no = 2)” and “How much does the name of the brand affect
your product choice? (Brand, coded as 1: least impact on choice, …, 5: most impact on
choice)”. The responses are saved in the t7e2 Excel file.

Using this data, is it possible to infer at the 5% significance level that the type of Store
attended and whether actively looking for Aussie products are related to each other?
Perform a chi-square test of independence.

Like in Exercise 1, first we need to think about the underlying data generating process.

(i) The 448 shoppers in the study can be considered as a random sample and it is
reasonable to assume that these shoppers were unrelated and hence their answers
independent.
(ii) The first question had two and the second question had five mutually exclusive but
exhaustive outcomes.
(iii) Based on (i), we can assume that the (unconditional) probability of choosing a
particular answer to any of those two questions does not change from shopper to
shopper.

Consequently, the data can be considered as having been generated by a multinomial


experiment.

Open the Long sheet of the t7e2 file in Excel and have a look at its structure. The data on
this sheet is in long format and contains 448 triads of observations on Aussie, Brand and
Store. Since this is a relatively large data set, unlike in other exercises, first we do the
calculations in R and then verify some details manually.

We can use the same R function than before, but with different arguments:

chisq.test(x, y)

where x and y are numeric or character vectors.

Launch RStudio, create a new project and script (t7e2) and import and load the data from
the Long sheet of the t7e2 Excel file. Execute

chisq.test(Aussie, Store)

to get

2
Miranda, M. and Kónya, L. (2009): Harnessing the Attraction of Smaller Carbon Footprints to Increase
Product Profitability. Presented on the 2009 AMS/ACRA Retailing Conference, New Orleans, September 30-
October 3, 2009.
6
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Pearson's Chi-squared test
data: Aussie and Store
X-squared = 5.1373, df = 3, p-value = 0.162

The observed test statistic value is 5.1373 and the p-value is 0.162, well above 0.05. Hence,
at the 5% significance level we maintain H0 meaning that there is not sufficient evidence to
conclude that the type of Store attended and whether actively looking for Aussie products
are related to each other.

Although the outputs of R functions are typically very succinct, a lot of details are stored in
the background that can be recalled for the verification of some details. For example, the
result of the chisq.test() function is a list that contains, among others, the following:

statistic: the value the chi-squared test statistic;


parameter: the degrees of freedom of the approximate chi-squared distribution;
p.value: the p-value of the test;
observed: the observed frequencies;
expected: the expected frequencies;
residuals: the values of (oij – eij) / eij.

For the sake of illustration, perform the test again, but this time save the results by executing

result = chisq.test(Aussie, Store)

Retrieve the observed frequencies

result$observed

to obtain

Store
Aussie Aldi Coles IGA Safeway
1 11 52 10 78
2 23 104 39 131

This is the contingency table of observed frequencies. From it you can calculate the
marginal sums

oij
Store
Aussie Aldi Coles IGA Safeway Total
1 11 52 10 78 151
2 23 104 39 131 297
Total 34 156 49 209 448

7
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
For each combination of Store and Aussie, the expected joint frequency is the product of
the corresponding column and row sums divided by the total number of observations. For
example, for Aussie = 2 and Store = IGA it is

1 c r
1 c r
297  49
elk    olj   oik    o2 j   oi 3   32.48
n j 1 i 1 n j 1 i 1 448

By calculating all other expected joint frequencies this way, you should get

eij
Store
Aussie Aldi Coles IGA Safeway Total
1 11.46 52.58 16.52 70.44 151.00
2 22.54 103.42 32.48 138.56 297.00
Total 34.00 156.00 49.00 209.00 448.00

Retrieve the expected frequencies in R to verify the manual calculations.

result$expected

returns

Store
Aussie Aldi Coles IGA Safeway
1 11.45982 52.58036 16.51562 70.4442
2 22.54018 103.41964 32.48438 138.5558

Next, for each cell, from the observed and expected joint frequencies calculate

o  eij 
2
ij

eij

You should get

(oij - eij)2/eij
Store
Aussie Aldi Coles IGA Safeway Total
1 0.018 0.006 2.570 0.810 3.406
2 0.009 0.003 1.307 0.412 1.732
Total 0.028 0.010 3.877 1.222 5.137

The test statistic in the lower-right cell is 5.137.

To recover these details in R, execute


8
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
(result$residuals)^2
result$statistic

They return

Store
Aussie Aldi Coles IGA Safeway
1 0.018450178 0.006405708 2.570497280 0.810431100
2 0.009380394 0.003256774 1.306885823 0.412037361

and

X-squared
5.137345

The test statistic is asymptotically distributed as a 2 random variable with df = (r  1)(c – 1)


= (2 - 1)(4 - 1) = 3. This approximation is good enough if all expected frequencies are at least
five, as is the case this time.3

Suppose now that we do not have the raw data set, only the contingency table of the
observed joint frequencies, like on the Wide sheet of the t7e2 Excel file:

Store
Aussie Aldi Coles IGA Safeway
1 11 52 10 78
2 23 104 39 131

You can import this contingency table from the Excel file the way you did in Exercise 1 of
Tutorial 6. To make your job in R easier, it is recommended to ignore the first row and the
first column, i.e. to import only

Aldi Coles IGA Safeway


11 52 10 78
23 104 39 131

In order to do so, click the Import Dataset button on the Environment tab, choose the From
Excel… menu option, and in the Import Excel Data dialogue window specify the File location
and the Import Options as shown on the next page.

Then, execute

chisq.test(t7e2_wide)

to get the same results than earlier:

3
Check the expected joint frequencies on the previous page to see that this requirement is indeed satisfied.
When this requirement is violated, R still performs the test, but displays a warning message because it can be
misleading.
9
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Pearson's Chi-squared test
data: t7e2_wide
X-squared = 5.1373, df = 3, p-value = 0.162

10
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Alternatively, since this contingency table is indeed small, we could simply enter the data
from the keyboard:

Aldi = c(11, 23)


Coles = c(52, 104)
IGA = c(10, 39)
Safeway = c(78, 131)

Then, we can combine the 4 vectors in a matrix

data = cbind(Aldi, Coles, IGA, Safeway)

and run the chi-square test on this matrix:

chisq.test(data)

Chi-square test of homogeneity

This test is a generalization of the Z / t test for the difference between two population
proportions. It can be used to test the null hypothesis that several populations are
homogeneous in the sense that categories of a single qualitative variable, or class intervals
of a single quantitative variable, have the same distribution in these populations.

Similarly to the chi-square test of independence, the chi-square test of homogeneity is also
based on a contingency table, and the two tests have the same test statistic that can be
calculated the same way. However, the two tests differ in terms of the underlying rationale.
In the test of independence, the expected joint frequencies are based on the multiplication
rule for independent events, while in the test of homogeneity they are based on the pooled
sample.

The test statistic can be calculated in three steps:

i. Identify the (sub-) populations of interest and draw an independent random sample
from each;
ii. Summarize the observed joint frequencies in an r  c contingency table;
iii. Assuming that H0 is correct, compute the expected joint frequencies from the pooled
data set.

Exercise 3 (Berenson et al., pp.572-574)

Airlines are constantly trying to find new ways to cut costs and improve efficiency. In recent
years, as access to the internet has become much more widespread, travelers are being
given the option to check in for international flights prior to their arrival to the airport. This
avoids long airport check-in queues and also reduces the number of staff members
required. Hikari Airlines encourages online check-in by sending reminder emails to its
passengers 24 hours prior to their departure time. For operational planning purposes Hikari

11
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Airlines is examining whether there is a difference between the uses of online check-in at
three of its ports, Sydney, Singapore and Jakarta.

The table below presents the check-in patterns of three samples of passengers departing
from the three cities. Is it possible to infer from this data at the 5% significance level that the
proportion of on-line check-in is the same in all three destinations?

City
Check-in Sydney Singapore Jakarta Total
On-line 258 375 210 843
Airport 162 155 190 507
Total 420 530 400 1350

Granted that the passengers had been selected randomly, we can assume again that the
data had been generated by a multinomial experiment.

The null hypothesis is that the distribution of passengers by the way of check-in is the same
irrespective of their destinations. Under the null the test statistic follows a chi-square
distribution with df = (2  1)(3 – 1) = 2. From the chi-square table the critical value is 2,df =
20.05,2 = 5.99 and H0 is to be rejected if the calculated test statistic value is larger than this
critical value.

In the pooled sample the proportion of all passengers who used on-line check-in is 843/1350
= 0.624, and the proportion of those passengers who checked in on the airport is 507/1350
= 0.376.4 If the null hypothesis is correct, these proportions should be the same in the three
(sub-) populations (Sydney, Singapore, Jakarta) and hence the expected joint frequencies
in the three samples can be obtained by multiplying these proportions with the number of
passengers to the three destinations (i.e. 420, 530 and 400).

For example, under H0 the expected number of passengers who use on-line check-in and
fly to Sydney is

420  0.624  262.08

while the expected number of passengers who use airport check-in and fly to Sydney is

420  0.376  157.92

Notice that the same expected frequencies can be also calculated by multiplying the
corresponding row and column marginal frequencies and dividing the product with the total
number of observations, just like in a chi-square test of independence. As an illustration,
the previous two expected joint frequencies are5

4
When you calculate proportions from a pooled sample, like these ones, always make sure that the decimals
add up to one.
5
There are some discrepancies between the results of the two types of calculations, but they are due to
rounding errors and are negligible.
12
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
843  420 507  420
 262.27 and  157.73
1350 1350

Following the calculations in this way, you should get:

oij
City
Check-in Sydney Singapore Jakarta Total
On-line 258 375 210 843
Airport 162 155 190 507
Total 420 530 400 1350

eij
City
Check-in Sydney Singapore Jakarta Total
On-line 262.27 330.96 249.78 843.00
Airport 157.73 199.04 150.22 507.00
Total 420.00 530.00 400.00 1350.00

oij2/eij
City
Check-in Sydney Singapore Jakarta Total
On-line 253.80 424.91 176.56 855.27
Airport 166.38 120.70 240.31 527.39
Total 420.18 545.61 416.87 1382.66

Each expected joint frequency is fairly large, so we can use the chi-square approximation.

From the third table, the observed test static is

r c oij2
  
2
 n  1382.66  1350  32.66
i 1 j 1 eij

It is well above the critical value (5.99), so at the 5% significance level we reject H0 and
conclude that the three cities, Sydney, Singapore and Jakarta, are different with respect to
the proportions of passengers who check in for their flights online or at the airport.

We do not have the raw data set this time, just the contingency table of observed
frequencies. Hence, to perform the chi-square test in R, you need to proceed as
recommended in the last part of Exercise 2. Namely, enter the data for the three cities from
the keyboard,

Sydney = c(258, 162)


Singapore = c(375, 155)
Jakarta = c(210, 190)

13
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
combine the 3 vectors in a matrix

Passengers = cbind(Sydney, Singapore, Jakarta)

and run the chi-square test on this matrix:6

chisq.test(Passengers)

to obtain

Pearson's Chi-squared test

data: Passengers
X-squared = 32.66, df = 2, p-value = 8.09e-08

The test statistic is 32.66 and the p-value is practically zero, so H0 can be rejected at any
reasonable significance level.

Measures of Association

The population covariance and the Pearson population correlation coefficient show the
direction (nature) and measure the strength of the linear relationship between two
quantitative variables, say X and Y. These population parameters are defined as

 xy  x     yi   y  
 xy  E  ( X   x )(Y   y )  and  xy   E  i x
   
 x y   x    y  

The population covariance, xy, is the expected value of the product of the deviation of X
from its expected value and the deviation of Y from its expected value. In the second formula
x and y denote the population standard deviation of X and Y, respectively, so the Pearson
population correlation coefficient (xy) is the expected value of the product of the
standardized X and Y variables, each standardized by the corresponding mean and
standard deviation.

These population parameters can be estimated from a sample of n pairs of observations for
X and Y by the sample covariance (sxy) and the sample correlation coefficient (rxy) given by
the following formulas:

1 n 1  n  1  n 1 n  n 
s xy  
n  1 i 1
( xi  x )( yi  y )   
n  1  i 1
xi yi  nx y     xi y i    xi    y i  
 n  1  i 1 n  i 1   i 1  

sxy
rxy 
sx s y

where sx and sy are the sample standard deviations of X and Y,

6
It is possible to combine the second and third commands as chisq.test(cbind(Sydney, Singapore, Jakarta)).
14
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
1  n 2 1 n  
2
1 n 1  n 2 
sx   ( xi  x ) 2 
n  1 i 1  
n  1  i 1
xi  nx 2  

  xi    xi  
n  1  i 1 n  i 1  

1  n 2 1 n  
2
1 n 1  n 2 
sy   ( yi  y ) 2 
n  1 i 1
 yi  ny 2  
n  1  i 1
 i
n  1  i 1
y   i 
n  i 1  
y

The sample covariance and sample correlation coefficient have the following properties.7

Covariance:

(i) sxy = 0 when there is no linear relationship between X and Y.

(ii) sxy > 0 (sxy < 0) when X and Y tend to deviate in the same (opposite) direction from
their respective means.

(iii) The larger | sxy |, the stronger the linear association between X and Y.

(iv) When X = Y, sxy =sxx = sx2, i.e. the variance is a special covariance, the covariance
of a variable with itself.

(v) sxy does not have natural lower and upper limits, so whatever non-zero value it takes
on, it cannot be claimed that the covariance between X and Y is particularly small
or large.

(vi) sxy depends on the units of measurement of X and Y, so it cannot be compared to


the covariance between two other variables, unless both pairs of variables are
measured on the same scales.

Correlation:

(i) rxy does not depend on the units of measurement.

(ii) -1  rxy  1, so rxy does have natural lower and upper limits that provide benchmarks
to compare with.

(iii) rxy = 1 indicates that there is a perfect positive/negative linear relationship


between X and Y, i.e. all observations are on a single straight line sloping
upward/downward.

(iv) rxy = 0 when there is no linear relationship between X and Y, i.e. all observations
are on a horizontal straight line.

(v) The sign of rxy shows the nature (negative/positive) of the linear relationship and
the closer rxy is to 1, the stronger the linear relationship between the variables.

7
For the sake of simplicity, the properties are stated in terms of the sample statistics, but they equally apply
for the corresponding population parameters as well.
15
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
(vi) Granted that X and Y are both normally distributed, the null hypothesis of no
correlation between X and Y (H0: xy = 0) can be tested with a t-test based on the
following statistic:

rxy n  2
tr  ~ tdf ; df  n  2
1  rxy2

Exercise 4 (Selvanathan et al., p. 748, ex. 17.9)

A critical factor for used-car buyers when determining the value of a car is how far the car
has been driven. There is, however, not much information about this available in the public
domain. To examine this issue, a used-car dealer randomly selected 100 five-year-old Ford
Lasers that had been sold at auction during the past month. Each car was in top condition
and equipped with automatic transmission, CD player and air conditioning. The dealer
recorded the Price ($’000) and the number of kilometres on the Odometer (‘000 km). The
data are saved in the t7e4 Excel file.

a) Irrespectively of the actual data we have, do you expect Price and Odometer to be
related to each other? If yes, do you expect the relationship between them to be positive
or a negative?

It seems logical that, everything else held constant, cars with higher odometer reading
sell for less. Therefore, we can expect Price and Odometer to be related to each other
and that this relationship is negative.

b) If Price and Odometer are related to each other, which variable is likely determining the
other?

Odometer reading clearly does not depend on the price of the car, but Odometer,
among others, determines Price. Hence, in this context, Odometer is the independent
variable and Price is the dependent variable.

c) Import the data to R and develop a scatterplot from the sample data. Make sure that
the ‘logical’ independent variable is plotted on the horizontal axis and the dependent
variable on the vertical axis. What does your scatterplot show?

Create a new RStudio project and script and import the data. The

plot(Odometer, Price,
main = "Scatterplot of Price versus Odometer",
col = "blue", pch = 19)

command returns the first plot on the next page.

You can make this scatterplot even more informative by superimposing the ‘best-fitting’
straight line on it. To do so, right after the previous plot() command execute

16
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
abline(lm(Price ~ Odometer), col = "red")

Scatterplot of Price versus Odometer

17.5
17.0
16.5
Price

16.0
15.5
15.0
14.5

20 25 30 35 40 45 50

Odometer

You should now get:

Scatterplot of Price versus Odometer


17.5
17.0
16.5
Price

16.0
15.5
15.0
14.5

20 25 30 35 40 45 50

Odometer

17
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
These scatterplots illustrate that the two variables tend to move in the opposite
directions, so in this sample there is indeed a negative linear relationship between Price
and Odometer. Moreover, on the second scatterplot the sample observations appear to
be spread relatively close to the ‘best-fitting’ straight line, suggesting that in this sample
there is a reasonable strong negative linear relationship between Price and Odometer.

d) Given the following statistics, calculate the sample covariance and correlation
coefficient between Odometer (X) and Price (Y) ‘manually’ and interpret them.

n n n

 xi  3601.10 ,
i 1
 yi  1623.70 , sx  6.596 , sy  0.765 ,
i 1
 x y  58067.44
i 1
i i

The sample covariance can be computed as

3601.11623.7
  58067.44 
1 n
1 n
 n
 100
sxy   xi yi    xi   yi     4.077
n  1  i 1 n  i 1  i 1   99

and the sample correlation coefficient is

s xy 4.077
rxy    0.808
sx s y 6.596  0.765

Both of these statistics are negative (sxy and rxy always have the same sign) indicating
that in this sample there is a negative linear relationship between Odometer and Price.
In addition, rxy also suggests that this relationship is fairly strong.

e) Obtain the sample covariance and the sample correlation coefficient between price and
odometer in R.

The relevant R functions are

cov(x, y, method = " ")

cor(x, y, method = " ")

where x and y are two quantitative variables and method is a character string indicating
which correlation coefficient (or covariance) is to be computed: "pearson" (default),
"kendall", or "spearman".

Since Price and Odometer are quantitative variables measured on ratio scales, we use
the Pearson coefficients, which is the default option.

cov(Price, Odometer)

and

cor(Price, Odometer)
18
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
return -4.076977 and -0.8082646, respectively.

f) Perform a test at the 5% significance level to determine whether it is possible to infer


that Price and Odometer are linearly related to each other. Specify the hypotheses
based on your answer in part (a), do the calculations first manually and then in R, make
a statistical decision and state your conclusion.

In part (a) we stated that the logical relationship between Price and Odometer is
negative. Accordingly, we perform a left-tail test with

H0: xy = 0 and HA: xy < 0,

where X and Y are Odometer and Price, respectively.

The test statistic is

rxy n  2
tr  ~ tdf ; df  n  2
1  rxy2

The critical value is -t0.05, 98  -t0.05, 100 = -1.660, and H0 is to be rejected if the calculated
test statistic value is below this critical value.

The observed test statistic value is

rxy n  2 0.808 100  2


tr ,obs    13.576
1 r 2
xy 1  (0.808)2

It is much smaller than the critical value, so at the 5% significance level we reject H0
and conclude that there is a significantly negative linear relationship between the
odometer reading and price of second-hand cars.

This test can be performed with the

cor.test(x, y, method = " ", alternative = " ", conf.level = )

R function, where the first three arguments are the same than those of cov() and cor() and
the last two are like in any other test we used before.

Since we need to perform a left-tail test at the 5% significance, the appropriate R


command is

cor.test(Price, Odometer, alternative = "less")

It returns the following printout:

19
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Pearson's product-moment correlation
data: Price and Odometer
t = -13.589, df = 98, p-value < 2.2e-16
alternative hypothesis: true correlation is less than 0
95 percent confidence interval:
-1.0000000 -0.7420354
sample estimates:
cor
-0.8082646

As you can see, the sample correlation coefficient is at the bottom of this printout, so it
is unnecessary to use the cor() function if you also intend to use the cor.test() function.

The t-test statistic value is -13.589 and the p-value is practically zero, so H0 can be
rejected in favour of a left-sided HA at any reasonable significance level. This implies
that there is a significantly negative correlation between Odometer and Price.

g) Recall that the t-test in part (f) requires normality. Check whether this requirement is
met by QQ plots and Shapiro-Wilk tests on both variables.

The relevant R commands for Price are

qqnorm(Price, main = "Normal Q-Q Plot for Price",


xlab = "Theoretical Quantiles", ylab = "Sample Quantiles",
col = "green")
qqline(Price, col = "orange")
shapiro.test(Price)

They return,

Normal Q-Q Plot for Price


17.5
Sample Quantiles

16.5
15.5
14.5

-2 -1 0 1 2

Theoretical Quantiles

20
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
and
Shapiro-Wilk normality test
data: Price
W = 0.96277, p-value = 0.006363

Based on these checks, it is unlikely that the population of Price is normally distributed.

The corresponding commands for Odometer

qqnorm(Odometer, main = "Normal Q-Q Plot for Odometer",


xlab = "Theoretical Quantiles", ylab = "Sample Quantiles",
col = "blue")
qqline(Odometer, col = "orange")
shapiro.test(Odometer)

produce

Normal Q-Q Plot for Odometer


50
45
40
Sample Quantiles

35
30
25
20

-2 -1 0 1 2

Theoretical Quantiles

and

Shapiro-Wilk normality test


data: Odometer
W = 0.98444, p-value = 0.2892

Hence, Odometer is likely normally distributed.

Still, since Price is probably non-normal, the conclusion we drew from the t-test on the
Pearson correlation coefficient may well be invalid.

21
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
A non-parametric alternative to the Pearson correlation coefficient is the Spearman rank
correlation coefficient (also known as Spearman’s rho) denoted as s. It can be used to
measure the strength of some monotonic, but not necessarily linear, relationship, between
the ranks of two variables measured on ordinal scales, or when one would like to perform a
t-test on the correlation coefficient but the variables are not normally distributed.

The Spearman rank correlation coefficient is calculated just like the Pearson correlation
coefficient, but from the ranks of the observations rather than from the original observations.
When the sample size is relatively small (n  30) its significance can be tested by comparing
it to the critical values in Table 10 of the Selvanathan et al. book (p. 1090), while for n > 30
we can use the same t-test than for the significance of the Pearson correlation coefficient.8

Exercise 5 (Selvanathan et al., p. 751, ex. 17.10)

The production manager of a firm wants to examine the relationship between aptitude test
scores given prior to the hiring of production-line workers and performance ratings received
by the employees three months after starting work. The results of the study would allow the
firm to decide how much weight to give to the aptitude tests relative to other work-history
information obtained, including references. The aptitude test results range from 0 to 100.
The performance ratings are as follows: 1 = Employee has performed well below average;
2 = Employee has performed somewhat below average; 3 = Employee has performed at
the average level; 4 = Employee has performed somewhat above average; 5 = Employee
has performed well above average. A random sample of 20 production workers yielded the
observations on Aptitude and Performance saved in the t7e5 Excel file. Can the firm’s
manager infer at the 5% significance level that aptitude test scores are correlated with
performance rating? Perform the test both manually and with R.

The first variable, Aptitude, is quantitative, but the second variable, Performance, is
qualitative measured on an ordinal scale. For this reason, the strength of the relationship
between them must be measured with the Spearman rank correlation coefficient. To
calculate the Spearman rank correlation coefficient manually, you can use a table like the
one on the next page.

In this table X denotes the ranks of Aptitude and Y denotes the ranks of Performance. The
ranks are assigned to the Aptitude and Performance values separately, from the smallest to
the largest. The Aptitude observations are all different and ranking is relatively simple. In the
case of Performance, however, there are several ties.

8
Selvanathan et al. advocate an alternative test procedure for the Spearman rank correlation coefficient, but
similarly to R, we use the same test for both correlation coefficients.
22
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Aptitude X Performance Y X2 Y2 XY
59 9 3 10.5 81.00 110.25 94.50
47 3 2 3.5 9.00 12.25 10.50
58 8 4 17 64.00 289.00 136.00
66 14 3 10.5 196.00 110.25 147.00
77 20 2 3.5 400.00 12.25 70.00
57 7 4 17 49.00 289.00 119.00
62 12 3 10.5 144.00 110.25 126.00
68 16 3 10.5 256.00 110.25 168.00
69 17 5 19.5 289.00 380.25 331.50
36 1 1 1 1.00 1.00 1.00
48 4 3 10.5 16.00 110.25 42.00
65 13 3 10.5 169.00 110.25 136.50
51 5 2 3.5 25.00 12.25 17.50
61 11 3 10.5 121.00 110.25 115.50
40 2 3 10.5 4.00 110.25 21.00
67 15 4 17 225.00 289.00 255.00
60 10 2 3.5 100.00 12.25 35.00
56 6 3 10.5 36.00 110.25 63.00
76 19 3 10.5 361.00 110.25 199.50
71 18 5 19.5 324.00 380.25 351.00
Sum 210 210 2870.00 2780.00 2439.50

From the column sums,

1  210  
2
1  n 2 2
sx   i
n  1  i 1
x  nx 

 
19 
2870  20      5.916
 20  

1  210  
2
1  n 2 2
sy   yi  ny   19  2780  20   20    5.501
n  1  i 1  

210  210
2439.5 
1 n 1  n  n  20
sxy   xi yi    xi   yi    12.342
n  1  i 1 n  i 1  i 1  19

and the Spearman rank correlation coefficient is

sxy 12.342
rs    0.376
sx s y 5.916  5.501

The question is whether aptitude test scores are correlated with performance rating.
Therefore, H0: s = 0 and we need to perform a two-tail test.

The sample size is 20, so we rely on the small sample critical values in Table 10 of
Selvanathan et al. This table provides one-tail (right-tail) critical values for four different
significance levels (0.05, 0.025, 0.01 and 0.005). For our two-tail test we need to divide the
significance level by two and select the column in the table accordingly. Hence, in the
23
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
intersection of row n = 20 and column  = 0.025, the upper critical value is 0.450 and the
corresponding lower critical value is -0.450. H0 is to be rejected if the calculated test statistic
value is smaller than -0.450 or larger than 0.450.

The value of our Spearman rank correlation coefficient is 0.376, i.e. it is between the lower
and upper critical values. Consequently, at the 5% significance level we maintain H0 and
conclude that the aptitude test scores and the performance ratings might not be related to
each other.

When the sample size is above 30, the t-test approximation is fairly accurate. In this case n
is only 20, but for the sake of illustration let’s use this approximation. The critical values are
t0.025, 18 = 2.101, and H0 is to be rejected if the calculated test statistic value is smaller
than -2.101 or larger than 2.101.

The formula of the t-test statistic is the same than in Exercise 4, so the observed test statistic
is

rs n  2 0.376 20  2
trs ,obs    1.722
1 rs2 1 0.3762

Since it is between the lower and upper critical values, we maintain H0, just like before.

To perform the calculations in R, start a new project and import the data from the t7e5 Excel
file. Then, execute the following command:

cor.test(Aptitude, Performance, method = "spearman")

to get

Spearman's rank correlation rho


data: Aptitude and Performance
S = 825.63, p-value = 0.09914
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.379226
Warning message:
In cor.test.default(Aptitude, Performance, method = "spearman") :
Cannot compute exact p-value with ties

R displays the above warning message because by default, cor.test(method = “spearman”)


provides the exact p-value, granted that there are no ties. However, as we saw when we
were ranking the observations, Performance has several ties.

We can hide this warning message by switching the exact logical argument to FALSE.9
Execute

cor.test(Aptitude, Performance, method = "spearman", exact = FALSE)


9
By default, this argument is set equal to TRUE in the cor.test function.
24
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
It returns

Spearman's rank correlation rho


data: Aptitude and Performance
S = 825.63, p-value = 0.09914
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.379226

If you compare this new printout to the previous one, you can see that the only difference
between them is that now we do not have the warning message. In particular, the test
statistic and the reported p-value are the same on these printouts. This illustrates that in the
presence of ties, cor.test uses the same t-approximation as we did on the previous page in
the manual calculations, no matter whether exact = TRUE or exact = FALSE.

On these printouts, the sample Spearman correlation coefficient (rho) is 0.379226,


confirming our calculations. The p-value is 0.09914, hence H0 cannot be rejected at the 5%
significance level, but it could be rejected at the 10% level.10

Exercises for Assessment

Exercise 6 (Selvanathan et al., p. 678, ex. 16.1)

Consider a multinomial experiment involving n = 300 trials and k = 5 cells. The observed
frequencies resulting from the experiments 1 to 5 are 24, 64, 84, 72, 56, and the hypotheses
to be tested are as follows:

H0: p1 = 0.1, p2 = 0.2, p3 = 0.3, p4 = 0.2, p5 = 0.2


HA: at least one pi (i = 1, 2, 3, 4, 5) is not equal to its value specified in H0.

Test the null hypothesis at the 1% significance level first manually and then with R.

Exercise 7

Return to the case study described in Exercise 2. Is it possible to infer at the 1% significance
level that the preference for Australian made grocery products (Aussie) and the impact of
brand name on product choice (Brand) are related to each other? Perform a chi-square test
of independence with R. Does increasing the significance level to 5% change your answer?

10
There is one more statistic on this printout, S = 825.63. You do not need it, but you might wish to know that
n3  n 203  20
it is calculated as S  (1 rs )  (1 0.379)  825.93 . If there were no ties, it would be equal to the
6 6
sum of all squared rank differences.

25
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7
Exercise 8

A survey was conducted in five countries. The percentages of respondents whose


household members own more than one personal computer, laptop, notebook or iPad are
as follows:

Australia 53%
New Zealand 48%
China 38%
Japan 54%
South Korea 49%

Suppose that the survey was based on 500 respondents in each country.

(a) At the 0.05 level of significance, determine whether there is some significant difference
in the proportion of households in these countries who own more than one computer
(personal computer, laptop, notebook or iPad). Do the calculations first manually and
then in R.

(b) Find the approximate p-value of the test in (a) from the relevant statistical table.

Exercise 9

In Exercise 4 you performed a t-test on the Pearson correlation coefficient between Price
and Odometer and concluded at the 5% significance level that there is a significantly
negative linear relationship between them. Later, however, you realised that this test might
be misleading because Price is probably non-normal.

To double check your conclusion, calculate and test the Spearman correlation coefficient
with R.

26
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 7

You might also like