0% found this document useful (0 votes)
13 views3 pages

Tutorial Worksheet wk6

DSA2101 tutorial wk6

Uploaded by

lixiaoyue200311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

Tutorial Worksheet wk6

DSA2101 tutorial wk6

Uploaded by

lixiaoyue200311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Week 6 Tutorial Worksheet

AY24/25

Submission: End of tutorial day

Question 1. Housing prices in the US


The data for this question come from the Federal Reserve Economic Data (FRED). It includes
quarterly median housing prices in the US from 1963 to 2024. The data set is available as
MSPUS.xls on Canvas.
1. Read the data into R as qn1_1. Perform the following tasks and overwrite qn1_1 with the
resulting data frame.
• Rename the variables to date and price.
• Convert the date column to Date type.
2. Re-create, as much as you can, the following visualization using the qn1_1 data frame.

Quarterly median house prices in the United States

400000

300000

200000

100000

1980 2000 2020

3. Subset qn1_1 to include data from Q1 2019 to Q2 2024 (inclusive). Save it as qn1_3. In
qn1_3, create two new variables:
• year: The year information from the date column.
• quarter: A factor variable for the quarter in the format of Q1, Q2, Q3, Q4, based
on the date column.

1
4. Re-create, as much as you can, the following visualization using the qn1_3 data frame.

Quarterly median Housing prices (dollars), 2019 − 2024

440000

420000

400000

380000

360000

340000

320000

2020 2022 2024

Question 2. International visitor arrivals


In this question, we will revisit the tourist data from Week 4. The file tourist.xlsx contains
monthly international visitor arrivals in Singapore from July to December 2022, retrieved from
the Singapore Department of Statistics.
1. Read the data into R as qn2_1 and transform it into a tidy format, following the structure
shown below. The first column k is an unique identifier for each duration category,
k = 1, 2, .., 13.
head(qn2_1, 4)

## # A tibble: 4 x 5
## k duration year month arrivals
## <int> <chr> <int> <ord> <dbl>
## 1 1 Under 1 Day (Number) 2022 Jul 108581
## 2 2 1 Day (Number) 2022 Jul 88056
## 3 3 2 Days (Number) 2022 Jul 91996
## 4 4 3 Days (Number) 2022 Jul 109801
2. Calculate the month-on-month rate of change in tourist arrivals for the 13 arrival
duration categories from August to December in 2022. For each duration category k, the
rate of change can be calculated as
Current Monthk − Previous Monthk
ratek = k = 1, 2, ..., 13
Previous Monthk
where Current Monthk represents the number of arrivals in a given month for duration
category k and Previous Monthk the number of arrivals in the previous month for the
same category k.

2
Store the result as a new column named rate in qn2_2. The columns of qn2_2 should
follow the structure of the following:
qn2_2 %>% arrange(year, k) %>% head(4)

## # A tibble: 4 x 6
## k duration year month arrivals rate
## <int> <chr> <int> <ord> <dbl> <dbl>
## 1 1 Under 1 Day (Number) 2022 Aug 125981 16.0
## 2 1 Under 1 Day (Number) 2022 Sep 134931 7.10
## 3 1 Under 1 Day (Number) 2022 Oct 136573 1.22
## 4 1 Under 1 Day (Number) 2022 Nov 146629 7.36

Requirements
• After answering all questions in the Rmd file, knit it into HTML.
• The code in your Rmd file should create the following data frames: qn1_1, qn1_3, qn2_1,
and qn2_2.
• The knitted HTML file should contain two graphs based on the housing price data.
• Submit your Rmd file to Canvas after your tutorial session.
This will be the last time we check your submissions before the midterm exam. Reach out to
your TA as soon as possible if you are still unsure about the submission requirements.

You might also like