Homework 1: Cut, Breaks C, A, B, C, D, A A, B B, C C, D A, B, C, D
Homework 1: Cut, Breaks C, A, B, C, D, A A, B B, C C, D A, B, C, D
1. Download and install R and RStudio on your personal computer. (No need to take a
screenshot.)
https://posit.co/download/rstudio-desktop/
3. Import the above dataset to R. Take a snapshot of what you did. What is your working
directory?
5. Produce a frequency table using R for the variable uploads. The number of classes is fixed
to be 4. Also, as a separate exercise, use codes to find is the maximum and the minimum number
of uploads. Please also attach your codes.
Note: in class we discussed how to cut the range of values of a variable into intervals of equal
lengths. In fact, you can also choose to cut the range into unequal intervals. Here is how you do
it:
upfrq ← table(cut(up, breaks = c(0, a, b, c, d)))
This code tells R to cut the range for the values of the vector up into four intervals [0, a), [a, b),
[b, c) and [c, d), and produce the corresponding frequency distribution. You should choose you
preferred values of a, b, c, d.
8. Produce a pie chart and a bar chart based on your answer in 5. Either save the pictures or
take screenshots.
(You do not have to use R, but if you explore a bit and find the appropriate R command, feel
free to take a snapshot of what you did.)
10. Classify the following as either cross-sectional data or times series data. (No justification
required)
(1) Number of supermarkets in each of the ten provinces and three territories in Canada as of
January 1st, 2023.
(2) Your monthly electricity bills for the last couple of years.
(3) Average life expectancy by countries.
(4) 200 responses to the same political poll, where the individuals are interviewed on either
Saturday or Sunday of the same week.
11. The following is a part of a dataset originally consisting of 6704 elements.The dataset included
six variables: age, gender, experience, job title, education level and salary.
Classify the variables as either qualitative variables or quantitative variables. Are the quanti-
tative variables continuous or discrete?
HOMEWORK 1 3
Challenge question. (A correct solution contributes 0.1 percent worth of extra credits towards
you overall grade.)
We define a Fibonacci-type sequence to be a sequence (an )n≥0 of integers such that
a0 = a
a1 = b
an = an−1 + an−2 for n ≥ 2
Here, a, b are pre-chosen integers. We call them the initial values of the sequence.
Write a function in R that takes the initial values a, b as extra variables and computes the
terms of a Fibonacci-type sequence. In particular, your function should have three variables. This
function allows us to change the initial values without modifying the code each time.
Afterwards, use the smallest digit in your student ID as a, and the largest digit as b and compute
the first twelve terms of the corresponding sequence using your own function.