Sheet1 Sol
Sheet1 Sol
Sheet I - Solutions
1 Descriptive Statistics - Variables
Introduction to R, RStudio
Task 1: Calculate the following quantities:
# sum of 52.3, 74.8, 3.17
52.3+74.8+3.17
## [1] 130.27
## [1] 12
## [1] 1.627074
## [1] 1 4 22 42 44
## [1] 5 8 17 14 0 12 7 3 9 6
# or
round(runif(n = 10, min = 0, max = 20))
## [1] 11 18 7 17 11 11 18 16 19 14
## [1] 50
## [1] 5
max(myvec)
## [1] 50
mean(myvec)
## [1] 21.66667
Task 3: The numbers below are the first ten days of rainfall in a year
# values: 0.1 0.5 2.3 1.1 11.3 14.7 23.4 15.7 0 0.9
# Read them into a vector using the c() command.
rainfall <- c(0.1,0.5,2.3,1.1,11.3,14.7,23.4,15.7,0,0.9)
# Calulate the mean and the standard deviation.
mean(rainfall)
## [1] 7
sd(rainfall)
## [1] 8.533594
## [1] 0.1 0.6 2.9 4.0 15.3 30.0 53.4 69.1 69.1 70.0
## [1] 70
## [1] 7
## [1] 16.275
# alternative solution
rainfall[rainfall %in% c(0,1.1)]
## [1] 4 9
Task 4: The length of five cylinders are 2.5, 3.4, 4.8, 3.1, 1.7 and their diameters are
0.7, 0.4, 0.5, 0.5, 0.9.
# Read these vectors into two vectors with appropriate names.
len <- c(2.5, 3.4, 4.8, 3.1, 1.7)
diam <- c(0.7, 0.4, 0.5, 0.5, 0.9)
# Calculate the volumes of each cylinder and store it
# in a new vector.
vol <- len * (diam/2)**2 * pi
vol
Task 5: Inspect the R commands union(), setdiff() and intersect() implying set
operations.
# Make two vectors
x <- c(1,2,3,4,5)
y <- c(3,5,7,9)
# Find values that are contained in both x and y.
intersect(x,y)
## [1] 3 5
## [1] 1 2 4
# y without x
setdiff(y,x)
## [1] 7 9
## [1] 1 2 3 4 5 7 9
## [1] 1 2 3 4 5 3 5 7 9
Task 6: Construct a matrix with 8 rows and 10 columns. The first row should contain
the numbers 0, 2, 4, …, 18 and the other rows should random integer numbers
between 0 and 100. Use runif() to create the random numbers and as.integer() to
transform to integers.
mat1 <- matrix(c(seq(0,18, by = 2),
as.integer(runif(70,0,100))),
nrow = 8, ncol = 10, byrow = TRUE)
mat1
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0 2 4 6 8 10 12 14 16 18
## [2,] 69 26 5 41 6 45 4 79 2 60
## [3,] 89 60 15 15 27 90 18 81 27 13
## [4,] 82 49 25 8 95 30 28 6 40 98
## [5,] 37 41 6 65 96 76 2 9 60 91
## [6,] 25 41 57 2 95 91 24 30 84 61
## [7,] 55 74 0 62 53 78 49 22 6 75
## [8,] 19 34 70 73 14 83 35 16 11 16
sd(rm)
## [1] 13.63456
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 69 26 5 41 6 45 4 79 2 60
## [2,] 89 60 15 15 27 90 18 81 27 13
## [3,] 82 49 25 8 95 30 28 6 40 98
## [4,] 37 41 6 65 96 76 2 9 60 91
## [5,] 25 41 57 2 95 91 24 30 84 61
## [6,] 55 74 0 62 53 78 49 22 6 75
## [7,] 19 34 70 73 14 83 35 16 11 16
# creating a histogram of cm
hist(cm)
Task 7: Inspect the R datasat mpg. Determine the types and the scales of
measurement of all variables in the dataset mpg. Further more determine whether the
variables are discret or continous.
# load packages
library(ggplot2)
library(tidyverse)
head(mpg)
## done
# alternative solution
2 rows
str_mpg
2 rows