Machine Learning-Intro
Machine Learning-Intro
[email protected]
Textbook
Title: An Introduction to Statistical Learning: with Applications in R, 2021
Authors: G. James, D. Witten, T. Hastie and R. Tibshirani
Reference Book
Title: The Elements of Statistical Learing: Data mining, Inference and Prediction
Authors: D. Hastie, R. Tibshirani and J. Friedman
Grading:
⚫ Attendance 10%
⚫ Mark of usual 30%
⚫ Midterm Exam 30%
⚫ Final Report 30%
Office hours:
Tue. 10:00~11:00
Thr. 10:00~11:00
Introduction to R
Basic Commands: vector
x <- c(1, 3, 2, 5)
x
## [1] 1 3 2 5
x = c(1, 6, 2)
x
## [1] 1 6 2
y = c(1, 4, 3)
We can tell R to add two sets of numbers together. It will then add the first number
from x to the first number from y, and so on. However, x and y should be the same
length. We can check their length using the length() function.
length (x)
## [1] 3
length (y)
## [1] 3
x + y
## [1] 2 10 5
The ls() function allows us to look at a list of all of the objects, such as data and
functions, that we have saved so far. The rm() function can be used to delete any
that we don’t want.
ls ()
rm(x, y)
ls ()
## character(0)
It’s also possible to remove all objects at once:
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
Alternatively, the byrow = TRUE option can be used to populate the matrix in order
of the rows.
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4
The sqrt() function returns the square root of each element of a vector or matrix.
The command x^2 raises each element of x to the power 2; any powers are possible,
including fractional or negative powers.
sqrt (x)
## [,1] [,2]
## [1,] 1.000000 1.414214
## [2,] 1.732051 2.000000
x^2
## [,1] [,2]
## [1,] 1 4
## [2,] 9 16
The rnorm() function generates a vector of random normal variables, with the first
argument n the sample size. Each time we call this function, we will get a diferent
answer. Here we create two correlated sets of numbers, x and y, and use the cor()
function to compute the correlation between them.
## [1] 0.9943504
By default, rnorm() creates standard normal random variables with a mean of 0 and
a standard deviation of 1. Sometimes we want our code to reproduce the exact same
set of random numbers; we can use the set.seed() function to do this.
set.seed (1303)
rnorm (50)
set.seed (3)
y <- rnorm (100)
mean (y)
## [1] 0.01103557
var (y)
## [1] 0.7328675
## [1] 0.8560768
sd(y)
## [1] 0.8560768
Graphics
## [1] 1 2 3 4 5 6 7 8 9 10
x <- 1:10
x
## [1] 1 2 3 4 5 6 7 8 9 10
Indexing Data
A[2, 3]
## [1] 10
A[c(1, 3) , c(2, 4)]
## [,1] [,2]
## [1,] 5 13
## [2,] 7 15
A[1:3 , 2:4]
A[1:2 , ]
A[, 1:2]
## [,1] [,2]
## [1,] 1 5
## [2,] 2 6
## [3,] 3 7
## [4,] 4 8
A[1, ]
## [1] 1 5 9 13
A[-c(1, 3) , ]
## [1] 6 8
dim (A)
## [1] 4 4