Lab 1: Introduction To R: 1 Installing Software

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Lab 1 : Introduction to R

Prof. Gustavo Sanchez


July, 2019

1 Installing Software
Anaconda R is a package/environment manager, a Python/R data science dis-
tribution, and a collection of over 1500+ open source packages. It is free to
install, from:

https://www.anaconda.com/distribution/

RStudio is a free and open-source integrated development environment (IDE) for


R, a programming language for statistical computing and graphics. It includes
a console, syntax-highlighting editor that supports direct code execution, as
well as tools for plotting, history, debugging and workspace management. It
is available in open source and commercial editions and runs on the desktop
(Windows, Mac, and Linux) or in a browser connected to RStudio Server or
RStudio Server Pro (Debian/Ubuntu, RedHat/CentOS, and SUSE Linux).

https://www.rstudio.com/products/rstudio/download/

1
2 Basic commands
The most useful R command for quickly entering in small data sets is the c func-
tion. This function combines, or concatenates terms together. As an example,
suppose we have the following raw data:

J F SO SE J J SE J F F SO SE J SO

To enter this into an R session:


>
> raw_data = c(’J’,’F’,’SO’,’SE’,’J’,’J’,’SE’,’J’,’F’,’F’,’SO’,’SE’,’J’,’SO’)
> raw_data
[1] "J" "F" "SO" "SE" "J" "J" "SE" "J" "F" "F" "SO" "SE" "J" "SO"

Try this and write your comments

summary(raw_data)
View(raw_data)
data.entry(raw_data)
n = length(raw_data)
raw_data[0]
raw_data[1]
raw_data[2]
raw_data[-4]
raw_data[-c(1,5,9)] #slicing
raw_data[5:9]
tmp <- 1:4
raw_data[tmp]
raw_data == ’SE’
which(raw_data == ’SE’)
raw_data[raw_data == ’SE’]
seq(1,14,2)
raw_data[seq(1,14,2)]
sample(raw_data, size=5)

2
3 Graphical presentation (Qualitative)
Try this and write your comments

table(raw_data)

T <- table(raw_data)
names(T)
library("lattice")
barchart(raw_data, horizontal = FALSE, main = "Bar Chart", xlab
= "Data Characteristic", ylab = "Frequency", col = "darkgreen")
barplot(T/sum(T)*100, main = "Bar Chart (%)", xlab = "Data Characteristic",
ylab = "Percentage", col = "darkblue")
pie(T,main="Pie Chart of raw_data")
lbls <- c("F", "J", "SE", "SO")
pct <- round(T/sum(T)*100)
lbls <- paste(lbls, pct)
lbls <- paste(lbls,"%",sep="")
pie(T,labels = lbls, col=rainbow(length(lbls)), main="Pie Chart
of raw_data")

3
4 Graphical presentation (Quantitative)
Try this and write your comments

data("iris")

summary(iris)
names(iris)
str(iris)
View(iris)

plot(1:length(iris$Sepal.Length),iris$Sepal.Length)
stem(iris$Sepal.Length)
hist(iris$Sepal.Length)
H <- hist(iris$Sepal.Length,n=20)
hist(iris$Sepal.Length,n=4)
cumfreq0 = c(0, cumsum(H$counts))
plot(H$breaks, cumfreq0,main="Cumulative frequencies chart",xlab="Sepal
Lengths", ylab="Cumulative frequencies")

You might also like