Curso 2 Data in Out Listas
Curso 2 Data in Out Listas
INTRODUCTION TO R
Introducción a la Ciencia de Datos
Index
• Introduction to R • Data input and Output
• Rstudio • Examining Datasets
• Getting Started - R Console • Selecting subsets
• Help • Merging datasets
• R-workspace • Numerical Sumaries
• Packages • Useful functions
• Data types and Structures
• Vectors
• Missing and special values
• Matrices and Arrays
• Factors
• Lists
• Data frames
• Indexing
• Conditional indexing
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
Lists
• Lists can be used to combine objects (of possibly different kinds/sizes) into a
larger composite object.
• The components of the list are named according to the arguments used.
• Components can be extracted with the double bracket operator [[ ]]
• Alternatively, named components can be accessed with the "$" separator.
Names
Names of an R object can be accessed and/or modified with
the names() function.
z <- list(a = 1, b = "c", c = 1:3)
> z
$a
[1] 1
# change just the name of the third element.
$b names(z)[3] <- "c2”
[1] "c" z
$c
[1] 1 2 3 $a
[1] 1
$b
[1] "c"
$c2
[1] 1 2 3
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 5
print(x^2)
[1] 1 4 9
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 9
x <- 1:3
print(x^2)
[1] 1 4 9
cat(x^2)
1 4 9
cat(x^2, x, "hola")
1 4 9 1 2 3 hola
cat(x^2, x, "hola", sep="_")
1_4_9_1_2_3_hola
1
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
0
nombre edad
John 25
Mary 28
Jim 19
1
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
1
• Note that scan() would not work here, because our file
has a mixture of numeric and character data (and a
header).
1
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
2
output.txt:
1 3 5
2 4 6
1
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
3
cat("abc\n",file="u.txt")
cat("de\n",file="u.txt",append=TRUE)
u.txt:
abc
de
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 14
x: R object
file: character string naming the file to print to. If "" (the
default), cat prints to the console unless redirected by sink.
sep: a character vector of strings to append after each element
fill: controls how the output is broken into successive lines.
append: logical. If TRUE output will be appended to file;
otherwise, it will overwrite the contents of file.
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores
read.delim()
• They are intended to read TAB separated files
Exporting Data
There are numerous methods for exporting R objects into
other formats . For SPSS, SAS and Stata you will need to
load the foreign packages. For Excel, you will need the
xlsReadWrite package.
• To an Excel Spreadsheet
library(xlsReadWrite)write.xls(mydata, "c:mydata.xls")
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 21
The source( ) function runs a script in the current session. If the filename
does not include a path, the file is taken from the current working
directory.
# input a script
source("myfile")
Máster Universitario OficialComputing
Applied Statistical en Ciencia and
de Datos e Ingeniería de Computadores
Graphics 23
Anexos
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 25
Subsetting syntax:
my_object[row]
my_object[row, col]
Character Functions
Function Description
substr(x, start=n1, stop=n2) Extract or replace substrings in a character vector.
x <- "abcdef"
substr(x, 2, 4) is "bcd"
substr(x, 2, 4) <- "22222" is "a222ef"
grep(pattern, x , Search for pattern in x. If fixed =FALSE then pattern is a regular expression. If
ignore.case=FALSE, fixed=FALSE) fixed=TRUE then pattern is a text string. Returns matching indices.
grep("A", c("b","A","c"), fixed=TRUE) #returns 2
sub(pattern, replacement, x, Find pattern in x and replace with replacement text. If fixed=FALSE then pattern is
ignore.case =FALSE, fixed=FALSE) a regular expression.
If fixed = T then pattern is a text string.
sub("\\s",".","Hello There") returns "Hello.There"
paste(..., sep="") Concatenate strings after using sep string to seperate them.
paste("x",1:3,sep="") returns c("x1","x2" "x3")
paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3”)
toupper(x) Uppercase
tolower(x) Lowercase
Máster Universitario Oficial en Ciencia de Datos e Ingeniería de Computadores 30
Gracias…