0% found this document useful (0 votes)
64 views14 pages

1 - 4 Subsetting

This document discusses various techniques for subsetting or extracting subsets of data from R objects like vectors, matrices, and lists. It explains that the [ operator returns an object of the same class, [[ extracts a single element, and $ extracts elements by name. It provides examples of subsetting vectors, matrices, and lists, and techniques like removing NAs, partial matching of names, and subsetting nested elements of lists. The key subsetting operators and their behaviors are concisely summarized.

Uploaded by

intj2001712
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views14 pages

1 - 4 Subsetting

This document discusses various techniques for subsetting or extracting subsets of data from R objects like vectors, matrices, and lists. It explains that the [ operator returns an object of the same class, [[ extracts a single element, and $ extracts elements by name. It provides examples of subsetting vectors, matrices, and lists, and techniques like removing NAs, partial matching of names, and subsetting nested elements of lists. The key subsetting operators and their behaviors are concisely summarized.

Uploaded by

intj2001712
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Introduction to the R Language Introduction to the R Language

Data Types and Basic Operations


Roger Peng, Associate Professor
Johns Hopkins Bloomberg School of Public Health
Subsetting Subsetting
There are a number of operators that can be used to extract subsets of R objects.
[ always returns an object of the same class as the original; can be used to select more than one
element (there is one exception)
[[ is used to extract elements of a list or a data frame; it can only be used to extract a single
element and the class of the returned object will not necessarily be a list or data frame
$ is used to extract elements of a list or data frame by name; semantics are similar to that of [[.

2/14
Subsetting Subsetting
> x <- c("a", "b", "c", "c", "d", "a")
> x[1]
[1] "a"
> x[2]
[1] "b"
> x[1:4]
[1] "a" "b" "c" "c"
> x[x > "a"]
[1] "b" "c" "c" "d"
> u <- x > "a"
> u
[1] FALSE TRUE TRUE TRUE TRUE FALSE
> x[u]
[1] "b" "c" "c" "d"
3/14
Subsetting a Matrix Subsetting a Matrix
Matrices can be subsetted in the usual way with (i,j) type indices.
Indices can also be missing.
> x <- matrix(1:6, 2, 3)
> x[1, 2]
[1] 3
> x[2, 1]
[1] 2
> x[1, ]
[1] 1 3 5
> x[, 2]
[1] 3 4
4/14
Subsetting a Matrix Subsetting a Matrix
By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather
than a 1 ! 1 matrix. This behavior can be turned off by setting drop = FALSE.
> x <- matrix(1:6, 2, 3)
> x[1, 2]
[1] 3
> x[1, 2, drop = FALSE]
[,1]
[1,] 3
5/14
Subsetting a Matrix Subsetting a Matrix
Similarly, subsetting a single column or a single row will give you a vector, not a matrix (by default).
> x <- matrix(1:6, 2, 3)
> x[1, ]
[1] 1 3 5
> x[1, , drop = FALSE]
[,1] [,2] [,3]
[1,] 1 3 5
6/14
Subsetting Lists Subsetting Lists
6/14
Subsetting Lists
> x <- list(foo = 1:4, bar = 0.6)
> x[1]
$foo
[1] 1 2 3 4
> x[[1]]
[1] 1 2 3 4
> x$bar
[1] 0.6
> x[["bar"]]
[1] 0.6
> x["bar"]
$bar
[1] 0.6
7/14
Subsetting Lists Subsetting Lists
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> x[c(1, 3)]
$foo
[1] 1 2 3 4
$baz
[1] "hello"
8/14
Subsetting Lists Subsetting Lists
The [[ operator can be used with computed indices; $ can only be used with literal names.
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> name <- "foo"
> x[[name]] ## computed index for foo
[1] 1 2 3 4
> x$name ## element name doesnt exist!
NULL
> x$foo
[1] 1 2 3 4 ## element foo does exist
9/14
Subsetting Nested Elements of a List Subsetting Nested Elements of a List
The [[ can take an integer sequence.
> x <- list(a = list(10, 12, 14), b = c(3.14, 2.81))
> x[[c(1, 3)]]
[1] 14
> x[[1]][[3]]
[1] 14
> x[[c(2, 1)]]
[1] 3.14
10/14
Partial Matching Partial Matching
Partial matching of names is allowed with [[ and $.
> x <- list(aardvark = 1:5)
> x$a
[1] 1 2 3 4 5
> x[["a"]]
NULL
> x[["a", exact = FALSE]]
[1] 1 2 3 4 5
11/14
Removing NA Values Removing NA Values
A common task is to remove missing values (NAs).
> x <- c(1, 2, NA, 4, NA, 5)
> bad <- is.na(x)
> x[!bad]
[1] 1 2 4 5
12/14
Removing NA Values Removing NA Values
What if there are multiple things and you want to take the subset with no missing values?
> x <- c(1, 2, NA, 4, NA, 5)
> y <- c("a", "b", NA, "d", NA, "f")
> good <- complete.cases(x, y)
> good
[1] TRUE TRUE FALSE TRUE FALSE TRUE
> x[good]
[1] 1 2 4 5
> y[good]
[1] "a" "b" "d" "f"
13/14
Removing NA Values Removing NA Values
> airquality[1:6, ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
> good <- complete.cases(airquality)
> airquality[good, ][1:6, ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
7 23 299 8.6 65 5 7
14/14

You might also like