0% found this document useful (0 votes)

4 views59 pages

Module 8

The document is a course outline for DA112: Introduction to R, focusing on dataframes, which are essential data structures in R. It covers topics such as creating dataframes, their attributes, subsetting, and manipulating data within them. Additionally, it discusses combining dataframes using functions like rbind, cbind, and merge.

Uploaded by

jyrfjidjjhstull

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views59 pages

Module 8

Uploaded by

jyrfjidjjhstull

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

Bachelor of Science (Honours) in Data Science and Artificial Intelligence

DA112: Introduction to R
Course Instructor :Ayon Borthakur,
TAs :Rahul Goswami, Tenzin Dawa, Kumar Sanu

Optional Note: Usage of this material for teaching and education is considered fair use. However, if that includes posting
images to a website, or commercial usage that could be considered copyright infringement. In such case, you are required to
reach out to the author(s) or the Institute for permission before the usage.
Week 8
Dataframes
Learning Objectives

01 What is Dataframes?

02 How to create Dataframes

03 Dataframe Attributes
Replacein/Deleting/Coercing in
04
Dataframe
05 Apply function over dataframe

3
Introduction

4
Introduction to Dataframes

● Dataframes are the most common data structure in R

● Dataframes are two-dimensional arrays, with rows and columns
● Dataframes can store different types of data in each column
● Internally dataframes are lists of vectors of equal length

5
Creating Dataframes

6
Creating Dataframes

● Dataframes can be created using the data.frame() function

# Create a dataframe
df <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
married = c(TRUE, FALSE, TRUE)
)

# Print the dataframe

print(df)
## name age married
## 1 Alice 25 TRUE
## 2 Bob 30 FALSE
## 3 Charlie 35 TRUE
note : An important change that happened in R 4.0.0 is that stringsAsFactors is now FALSE by default. This
means that character vectors are no longer automatically converted to factors when creating a data frame. If
you want to convert a character vector to a factor, you need to do it explicitly using the factor() function.

7
column names

The columns of a dataframe is always named, what will happen if we try to create a dataframe without
column names?
# Create a dataframe without column names
df <- data.frame(
c("Alice", "Bob", "Charlie"),
c(25, 30, 35),
c(TRUE, FALSE, TRUE)
)

# Print the dataframe

print(df)
## c..Alice....Bob....Charlie.. c.25..30..35. c.TRUE..FALSE..TRUE.
## 1 Alice 25 TRUE
## 2 Bob 30 FALSE
## 3 Charlie 35 TRUE
note : If you create a dataframe without column names, R will automatically generate column names for you.

8
Dataframe over a matrix

● Dataframes are similar to matrices, but with some important differences

● Dataframes can store different types of data in each column
(df <- data.frame(
name = c("Petra", "Jochen", "Alexander"), # character
age = c(35L, 21L, 12L), # integer
height = c(1.72, 1.65, 1.39), # numeric
austrian = c(FALSE, TRUE, TRUE), # logical
stringsAsFactors = FALSE # default
))
## name age height austrian
## 1 Petra 35 1.72 FALSE
## 2 Jochen 21 1.65 TRUE
## 3 Alexander 12 1.39 TRUE

9
Lets take a look at structure

str(df)
## 'data.frame': 3 obs. of 4 variables:
## $ name : chr "Petra" "Jochen" "Alexander"
## $ age : int 35 21 12
## $ height : num 1.72 1.65 1.39
## $ austrian: logi FALSE TRUE TRUE

10
Dataframe Attributes

11
Dataframe Attributes

There are several attributes that can be set for a dataframe following are some of the
important attributes

● row.names : The row names of the dataframe

● col.names : The column names of the dataframe
● names : The column names of the dataframe
● class : The class of the dataframe
● dim : The dimensions of the dataframe
● nrows : The number of rows in the dataframe
● ncols : The number of columns in the dataframe

12
row.names and col.names

# Setting row names

row.names(df) <- c("Row 1", "Row 2", "Row 3")

# Setting column names

colnames(df) <- c("Name", "Age", "Height", "Austrian")
df
## Name Age Height Austrian
## Row 1 Petra 35 1.72 FALSE
## Row 2 Jochen 21 1.65 TRUE
## Row 3 Alexander 12 1.39 TRUE

13
names and class

# Getting the column names

names(df)
## [1] "Name" "Age" "Height" "Austrian"
# Getting the class of the dataframe
class(df)
## [1] "data.frame"

14
dim and nrows and ncols

# Getting the dimensions of the dataframe

dim(df)
## [1] 3 4
# Getting the number of rows in the dataframe
nrow(df)
## [1] 3
# Getting the number of columns in the dataframe
ncol(df)
## [1] 4

15
Subsetting Dataframes

16
Subsetting Dataframes

Subsetting one of the vital operation in data manipulation, and since the
dataframes are most used datastructure we must understand the different
criteria under which we can subset a dataframeThere are various way to
subset a dataframe in R

17
List style subsetting

# List style subsetting

df[1] # returns a subdataframe with the first column
## name
## 1 Petra
## 2 Jochen
## 3 Alexander

18
List style subsetting continued

# List style subsetting

df['name'] # returns a subdataframe with the column
named 'name'
## name
## 1 Petra
## 2 Jochen
## 3 Alexander

19
List style subsetting continued

# List style subsetting

df[c('name', 'age')] # returns a subdataframe with the
columns 'name' and 'age'
## name age
## 1 Petra 35
## 2 Jochen 21
## 3 Alexander 12

20
List style subsetting continued

# List style subsetting

df$name # returns a vector with the column named 'name'
## [1] "Petra" "Jochen" "Alexander"

21
List style subsetting continued

# List style subsetting by logical vector

df[c(TRUE, FALSE, TRUE)] # returns a subdataframe with
the first and third column
## name height austrian
## 1 Petra 1.72 FALSE
## 2 Jochen 1.65 TRUE
## 3 Alexander 1.39 TRUE

22
Matrix style subsetting

# Matrix style subsetting

df[1, ] # returns a subdataframe with the first row
## name age height austrian
## 1 Petra 35 1.72 FALSE

23
Matrix style subsetting continued

# Matrix style subsetting

df[2:3,1:3] # returns a subdataframe with the second
and third row and the first three columns
## name age height
## 2 Jochen 21 1.65
## 3 Alexander 12 1.39

24
The subset function

The subset function is a powerful function to subset dataframes

Usage : subset(x, subset, select, drop = FALSE, …)
● x : dataframe (or an object coercible to a dataframe)
● subset : logical expression indicating elements or rows to keep, missing
values are taken as false
● select : expression, indicating columns to select from x
● drop : ipassed on to ‘[’ indexing operator.

25
The subset function continued

A demo dataframe to demonstrate the subset function

(df <- data.frame(
name = c("Petra", "Jochen", "Alexander"), # character
age = c(35L, 21L, 12L), # integer
height = c(1.72, 1.65, 1.39), # numeric
austrian = c(FALSE, TRUE, TRUE), # logical
stringsAsFactors = FALSE # default
))
## name age height austrian
## 1 Petra 35 1.72 FALSE
## 2 Jochen 21 1.65 TRUE
## 3 Alexander 12 1.39 TRUE
subset age greter than 18 and columns name and height
subset(df, age > 18, select = c(name, height))
## name height
## 1 Petra 1.72
## 2 Jochen 1.65
Students (Try it out): It can also be done using the df[] notation

26
The subset function continued

Subset only who are not austrian from the dataframe

subset(df, !austrian)
## name age height austrian
## 1 Petra 35 1.72 FALSE

27
The subset function continued

The return of subset() will be a data frame if the first argument x is of class
data frame – except if we select one row and set drop = TRUE. In this case we
will only get a vector, in the example below a logical vector.

subset(df, age >= 18, austrian, drop = TRUE)

## [1] FALSE TRUE

28
Graphical summary

29
Replacing/Deleting/Ad
ding Variables

30
Replacing/Deleting/Adding variables

To delete a variable from a dataframe, you can set it to NULL

# Delete the column 'name'
df$name <- NULL
print(df)
## age height austrian
## 1 35 1.72 FALSE
## 2 21 1.65 TRUE
## 3 12 1.39 TRUE

31
Replacing/Deleting/Adding variables continued

# Adds a completely new variable

df$nationality <- ifelse(df$austrian, "AT", NA)
# Replaces an existing colmn
df$height <- as.integer(df$height * 100)
# Replace one element
df$age[2] <- 102
# Print resulting data frame
df
## age height austrian nationality
## 1 35 172 FALSE <NA>
## 2 102 165 TRUE AT
## 3 12 139 TRUE AT

32
Coercion

33
Coercion

To an extent we coerce objects in R to a data frame. For example, a matrix

can be coerced to a data frame.

mat <- matrix(1:6, nrow = 2, dimnames = list(c("Row 1",

"Row 2"), LETTERS[1:3]))
df <- as.data.frame(mat)
print(df)
## A B C
## Row 1 1 3 5
## Row 2 2 4 6

34
Coercion of Heteregenous dataframe to matrix

df <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
married = c(TRUE, FALSE, TRUE)
)

# Coerce to matrix and back to data frame

(df2 <- as.data.frame(as.matrix(df)))
## name age married
## 1 Alice 25 TRUE
## 2 Bob 30 FALSE
## 3 Charlie 35 TRUE
# Check if the two dataframes are identical
identical(df, df2)
## [1] FALSE

35
Coercion of Heteregenous dataframe to matrix continued

Well in the prevous example df and df2 are not identical, why?
Lets check mean of age in df2
mean(df2$age)
## Warning in mean.default(df2$age): argument is not numeric or
logical: returning
## NA
## [1] NA
We got NA, why?, lets check the structure of df2
str(df2)
## 'data.frame': 3 obs. of 3 variables:
## $ name : chr "Alice" "Bob" "Charlie"
## $ age : chr "25" "30" "35"
## $ married: chr "TRUE" "FALSE" "TRUE"
The age column is coerced to character in the process of coercion since the matrix can only have
one type of data.

36
list to dataframe

df <- as.data.frame(list(x = c(1, 2, 3, 4), y = c("A",

"B")))
print(df)
## x y
## 1 1 A
## 2 2 B
## 3 3 A
## 4 4 B

37
list to dataframe continued

as.list(df)
## $x
## [1] 1 2 3 4
##
## $y
## [1] "A" "B" "A" "B"

38
Combining Dataframe

39
Combining dataframes

The usual cbind and rbind functions can be used to combine dataframes.
df1 <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
married = c(TRUE, FALSE, TRUE)
)

df2 <- data.frame(

name = c("David", "Eve"),
age = c(40, 45),
married = c(FALSE, TRUE)
)

40
Combining dataframes continued

# Combine the two dataframes

df <- rbind(df1, df2)

print(df)
## name age married
## 1 Alice 25 TRUE
## 2 Bob 30 FALSE
## 3 Charlie 35 TRUE
## 4 David 40 FALSE
## 5 Eve 45 TRUE

41
Combining dataframes continued

df1 <- data.frame(

name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
married = c(TRUE, FALSE, TRUE)
)

df2 <- data.frame(

name = c("Alice", "Bob", "Charlie"),
height = c(1.72, 1.65, 1.39),
austrian = c(FALSE, TRUE, TRUE)
)

# Combine the two dataframes

df <- cbind(df1, df2)
print(df)
## name age married name height austrian
## 1 Alice 25 TRUE Alice 1.72 FALSE
## 2 Bob 30 FALSE Bob 1.65 TRUE
## 3 Charlie 35 TRUE Charlie 1.39 TRUE 42
Combining dataframes continued

We can also use the merge function to combine dataframes

df1 <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 35),
married = c(TRUE, FALSE, TRUE)
)

df2 <- data.frame(

name = c("Alice", "Bob", "Charlie"),
height = c(1.72, 1.65, 1.39),
austrian = c(FALSE, TRUE, TRUE)
)

# Combine the two dataframes

df <- merge(df1, df2, by = "name")
print(df)
## name age married height austrian
## 1 Alice 25 TRUE 1.72 FALSE
## 2 Bob 30 FALSE 1.65 TRUE
## 3 Charlie 35 TRUE 1.39 TRUE

43
Graphical summary

A quick graphical summary of the

different (correct and wrong) ways
of combining data frames. We have
three small data frames with two
observations each. The first two
(left) share the same variable
names and contain geographical
location of some cities. The last
(right) shares the same values in
one column (name) with the data
frame bottom left, but contains
different information.

44
Graphical summary

Row binding: As the two data

frames on the left have the same
number of variables (columns) we
can use rbind(df1, df2) to combine
them. Warning: base R does not
’
binds them together!

45
Graphical summary

Column binding: When having two

objects with the same number of
rows, we can call cbind(df2, df3).
Again, cbind() does not care about
what is in there, just combines
them.

46
Graphical summary

47
Graphical summary

Merging: merge(df2, df3, by =

“ ” ‘ - ’
the information correctly. Compares
the values in x𝑛𝑎𝑚𝑒𝑎𝑛𝑑𝑦name and
correctly combines the information.

48
Apply Functions

49
Function Return value
Apply functions lapply list
sapply try to simplify the result to a
we have already seen apply vector or matrix
function in the previous weeks, lets vapply similar to sapply but allows
see some more you to specify the type of the
return value

50
apply functions Usage

● lapply(X, FUN, …)
● sapply(X, FUN, …)
● vapply(X, FUN, FUN.VALUE, …,)

Lets create a dataframe for demo

51
lapply

lapply(df, class)
## $name
## [1] "character"
##
## $age
## [1] "integer"
##
## $height
## [1] "numeric"
##
## $austrian
## [1] "logical"

52
sapply

sapply(df, class)
## name age height austrian
## "character" "integer" "numeric" "logical"

53
sapply continued

sapply(df, length)
## name age height austrian
## 3 3 3 3

54
sapply continued

sapply(df, mean)
## Warning in mean.default(X[[i]], ...): argument is
not numeric or logical:
## returning NA
## name age height austrian
## NA 22.6666667 1.5866667 0.6666667

55
sapply continued

sapply(df, function(x) if(is.numeric(x)) mean(x) else x)

## $name
## [1] "Petra" "Jochen" "Alexander"
##
## $age
## [1] 22.66667
##
## $height
## [1] 1.586667
##
## $austrian
## [1] FALSE TRUE TRUE

56
vapply

Return must be character

vapply(df, class, "")

## name age height austrian
## "character" "integer" "numeric" "logical"

57
vapply continued

Return must be integer

vapply(df, length, vector("integer", 1)) # Return must be integer

## name age height austrian
## 3 3 3 3

58
Thank you

Factors
No ratings yet
Factors
23 pages
R Programming Cheatsheet
100% (2)
R Programming Cheatsheet
6 pages
R - A Practical Course
No ratings yet
R - A Practical Course
42 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
Data Frames in R
No ratings yet
Data Frames in R
7 pages
R Programming Cont..
No ratings yet
R Programming Cont..
24 pages
Dar Lecture 7
No ratings yet
Dar Lecture 7
24 pages
9-Data Frames - Working With Data Frames-21!01!2025
No ratings yet
9-Data Frames - Working With Data Frames-21!01!2025
26 pages
Unit 1.3
No ratings yet
Unit 1.3
36 pages
8 R Basics 3
No ratings yet
8 R Basics 3
27 pages
Data Handling in R Programming Notes
No ratings yet
Data Handling in R Programming Notes
41 pages
L3 Notes-1
No ratings yet
L3 Notes-1
8 pages
Kiran R1
No ratings yet
Kiran R1
12 pages
Unit 1 Factor
No ratings yet
Unit 1 Factor
9 pages
Unit 3 Chatgpt
No ratings yet
Unit 3 Chatgpt
6 pages
Siv2010 Mathematics in Biology: Revision (Quiz 1) - R
No ratings yet
Siv2010 Mathematics in Biology: Revision (Quiz 1) - R
17 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
R
No ratings yet
R
15 pages
Data Frames
No ratings yet
Data Frames
20 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
Lab 02 - Compound Data Structures
No ratings yet
Lab 02 - Compound Data Structures
12 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
R Prog
No ratings yet
R Prog
27 pages
CH 03
No ratings yet
CH 03
42 pages
Dataframes
No ratings yet
Dataframes
13 pages
R Chapter4
No ratings yet
R Chapter4
8 pages
R Exam
No ratings yet
R Exam
18 pages
Harneet 23-R Presentation
No ratings yet
Harneet 23-R Presentation
22 pages
Factors in R
No ratings yet
Factors in R
6 pages
R Programming Cheat Sheet: Ata Tructures
No ratings yet
R Programming Cheat Sheet: Ata Tructures
2 pages
Experiment No 8
No ratings yet
Experiment No 8
11 pages
Machine Learning - Unit IV Notes
No ratings yet
Machine Learning - Unit IV Notes
18 pages
R Software Notes
No ratings yet
R Software Notes
5 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
R Data Structures - 07 - 4
No ratings yet
R Data Structures - 07 - 4
27 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
DSF 9-10
No ratings yet
DSF 9-10
25 pages
Lecture 5 (Managing and Understanding Data)
No ratings yet
Lecture 5 (Managing and Understanding Data)
9 pages
What Is A Data Frame in R?
No ratings yet
What Is A Data Frame in R?
5 pages
EM622 Data Analysis and Visualization Techniques For Decision-Making
No ratings yet
EM622 Data Analysis and Visualization Techniques For Decision-Making
47 pages
MLlab 5 TH
No ratings yet
MLlab 5 TH
17 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
Base R
No ratings yet
Base R
9 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
BDA Section 4
No ratings yet
BDA Section 4
19 pages
Chapter 1 - Part 2 - DataFrame
No ratings yet
Chapter 1 - Part 2 - DataFrame
48 pages
People Analytics With R Part 4
No ratings yet
People Analytics With R Part 4
11 pages
Data Visualisation Slides 1-6
No ratings yet
Data Visualisation Slides 1-6
318 pages
Kids C ("Jack", "Jill") : 5.1 Creating Data Frames
No ratings yet
Kids C ("Jack", "Jill") : 5.1 Creating Data Frames
11 pages
R Data Frame - Javatpoint
No ratings yet
R Data Frame - Javatpoint
14 pages
Explore The Data Frame
No ratings yet
Explore The Data Frame
8 pages
Basic Data Objects in R
No ratings yet
Basic Data Objects in R
18 pages
R Nuts and Bolts
No ratings yet
R Nuts and Bolts
9 pages
R1 Uptovisualisation
No ratings yet
R1 Uptovisualisation
122 pages
R Data Types 8
No ratings yet
R Data Types 8
7 pages
MTech R Notes
No ratings yet
MTech R Notes
14 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Iitg Bscdsai Da109 - Week 10
No ratings yet
Iitg Bscdsai Da109 - Week 10
29 pages
Slides Module1
No ratings yet
Slides Module1
84 pages
Slides Module2 1
No ratings yet
Slides Module2 1
82 pages
Matrix Introduction: IIT Guwahati
No ratings yet
Matrix Introduction: IIT Guwahati
103 pages
LECTURE - 4 - Matrices and Vectors (Cont.) QH
No ratings yet
LECTURE - 4 - Matrices and Vectors (Cont.) QH
22 pages
Module 3
No ratings yet
Module 3
12 pages
Computer Science Class Notes
No ratings yet
Computer Science Class Notes
3 pages
Can We Be Strangers Again
No ratings yet
Can We Be Strangers Again
5 pages
哈工大设计模式课件（孙玉山）
No ratings yet
哈工大设计模式课件（孙玉山）
1,281 pages
0 - FB UFM V5 (TwinCat 2 Und 3) - EN
No ratings yet
0 - FB UFM V5 (TwinCat 2 Und 3) - EN
40 pages
Data Structure by Blogspot
No ratings yet
Data Structure by Blogspot
30 pages
Computer Programming E. Balagurusamy - Quickly Download The Ebook To Explore The Full Content
100% (2)
Computer Programming E. Balagurusamy - Quickly Download The Ebook To Explore The Full Content
54 pages
IA Complexity 1
No ratings yet
IA Complexity 1
3 pages
Computer Science NEP Syllabus 1and 2
No ratings yet
Computer Science NEP Syllabus 1and 2
13 pages
OCS353-Data Science Fundamentals Manual 1 - PDF
No ratings yet
OCS353-Data Science Fundamentals Manual 1 - PDF
6 pages
Micro Lab 2
No ratings yet
Micro Lab 2
5 pages
Module 4 - Arrays and Indexing Types
No ratings yet
Module 4 - Arrays and Indexing Types
19 pages
1 GD005 H 0 CH N7 LX Huf HQG
No ratings yet
1 GD005 H 0 CH N7 LX Huf HQG
9 pages
Internal Sorting:Definition: When The Entire Dataset Can Fit
No ratings yet
Internal Sorting:Definition: When The Entire Dataset Can Fit
12 pages
Ex No 2 Implementation of Queue Using Array
No ratings yet
Ex No 2 Implementation of Queue Using Array
5 pages
Ba Paper 3
No ratings yet
Ba Paper 3
1 page
MCQ
No ratings yet
MCQ
8 pages
CSc9618 (A2) - Mock 5 - Paper 4
No ratings yet
CSc9618 (A2) - Mock 5 - Paper 4
10 pages
Functions and Arrays
No ratings yet
Functions and Arrays
11 pages
PHP Programs Sort Oops STRH
No ratings yet
PHP Programs Sort Oops STRH
12 pages
Data Structures & Algorithms 01
No ratings yet
Data Structures & Algorithms 01
21 pages
Up Up and Array!: Dynamic Array Formulas For Excel 365 and Beyond 1st Edition Abbott Ira Katz Instant Download
No ratings yet
Up Up and Array!: Dynamic Array Formulas For Excel 365 and Beyond 1st Edition Abbott Ira Katz Instant Download
78 pages
OCS752-Introduction To C Programming
No ratings yet
OCS752-Introduction To C Programming
9 pages
Computational Finance: MATLAB® Oriented Modeling (Routledge-Giappichelli Studies in Business and Management) 1st Edition Francesco Cesarone pdf download
No ratings yet
Computational Finance: MATLAB® Oriented Modeling (Routledge-Giappichelli Studies in Business and Management) 1st Edition Francesco Cesarone pdf download
164 pages
Complete MTE Syllabus II Year 2024-2025 15112024
No ratings yet
Complete MTE Syllabus II Year 2024-2025 15112024
2 pages
QP-Computer Science - 2024
No ratings yet
QP-Computer Science - 2024
11 pages
01 - Python Pandas 1 & 2
No ratings yet
01 - Python Pandas 1 & 2
5 pages
Summary
No ratings yet
Summary
125 pages
MCQ STACK Queue
No ratings yet
MCQ STACK Queue
4 pages

Module 8

Uploaded by

Module 8

Uploaded by

Bachelor of Science (Honours) in Data Science and Artificial Intelligence

02 How to create Dataframes

● Dataframes are the most common data structure in R

● Dataframes can be created using the data.frame() function

# Print the dataframe

# Print the dataframe

● Dataframes are similar to matrices, but with some important differences

● row.names : The row names of the dataframe

# Setting row names

# Setting column names

# Getting the column names

# Getting the dimensions of the dataframe

# List style subsetting

# List style subsetting

# List style subsetting

# List style subsetting

# List style subsetting by logical vector

# Matrix style subsetting

# Matrix style subsetting

The subset function is a powerful function to subset dataframes

A demo dataframe to demonstrate the subset function

Subset only who are not austrian from the dataframe

subset(df, age >= 18, austrian, drop = TRUE)

To delete a variable from a dataframe, you can set it to NULL

# Adds a completely new variable

To an extent we coerce objects in R to a data frame. For example, a matrix

mat <- matrix(1:6, nrow = 2, dimnames = list(c("Row 1",

# Coerce to matrix and back to data frame

df <- as.data.frame(list(x = c(1, 2, 3, 4), y = c("A",

df2 <- data.frame(

# Combine the two dataframes

df <- rbind(df1, df2)

df1 <- data.frame(

df2 <- data.frame(

# Combine the two dataframes

We can also use the merge function to combine dataframes

df2 <- data.frame(

# Combine the two dataframes

A quick graphical summary of the

Row binding: As the two data

Column binding: When having two

Merging: merge(df2, df3, by =

Lets create a dataframe for demo

sapply(df, function(x) if(is.numeric(x)) mean(x) else x)

Return must be character

vapply(df, class, "")

Return must be integer

vapply(df, length, vector("integer", 1)) # Return must be integer

You might also like