0% found this document useful (0 votes)
52 views12 pages

Data Frames: Nptel Noc18-Cs28

This document discusses dataframes in R. It explains how to create dataframes from vectors, access rows and columns, edit dataframe elements, add and delete rows and columns, and resolve issues with manipulating character columns that are converted to factors. Dataframes allow storing and manipulating tabular data in R and are a fundamental data structure.

Uploaded by

tkpatil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views12 pages

Data Frames: Nptel Noc18-Cs28

This document discusses dataframes in R. It explains how to create dataframes from vectors, access rows and columns, edit dataframe elements, add and delete rows and columns, and resolve issues with manipulating character columns that are converted to factors. Dataframes allow storing and manipulating tabular data in R and are a fundamental data structure.

Uploaded by

tkpatil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Data science for Engineers

Data frames

Dataframes-1 NPTEL NOC18-CS28 1


Data science for Engineers

In this lecture
 Dataframe
 Create
 Access rows and columns
 Edit
 Add new rows and columns

Dataframes-1 NPTEL NOC18-CS28 2


Data science for Engineers

Dataframes: Create dataframe


Data frames are generic data objects of R, used to store
tabular data
Code Console Output

# Introduction to data frames


vec1 = c(1,2,3)
vec2 = c("R","Scilab","Java")
vec3 = c("For prototyping",
"For prototyping","For Scaleup")
df = data.frame(vec1,vec2,vec3)
print(df)

Dataframes-1 NPTEL NOC18-CS28 3


Data science for Engineers

Create a dataframe using data from a file


• A dataframe can also be created by reading data from a
file using the following command
 newDF = read.table(path=“Path of the file”)

• In the path, please use ‘/’ instead of ‘\’


 Example: “C:/Users/hii/Documents/R/R-Workspace/”

• A separator can also be used to distinguish between


entries. Default separator is space, ‘ ’
 newDF = read.table(file=“path of the file”, sep)

Dataframes-1 NPTEL NOC18-CS28 4


Data science for Engineers

Accessing rows and columns


• df[val1,val2] refers to row “val1”, column “val2”. Can be number or
string
• “val1” or “val2” can also be array of values like “1:2” or “c(1,3)”
• df[val2] (no commas) - just refers to column “val2” only

Code Console Output


# accessing first & second row:
print(df[1:2,])
# accessing first & second column:
print(df[,1:2])
# accessing 1st & 2nd column –
# alternate:
print(df[1:2])

Dataframes-1 NPTEL NOC18-CS28 5


Data science for Engineers

Subset
subset() which extracts subset of data based on conditions

Code Console Output


# Data frame example 2
pd=data.frame("Name"=c("Senthil","
Senthil","Sam","Sam"),
"Month"=c("Jan","Feb","Jan","Feb"),
"BS" = c(141.2,139.3,135.2,160.1),
"BP" = c(90,78,80,81))
pd2 = subset(pd,Name=="Senthil" |
BS> 150 )
print("new subset pd2")
print(pd2)

Dataframes-1 NPTEL NOC18-CS28 6


Data science for Engineers

Editing dataframes
Dataframes can be edited by direct assignment

Code
Console Output
# Introduction to dataframes
vec1 = c(1,2,3)
vec2 = c("R","Scilab","Java")
vec3 = c("For prototyping", "For
prototyping","For Scaleup")
df = data.frame(vec1,vec2,vec3)
print(df)
df[[2]][2] = “R”

Dataframes-1 NPTEL NOC18-CS28 7


Data science for Engineers

Editing dataframes
• A dataframe can also be edited using the edit() command
• Create an instance of data frame and use edit command to open a
table editor, changes can be manually made
Code
# Editing a data frame
myTable = data.frame()
myTable = edit(myTable)

Dataframes-1 NPTEL NOC18-CS28 8


Data science for Engineers

Adding extra rows and columns


Extra row can be added with “rbind” function and extra column with “cbind”

Code Console Output


# continuing from previous example
# adding extra row and column:
df = rbind(df,data.frame(vec1=4,
vec2="C", vec3="For Scaleup"))
print("adding extra row")
print(df)
df = cbind(df,vec4=c(10,20,30,40))
print("adding extra col")
print(df)

Dataframes-1 NPTEL NOC18-CS28 9


Data science for Engineers

Deleting rows and columns


There are several ways to delete a row/column, some cases are
shown below
Code

# continuing from previous example A ‘-’ sign before value and before ‘,’
for rows & after ‘,’ for columns
# Deleting rows and columns:
‘!’ means no to those rows /columns
df2 = df[-3,-1] which satisfy the condition
print(df2)
# conditional deletion:
df3 = df[,!names(df) %in% c(“vec3”)]
print(df3)
df4 = df[!df$vec1==3,]
print(df4)

Dataframes-1 NPTEL NOC18-CS28 10


Data science for Engineers

Manipulating rows – the factor issue


 When character columns are created in a data.frame, they become factors
 Factor variables are those where the character column is split into
categories or factor levels

Code Console Output

# Manipulating rows in data frame


# continued from previous page
df[3,1]= 3.1
df[3,3]= "Others"
print(df)

Notice the NA values displayed instead of the string “Others”.


Also see the use of the word “factor” in the warning above
Dataframes-1 NPTEL NOC18-CS28 11
Data science for Engineers

Resolving factor issue


New entries need to be consistent with factor levels which are fixed
when the dataframe is first created

Code Console Output


vec1 = c(1,2,3)
vec2 = c("R","Scilab","Java")
vec3 = c("For prototyping",
"For prototyping","For Scaleup")
df = data.frame(vec1,vec2,vec3,
stringsAsFactors = F)
# Now trying the same manipulation
df[3,3]= "Others“
print(df)

Dataframes-1 NPTEL NOC18-CS28 12

You might also like