R Programming Cont..
R Programming Cont..
Data Frames
• Data Frames in R Language are generic data
objects of R that are used to store tabular
data. Data frames can also be interpreted as
matrices where each column of a matrix can
be of different data types. R Data Frame is
made up of three principal components, the
data, rows, and columns.
Create a data frame
• x <-
data.frame(GENDAR=c("M","F","M","F"),AGE=
c(25,18,2,8),
WEIGHT=c(2,5,2,8),HEIGHT=c(6,3,2,4))
• print(x)
#output
???????
# R program to create dataframe
arguments, this function returns unique rows by checking values on all columns .
• # Create dataframe
df=data.frame(id=c(11,11,33,44,44),
pages=c(32,32,33,22,22),
name=c("spark","spark","R","java","jsp"),
chapters=c(76,76,11,15,15),
price=c(144,144,321,567,567)) df # Load library
dplyr library(dplyr) # Distinct rows df2 <- df %>%
distinct() df2
• # Distinct on selected columns df2 <- df %>%
distinct(id,pages) df2
dplyr::arrange() Examples
dplyr arrange() function is used to sort the R dataframe rows by ascending or descending order based on column
values.
• # R program to extract
• # data from the data frame
•
• # creating a data frame
• friend.data <- data.frame(
• friend_id = c(1:5),
• friend_name = c("Sachin", "Sourav",
• "Dravid", "Sehwag",
• "Dhoni"),
• stringsAsFactors = FALSE
• )
•
• # Extracting friend_name column
• result <- data.frame(friend.data$friend_name)
• print(result)
A data frame in R can be expanded by adding new
columns and rows to the already existing R data frame .
• # R program to expand
• # the data frame
•
• # creating a data frame
• friend.data <- data.frame(
• friend_id = c(1:5),
• friend_name = c("Sachin", "Sourav",
• "Dravid", "Sehwag",
• "Dhoni"),
• stringsAsFactors = FALSE
• )
•
• # Expanding data frame
• friend.data$location <- c("Kolkata", "Delhi",
• "Bangalore", "Hyderabad",
• "Chennai")
• resultant <- friend.data
• # print the modified data frame
• print(resultant)
A data frame in R removes columns and rows from
the already existing R data frame.
• library(dplyr)
• # Create a data frame
• data <- data.frame(
• friend_id = c(1, 2, 3, 4, 5),
• friend_name = c("Sachin", "Sourav", "Dravid", "Sehwag", "Dhoni"),
• location = c("Kolkata", "Delhi", "Bangalore", "Hyderabad", "Chennai")
• )
•
• # Remove a row with friend_id = 3
• data <- subset(data, friend_id != 3)
•
• # Remove the 'location' column
• data <- select(data, -location)
• Print(data)