0% found this document useful (0 votes)
23 views14 pages

R Data Frame - Javatpoint

Uploaded by

Arjun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views14 pages

R Data Frame - Javatpoint

Uploaded by

Arjun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14


Home Python Java JavaScript HTML SQL PHP C#

← prev next →

R Data Frame

A data frame is a two-dimensional array-like structure or a table in which a column contains


values of one variable, and rows contains one set of values from each column. A data frame
is a special case of the list in which each component has equal length.

A data frame is used to store data table and the vectors which are present in the form of a
list in a data frame, are of equal length.

In a simple way, it is a list of equal length vectors. A matrix can contain one type of data, but
a data frame can contain different data types such as numeric, character, factor, etc.

There are following characteristics of a data frame.

ADVERTISEMENT
The columns name should be non-empty.

The rows name should be unique.

The data which is stored in a data frame can be a factor, numeric, or character
type.

Each column contains the same number of data items.

How to create Data Frame


ADVERTISEMENT
In R, the data frames are created with the help of frame() function of data. This function
contains the vectors of any type such as numeric, character, or integer. In below example,
we create a data frame that contains employee id (integer vector), employee
name(character vector), salary(numeric vector), and starting date(Date vector).

Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,915.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
# Printing the data frame.
print(emp.data)

ADVERTISEMENT
Output

employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita915.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27

Getting the structure of R Data Frame


In R, we can find the structure of our data frame. R provides an in-build function called str()
which returns the data with its complete structure. In below example, we have created a
frame using a vector of different data type and extracted the structure of it.
Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
# Printing the structure of data frame.
str(emp.data)

Output

'data.frame': 5 obs. of 4 variables:


$ employee_id : int 1 2 3 4 5
$ employee_name: chr "Shubham" "Arpita" "Nishka" "Gunjan" ...
$ sal : num 623 515 611 729 843
$ starting_date: Date, format: "2012-01-01" "2013-09-23" ...

Extracting data from Data Frame


The data of the data frame is very crucial for us. To manipulate the data of the data frame, it
is essential to extract it from the data frame. We can extract the data in three ways which
are as follows:

1. We can extract the specific columns from a data frame using the column name.
2. We can extract the specific rows also from a data frame.
3. We can extract the specific rows corresponding to specific columns.

Let's see an example of each one to understand how data is extracted from the data frame
with the help these ways.

Extracting the specific columns from a data frame

Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name= c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
# Extracting specific columns from a data frame
final <- data.frame(emp.data$employee_id,emp.data$sal)
print(final)

Output

emp.data.employee_idemp.data.sal
1 1 623.30
2 2 515.20
3 3 611.00
4 4 729.00
5 5 843.25

Extracting the specific rows from a data frame


Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
# Extracting first row from a data frame
final <- emp.data[1,]
print(final)

# Extracting last two row from a data frame


final <- emp.data[4:5,]
print(final)
Output

employee_id employee_name sal starting_date


1 1 Shubham 623.3 2012-01-01

employee_id employee_name sal starting_date


4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27

Extracting specific rows corresponding to specific columns


Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
# Extracting 2nd and 3rd row corresponding to the 1st and 4th column
final <- emp.data[c(2,3),c(1,4)]
print(final)

Output
employee_id starting_date
2 2 2013-09-23
3 3 2014-11-15

Modification in Data Frame


R allows us to do modification in our data frame. Like matrices modification, we can modify
our data frame through re-assignment. We cannot only add rows and columns, but also we
can delete them. The data frame is expanded by adding rows and columns.

We can

1. Add a column by adding a column vector with the help of a new column name using cbind()
function.
2. Add rows by adding new rows in the same structure as the existing data frame and using
rbind() function
3. Delete the columns by assigning a NULL value to them.
4. Delete the rows by re-assignment to them.

Let's see an example to understand how rbind() function works and how the modification is
done in our data frame.

Example: Adding rows and columns

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
print(emp.data)

#Adding row in the data frame


x <- list(6,"Vaishali",547,"2015-09-01")
rbind(emp.data,x)

#Adding column in the data frame


y <- c("Moradabad","Lucknow","Etah","Sambhal","Khurja")
cbind(emp.data,Address=y)

Output

employee_id employee_name sal starting_date


1 1 Shubham 623.30 2012-01-01
2 2 Arpita 515.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
employee_id employee_name sal starting_date
1 1 Shubham 623.30 2012-01-01
2 2 Arpita 515.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
6 6 Vaishali 547.00 2015-09-01
employee_id employee_name sal starting_date Address
1 1 Shubham 623.30 2012-01-01 Moradabad
2 2 Arpita 515.20 2013-09-23 Lucknow
3 3 Nishka 611.00 2014-11-15 Etah
4 4 Gunjan 729.00 2014-05-11 Sambhal
5 5 Sumit 843.25 2015-03-27 Khurja

Example: Delete rows and columns

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
print(emp.data)

#Delete rows from data frame


emp.data<-emp.data[-1,]
print(emp.data)

#Delete column from the data frame


emp.data$starting_date<-NULL
print(emp.data)

Output

employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesalstarting_date
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesal
1 1 Shubham623.30
2 2 Arpita515.20
3 3 Nishka611.00
4 4 Gunjan729.00
5 5 Sumit843.25
Summary of data in Data Frames
In some cases, it is required to find the statistical summary and nature of the data in the
data frame. R provides the summary() function to extract the statistical summary and
nature of the data. This function takes the data frame as a parameter and returns the
statistical information of the data. Let?s see an example to understand how this function is
used in R:

Example

# Creating the data frame.


emp.data<- data.frame(
employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,515.2,611.0,729.0,843.25),

starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",


"2015-03-27")),
stringsAsFactors = FALSE
)
print(emp.data)

#Printing the summary


print(summary(emp.data))

Output

employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27

employee_idemployee_namesalstarting_date
Min. :1 Length:5 Min. :515.2 Min. :2012-01-01
1st Qu.:2 Class :character 1st Qu.:611.0 1st Qu.:2013-09-23
Median :3 Mode :character Median :623.3 Median :2014-05-11
Mean :3 Mean :664.4 Mean :2014-01-14
3rd Qu.:4 3rd Qu.:729.0 3rd Qu.:2014-11-15
Max. :5 Max. :843.2 Max. :2015-03-27

Next Topic R Factors

← prev next →

Learn Important Tutorial

Python Java

Javascript HTML

Database PHP

You might also like