0% found this document useful (0 votes)

3 views10 pages

R Programming-Chapiter 6

Chapter 6 discusses data manipulation in R, highlighting its statistical capabilities and built-in datasets. It covers how to use existing datasets, import and export files, and perform basic statistical analyses using various functions. The chapter also includes an example of analyzing the 'cars' dataset, detailing steps for exploration and visualization.

Uploaded by

memoiremath1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views10 pages

R Programming-Chapiter 6

Uploaded by

memoiremath1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

42

Chapter 6
DATA MANIPULATION IN R
Our goal here is to give some landmarks with the aim to have an idea of the features
of R to perform statistical and data analyses. R software is an environment within which
many classical and modern statistical techniques have been implemented [ref4]. In this
range, some statistical methods are available in a large number of packages. Some of them
are distributed with a base installation of R (about 25 packages supplied with R (called
“standard” and “recommended” packages) [4], and many other packages are contributed
and must be installed by the user [2] and re available through the CRAN family of Internet
sites (via https://CRAN.R-project.org) and elsewhere.

6.1 DataSets in R
The R programming language has many built-in datasets that can usually be used as
sample data to illustrate the performance of R functions.

6.1.1 What is DataSet

A dataset is a data collection presented in a table.

6.1.2 Using Existing DataSet in R

For more details about datset packages in R consult the link : https://stat.ethz.ch/R-
manual/R-devel/library/datasets/html/00Index.html. It shows a set of existing dataset
in R that can be used and explored using statistical functions. Table 6.1 presents a few
dataset that existing in R.
43

N° DataSet Name in R Description

1 Cars Speed and stopping Distances of cars.

2 BJsales.lead Sales Data with Leading Indicator.

3 iris Edgar Anderson’s Iris Data.

4 airquality New York Air Quality Measurements.

5 Nile Flow of the River Nile.

Table 6.1: Sample Existing DataSet in R.

Example: In this example, we will explore the dataset "airquality". To display the
dataset, we simply write the name of the dataset inside the print() as shown in the table
6.2

N° Function Example Description Execution result

1 print print("airquality") display the data of the dataset. All data

2 dim dim(airquality) get dimension of dataset 1536

3 nRow nrow(airquality) get number of rows 153

4 cCol ncol(airquality) get number of columns 6

5 Names names(airquality) get name of variable of dataset All names

6 $ print(airquality$Temp) display all values of Temp variable All values

7 sort sort(airquality$Temp) sort values of Temp variable sort values

Table 6.2: Functions to Get Information About the Dataset.

6.2 Directory functions in R

In R software, there are two important predefined functions that allow a user to
designate a working directory (see figure 6.1). These functions are the following:

• getwd(): this function is used to get the current working directory.

• setwd(): this function is used to change the current working directory.

Example:

Figure 6.1: Directory functions in R.

6.3 Importing Files in R Software

For statistical analysis, it is important to use certain functions with the R program-
ming language to work with system directories and import and export data from these
directories. The R software can read different types of files such as (CSV) files, text files,
Excel sheets and files, SPSS files, SAS files, etc. Table 6.3 shows some functions that can
be used to read some files.

File type Function

Text file read.table

CSV file read.csv

Excel file read.xlsx

Table 6.3: The reading functions of files in R.

6.3.1 Importing Text Files in R

The function "read.table" allows to read text files saved in the current working directory
and then import the data from that particular text file as shown in the figure 6.2:
45

Figure 6.2: Read Text File in R.

6.3.2 Importing CSV Files in R

The function "read.csv" allows to read CSV file saved in the current working directory
and then import the data from that particular text file as shown in the figure 6.3:

Figure 6.3: Read CSV File in R.

6.3.3 Importing Excel Files in R

The function "read.xlsx" allows to read Excel files saved in the current working di-
rectory and then import the data from that particular text file as shown in the figure
6.4:

Figure 6.4: Read Excel File in R.

6.4 Exporting files in R Software

To export data to a file, R software contains some functions that allow saving data
into files of different types.

6.4.1 Exporting Data to Text Files in R

The function "sink" allows exporting data to a text file in the current working directory
as shown in figure 6.5:
47

Figure 6.5: Export Data to Text File in R.

6.4.2 Exporting to CSV Files in R

The function "write.csv" allows exporting data to CSV file saved in the current working
directory as shown in figure 6.6:

Figure 6.6: Export Data to CSV File in R.

6.5 Basic Statistics

R is a statistical computing language, and many functions integrated into R are de-
veloped for statistical purposes. In this section, we will examine some basic statistical
functions and use R to illustrate their application [6].
48

6.5.1 Statistical Functions in R

The following table (see table 6.4) summarizes the most important basic statistical
functions found in the R program, giving the name of the function, its implementation
method, and its role.

N° Function Run in R Description

1 Mean mean(Vector) Calculate the average of a vector.

2 Trimmed Mean mean(Vector,trim=0.##) Calculate the mean of certain proportion of the vector.

3 Variance var(Vector) Measure the spread of a vector.

4 Standard Deviation sd(Vector) measure the spread of the data in the vector.

5 Standard Error sd(Vector)/sqrt(length(Vector)) Display the error associated with a point estimate.

6 Median Absolute Deviation mad(Vector) calculate the average distance between each datapoint.

7 Median median(Vector) Estimate the center of the data in the vector.

8 Minimum min(Vector) Find the smallest value in the vector.

9 Maximum max(Vector) Find the largest value in the vector.

10 Range max(Vector, - min(Vector) caclulate the maximum minus the minimum.

11 Quantile quantile (Vector, c(##)) calculate n percent of the data in a vector.

12 Interquartile Range IQR(vector) calculate the middle 50% of data.

Table 6.4: The Most Important Basic Statistical Functions in R.

Example: We can use the summary() function to get statistical information about
the variable in the dataset as shown in figure 6.7. This function returns six statistical
summaries which are: min, First Quartile, Median, Mean, Third Quartile, and Max. The
example shows the statistical information about the Temp variable.
49

Figure 6.7: Get Statistical information Using summary Functions in R.

6.6 Data analysis of cars

In this section, we are going to analyze the dataset in R which is called cars. We can
find the description of this dataset by just writing "cars" in the help section in Rstudio,
checking figures 6.8 and 6.9.

Figure 6.8: Cars dataset

Figure 6.9: Information about Cars dataset

To formally do a good analysis of this data we need to follow the following steps:

1. Get to know the details of this dataset by using the functions "names(), col.names(),
row.names().

2. Defines the data cars by using the function "view()" which can be used to invoke a
spreadsheet-style data viewer within RStudio.

3. Get to know the type of car data.

4. Use the function "summary()" to summarize the data frame into just one value or
vector.

5. Separate the information into two sections by using "summary()[,1]" and "summary()[2,]"

6. Plot the data and give a name to the x-axis by "speed", and a name to the y-axis
by "stop distance" and give this title "cars data".
51

7. Choose data of the variable "speed" and also of the variable "distance" and plot its
histogram.

8. check the ANOVA analysis for the following variables: "cars.1, cars.2, cars.3, cars.4".

QGIS 3.34 DesktopUserGuide En
No ratings yet
QGIS 3.34 DesktopUserGuide En
1,607 pages
Grammatical Names and Roles
100% (1)
Grammatical Names and Roles
10 pages
Unit 2
No ratings yet
Unit 2
32 pages
(Science 6 WK 3 L5) - Perform Experiments Affecting Solubility
No ratings yet
(Science 6 WK 3 L5) - Perform Experiments Affecting Solubility
35 pages
Module 5-6
No ratings yet
Module 5-6
12 pages
r Module 5
No ratings yet
r Module 5
21 pages
Introduction to R for Business Analytics(1)
No ratings yet
Introduction to R for Business Analytics(1)
7 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Data Preprocessing
No ratings yet
Data Preprocessing
27 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
Basic Descriptive Statistics Using R
No ratings yet
Basic Descriptive Statistics Using R
4 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
40 pages
CS ELEC 4 Midterm Module
No ratings yet
CS ELEC 4 Midterm Module
59 pages
Capital Gains
No ratings yet
Capital Gains
8 pages
Practical 1_Data Frame Manipulation_072502
No ratings yet
Practical 1_Data Frame Manipulation_072502
16 pages
DAR 4
No ratings yet
DAR 4
28 pages
Exploratory Data Analysis - NOTES
No ratings yet
Exploratory Data Analysis - NOTES
31 pages
Descriptive and Inferential Statistics With R
No ratings yet
Descriptive and Inferential Statistics With R
6 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
100% (8)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
35 pages
DS Lab
No ratings yet
DS Lab
31 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
R Programming For NGS Data Analysis
No ratings yet
R Programming For NGS Data Analysis
5 pages
Lec7 8
No ratings yet
Lec7 8
28 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
R Module 5
No ratings yet
R Module 5
21 pages
Howtouser: 1 What Is R
No ratings yet
Howtouser: 1 What Is R
6 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
100% (3)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
40 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
r-graphics-essentials-great-data-visualization
No ratings yet
r-graphics-essentials-great-data-visualization
248 pages
ProgrammingForDS14_Rbasics
No ratings yet
ProgrammingForDS14_Rbasics
32 pages
Using R For Basic Statistical Analysis
No ratings yet
Using R For Basic Statistical Analysis
11 pages
Business Analytics Unit 4
No ratings yet
Business Analytics Unit 4
24 pages
Starting With R
No ratings yet
Starting With R
34 pages
unit3_R[1] (1)
No ratings yet
unit3_R[1] (1)
30 pages
Business Analytics Unit - IV Notes_60637706_2025_05!15!02_16
No ratings yet
Business Analytics Unit - IV Notes_60637706_2025_05!15!02_16
28 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
Lecture 10 R
No ratings yet
Lecture 10 R
117 pages
Mod1 R Programming
No ratings yet
Mod1 R Programming
49 pages
DA_Lab_Week-1
No ratings yet
DA_Lab_Week-1
7 pages
R Programming Unit 2
No ratings yet
R Programming Unit 2
46 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell download
100% (2)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell download
33 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
A Brief Introduction To R
No ratings yet
A Brief Introduction To R
17 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
Experiment # 4
No ratings yet
Experiment # 4
10 pages
RBasics Handout
No ratings yet
RBasics Handout
6 pages
Kmbn It01_ Unit 4
No ratings yet
Kmbn It01_ Unit 4
19 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
100% (2)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
47 pages
RStudio Exercices
No ratings yet
RStudio Exercices
8 pages
Advanced Statistics
No ratings yet
Advanced Statistics
259 pages
R - Lecture 4
No ratings yet
R - Lecture 4
37 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
Lab 1
No ratings yet
Lab 1
26 pages
#02 R Basics
No ratings yet
#02 R Basics
30 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
R Most Important Question
No ratings yet
R Most Important Question
12 pages
DSA1101 2019 Week1 Part2
No ratings yet
DSA1101 2019 Week1 Part2
38 pages
R Notes Based on Text Module 2
No ratings yet
R Notes Based on Text Module 2
24 pages
Beginning R: The Statistical Programming Language
From Everand
Beginning R: The Statistical Programming Language
Mark Gardener
4.5/5 (4)
R Programming - a Comprehensive Guide: Software
From Everand
R Programming - a Comprehensive Guide: Software
Editor IJSMI
No ratings yet
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
(English) Accounting Basics Explained Through A Story (DownSub - Com)
No ratings yet
(English) Accounting Basics Explained Through A Story (DownSub - Com)
8 pages
2021 Activity 6 in Spirituality in Nursing
No ratings yet
2021 Activity 6 in Spirituality in Nursing
2 pages
Giữa kỳ 1-dịch kinh tế-tài chính
No ratings yet
Giữa kỳ 1-dịch kinh tế-tài chính
11 pages
La Rosa, Et Al. vs. Ambassador Hotel
No ratings yet
La Rosa, Et Al. vs. Ambassador Hotel
8 pages
HC138 SingingPropellers
No ratings yet
HC138 SingingPropellers
2 pages
6em Soc
No ratings yet
6em Soc
194 pages
Anthology of Poems 2024-25
No ratings yet
Anthology of Poems 2024-25
15 pages
Asa Dual Isp
No ratings yet
Asa Dual Isp
8 pages
Final 71th Mentorship Program Fee
No ratings yet
Final 71th Mentorship Program Fee
4 pages
20024923063_357806867_81790XXXXX_3_2025 (2)
No ratings yet
20024923063_357806867_81790XXXXX_3_2025 (2)
5 pages
Associates: Youth Mind First Aid
No ratings yet
Associates: Youth Mind First Aid
8 pages
A320 Tail Strike After Take-Off
No ratings yet
A320 Tail Strike After Take-Off
4 pages
Gulu University Dipoma Entry Scheme Government Admission List 2024 - 2025
No ratings yet
Gulu University Dipoma Entry Scheme Government Admission List 2024 - 2025
3 pages
ITR-25
No ratings yet
ITR-25
1 page
Asialink Loan Application
No ratings yet
Asialink Loan Application
2 pages
Biochemistry (Digestion) (269-272)
No ratings yet
Biochemistry (Digestion) (269-272)
4 pages
02.06.2021 2020 Well-Wise Daily Geological Report For Drilling Wells (Status at 5:00 AM)
No ratings yet
02.06.2021 2020 Well-Wise Daily Geological Report For Drilling Wells (Status at 5:00 AM)
1 page
Chapter - 1. Number System
No ratings yet
Chapter - 1. Number System
31 pages
SEACLEAN PLUS.pdf
No ratings yet
SEACLEAN PLUS.pdf
13 pages
Đề KT Unit 6 - Tiếng Anh 2 Smart Start
No ratings yet
Đề KT Unit 6 - Tiếng Anh 2 Smart Start
5 pages
ALL SEMESTER MARKSHEET (ALL IN ONE) - Compressed - Rem675757oved
100% (1)
ALL SEMESTER MARKSHEET (ALL IN ONE) - Compressed - Rem675757oved
1 page
Combined Answers and Rationale
No ratings yet
Combined Answers and Rationale
7 pages
Problems On Cost of Capital
100% (1)
Problems On Cost of Capital
4 pages
Chapter 6 Capital Budgeting Process
No ratings yet
Chapter 6 Capital Budgeting Process
43 pages
Mandala (Breaking Bad) - Wikipedia
No ratings yet
Mandala (Breaking Bad) - Wikipedia
3 pages
17 The Book of Secrets
100% (1)
17 The Book of Secrets
3 pages
Chapter - 72 - Surah Al-Jinn Tafsir-Ibn-Kathir - 5333
No ratings yet
Chapter - 72 - Surah Al-Jinn Tafsir-Ibn-Kathir - 5333
21 pages