0% found this document useful (0 votes)
27 views190 pages

Data Import, Export and Analysis using R

Module-4 of the CSE1006 course focuses on Data Analysis in R, covering data import, cleaning, and exploratory data analysis techniques using the dplyr package. It details methods for importing various file formats (CSV, TXT, Excel) and exporting data, as well as essential dplyr functions for data manipulation. The module emphasizes practical examples and syntax for effective data handling in R.

Uploaded by

yaraha5692
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views190 pages

Data Import, Export and Analysis using R

Module-4 of the CSE1006 course focuses on Data Analysis in R, covering data import, cleaning, and exploratory data analysis techniques using the dplyr package. It details methods for importing various file formats (CSV, TXT, Excel) and exporting data, as well as essential dplyr functions for data manipulation. The module emphasizes practical examples and syntax for effective data handling in R.

Uploaded by

yaraha5692
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 190

Course code : CSE1006

Course title : Foundations of Data Analytics

Module-4
Data Analysis

15-03-2025 Dr. V. Srilakshmi 1


Module-4
Data Analysis:
Data Import: Reading Data, Writing Data in R, data
cleaning and summarizing with dplyr package,
Exploratory Data Analysis: Box plot, Histogram, Pie
graph, Line chart, Barplot, Scatter Plot

15-03-2025 Dr. V. Srilakshmi 2


Importing Data in R Script
• We can read external datasets and operate with them in our R
environment by importing data into an R script.
• R offers a number of functions for importing data from various file
formats.
• First, let’s consider a data set that we can use for the demonstration. For
this demonstration, we will use two examples of a single dataset, one in
.csv form and another .txt

data1 ->

15-03-2025 Dr. V. Srilakshmi 3


Importing Data in R Script
• Reading a Comma-Separated Value(CSV) File:
Method 1: Using read.csv() Function Read CSV Files into R
• Syntax: read.csv(file.choose(), header)
• The function read.csv() has two parameters:
• file.choose(): It opens a menu to choose a CSV file from the desktop.
• header: It is to indicate whether the first row of the dataset is a variable name
or not. Apply T/True if the variable name is present else put F/False.

# import and store the dataset in data1


data1 <- read.csv(file.choose(), header=T)
# display the data
data1
15-03-2025 Dr. V. Srilakshmi 4
Importing Data in R Script
• Reading a Comma-Separated Value(CSV) File:

• Output:

15-03-2025 Dr. V. Srilakshmi 5


Importing Data in R Script
• Reading a Comma-Separated Value(CSV) File:
Method 2: Using read.table() Function
• Syntax: read. table(file.choose(), header, sep=“ , ”)
• This function specifies how the dataset is separated, in this case
we take sep=”, “ as an argument.
• Example: Assume that we have some data in txt file.

15-03-2025 Dr. V. Srilakshmi 6


Sample1.txt

15-03-2025 Dr. V. Srilakshmi 7


# import and store the dataset in data2
data2 <- read.table(file.choose(), header=T, sep=“,”)
# display the data
data2

15-03-2025 Dr. V. Srilakshmi 8


Importing Data in R Script
• Reading a Comma-Separated Value(CSV) File :
Method 1: Using read.delim() Function
• Syntax: read.delim(file.choose(), header,sep=“”)
• The function read.delim() has two parameters:
• file.choose(): It opens a menu to choose a CSV file from the desktop.
• header: It is to indicate whether the first row of the dataset is a variable
name or not. Apply T/True if the variable name is present else put F/False.

# import and store the dataset in data3


data3 <- read.delim(file.choose(), header=T,sep=“,”)
# display the data
data3
15-03-2025 Dr. V. Srilakshmi 9
Importing Data in R Script
• Reading a Tab-Delimited(txt) File in R Programming Language:

Method 2: Using read.table() Function


• Syntax: read. table(file.choose(), header, sep=“ \t ”)
• This function specifies how the dataset is separated, in this case we take
sep=”\t“ as an argument.

# import and store the dataset in data2


data4 <- read.table(file.choose(), header=T, sep=“\t”)
# display the data
data4

15-03-2025 Dr. V. Srilakshmi 10


Importing Data in R Script
• Reading a Tab-Delimited(txt) File in R Programming Language:
• Output:

15-03-2025 Dr. V. Srilakshmi 11


Importing Data in R Script
• Reading a excel File in R Programming Language:
Method 1: Using read_excel() from readxl
• read_excel() function is basically used to import/read an Excel file and it
can only be accessed after importing the readxl library in R language.
• Syntax: read_excel(path) or read_excel(file.choose(),sheet)
• The read_excel() method extracts the data from the Excel file and returns
it as an R data frame.
install.packages("readxl")
library(readxl)
Data_gfg <- read_excel(file.choose())
or
Data_gfg <- read_excel(file.choose(),sheet=1)
Data_gfg
15-03-2025 Dr. V. Srilakshmi 12
Importing Data in R Script
• Reading a excel File in R Programming Language:
Method 1: Using read.xlsx() from xlsx
• read.xlsx() function is imported from the xlsx library of R language and
used to read/import an excel file in R language.
• Syntax: read.xlsx(path)
• The read_excel() method extracts the data from the Excel file and returns
it as an R data frame.

Data_gfg <-read.xlsx('Data_gfg.xlsx’)
Data_gfg

15-03-2025 Dr. V. Srilakshmi 13


Importing Data Using R-Studio
• Here we are going to import data through R studio with the following
steps.
• Steps:
1. From the Environment tab click on the Import Dataset Menu.

15-03-2025 Dr. V. Srilakshmi 14


Importing Data Using R-Studio
• Steps:
2. Select the file extension from the option.

15-03-2025 Dr. V. Srilakshmi 15


3. In the third step, a pop-up box will appear,
either enter the file name or browse the desktop.

15-03-2025 Dr. V. Srilakshmi 16


4. The selected file will be displayed on a new
window with its dimensions.

15-03-2025 Dr. V. Srilakshmi 17


5. In order to see the output on the console, type
the filename.

15-03-2025 Dr. V. Srilakshmi 18


15-03-2025 Dr. V. Srilakshmi 19
Exporting Data from R Scripts
• When a program is terminated, the entire data is lost.
• Storing in a file will preserve one’s data even if the program terminates.
• If one has to enter a large number of data, it will take a lot of time to
enter them all. However, if one has a file containing all the data, he/she
can easily access the contents of the file using a few commands in R.
• Exporting data to a text file:
• One of the important formats to store a file is in a text file. R provides
various methods that one can export data to a text file.
• write.table():
• The R base function write.table() can be used to export a data frame or a matrix to
a text file.
• In This section of R studio we get the data saved as the name that we gave in the
code. and when we select that files we get this type of output.
15-03-2025 Dr. V. Srilakshmi 20
Exporting Data from R Scripts
• write.table():

• Syntax: write.table(x, file,sep = ” “, dec = “.”, row.names = TRUE, col.names =


TRUE,quote=TRUE/FALSE)

Parameters:
• x: a matrix or a data frame to be written.
• file: a character specifying the name of the result file.
• sep: the field separator string, e.g., sep = “\t” (for tab-separated value).
• dec: the string to be used as decimal separator. Default is “.”
• row.names: either a logical value indicating whether the row names of x are to be
written along with x, or a character vector of row names to be written.
• col.names: either a logical value indicating whether the column names of x are to be
written along with x, or a character vector of column names to be written.
• Quote: logical value which is by default TRUE used to represent whether quotes are
required or not
15-03-2025 Dr. V. Srilakshmi 21
Exporting Data from R Scripts
• write.table():
• Example:
# R program to illustrate Exporting data from R
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45))

# Export a data frame to a text file using write.table()


write.table(df, file = "myDataFrame.txt",
sep = "\t",
row.names = TRUE,
col.names = NA)
15-03-2025 Dr. V. Srilakshmi 22
15-03-2025 Dr. V. Srilakshmi 23
15-03-2025 Dr. V. Srilakshmi 24
Exporting Data from R Scripts
• write.table():
• Example:
# R program to illustrate Exporting data from R
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45))

# Export a data frame to a text file using write.table()


write.table(df, file = "myDataFrame.txt",
sep = "\t",
row.names = TRUE,
col.names = TRUE)
15-03-2025 Dr. V. Srilakshmi 25
15-03-2025 Dr. V. Srilakshmi 26
Exporting Data from R Scripts
• write.table():
• Example:
# R program to illustrate Exporting data from R
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45))

# Export a data frame to a text file using write.table()


write.table(df, file = "myDataFrame.txt",
sep = "\t",
row.names = TRUE,
col.names = TRUE,quote=FALSE)
15-03-2025 Dr. V. Srilakshmi 27
15-03-2025 Dr. V. Srilakshmi 28
15-03-2025 Dr. V. Srilakshmi 29
Exporting Data from R Scripts
• write_tsv():
• This write_tsv() method is also used for to export data to a tab separated (“\t”) values by
using the help of readr package.
• Syntax: write_tsv(file, path)
Parameters:
• file: a data frame to be written
• path: the path to the result file
Example:
# R program to illustrate Exporting data from R
# Importing readr library
library(readr)
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45) )
# Export a data frame using write_tsv()
write_tsv(df, path = "MyDataFrame.txt")
15-03-2025 Dr. V. Srilakshmi 30
Exporting Data from R Scripts
• write.csv():
• This write.csv() method is recommendable for exporting data to a csv file. It
uses “.” for the decimal point and a comma (“, ”) for the separator.
• Syntax: write.csv(file, path)
Parameters:
• file: a data frame to be written
• path: the path to the result file
Example:
# R program to illustrate Exporting data from R
# Importing readr library
library(readr)
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45) )
# Export a data frame using write.csv()
write.csv(df, file = "My_Data.csv")
15-03-2025 Dr. V. Srilakshmi 31
15-03-2025 Dr. V. Srilakshmi 32
Exporting Data from R Scripts
• write.csv2():
• This method is much similar as write.csv() but it uses a comma (“, ”) for the
decimal point and a semicolon (“;”) for the separator.
• Syntax: write.csv2(file, path)
Parameters:
• file: a data frame to be written
• path: the path to the result file
Example:
# R program to illustrate Exporting data from R
# Importing readr library
library(readr)
# Creating a dataframe
df = data.frame( "Name" = c("Amiya", "Raj", "Asish"),
"Language" = c("R", "Python", "Java"),
"Age" = c(22, 25, 45) )
# Export a data frame using write_tsv()
Write.csv2(df, file = "My_Data.csv")
15-03-2025 Dr. V. Srilakshmi 33
15-03-2025 Dr. V. Srilakshmi 34
Data cleaning and summarizing with dplyr package
• dplyr is a powerful R-package to transform and summarize tabular data with
rows and columns.
• The package contains a set of functions (or “verbs”) that perform common
data manipulation operations such as filtering for rows, selecting specific
columns, re-ordering rows, adding new columns and summarizing data.
• In addition, dplyr contains a useful function to perform another common
task which is the “split-apply-combine” concept.
• Install and load dplyr:
To install dplyr
install.packages("dplyr")
To load dplyr
library(dplyr)
15-03-2025 35
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
• The below are some of the most common dplyr functions:
• rename() : rename columns
• recode() : recode values in a column
• select() : subset columns
• filter() : subset rows on conditions
• mutate() : create new columns by using information from other
columns
• summarise() : create summary statistics on grouped data
• arrange() : sort results
• count() : count discrete values
• group_by() : allows for group operations in the “split-apply-combine”
concept
%>%: the “pipe” operator is used to connect multiple verb actions together into a pipeline
15-03-2025 36
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
• rename(): It is often necessary to rename variables to make them more
meaningful.
• Example :
#library(dplyr)
>sample <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
> sample
>dplyr::rename(sample,PULSE1=Pulse,DURATION1=Duration)
15-03-2025 37
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
• select(): The select() function is used to pick specific variables or features of a
Data Frame or a table.
• It selects columns based on provided conditions like contains, matches, starts
with, ends with, and so on.
• Syntax: select(data,col1,col2,…)
• This function returns an object of the same type as data.
• Example:
>dplyr::select(sample,Training) >dplyr::select(sample,Pulse,Training)

15-03-2025 38
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
• select(): Select column list either by name or index number
sample <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45))
dplyr::rename(sample,PULSE1=Pulse,DURATION1=Duration)
dplyr::select(sample,Training)
dplyr::select(sample,Pulse,Training)
dplyr::select(sample,1,3)
dplyr::select(sample,1:3)
15-03-2025 39
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
• select(): Some additional options to select columns based
on a specific criteria include

1. starts_with() = Select columns that starts with a character string


2. ends_with() = Select columns that end with a character string
3. contains() = Select columns that contain a character string
4. matches() = Select columns that match a regular expression

15-03-2025 40
Dr. V. Srilakshmi
Example:

15-03-2025 Dr. V. Srilakshmi 41


15-03-2025 Dr. V. Srilakshmi 42
15-03-2025 Dr. V. Srilakshmi 43
15-03-2025 Dr. V. Srilakshmi 44
Data cleaning and summarizing with dplyr
package
filter():
• The filter() function is used to produce a subset of the data frame, retaining all
rows that satisfy the specified conditions.
• The filter() method in R programming language can be applied to both grouped
and ungrouped data. The expressions include comparison operators (==, >, >= )
, logical operators (&, |, !, xor()) , range operators (between(), near()) as well as
NA value check against the column values.
• Syntax: filter(df , condition)
• Parameters :
• df: The data frame object
• condition: filtering based upon this condition
15-03-2025 45
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
filter() Example:
#library(dplyr)
df=data.frame(x=c(12,31,4,66,78),
y=c(22.1,44.5,6.1,43.1,99),
z=c(TRUE,TRUE,FALSE,TRUE,TRUE))
df
# condition
dplyr::filter(df, x<50 & z==TRUE)

15-03-2025 46
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
• mutate(): mutate() function in R Programming Language is used to add new
variables in a data frame which are formed by performing operations on existing
variables.
• Syntax: mutate(x, expr)
• In R there are five types of main function for mutate that are discribe as below.
we will use dplyr package in R for all mutate functions.
• mutate() - adds new variables while retaining old variables to a data frame.
• transmute() - adds new variables and removes old ones from a data frame.
• mutate_all() - changes every variable in a data frame simultaneously.
• mutate_at() - changes certain variables by name.
• mutate_if() - alterations all variables that satisfy a specific criterion
15-03-2025 47
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
mutate() Example:
library(dplyr)
# Create a data frame
d <- data.frame( name = c("Abhi", "Bhavesh", "Chaman", "Dimri"),
age = c(7, 5, 9, 16),
ht = c(46, NA, NA, 69),
school = c("yes", "yes", "no", "no") )
print(d)
# Calculating a variable x3 which is sum of height and age printing with ht and
age
dplyr::mutate(d, x3 = ht + age)

15-03-2025 48
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
transmute() Example:
# Use transmute to create a new variable 'age_in_months' and drop the 'age'
variable
result <- transmute(d,
name = name,
age_in_months = age * 12,
ht,school)
print(result)

15-03-2025 49
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
mutate_all() Example: The mutate_all() function changes every variable in a data
frame at once, enabling you to use the funs() function to apply a certain function to every
variable.

15-03-2025 Dr. V. Srilakshmi 50


Data cleaning and summarizing with dplyr package
The use of mutate_all() to divide each column in a data frame by ten is
demonstrated in the code below.

15-03-2025 Dr. V. Srilakshmi 51


Data cleaning and summarizing with dplyr package
mutate_at() Example: Using names, the mutate at() function changes particular
variables.
The use of mutate_at() to divide two particular variables by 10 is demonstrated in the code
below:

15-03-2025 Dr. V. Srilakshmi 52


Data cleaning and summarizing with dplyr package
• mutate_if() Example: All variables that match a specific condition are modified by the
mutate_if() function.
• The mutate_if() method can be used to round any numeric variables to the nearest whole
number using the following example code.

15-03-2025 Dr. V. Srilakshmi 53


mutate_if() Example: The mutate_if() function can be used to change any variables
of type factor to type character, as shown in the code below.

15-03-2025 Dr. V. Srilakshmi 54


Data cleaning and summarizing with dplyr
package
summarise_all():
• The summarise_all method in R is used to affect every column of the data
frame. The output data frame returns all the columns of the data frame where
the specified function is applied over every column..
• Syntax: summarise_all(data, function)
• Arguments :
• data – The data frame to summarise the columns of
• function – The function to apply on all the data frame columns.

15-03-2025 55
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
Summarise_all() Example:

# creating a data frame


df <- data.frame(col1=c(1:10),col2=c(11:20))
print("original dataframe")
print(df)
print("summarised dataframe")
dplyr::summarise_all(df, mean)

15-03-2025 56
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr package
arrange():
• arrange() function in R Language is used for reordering of table rows with the help of
column names as expression passed to the function.
• Syntax: arrange(x, expr)
• Parameters:
• x: data set to be reordered
• expr: logical expression with column name
• Example:
#library(dplyr)
d <- data.frame( name = c("Abhi", "Bhavesh", "Chaman", "Dimri"),
age = c(7, 5, 9, 16) )
# Arranging name according to the age
d2<- dplyr::arrange(d, age)
print(d2)
15-03-2025 57
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
• arrange(): arrange() function in R Language is used for reordering of table
rows with the help of column names as expression passed to the function.

• To arrange in a descending order:


• arrange(d1, desc(age))

• To arrange in order using col1 and then by col2:


• arrange(d, age, rollno)

15-03-2025 58
Dr. V. Srilakshmi
Data cleaning and summarizing with dplyr
package
• The count() function is part of the dplyr package, which is widely used for
data manipulation in R. It provides a convenient way to count the
occurrences of unique combinations of variables in a data frame.

• SYNTAX: count(data, ..., sort = FALSE)

15-03-2025 Dr. V. Srilakshmi 59


• count example:

15-03-2025 Dr. V. Srilakshmi 60


15-03-2025 Dr. V. Srilakshmi 61
Data cleaning and summarizing with dplyr
package
Group_by():
• Group_by() function belongs to the dplyr package in the R programming language, which
groups the data frames.
• Group_by() function alone will not give any output. It should be followed by summarise()
function with an appropriate action to perform. It works similar to GROUP BY in SQL and
pivot table in excel.
• Example:
library(dplyr)
df = read.csv("Sample_Superstore.csv")
df_grp_region = df %>% group_by(Region) %>%
summarise(total_sales = sum(Sales),
total_profits = sum(Profit),.groups = 'drop')

15-03-2025 62
Dr. V. Srilakshmi
Exploratory Data Analysis
• Exploratory Data Analysis or EDA is a statistical approach or technique for analysing data
sets in order to summarize their important and main characteristics generally by using
some visual aids.
• The EDA approach can be used to gather knowledge about the following aspects of data:
• Main characteristics or features of the data.
• The variables and their relationships.
• Finding out the important variables that can be used in our problem.
• Exploratory Data Analysis in R:
• In R Language, we are going to perform EDA under two broad classifications:
• Descriptive Statistics, which includes mean, median, mode, inter-quartile range, and so on.
• Graphical Methods, which includes Box plot, Histogram, Pie graph, Line chart, Barplot, Scatter
Plot and so on.

15-03-2025 63
Dr. V. Srilakshmi
Exploratory Data Analysis
• Diagrammatic representation of data:
• The diagrammatic representation of data is one of the best and attractive way of
presenting data.
• It caters both educated and uneducated section of the society.

15-03-2025 64
Dr. V. Srilakshmi
15-03-2025 Dr. V. Srilakshmi 65
15-03-2025 Dr. V. Srilakshmi 66
15-03-2025 Dr. V. Srilakshmi 67
15-03-2025 Dr. V. Srilakshmi 68
Histograms
• A histogram contains a rectangular area to display the statistical information
which is proportional to the frequency of a variable and its width in
successive numerical intervals.
• We can create histograms in R Programming Language using the hist()
function.
• Syntax: hist(v, main, xlab, xlim, ylim, breaks, col, border)
• v: This parameter contains numerical values used in histogram.
• main: This parameter main is the title of the chart.
• col: This parameter is used to set color of the bars.
• xlab: This parameter is the label for horizontal axis.
• border: This parameter is used to set border color of each bar.
• xlim: This parameter is used for plotting values of x-axis.
• ylim: This parameter is used for plotting values of y-axis.
• breaks: This parameter is used as width of each bar.
15-03-2025 69
Dr. V. Srilakshmi
Histograms
• Example:
# Create data for the graph.
v <- c(19, 23, 11, 5, 16, 21, 32, 14, 19, 27, 39)
# Create the histogram.
hist(v, xlab = "No.of Articles ",col = "green", border = "black")

15-03-2025 70
Dr. V. Srilakshmi
Histograms
• Example:
# Create data for the graph.
v <- c(19, 23, 11, 5, 16, 21, 32, 14, 19, 27, 39)
# Create the histogram.
hist(v, xlab="No.of Articles ",col = "green", border = "black“,xlim=c(0,50),ylim=c(0,5),break=5)

15-03-2025 71
Dr. V. Srilakshmi
Pie graph
• A pie chart is a circular statistical graphic, which is divided into slices to illustrate
numerical proportions.
• It depicts a special chart that uses “pie slices”, where each sector shows the relative
sizes of data.
• A circular chart cuts in the form of radii into segments describing relative
frequencies or magnitude also known as a circle graph.
• Syntax: pie(x, labels, main, col, clockwise)
• x: This parameter is a vector that contains the numeric values which are used in the pie chart.
• labels: This parameter gives the description to the slices in pie chart.
• main: This parameter is representing title of the pie chart.
• clockwise: This parameter contains the logical value which indicates whether the slices are
drawn clockwise or in anti-clockwise direction.
• col: This parameter give colours to the pie in the graph.

15-03-2025 72
Dr. V. Srilakshmi
Pie graph
• Example:
bitmap(file="out.png")
Temp<- c(23, 36, 50, 43)
Cities <- c("Banglore", "Pune", "Chennai", "Amaravati")
# Plot the chart.
pie(Temp, Cities)

15-03-2025 73
Dr. V. Srilakshmi
Pie graph
• Example:
bitmap(file="out.png")
Temp<- c(23, 36, 50, 43)
Cities <- c("Banglore", "Pune", "Chennai", "Amaravati")
# Plot the chart.
pie(Temp, Cities, main = "City pie chart",
col = rainbow(length(Temp)) )

15-03-2025 74
Dr. V. Srilakshmi
15-03-2025 Dr. V. Srilakshmi 75
Barplot
• Bar charts are a popular and effective way to visually represent categorical data in a
structured manner.
• R uses the barplot() function to create bar charts. Here, both vertical and Horizontal
bars can be drawn.
• Syntax: barplot(H, xlab, ylab, main, names.arg, col)
• H: This parameter is a vector or matrix containing numeric values which are used in bar chart.
• xlab: This parameter is the label for x axis in bar chart.
• ylab: This parameter is the label for y axis in bar chart.
• main: This parameter is the title of the bar chart.
• names.arg: This parameter is a vector of names appearing under each bar in bar chart.
• col: This parameter is used to give colors to the bars in the graph.

15-03-2025 76
Dr. V. Srilakshmi
Barplot
• Example:
bitmap(file="out.png")
# Create the data for the chart
A <- c(17, 32, 8, 53, 1)
# Plot the bar chart
barplot(A, xlab = "X-axis", ylab = "Y-axis", main ="Bar-Chart")

15-03-2025 77
Dr. V. Srilakshmi
Barplot
• Example:
# Create the data for the chart
A <- c(17, 32, 8, 53, 1)
# Plot the bar chart
barplot(A, horiz = TRUE, xlab = "X-axis", ylab = "Y-axis", main ="Horizontal Bar Chart" )

15-03-2025 78
Dr. V. Srilakshmi
15-03-2025 Dr. V. Srilakshmi 79
15-03-2025 Dr. V. Srilakshmi 80
Scatter Plots
• A "scatter plot" is a type of plot used to display the relationship between two
numerical variables, and plots one dot for each observation.
• It needs two vectors of same length, one for the x-axis (horizontal) and one for the
y-axis (vertical).
• Syntax: plot(x, y, main, xlab, ylab, xlim, ylim, axes)
• x: This parameter sets the horizontal coordinates.
• y: This parameter sets the vertical coordinates.
• xlab: This parameter is the label for horizontal axis.
• ylab: This parameter is the label for vertical axis.
• main: This parameter main is the title of the chart.
• xlim: This parameter is used for plotting values of x.
• ylim: This parameter is used for plotting values of y.
• axes: This parameter indicates whether both axes should be drawn on the plot.

15-03-2025 81
Dr. V. Srilakshmi
Scatter Plots
• Example:
# Get the input values.
input <- mtcars[, c('wt', 'mpg')]
# Plot the chart for cars with weight between 1.5 to 4 and mileage between 10 and 25.
plot(x = input$wt, y = input$mpg,
xlab = "Weight",
ylab = "Milage",
xlim = c(1.5, 4),
ylim = c(10, 25),
main = "Weight vs Milage"
)

15-03-2025 82
Dr. V. Srilakshmi
Line Graphs
• A line graph is a chart that is used to display information in the form of a series of
data points.
• It utilizes points and lines to represent change over time.
• Line graphs are drawn by plotting different points on their X coordinates and Y
coordinates, then by joining them together through a line from beginning to end.
• Syntax: plot(v, type, col, xlab, ylab)
• v: This parameter is a contains only the numeric values
• type: This parameter has the following value:
• “p” : This value is used to draw only the points.
• “l” : This value is used to draw only the lines.
• “o”: This value is used to draw both points and lines
• xlab: This parameter is the label for x axis in the chart.
• ylab: This parameter is the label for y axis in the chart.
• main: This parameter main is the title of the chart.
• col: This parameter is used to give colors to both the points and lines.
15-03-2025 83
Dr. V. Srilakshmi
Line Graphs
• Example:
# Create the data for the chart.
v <- c(17, 25, 38, 13, 41)

# Plot the bar chart.


plot(v, type = "o")

15-03-2025 84
Dr. V. Srilakshmi
15-03-2025 Dr. V. Srilakshmi 85
Boxplots
• A box graph is a chart that is used to display information in the form of distribution
by drawing boxplots for each of them.
• This distribution of data is based on five sets (minimum, first quartile, median, third
quartile, and maximum).
• Syntax: boxplot(x, data, notch, varwidth, names, main)
• x: This parameter sets as a vector or a formula.
• data: This parameter sets the data frame.
• notch: This parameter is the label for horizontal axis.
• varwidth: This parameter is a logical value. Set as true to draw width of the box proportionate
to the sample size.
• main: This parameter is the title of the chart.
• names: This parameter are the group labels that will be showed under each boxplot.

15-03-2025 86
Dr. V. Srilakshmi
Boxplots
• Example:
# use head() to load first six rows of mtcars dataset
head(mtcars)

# boxplot for mpg reading of mtcars dataset


boxplot(mtcars$mpg)

15-03-2025 87
Dr. V. Srilakshmi
Boxplots
• Example:
# add title, label, new color to boxplot
boxplot(mtcars$mpg,
main="Mileage Data Boxplot",
ylab="Miles Per Gallon(mpg)",
xlab="No. of Cylinders",
col="orange")

15-03-2025 88
Dr. V. Srilakshmi
15-03-2025 Dr. V. Srilakshmi 89
15-03-2025 Dr. V. Srilakshmi 90
15-03-2025 Dr. V. Srilakshmi 91
15-03-2025 Dr. V. Srilakshmi 92
15-03-2025 Dr. V. Srilakshmi 93
15-03-2025 Dr. V. Srilakshmi 94
15-03-2025 Dr. V. Srilakshmi 95
15-03-2025 Dr. V. Srilakshmi 96
15-03-2025 Dr. V. Srilakshmi 97
15-03-2025 Dr. V. Srilakshmi 98
15-03-2025 Dr. V. Srilakshmi 99
15-03-2025 Dr. V. Srilakshmi 100
15-03-2025 Dr. V. Srilakshmi 101
15-03-2025 Dr. V. Srilakshmi 102
15-03-2025 Dr. V. Srilakshmi 103
15-03-2025 Dr. V. Srilakshmi 104
15-03-2025 Dr. V. Srilakshmi 105
15-03-2025 Dr. V. Srilakshmi 106
15-03-2025 Dr. V. Srilakshmi 107
15-03-2025 Dr. V. Srilakshmi 108
15-03-2025 Dr. V. Srilakshmi 109
15-03-2025 Dr. V. Srilakshmi 110
15-03-2025 Dr. V. Srilakshmi 111
15-03-2025 Dr. V. Srilakshmi 112
15-03-2025 Dr. V. Srilakshmi 113
15-03-2025 Dr. V. Srilakshmi 114
15-03-2025 Dr. V. Srilakshmi 115
15-03-2025 Dr. V. Srilakshmi 116
15-03-2025 Dr. V. Srilakshmi 117
15-03-2025 Dr. V. Srilakshmi 118
15-03-2025 Dr. V. Srilakshmi 119
15-03-2025 Dr. V. Srilakshmi 120
15-03-2025 Dr. V. Srilakshmi 121
15-03-2025 Dr. V. Srilakshmi 122
15-03-2025 Dr. V. Srilakshmi 123
15-03-2025 Dr. V. Srilakshmi 124
15-03-2025 Dr. V. Srilakshmi 125
15-03-2025 Dr. V. Srilakshmi 126
15-03-2025 Dr. V. Srilakshmi 127
15-03-2025 Dr. V. Srilakshmi 128
15-03-2025 Dr. V. Srilakshmi 129
15-03-2025 Dr. V. Srilakshmi 130
15-03-2025 Dr. V. Srilakshmi 131
15-03-2025 Dr. V. Srilakshmi 132
15-03-2025 Dr. V. Srilakshmi 133
15-03-2025 Dr. V. Srilakshmi 134
15-03-2025 Dr. V. Srilakshmi 135
15-03-2025 Dr. V. Srilakshmi 136
15-03-2025 Dr. V. Srilakshmi 137
15-03-2025 Dr. V. Srilakshmi 138
15-03-2025 Dr. V. Srilakshmi 139
15-03-2025 Dr. V. Srilakshmi 140
15-03-2025 Dr. V. Srilakshmi 141
15-03-2025 Dr. V. Srilakshmi 142
15-03-2025 Dr. V. Srilakshmi 143
15-03-2025 Dr. V. Srilakshmi 144
15-03-2025 Dr. V. Srilakshmi 145
15-03-2025 Dr. V. Srilakshmi 146
15-03-2025 Dr. V. Srilakshmi 147
15-03-2025 Dr. V. Srilakshmi 148
15-03-2025 Dr. V. Srilakshmi 149
15-03-2025 Dr. V. Srilakshmi 150
15-03-2025 Dr. V. Srilakshmi 151
15-03-2025 Dr. V. Srilakshmi 152
15-03-2025 Dr. V. Srilakshmi 153
15-03-2025 Dr. V. Srilakshmi 154
15-03-2025 Dr. V. Srilakshmi 155
15-03-2025 Dr. V. Srilakshmi 156
15-03-2025 Dr. V. Srilakshmi 157
15-03-2025 Dr. V. Srilakshmi 158
15-03-2025 Dr. V. Srilakshmi 159
15-03-2025 Dr. V. Srilakshmi 160
15-03-2025 Dr. V. Srilakshmi 161
15-03-2025 Dr. V. Srilakshmi 162
15-03-2025 Dr. V. Srilakshmi 163
15-03-2025 Dr. V. Srilakshmi 164
15-03-2025 Dr. V. Srilakshmi 165
15-03-2025 Dr. V. Srilakshmi 166
15-03-2025 Dr. V. Srilakshmi 167
15-03-2025 Dr. V. Srilakshmi 168
15-03-2025 Dr. V. Srilakshmi 169
15-03-2025 Dr. V. Srilakshmi 170
15-03-2025 Dr. V. Srilakshmi 171
15-03-2025 Dr. V. Srilakshmi 172
15-03-2025 Dr. V. Srilakshmi 173
15-03-2025 Dr. V. Srilakshmi 174
15-03-2025 Dr. V. Srilakshmi 175
15-03-2025 Dr. V. Srilakshmi 176
15-03-2025 Dr. V. Srilakshmi 177
15-03-2025 Dr. V. Srilakshmi 178
15-03-2025 Dr. V. Srilakshmi 179
15-03-2025 Dr. V. Srilakshmi 180
15-03-2025 Dr. V. Srilakshmi 181
15-03-2025 Dr. V. Srilakshmi 182
15-03-2025 Dr. V. Srilakshmi 183
15-03-2025 Dr. V. Srilakshmi 184
15-03-2025 Dr. V. Srilakshmi 185
15-03-2025 Dr. V. Srilakshmi 186
15-03-2025 Dr. V. Srilakshmi 187
15-03-2025 Dr. V. Srilakshmi 188
15-03-2025 Dr. V. Srilakshmi 189
15-03-2025 Dr. V. Srilakshmi 190

You might also like