0% found this document useful (0 votes)
57 views43 pages

Experiment 3

R can be used to create various data visualization plots including bar charts, box plots, histograms, line graphs, and scatter plots. Bar charts represent data using rectangular bars of varying lengths, and are created using the barplot() function. Box plots show the distribution of data through minimum, maximum, median and quartile values, and are made with boxplot(). Histograms group data into buckets and show frequencies, using hist(). Line graphs connect points over time using plot(), and multiple lines can be added with lines(). Scatter plots show the relationship between two variables using plot(), placing one on the x-axis and one on the y-axis.

Uploaded by

PUSHPITHA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views43 pages

Experiment 3

R can be used to create various data visualization plots including bar charts, box plots, histograms, line graphs, and scatter plots. Bar charts represent data using rectangular bars of varying lengths, and are created using the barplot() function. Box plots show the distribution of data through minimum, maximum, median and quartile values, and are made with boxplot(). Histograms group data into buckets and show frequencies, using hist(). Line graphs connect points over time using plot(), and multiple lines can be added with lines(). Scatter plots show the relationship between two variables using plot(), placing one on the x-axis and one on the y-axis.

Uploaded by

PUSHPITHA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Aim

To visualize the data using


various kinds of plots.
Types of data visualization techniques or plots in R :
In R, we can read data from files stored
outside the R environment
• data <- read.csv("input.csv")
• print(data)
BAR CHART
• A bar chart represents data in rectangular bars with length of the bar
proportional to the value of the variable.
• R uses the function barplot() to create bar charts. R can draw both
vertical and horizontal bars in the bar chart.
The basic syntax to create a bar-chart in R is −
• barplot(H, xlab, ylab, main, names.arg, col)
• WHERE
• H is a vector or matrix containing numeric values used in bar chart.
• xlab is the label for x axis.
• ylab is the label for y axis.
• main is the title of the bar chart.
• names.arg is a vector of names appearing under each bar.
• col is used to give colors to the bars in the graph.
Example
A simple bar chart is created using just the input
vector and the name of each bar.
• # Create the data for the chart.
• H <- c(7,12,28,3,41)

• # Give the chart file a name.


• png(file = "barchart.png")

• # Plot the bar chart.


• barplot(H) # Save the file. dev.off()
Bar Chart Labels, Title and Colors
• The features of the bar chart can be expanded by adding more
parameters.
• The main parameter is used to add title. The col parameter is used to
add colors to the bars.
• The args.name is a vector having same number of values as the input
vector to describe the meaning of each bar.
Example
The following script will create and save the bar
chart in the current R working directory.
• # Create the data for the chart.
• H <- c(7,12,28,3,41)
• M <- c("Mar","Apr","May","Jun","Jul")

• # Give the chart file a name.


• png(file = "barchart_months_revenue.png")

• # Plot the bar chart.


• barplot(H,names.arg = M,xlab = "Month",ylab = "Revenue",col =
"blue", main = "Revenue chart",border = "red“)
• # Save the file.
• dev.off()
Box Plot
• Boxplots are a measure of how well distributed is the data in a data
set. It divides the data set into three quartiles.
• This graph represents the minimum, maximum, median, first quartile
and third quartile in the data set.
• It is also useful in comparing the distribution of data across data sets
by drawing boxplots for each of them.
Boxplots are created in R by using the
boxplot() function.
• Syntax:
boxplot(x, data, notch, varwidth, names, main)
• Where : x is a vector or a formula.
• data is the data frame.
• notch is a logical value. Set as TRUE to draw a notch.
• varwidth is a logical value. Set as true to draw width of the box
proportionate to the sample size.
• names are the group labels which will be printed under each boxplot.
• main is used to give a title to the graph
• Example
• We use the data set "mtcars" available in the R environment to create
a basic boxplot.
• Let's look at the columns "mpg" and "cyl" in mtcars

• input <- mtcars[,c('mpg','cyl')]


• print(head(input))
Creating the Boxplot
• The below script will create a boxplot graph for the relation between mpg
(miles per gallon) and cyl (number of cylinders).

• # Give the chart file a name.


• png(file = "boxplot.png")
• # Plot the chart.
• boxplot(mpg ~ cyl, data = mtcars, xlab = "Number of Cylinders", ylab =
"Miles Per Gallon", main = "Mileage Data")
• # Save the file.
• dev.off()
Histograms
• A histogram represents the frequencies of values of a variable
bucketed into ranges. Histogram is similar to bar chat but the
difference is it groups the values into continuous ranges. Each bar in
histogram represents the height of the number of values present in
that range.
• R creates histogram using hist() function.
• This function takes a vector as an input and uses some more
parameters to plot histograms.
• Syntax
• The basic syntax for creating a histogram using R is −
• hist(v,main,xlab,xlim,ylim,breaks,col,border)
• Where : v is a vector containing numeric values used in histogram.
• main indicates title of the chart.
• col is used to set color of the bars.
• border is used to set border color of each bar.
• xlab is used to give description of x-axis.
• xlim is used to specify the range of values on the x-axis.
• ylim is used to specify the range of values on the y-axis.
• breaks is used to mention the width of each bar.
Example
A simple histogram is created using input vector,
label, col and border parameters.
The script given below will create and save the
histogram in the current R working directory.
• # Create data for the graph.
• v <- c(9,13,21,8,36,22,12,41,31,33,19)

• # Give the chart file a name.


• png(file = "histogram.png")

• # Create the histogram.


• hist(v,xlab = "Weight",col = "yellow",border = "blue")
• # Save the file.
• dev.off()
Range of X and Y values

To specify the range of values allowed in X axis and


Y axis, we can use the xlim and ylim parameters.
The width of each of the bar can be decided by
using breaks.
• # Create data for the graph.
• v <- c(9,13,21,8,36,22,12,41,31,33,19)

• # Give the chart file a name.


• png(file = "histogram_lim_breaks.png")

• # Create the histogram.


• hist(v,xlab = "Weight",col = "green",border = "red", xlim = c(0,40),
ylim = c(0,5), breaks = 5)

• # Save the file.


• dev.off()
R - Line Graphs
• A line chart is a graph that connects a series of points by drawing line
segments between them.
• These points are ordered in one of their coordinate (usually the x-
coordinate) value. Line charts are usually used in identifying the
trends in data.
• The plot() function in R is used to create the line graph.
Syntax :
• plot(v,type,col,xlab,ylab)
• Where :
• v is a vector containing the numeric values.
• type takes the value "p" to draw only the points, "l" to draw only the
lines and "o" to draw both points and lines.
• xlab is the label for x axis.
• ylab is the label for y axis.
• main is the Title of the chart.
• col is used to give colors to both the points and lines.
• Example
• A simple line chart is created using the input vector and the type
parameter as "O". The below script will create and save a line chart in
the current R working directory.
• # Create the data for the chart.
• v <- c(7,12,28,3,41)

• # Give the chart file a name.


• png(file = "line_chart.jpg")

• # Plot the bar chart.


• plot(v,type = "o")

• # Save the file.


• dev.off()
Multiple Lines in a Line Chart

• More than one line can be drawn on the same chart by using the
lines()function.
• After the first line is plotted, the lines() function can use an additional
vector as input to draw the second line in the chart,
• # Create the data for the chart.
• v <- c(7,12,28,3,41)
• t <- c(14,7,6,19,3)

• # Give the chart file a name.


• png(file = "line_chart_2_lines.jpg")

• # Plot the bar chart.


• plot(v,type = "o",col = "red", xlab = "Month", ylab = "Rain fall", main
= "Rain fall chart")
• lines(t, type = "o", col = "blue")
• # Save the file.
• dev.off()
R - Scatterplots
• Scatterplots show many points plotted in the Cartesian plane. Each
point represents the values of two variables.
• One variable is chosen in the horizontal axis and another in the
vertical axis.
• The simple scatterplot is created using the plot() function.
Syntax:
• plot(x, y, main, xlab, ylab, xlim, ylim, axes)
Where :
• x is the data set whose values are the horizontal coordinates.
• y is the data set whose values are the vertical coordinates.
• main is the tile of the graph.
• xlab is the label in the horizontal axis.
• ylab is the label in the vertical axis.
• xlim is the limits of the values of x used for plotting.
• ylim is the limits of the values of y used for plotting.
• axes indicates whether both axes should be drawn on the plot.
Example
We use the data set "mtcars" available in the R
environment to create a basic scatterplot. Let's
use the columns "wt" and "mpg" in mtcars.
• input <- mtcars[,c('wt','mpg')]
• print(head(input))
Creating the Scatterplot
• The below script will create a scatterplot graph for the relation
between wt(weight) and mpg(miles per gallon).
• # Get the input values.
• input <- mtcars[,c('wt','mpg')]
• # Give the chart file a name.
• png(file = "scatterplot.png")
• # Plot the chart for cars with weight between 2.5 to 5 and mileage
between 15 and 30.
• plot(x = input$wt,y = input$mpg, xlab = "Weight", ylab = "Milage",
xlim = c(2.5,5), ylim = c(15,30), main = "Weight vs Milage" )
• # Save the file.
• dev.off()
Test Case-1:
• Create a Student dataset consists of 6 subject marks.
➢Generate Bar plot.
Test Case-2:
• Create a Student’s dataset consists of 6 subject marks
• show the variance in subjects 3 and subject 4.
• Show the variance in subject 1 and subject 5.
➢By using Box Plot.
Test case -3 :
• Create 5 -Student Attendance dataset of December month which consists of
reg.no , no of classes attended in each of 6 subjects.
➢ visualize the students with heighest and lowest attendance values.
[Hint : use multiple line plot function – line() ]
Solution – Test case1 : Create a Student dataset
consists of 6 subject marks.
Generate Bar plot.
• A: H <- c(7,12,28,3,41)
• M <- c(“sub1",“sub2",“sub3",“sub4",“sub5“,”sub6”)

• # Plot the bar chart.


• barplot(H,names.arg = M,xlab = “subjects",ylab = “marks",col = "blue",
main = “Studentchart",border = "red“)
Solution Test case 2 : Create a Student’s dataset
consists of 6 subject marks.
Generate Box Plot, Line plot.
• Create a Student’s dataset consists of 6 subject marks.

• A <- c(7,12,28,3,41) # student 1 marks


• B<- c(3,5,15,21,23) # student 2 marks
• M <- c(“sub1",“sub2",“sub3",“sub4",“sub5“,”sub6”)

• # Plot the chart.


• boxplot( sub3~ sub4, data = student, xlab = “subjects", ylab = “marks", main =
“student data")
• boxplot( sub1~ sub5, data = student, xlab = “subjects", ylab = “marks", main =
“student data")
Solution - Test case -3 : Create 5 -Student Attendance dataset of December month which consists of
no of classes attended in each of 6 subjects.
visualize the second and fourth students attendance values.
[Hint : use multiple line plot function – line() ]

• A<- c(7,12,28,3,41)
• B<- C( 5,12,14,23,25)
• C<-C(21,24,13,15,18)
• D<-c(21,24,9,10,4,7)
• E<- c(22,24,26,12,15)
• plot(B,type = "o",col = "red", xlab = “student", ylab = “attendance ",
main = “Atendance report")
• lines(D, type = "o", col = "blue")
• # Save the file.

You might also like