P6ADBMS
P6ADBMS
▷ Plot
○ The plot() function is used to draw points (markers) in a diagram.
○ The function takes parameters for specifying points in the diagram.
■ Parameter 1 specifies points on the x-axis.
■ Parameter 2 specifies points on the y-axis.
○ At its simplest, you can use the plot() function to plot two numbers against each other:
○ Example - Draw one point in the diagram, at position (1) and position (3):
plot(1, 3)
R Plotting
▷ Multiple Points
○ You can plot as many points as you like, just make sure you have the same number of points in
both axis:
○ Example
plot(c(1, 2, 3, 4, 5), c(3, 7, 8, 9, 12))
R Plotting
○ For better organization, when you have many values, it is better to use variables:
○ Example
x <- c(1, 2, 3, 4, 5)
y <- c(3, 7, 8, 9, 12)
plot(x, y)
▷ Sequences of Points
▷ If you want to draw dots in a sequence, on both the x-axis and the y-axis, use the : operator:
▷ Example
plot(1:10)
R Line
▷ Line Graphs
○ A line chart is a graph that connects a series of points by drawing line segments between them.
○ These points are ordered in one of their coordinate (usually the x-coordinate) value.
○ Line charts are usually used in identifying the trends in data.
○ The plot() function in R is used to create the line graph.
R Line
▷ Line Graphs
○ A line graph has a line that connects all the points in a diagram.
○ To create a line, use the plot() function and add the type parameter with a value of "l":
○ Example
plot(1:10, type="l")
▷ Line Color
○ The line color is black by default. To change the color, use the col parameter:
○ Example
plot(1:10, type="l", col="blue")
R Line
▷ Line Width
○ To change the width of the line, use the lwd parameter (1 is default, while 0.5 means 50%
smaller, and 2 means 100% larger):
○ Example
plot(1:10, type="l", lwd=2)
▷ Line Styles
○ The line is solid by default. Use the lty parameter with a value from 0 to 6 to specify the line
format.
○ For example, lty=3 will display a dotted line instead of a solid line:
○ Example
plot(1:10, type="l", lwd=5, lty=3)
R Line
○ The simple scatterplot is created using the plot() •x is the data set whose values are the horizontal coordinates.
•y is the data set whose values are the vertical coordinates.
function.
•main is the tile of the graph.
○ Syntax - The basic syntax for creating scatterplot in •xlab is the label in the horizontal axis.
▷ Example
x <- c(5,7,8,7,2,2,9,4,11,12,9,6)
y <- c(99,86,87,88,111,103,87,94,78,77,85,86)
plot(x, y, main="Observation of Cars", xlab="Car age", ylab="Car speed")
▷ The observation in the example above should show the result of 12 cars passing by.
▷ The x-axis shows how old the car is.
▷ The y-axis shows the speed of the car when it passes.
R - Scatterplots
▷ Compare Plots
○ To compare the plot with another plot, use the points() function:
○ Example - Draw two plots on the same figure:
# day one, the age and speed of 12 cars:
x1 <- c(5,7,8,7,2,2,9,4,11,12,9,6)
y1 <- c(99,86,87,88,111,103,87,94,78,77,85,86)
# day two, the age and speed of 15 cars:
x2 <- c(2,2,8,1,15,8,12,9,7,3,11,4,7,14,12)
y2 <- c(100,105,84,105,90,99,90,95,94,100,79,112,91,80,85)
plot(x1, y1, main="Observation of Cars", xlab="Car age", ylab="Car speed", col="red", cex=2)
points(x2, y2, col="blue", cex=2)
R - Pie Charts
▷ Pie Charts
○ A pie chart is a circular graphical view of data. In R the pie chart is created using the pie() function
which takes positive numbers as a vector input. The additional parameters are used to control labels,
color, title etc.
○ Syntax - The basic syntax for creating a pie-chart using the R is −
pie(x, labels, radius, main, col, clockwise)
○ Following is the description of the parameters used −
■ x is a vector containing the numeric values used in the pie chart.
■ labels is used to give description to the slices.
■ radius indicates the radius of the circle of the pie chart.(value between −1 and +1).
■ main indicates the title of the chart.
■ col indicates the color palette.
■ clockwise is a logical value indicating if the slices are drawn clockwise or anti clockwise.
R - Pie Charts
▷ Example
# Create a vector of pies
x <- c(10,20,30,40)
pie(x)
▷ As you can see the pie chart draws one pie for each value in the vector (in this case 10, 20, 30, 40).
▷ By default, the plotting of the first pie starts from the x-axis and move counterclockwise.
▷ Note: The size of each pie is determined by comparing the value with all the other values, by using this
formula:
▷ The value divided by the sum of all values: x/sum(x)
R - Pie Charts
▷ Colors
○ You can add a color to each pie with the col parameter:
○ Example
# Create a vector of colors
colors <- c("blue", "yellow", "green", "black")
# Display the pie chart with colors
pie(x, label = mylabel, main = "Fruits", col = colors)
R - Pie Charts
▷ Legend
○ To add a list of explanation for each pie, use the legend() function:
○ The legend can be positioned as either: bottomright, bottom, bottomleft, left, topleft, top, topright, right, center
○ Example
# Create a vector of pies
x <- c(10,20,30,40)
# Create a vector of labels
mylabel <- c("Apples", "Bananas", "Cherries", "Dates")
# Create a vector of colors
colors <- c("blue", "yellow", "green", "black")
# Display the pie chart with colors
pie(x, label = mylabel, main = "Pie Chart", col = colors)
# Display the explanation box
legend("bottomright", mylabel, fill = colors)
R Data Interfaces -
Importing Data
Importing Data
▷ The csv file is a text file in which the values in the columns are separated by a comma.
▷ You can create this file using windows notepad by copying and pasting this data.
▷ Save the file as input.csv using the save As All files(*.*) option in notepad.
id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
8,Guru,722.5,2014-06-17,Finance
Importing Data
▷ Get Information
○ Use the dim() function to find the dimensions of the data set, and the names() function to view
the names of the variables:
dim(data)
names(data)
rownames(data)
Importing Data
▷ Once we read data in a data frame, we can apply all the functions applicable to data frames
▷ Example - Get the maximum salary
# Get the max salary from data frame.
sal <- max(data$salary)
print(sal)
Importing Data
▷ Sort data
○ To sort the values, use the sort() function:
sort(data$salary)
▷ The summary() function returns six statistical numbers for each variable:
○ Min
○ First quantile (percentile)
○ Median
○ Mean
○ Third quantile (percentile)
○ Max
Importing Data
■ The file_names is the name of that file in which we want to insert our data.
■ The col.names and row.names are the logical values that are specifying whether the column names/row
names of the data frame are to be written to the file.
■ The append is a logical value, which indicates our data should be appended or not into an existing file.
Importing Data
▷ Example
# Create a data frame.
data <- read.xlsx("input.xlsx", sheetIndex = 1)
empdata <- subset(data, as.Date(start_date) > as.Date("2014-01-01"))
# Write filtered data into a new file.
write.xlsx(empdata,"emp.xlsx",col.names=TRUE,
row.names=TRUE,sheetName="Sheet2",append = TRUE)
newdata <- read.xlsx("emp.xlsx",sheetIndex = 1)
print(newdata)