0% found this document useful (0 votes)
43 views3 pages

DAV Practical 7

This experiment aimed to perform data visualization in R using various libraries. It loaded Uber trip data from multiple months, combined the data, and transformed variables like date/time and weekday. It then created visualizations to show total trips by hour, trips by hour and month in bar plots, and a heatmap of trips by day and hour. The conclusions were that various R libraries were used to analyze and visualize the Uber dataset for predictive purposes and industrial applications.

Uploaded by

potake7704
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views3 pages

DAV Practical 7

This experiment aimed to perform data visualization in R using various libraries. It loaded Uber trip data from multiple months, combined the data, and transformed variables like date/time and weekday. It then created visualizations to show total trips by hour, trips by hour and month in bar plots, and a heatmap of trips by day and hour. The conclusions were that various R libraries were used to analyze and visualize the Uber dataset for predictive purposes and industrial applications.

Uploaded by

potake7704
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

EXPERIMENT NO.

AIM :- Data visualization experiment in R using different Libraries.


PROGRAM :-
library(ggplot2)
library(ggthemes)
library(lubridate)
library(dplyr)
library(tidyr)
library(tidyverse)
library(data.table)
library(scales)

colors = c("#CC1011", "#665555", "#05a399", "#cfcaca", "#f5e840", "#0683c9",


"#e075b0")
colors

apr <- read.csv("../input/uberdataset/uber-raw-data-apr14.csv")


may <- read.csv("../input/uberdataset/uber-raw-data-may14.csv")
june <- read.csv("../input/uberdataset/uber-raw-data-jun14.csv")
july <- read.csv("../input/uberdataset/uber-raw-data-jul14.csv")
aug <- read.csv("../input/uberdataset/uber-raw-data-aug14.csv")
sept <- read.csv("../input/uberdataset/uber-raw-data-sep14.csv")

data <- rbind(apr, may, june, july, aug, sept)


cat("The dimensions of the data are:", dim(data))
The dimensions of the data are: 4534327 4
head(data)

data$Date.Time <- as.POSIXct(data$Date.Time, format="%m/%d/%Y %H:%M:%S")


data$Time <- format(as.POSIXct(data$Date.Time, format = "%m/%d/%Y %H:%M:%S"),
format="%H:%M:%S")
data$Date.Time <- ymd_hms(data$Date.Time)
data$day <- factor(day(data$Date.Time))
data$month <- factor(month(data$Date.Time))
data$year <- factor(year(data$Date.Time))
data$dayofweek <- factor(wday(data$Date.Time))
data$second = factor(second(hms(data$Time)))
data$minute = factor(minute(hms(data$Time)))
data$hour = factor(hour(hms(data$Time)))
head(data)

hourly_data <- data %>%


group_by(hour) %>%
dplyr::summarize(Total = n())
data.table(hourly_data)
ggplot(hourly_data, aes(hour, Total)) +
geom_bar(stat="identity",
fill="steelblue",
color="red") +
ggtitle("Trips Every Hour", subtitle = "aggregated today") +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5)) +
scale_y_continuous(labels=comma)

month_hour_data <- data %>% group_by(month, hour) %>% dplyr::summarize(Total


= n())
ggplot(month_hour_data, aes(hour, Total, fill=month)) +
geom_bar(stat = "identity") +
ggtitle("Trips by Hour and Month") +
scale_y_continuous(labels = comma)

day_month_data <- data %>% group_by(dayofweek, month) %>%


dplyr::summarize(Trips = n())
day_month_data
ggplot(day_month_data, aes(dayofweek, Trips, fill = month)) +
geom_bar(stat = "identity", aes(fill = month), position = "dodge") +
ggtitle("Trias by Day and Month") +
scale_y_continuous(labels = comma) +
scale_fill_manual(values = colors)

ggplot(day_hour_data, aes(day, hour, fill = Total)) +


geom_tile(color = "white") +
ggtitle("Heat Map by Hour and Day")

OUTPUT :-
CONCLUSION :- We studied about the data visualization as well as how to use R libraries in
data analysis for predicting and analyzing the given dataset also know that work of it in
industrial companies.

You might also like