0% found this document useful (0 votes)
69 views23 pages

DV Lab Manual (Ex - No.1-10)

Here are the steps to create custom calculations and fields using aggregate functions in R: 1. Read the CSV file containing the student data using read.csv() and assign it to a data frame df. 2. Use aggregate() to calculate the sum of marks by subject. Specify the formula as df$marks and the grouping as by list(df$subject). Set the FUN argument to sum. This gives the total marks per subject. 3. Similarly calculate the average, minimum, maximum, count and standard deviation of marks by subject using aggregate() and setting FUN to mean, min, max, length, and sd respectively. 4. Use cbind() to combine all the aggregate data frames into one
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views23 pages

DV Lab Manual (Ex - No.1-10)

Here are the steps to create custom calculations and fields using aggregate functions in R: 1. Read the CSV file containing the student data using read.csv() and assign it to a data frame df. 2. Use aggregate() to calculate the sum of marks by subject. Specify the formula as df$marks and the grouping as by list(df$subject). Set the FUN argument to sum. This gives the total marks per subject. 3. Similarly calculate the average, minimum, maximum, count and standard deviation of marks by subject using aggregate() and setting FUN to mean, min, max, length, and sd respectively. 4. Use cbind() to combine all the aggregate data frames into one
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

22CS307PC - DATA VISUALIZATION LAB

Ex.No.1:

Understanding Data, what is data, where to find data, Foundations for building Data
Visualizations, Creating Your First visualization?

Aim:

Understanding Data, what is data? where to find data? Foundations for building Data
Visualizations and Creating Your First visualization.

Solution:

What is Data?

Data refers to raw facts, statistics, or information collected or stored in a structured or


unstructured form. Data can take various forms, such as text, numbers, images, videos, and
more. It is the foundation of all information and knowledge and is used in various fields for
analysis, decision-making, and understanding trends and patterns.

Data can be categorized into two main types:

• Structured Data: This type of data is organized into a specific format, such as tables
or databases, and is easily searchable and analysable. Examples include spreadsheets,
relational databases, and CSV files.

• Unstructured Data: Unstructured data lacks a specific format and can include text
documents, social media posts, images, audio recordings, and more. Analysing
unstructured data often requires advanced techniques like natural language processing
and image recognition.

Where to Find Data?

You can find data from various sources, depending on your specific needs:

• Open Data Portals: Many governments and organizations provide free access to a
wide range of data through open data portals. Examples include Data.gov (United
States) and data.gov.uk (United Kingdom).

• Data Repositories: Academic institutions, research organizations, and data enthusiasts


often share datasets on platforms like Kaggle, GitHub, and the UCI Machine Learning
Repository.
• APIs (Application Programming Interfaces): Some websites and services offer APIs
that allow you to programmatically access and retrieve data. Examples include Twitter
API, Google Maps API, and financial market APIs.

• Web Scraping: You can extract data from websites using web scraping tools and
libraries like BeautifulSoup and Scrapy. However, be mindful of the website's terms of
use and legal restrictions.

• Surveys and Surveys: You can conduct your own surveys or collect data through
questionnaires and interviews.

• IoT Devices: Internet of Things (IoT) devices generate vast amounts of data that can
be used for various purposes.

• Commercial Data Providers: Some companies specialize in selling datasets for


specific industries, such as market research, finance, and healthcare.

Foundations for Building Data Visualizations:

Creating effective data visualizations requires a strong foundation in several key areas:

• Data Analysis: Before creating visualizations, you should thoroughly analyze your
data to understand its structure, relationships, and any patterns or trends. Exploratory
data analysis (EDA) techniques can help with this.
• Statistical Knowledge: Understanding basic statistics is essential for making
meaningful interpretations of data. Concepts like mean, median, standard deviation, and
correlation are commonly used in data visualization.
• Domain Knowledge: Having knowledge of the specific domain or subject matter
related to your data is crucial for creating contextually relevant visualizations. It helps
you ask the right questions and provide valuable insights.
• Visualization Tools: Familiarize yourself with data visualization tools and libraries
such as matplotlib, Seaborn, ggplot2, D3.js, and Tableau. Each tool has its strengths
and can be used for different types of visualizations.
• Design Principles: Study design principles, including color theory, typography, and
visual hierarchy, to create visually appealing and effective visualizations. Avoid
common pitfalls like misleading visualizations.
• Interactivity: Learn how to add interactive elements to your visualizations to engage
users and allow them to explore the data. This can be achieved using tools like
JavaScript, Python libraries, or dedicated visualization software.
Creating Your First Visualization:

To create your first data visualization, follow these general steps:

• Select Your Data: Choose a dataset that aligns with your goals and interests. Ensure
that the data is clean and well-structured.
• Define Your Objective: Clearly define what you want to communicate or explore with
your visualization. Are you looking to show trends, comparisons, or distributions?
• Choose the Right Visualization Type: Select a visualization type that suits your data
and objectives. Common types include bar charts, line charts, scatter plots, histograms,
and pie charts.
• Prepare and Transform Data: Preprocess your data as needed. This may involve
aggregating, filtering, or transforming the data to fit the chosen visualization.
• Create the Visualization: Use a suitable tool or library to create your visualization.
Customize it with labels, colors, and other design elements.
• Interactivity (Optional): If appropriate, add interactive features to your visualization
to allow users to interact with the data.
• Test and Iterate: Review your visualization for accuracy and clarity. Seek feedback
from others and make improvements as necessary.
• Publish or Share: Once you are satisfied with your visualization, publish it on a
platform, embed it in a report, or share it with your intended audience.
• Document and Explain: Provide context and explanations for your visualization.
Clearly communicate what the viewer should take away from it.
• Maintain and Update: If the data changes or new insights emerge, update your
visualization accordingly.
22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 2 Date: 04/09/2023

LAB PROBLEM: Getting started with Tableau Software using Data file formats, connecting your Data to
Tableau, creating basic charts (line, bar charts, Tree maps), Using the Show me panel.

AIM: Create basic charts using Data file format and R graphics packages.

PROGRAMS:
Ex. No. 2(a): Create Line Chart using R Programming
v <- c(17, 25, 38, 13, 41)
t <- c(22, 19, 36, 19, 23)
m<- c(25, 14, 16, 34, 29)

plot(v, type = "o", col = "BLUE", xlab = "Month", ylab = "Article Written ",
main = "Article Written chart")
lines(t, type = "o", col = "RED")
lines(m, type = "o", col = "GREEN")

Ex. No.2(b): Creating Bar Charts using R Programming


temperatures <- c(22, 27, 26, 24, 23, 26, 28)
result <- barplot(temperatures, main = "Maximum Temperatures in a Week",
xlab = "Degree Celsius", ylab = "Day",
names.arg = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),
col = "Blue", density = 20)
print(result)

Ex. No. 2(c) Tree Maps using R Programming


library(plotly)
fig <- plot_ly(type = "treemap",
labels = c("Fruits", "Orange", "Apple", "Red Apple", "Green Apple", "Grapes", "Mango", "Raw Mango",
"Banana"), parents = c("","Fruits", "Fruits", "Apple", "Apple", "Fruits", "Fruits", "Mango", "Fruits"),
values = c(160, 20, 20, 20, 20, 20, 20, 20, 20))
fig
22CS307PC- DATA VISUALIZATION LAB

OUTPUT: 2(a)

OUTPUT: 2(b)

OUTPUT: 2(c)

Result: The above experiments were successfully executed.


22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 3 Date: 11/09/2023

LAB PROBLEM: Tableau Calculations, Overview of SUM, AVR, and Aggregate features, Creating
custom calculations and fields.

AIM: Create custom calculations and fields using Aggregate features in R.

PROGRAM:
df <- read.csv('D:/R_PRG/csv/student_data.csv')
adf1 <- aggregate(df$marks, by=list(df$subject), FUN=sum)
adf2 <- aggregate(df$marks, by=list(df$subject), FUN=mean)
adf3 <- aggregate(df$marks, by=list(df$subject), FUN=min)
adf4 <- aggregate(df$marks, by=list(df$subject), FUN=max)
adf5 <- aggregate(df$marks, by=list(df$subject), FUN=length)
adf6 <- aggregate(df$marks, by=list(df$subject), FUN=sd)
adf <- cbind(adf1, adf2$x,adf3$x, adf4$x,adf5$x,adf6$x)
colnames(adf) <- c('Subject', 'Total', 'Average', 'Min', 'Max', 'count', 'Std. Deviation')
adf

INPUT:
Marks
sno student
Name English Maths Science
1 Bala 72 80 68
2 Damu 95 78 82
3 Gopu 95 90 92
4 John 75 52 95
5 Mary 18 52 86
6 Raju 93 89 27
7 Ram 95 71 90
8 Sita 61 85 88
9 Sudha 75 70 85
10 Syed 99 82 60

student_data.csv
sno, student, subject, marks
1, Bala , English, 72
2, Damu, English, 95
3, Gopu, English, 95
4, John , English, 75
5, Mary, English, 18
6, Raju , English, 93
7, Ram , English, 95
8, Sita, English, 61
9, Sudha, English, 75
10, Syed, English, 99
22CS307PC- DATA VISUALIZATION LAB

11, Bala, Maths, 80


12, Damu, Maths,78
13, Gopu, Maths, 90
14, John, Maths, 52
15, Mary, Maths, 52
16, Raju, Maths, 89
17, Ram, Maths, 71
18, Sita, Maths,85
19, Sudha, Maths, 70
20, Syed, Maths,82
21, Bala, Science, 68
22, Damu, Science, 82
23, Gopu, Science, 92
24, John, Science, 95
25, Mary, Science, 86
26, Raju, Science, 27
27, Ram, Science, 90
28, Sita, Science, 88
29, Sudha, Science, 85
30, Syed, Science, 60

OUTPUT:
Subject Total Average Min Max count Std. Deviation
English 778 77.8 18 99 10 24.66577
Maths 749 74.9 52 90 10 13.75540
Science 773 77.3 27 95 10 20.75813

Result: The above experiment was successfully executed.


22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 4 Date: 25/09/2023

LAB PROBLEM: Applying new data calculations to your visualizations, Formatting Visualizations,
Formatting Tools and Menus, Formatting specific parts of the view.

AIM: Apply new data calculations and format in visualization using R package.

PROGRAM:
library(plotly)
df = read.csv('D:/R_PRG/csv/sales-data.csv')
df['Total_Price'] = df$Price + df$Tax
agg_df <- aggregate(df$Total_Price, by=list(df$ITEM_GROUP), FUN=sum)
colnames(agg_df) <- c('Items', 'Price')
fig <- plot_ly(type='bar',x=agg_df$Items, y=agg_df$Price, text=agg_df$Price)
fig <- fig %>% layout(title = '<b> Super Market - Sales Data',
xaxis = list(title="<b> Grocery Items category", color='Red'),
yaxis = list(title="<b> Total sales(in Rupees)", color='Red'))
fig

sales-data.csv
ITEM_GROUP, ITEM_NAME, Price, Tax
Fruit, Apple, 100, 5
Fruit, Banana, 50, 5
Fruit, Orange, 100, 10
Fruit, Mango, 60, 6
Vegetable, Potato, 50, 5
Vegetable, Brinjal, 40, 4
Vegetable, Raddish, 40, 4
Dairy, Ghee, 100, 10
Dairy, Curd, 40, 4
Dairy, Milk, 50, 5

OUTPUT:

Result: The above experiment was successfully executed.


22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 5 Date: 16/10/2023

LAB PROBLEM: Editing and Formatting Axes, Manipulating Data in Tableau data, Pivoting Tableau
data.

AIM: Manipulating and pivoting employee data using R package.

PROGRAM:
## Data Manipulation using dplyr package
library(dplyr)
d1 <- read.csv('D:/R_PRG/csv/emp.csv')
d2 <- read.csv('D:/R_PRG/csv/dept.csv')
# Employee salary greater than 10000
select(d1, EMP_ID, JOB_ID, F_NAME, L_NAME, SALARY) %>%
filter(SALARY > 10000) %>% arrange(F_NAME) %>%
rename(DESIGNATION=JOB_ID)
## MUTATE&JOINS in SALARY(USD to INR Conversion and Ranking)
df <- left_join(d1,d2, by="DEPT_ID") %>%
select(EMP_ID, F_NAME, L_NAME, DEPT_NAME, SALARY) %>%
group_by(DEPT_NAME) %>%
mutate(SALARY = SALARY * 83, rank = min_rank(desc(SALARY)))
df
## Summarize SALARY department wise
df %>% group_by(DEPT_NAME) %>% summarise(sum(SALARY), mean(SALARY))
## Summarize SALARY of all employees
summarise(d1, sum(SALARY), mean(SALARY))

emp.csv
EMP_ID, F_NAME, L_NAME, JOB_ID, SALARY, DEPT_ID
100, Steven, King, PRESIDENT,24000,90
101, Neena, Kochhar, VICE PRESIDENT,17000,90
102, Lex, De Haan, VICE PRESIDENT,17000,90
103, Alexander, Hunold, IT_PROGRAMMER,9000,60
104, Bruce, Ernst, IT_PROGRAMMER,6000,60
105, David, Austin, IT_PROGRAMMER,4800,60
106, Valli, Pataballa, IT_PROGRAMMER,4800,60
107, Diana, Lorentz, IT_PROGRAMMER,4200,60
108, Nancy, Greenberg, FI_MANAGER,12008,100
109, Daniel, Faviet, ACCOUNTANT,9000,100
110, John, Chen, ACCOUNTANT,8200,100
111, Ismael, Sciarra, ACCOUNTANT,7700,100
112, Jose Manuel, Urman, ACCOUNTANT,7800,100
113, Luis, Popp, ACCOUNTANT,6900,100
114, Den, Raphaely, PU_MAN,11000,30
115, Alexander, Khoo, PU_CLERK,3100,30
116, Shelli, Baida, PU_CLERK,2900,30
117, Sigal, Tobias, PU_CLERK,2800,30
118, Guy, Himuro, PU_CLERK,2600,30
119, Karen, Colmenares, PU_CLERK,2500,30
120, Matthew,Weiss, ST_MAN,8000,50
121, Adam,Fripp, ST_MAN,8200,50
122, Payam,Kaufling, ST_MAN,7900,50
123, Shanta, Vollman, ST_MAN,6500,50
22CS307PC- DATA VISUALIZATION LAB

124, Kevin, Mourgos, ST_MAN,5800,50


125, Julia, Nayer, ST_CLERK,3200,50
126, Irene, Mikkilineni, ST_CLERK,2700,50
127, James, Landry, ST_CLERK,2400,50
128, Steven, Markle, ST_CLERK,2200,50
129, Laura, Bissot, ST_CLERK,3300,50
130, Mozhe, Atkinson, ST_CLERK,2800,50

dept.csv
DEPT_ID, DEPT_NAME
30, PRODUCTION UNIT
40, HUMAN RESOURCE
50, STORE
60, INFORMATION TECHOLOGY
90, ADMINISTRATIVE
100, FINANCE
110, ACCOUNTING

OUTPUT:
EMP_ID DESIGNATION F_NAME L_NAME SALARY
114 PU_MAN Den Raphaely 11000
102 VICE PRESIDENT Lex De Haan 17000
108 FI_MANAGER Nancy Greenberg 12008
101 VICE PRESIDENT Neena Kochhar 17000
100 PRESIDENT Steven King 24000

EMP_ID F_NAME L_NAME DEPT_NAME SALARY rank


100 Steven King "ADMINISTRATIVE " 1992000 1
101 Neena Kochhar "ADMINISTRATIVE " 1411000 2
102 Lex De Haan "ADMINISTRATIVE " 1411000 2
103 Alexander Hunold "INFORMATION TECHOLOGY" 747000 1
104 Bruce Ernst "INFORMATION TECHOLOGY" 498000 2
105 David Austin "INFORMATION TECHOLOGY" 398400 3
106 Valli Pataballa "INFORMATION TECHOLOGY" 398400 3
107 Diana Lorentz "INFORMATION TECHOLOGY" 348600 5
108 Nancy Greenberg "FINANCE" 996664 1
109 Daniel Faviet "FINANCE" 747000 2
# ℹ 21 more rows

DEPT_NAME sum(SALARY) mean(SALARY)


ADMINISTRATIVE 4814000 1604667
FINANCE 4283464 713911
INFORMATION TECHOLOGY 2390400 478080
PRODUCTION UNIT 2066700 344450
STORE 4399000 399909

sum(SALARY) mean(SALARY)
216308 6977.677

Result: The above experiment was successfully executed.


22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 6 Date: 16/10/2023

LAB PROBLEM: Structuring your data, Sorting and filtering Tableau data, Pivoting Tableau data.

AIM: To sorting, filtering and pivoting data using R package.

PROGRAM:
library(tidyverse)
fd <- read.csv('D:/R_PRG/csv/fruits.csv')
fd %>% arrange(desc(quantity)) %>%
filter(colour=='green') %>%
mutate(fruits=fct_reorder(fruit,quantity)) %>%
ggplot(aes(fruits,quantity,fill=colour))+
geom_bar(stat="identity")+
scale_y_continuous("",label=scales::percent)+
coord_flip()+
scale_fill_manual(values = c("orange"="orange","green"="green",
"red"="red","yellow"="yellow"))
fruits.csv
fruit, colour, quantity
apples, green,15
apples, red,25
bananas, green,10
bananas, red,40
bananas, yellow,55
oranges, orange,35
mangos, green,25
mangos, yellow,20
grapes, green,60

OUTPUT:

Result: The above experiment was successfully executed.


22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 7 Date: 20/11/2023

LAB PROBLEM:
Advanced Visualization Tools: Using Filters, Using the Detail panel, using the Size panels, customizing
filters, Using and Customizing tooltips, Formatting your data with colors.

AIM: Customize filter and tooltips, formatting data.

PROGRAM:
library(plotly)
basket <- read.csv("D:/R_PRG/csv/sales.csv")
fig <- plot_ly(
type = 'scatter',
x = basket$ITEM_NAME,
y = basket$Price+basket$Tax,
mode = 'markers',
color = ~basket$ITEM_GROUP,
symbol = ~basket$ITEM_GROUP,
size = 2, alpha = 0.5,
text = ~basket$ITEM_GROUP,
hovertemplate = paste('<b>Price:</b>Rs.%{y:.2f}','<br><b>Item:</b>%{x}',
'<br><b>Group:</b>%{text}'),
marker = list(size = 8,color = "yellow",line = list(color = "red",width=1)),
transforms = list(list(type = 'filter',target = 'y',operation = '>',value = 100))
)
fig <- fig %>% layout(title = "<b>Super Market Items > Rs.100",
xaxis = list(title = "<b>Items", color = 'Blue'),
yaxis = list(title = "<b>Price(in Rupees)", color = 'blue'))
fig

INPUT: sales.csv
ITEM_GROUP, ITEM_NAME, Price, Tax
Fruit, Apple, 200, 10
Fruit, Banana, 80, 4
Fruit, Orange, 100, 5
Fruit, Mango, 60, 3
Fruit, Papaya, 40, 2
Fruit, Lemon, 10, 1
Vegetable, Potato, 20, 1
Vegetable, Brinjal, 20, 1
Vegetable, Radish, 40, 2
Vegetable, Tomato, 40, 2
Vegetable, Onion, 40, 2
Vegetable, Cucumber, 40, 1
Dairy, Butter Milk, 10, 1
Dairy, Ghee, 200, 10
Dairy, Curd, 100, 5
Dairy, Cheese, 100, 5
Dairy, Milk, 60, 3
Dairy, Paneer, 100, 5
22CS307PC- DATA VISUALIZATION LAB

OUTPUT:

RESULT:
The above experiment is successfully executed and output is verified.
22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 8 Date: 27/11/2023

LAB PROBLEM:
Creating Dashboards and Storytelling, creating your first dashboard and Story, Design for different displays,
adding interactivity to your Dashboard, Distributing and Publishing your Visualization.

AIM: To create Sales Dashboard using Shiny.

PROGRAM:
library(shiny)
require(shinydashboard)
library(ggplot2)
library(dplyr)

df <- read.csv ('d:/R_PRG/csv/Sales_Sample.csv', stringsAsFactors = F, header=T)

header <- dashboardHeader(title = "Dashboard")

sidebar <- dashboardSidebar(sidebarMenu(menuItem("Sales Dashboard", tabName = "dashboard", icon =


icon("dashboard"))))

frow1 <- fluidRow(valueBoxOutput("value1"),valueBoxOutput("value2"))

frow2 <- fluidRow(box(title = "Revenue by Sales Rep", status = "primary", solidHeader = TRUE,
collapsible = TRUE, plotOutput("revenuebyRep", height = "300px")),
box (title = "Regionwise Sales Data",status = "primary", solidHeader = TRUE,
collapsible = TRUE, plotOutput("revenuebyRegion", height = "300px")))

# Combine the two fluid rows to make the body


body <- dashboardBody(frow1, frow2)

#completing the ui part with dashboard Page


ui <- dashboardPage(title = 'Sales Dashboard', header, sidebar, body, skin='red')

# Create the server functions for the dashboard


server <- function (input, output) {

#Some data manipulation to derive the values of KPI boxes


total.sales <- sum(df$Sales)
total.units <- sum(df$Units_Sold)

output$value1 <- renderValueBox({ valueBox(formatC(total.sales, format="d", big.mark=','),


'Total Sales',icon = icon("stats",lib='glyphicon'),color = "purple")})

output$value2 <- renderValueBox({ valueBox(formatC(total.units, format="d", big.mark=','),


'Total Units Sold',icon = icon("gbp",lib='glyphicon'),color = "green")})
22CS307PC- DATA VISUALIZATION LAB

#creating the plot Output content


output$revenuebyRep <- renderPlot({
ggplot(data = df, aes(x=QTR, y=Sales, fill=factor(SalesRep))) +
geom_bar(position = "dodge", stat = "identity") +
ylab("Sales in Rupees") + xlab("Querter") +
theme (legend.position="bottom", plot.title = element_text(size=15, face="bold")) +
ggtitle("Revenue by Sales Rep") + labs(fill = "Sales Rep")})

output$revenuebyRegion <- renderPlot({


ggplot(data = df, aes(x=QTR, y=Sales, fill=factor(Region))) +
geom_bar(position = "dodge", stat = "identity") +
ylab("Sales in Rupees") + xlab("Querter") +
theme (legend.position="bottom",plot.title = element_text(size=15, face="bold")) +
ggtitle("Regionwise Revenue") + labs(fill = "Region")})
}
shinyApp(ui, server)

INPUT: Sales_Sample.csv
SalesRep, Region, QTR, Sales, Units_Sold
Amy, North, Q1,24971,84
Amy, South, Q2,25749,557
Amy, East, Q3,24437,95
Amy, West, Q4,25355,706
Bob, North, Q1,25320,231
Bob, South, Q2,25999,84
Bob, East, Q3,22639,260
Bob, West, Q4,23949,109
Chuck, North, Q1,20280,453
Chuck, South, Q2,21584,114
Chuck, East, Q3,19625,83
Chuck, West, Q4,19832,70
Doug, North, Q1,25150,242
Doug, South, Q2,29061,146
Doug, East, Q3,27113,120
Doug, West, Q4,25953,81
John, North, Q1,34971,184
John, South, Q2,35749,657
John, East, Q3,34437,295
John, West, Q4,35355,806
22CS307PC- DATA VISUALIZATION LAB

OUTPUT:

RESULT: The above experiment is successfully executed and output is verified.


Ex. No.: 9 Date: 04/12/2023

LAB PROBLEM:
Tableau file types, publishing to Tableau Online, Sharing your visualizations, printing, and Exporting.

AIM: To publishing, sharing, printing and exporting R output files.

PROCEDURE:

R File Types: Most commonly used files types related in R,

R-files
⁕ .R - R Script file
⁕ .Rproj - R Project
⁕ .RData - R Data file

Data Files
⁕ .csv files – CSV (comma separated values file) file
⁕ .txt files – Text file / Tab-separated data / Tab delimited files
⁕ .stata – stata (syllabic abbreviation of the words statistics and data) file
⁕ .sav – SPSS (Statistical Package for the Social Sciences) data file
⁕ .xlsx – Microsoft Excel format

Publishing to R Online and Sharing your Visualizations


Sharing your R projects on GitHub or RStudio Cloud comes with many advantages. Collaborating with
other data analysts, developers, or researchers can give you valuable feedback and suggestions.

GitHub:
GitHub is a web-based platform that allows you to store, manage, and version control your code and files.
It is widely used by developers, researchers, and data analysts to collaborate on projects, track changes, and
host websites.

RStudio:
RStudio Cloud is a web-based platform that allows you to create, run, and share your R projects online. It
is similar to the RStudio IDE, but you don't need to install anything on your computer. You can access your
projects from any browser and any device. RStudio Cloud also lets you collaborate with others in real time,
share your code and data, and publish your results as websites or apps.

Step 1: Create Your Data Visualization


The first step in sharing your data visualization online is, of course, creating it. RStudio is a great tool for
creating data visualizations using R, and there are countless packages available for creating everything from
basic bar charts to complex interactive visualizations. Once you have created your visualization in R, you
will need to save it as an HTML file. This can be done using the htmlwidgets package in R. Simply call the
saveWidget() function with your visualization as the first argument and the file path where you want to
save the HTML file as the second argument.
Step 2: Deploy Your Visualization to RStudio Connect
RStudio Connect is a platform for sharing R-based content, including data visualizations, with others. To
deploy your visualization to RStudio Connect, you will need to create an account on the platform and upload
your HTML file. To upload your HTML file to RStudio Connect, simply click on the “Upload” button in
the dashboard and select your file. You can then customize the settings for your visualization, such as who
can access it and whether it should be password-protected.

Step 3: Publish Your Visualization to GitHub Pages


GitHub Pages is a free hosting service provided by GitHub that allows you to publish your HTML files
online. To publish your visualization to GitHub Pages, you will need to create a repository on GitHub and
upload your HTML file to it. Once you have created your repository and uploaded your HTML file, you
can enable GitHub Pages by going to the repository settings and selecting the “Pages” tab. From there, you
can choose which branch you want to publish your visualization from and customize your site settings.

Step 4: Share Your Visualization


Now that your visualization is online, you can share it with others by simply sending them the URL. You
can also embed your visualization on other websites by using the iframe code provided by RStudio Connect
or GitHub Pages.

Printing & Exporting:


File formats for exporting plots:
⁕ pdf(“rplot.pdf”): pdf file
⁕ png(“rplot.png”): png file
⁕ jpeg(“rplot.jpg”): jpeg file
⁕ postscript(“rplot.ps”): postscript file
⁕ bmp(“rplot.bmp”): bmp file
⁕ win.metafile(“rplot.wmf”): windows metafile

If you are using RStudio you can export a plot with the Export menu of the Plots Pane:

The menu allows you to select three options: save the plot as Image, as PDF or copy the plot to the
Clipboard.
Save as image:
If you select Save as Image... the following window will open:You can select the image format to which
you want to save the plot (PNG, JPEG, TIFF, BMP, Metafile, SVG, EPS), the width and height in pixels,
the directory where is going to be saved and the file name.
Save as PDF:
If you select Save as PDF... you can select the PDF size, the orientation, the cairo graphics API, the
directory and the file name
Copy to clipboard:
The last option you can select is copying the image to the clipboard, as Bitmap or Metafile. You can also
specify the width and the height in pixels.

In R GUI you will need to go to File → Save as and select the type of file you prefer. If you select Jpeg,
you can also specify the quality of the resulting image. The last option is copying the image to the
Clipboard.
22CS307PC- DATA VISUALIZATION LAB

Ex. No.: 10 Date: 18/12/2023

LAB PROBLEM: Creating custom charts, cyclical data and circular area charts, Dual Axis charts.

AIM: To create circular area and dual axis charts for cyclical data.

PROGRAM: 10(a) – Circular Area Chart


# 10(a): cyclical data and circular area chart
library(ggplot2)
df<- read.csv("D:/R_PRG/csv/temp.csv")
df$Year <- as.factor(df$Year)
df$Month <- as.factor(df$Month)

p <- ggplot(df, aes(x = Month, y = Temperature, fill = Year)) +


geom_col(position = "dodge") +
ggtitle("Temperature Data [Year 2018 - 2022]") +
coord_polar() +
xlab("Year and Months") + ylab("Temperature in Celsius") +
ylim(-20, 20) +
scale_fill_viridis_d()
p

p + theme(plot.title = element_text(color="brown", size=18, face="bold.italic"),


axis.title.x = element_text(color="blue", size=14, face="bold"),
axis.text.x = element_text(color="darkblue"),
axis.title.y = element_text(color="red", size=14, face="bold"),
axis.text.y = element_text(color="darkred"))

INPUT: temp.csv
Year, Month, Temperature
2018,JAN,4.1
2018,FEB,2.1
2018,MAR,3.8
2018,APR,8.6
2018,MAY,12.2
2018,JUN,14.6
2018,JUL,17.8
2018,AUG,16.2
2018,SEP,13
2018,OCT,10
2018,NOV,7.4
2018,DEC,5.4
2019,JAN,3.4
2019,FEB,6
2019,MAR,7
2019,APR,7.8
2019,MAY,10.3
2019,JUN,13.4
2019,JUL,17
2019,AUG,16.5
2019,SEP,13.4
2019,OCT,8.8
22CS307PC- DATA VISUALIZATION LAB

2019,NOV,5.6
2019,DEC,4.9
2020,JAN,5.7
2020,FEB,5.2
2020,MAR,5.8
2020,APR,9.1
2020,MAY,11.7
2020,JUN,14
2020,JUL,14.8
2020,AUG,16
2020,SEP,13
2020,OCT,9.5
2020,NOV,7.7
2020,DEC,4
2021,JAN,1.9
2021,FEB,4
2021,MAR,6.9
2021,APR,5.5
2021,MAY,9.2
2021,JUN,14.6
2021,JUL,16.9
2021,AUG,15.4
2021,SEP,15.3
2021,OCT,11.1
2021,NOV,7
2021,DEC,4.8
2022,JAN,4.3
2022,FEB,5.7
2022,MAR,6.9
2022,APR,8.2
2022,MAY,12.2
2022,JUN,14.4
2022,JUL,17.6
2022,AUG,17.2
2022,SEP,13.7
2022,OCT,11.7
2022,NOV,8
2022,DEC,2.9

OUTPUT:
22CS307PC- DATA VISUALIZATION LAB

PROGRAM: 10(b) – Dual Axis Chart


#Ex. No. 10(b): cyclical data and Dual Axis Charts
library(latticeExtra)
df <- read.csv("D:/R_PRG/csv/rain_temp.csv")
Year <- df$year
Rainfall <- df$rainfall
Temperature <- df$temperature
df <- data.frame(Year, Rainfall, Temperature)
# construct separate plots for each series
rf <- xyplot(Rainfall ~Year, df, type = c("p","l") , lwd = 2, pch = 10,
main = "Rainfall and Temperature (2010-2022)")
tp <- xyplot(Temperature ~Year, df, type = c("p","l"), lwd = 2, pch = 10)
# Make the plot with second y axis AND legend
doubleYScale(rf, tp, text = c("Rainfall (millimeters)","Temperature (Celsius)"), add.ylab2 = TRUE)
# re-plot with different styles
update(trellis.last.object(), par.settings = simpleTheme(col = c("blue","red")))

INPUT: rain_temp.csv
year, rainfall, temperature
2010,638.3,28.16
2011,561.7,29.85
2012,943,28.81
2013,605.5,28.85
2014,731.2,30.01
2015,715.5,29.38
2016,678.2,29.38
2017,754.5,29.71
2018,621.6,29.64
2019,805.9,29.54
2020,733.8,29.73
2021,899.8,29.4
2022,610.2,30.24

OUTPUT:

RESULT: The above experiments are successfully executed and outputs are verified.

You might also like