UNIT V Data Visualization and Prototype Application
Development
UNIT V Data Visualization and Prototype Application Development: Data Visualization
options, Crossfilter, the JavaScript MapReduce library, Creating an interactive
dashboard with dc.js, Dashboard development tools.
Data Visualization options
In this scenario, a hospital pharmacy needs to categorize its medicine inventory based on
light sensitivity due to new government regulations. However, no specific list of light-
sensitive medicines is provided. Using text mining techniques, each medicine's leaflet is
analyzed to determine if it is "light sensitive" or "not light sensitive." This classification
helps the pharmacy decide on the type and quantity of containers needed. The Excel
table structure shown above represents a simplified view of the dataset, which can later
be visualized in a dashboard to assist the pharmacy in stock management and regulatory
compliance.
In this case study, you’re working with time-series data that tracks the stock movement
of medicines across an entire year. Each medicine has 365 entries in the dataset,
resulting in a total of over 10,000 lines of data for 29 medicines. Although the actual
medicines are real, other values in the dataset are generated for confidentiality.
Here are some popular data visualization options, especially suited for creating
interactive dashboards:
1. dc.js - A JavaScript library combining Crossfilter (for data manipulation) with d3.js (for
visuals), ideal for interactive dashboards where users need to explore data through filters
and real-time updates.
2. d3.js - A powerful and flexible JavaScript library for creating complex, custom data
visualizations directly in the browser.
3. Crossfilter - A JavaScript library for fast manipulation of large datasets, often used with
dc.js to allow real-time filtering in dashboards.
4. NVD3 - Built on top of d3.js, NVD3 provides reusable chart components, making it
easier to create visualizations without writing extensive custom code.
5. C3.js - An easy-to-use, flexible charting library based on d3.js, which simplifies the
process of creating common chart types like line, bar, and pie charts.
6. Dimple.js - A simple library for business analytics that extends d3.js, offering a variety
of chart options suitable for dashboards.
7. Chart.js - A straightforward, lightweight library for basic chart types like bar, line,
and pie, often used for simpler visualizations.
To get an idea of what you’re about to create, you can go to the following website,
http://dc-js.github.io/dc.js/, and scroll down to the NASDAQ example, shown in figure
These tools range from highly customizable (like d3.js) to simpler, ready-to-use options
(like Chart.js), depending on your dashboard needs and data complexity.
Figure : A dc.js interactive example on its official website
Crossfilter, the JavaScript MapReduce library
In simple terms, Crossfilter is a JavaScript library that helps process and filter large
datasets efficiently within the browser. While JavaScript is not the best language for
handling huge amounts of data, developers at Square created MapReduce libraries like
Crossfilter to make it faster.
Key Points:
• Data Transfer Issues: When you send too much data over the network (whether it's the
internet or an internal network), it can slow things down for everyone. Also, loading a
lot of data into the browser can cause it to freeze, especially when the dataset gets very
large (like 100,000 to 1 million lines).
• Browser Problems: If you try to load a huge dataset in the browser, it may take a long
time or even crash if the data is too wide or too large.
• How Crossfilter Helps: Once the data is loaded into the browser, Crossfilter takes over.
It processes and filters the data locally in the browser, making it faster and reducing the
load on the network. This means you don’t have to keep sending huge chunks of data
from the server to the browser.
Example:
In your case study, a pharmacist requested stock data for 29 medicines in 2015. Instead
of reloading the data each time, Crossfilter processes the data on the client-side (in the
browser), allowing the pharmacist to interact with the data smoothly and quickly.
Conclusion:
While you want to avoid sending huge datasets over the network, Crossfilter ensures that
once the data is in the browser, it's handled efficiently, so the user can work with it
without delays or crashes.
Setting up everything:
It’s time to build the actual application, and the ingredients of our small dc.js application
are as follows:
■ JQuery—To handle the interactivity
■ Crossfilter.js—A MapReduce library and prerequisite to dc.js
■ d3.js—A popular data visualization library and prerequisite to dc.js
■ dc.js—The visualization library you will use to create your interactive dashboard
■ Bootstrap—A widely used layout library you’ll use to make it all look better
You’ll write only three files:
■ index.html—The HTML page that contains your application
■ application.js—To hold all the JavaScript code you’ll write
■ application.css—For your own CSS
To run your code on an HTTP server without setting up complex server software like
LAMP or WAMP, you can use Python's built-in HTTP server. Here's how to do it:
1. Open the Command Line: Use your terminal or command prompt (CMD).
2. Navigate to Your Project Folder: Use the cd command to go to the folder
where your index.html file is located.
3. Run the Python HTTP Server:
If you have Python 2.x, use
python -m
SimpleHTTPServer If you have
Python 3.x, use:
python -m http.server 8000
As you can see in figure 9.3, an HTTP server is started on localhost port 8000. In your
browser this translates to “localhost:8000”; putting “0.0.0.0:8000” won’t work.
Figure: Starting up a simple Python HTTP server
Make sure to have all the required files available in the same folder as your index.html.
You can download them from the Manning website or from their creators’ websites.
■ dc.css and dc.min.js—https://dc-js.github.io/dc.js/
■ d3.v3.min.js—http://d3js.org/
■ crossfilter.min.js—http://square.github.io/crossfilter/
Now we know how to run the code we’re about to create, so let’s look at the
index.html page, shown in the following listing.
CreateTable() requires three arguments:
■ data—The data it needs to put into a table.
■ variablesInTable—What variables it needs to show.
■ Title—The title of the table. It’s always nice to know what you’re looking at.
CreateTable() uses a predefined variable, table Template, that contains our
overall table layout. CreateTable() can then add rows of data to this template.
Now that you have your utilities, let’s get to the main function of the application, as
shown in the following listing.
You start off by showing your data on the screen, but preferably not all of it; only
the first five entries will do, as shown in figure 9.4. You have a date variable in your
data and you want to make sure Crossfilter will recognize it as such later on, so you
first
parse it and create a new variable called Day. You show the original, Date, to appear
in
the table for now, but later on you’ll use Day for all your calculations.
Unleashing Crossfilter to filter the medicine data set
Here’s a streamlined example of how to implement filtering with Crossfilter on a
dataset of medicines. This example sets up the Crossfilter instance, defines a
dimension based on medicine names, and filters the data by a specific medicine,
showing the top five results.
javascript
Copy code
function main() {
// Step 1: Initialize Crossfilter with the dataset
var CrossfilterInstance = crossfilter(medicineData);
// Step 2: Create a dimension based on medicine names
var medNameDim = CrossfilterInstance.dimension(function(d)
{ return d.MedName;
});
// Step 3: Filter data for a specific medicine name
var dataFiltered = medNameDim.filter('Grazax 75 000 SQ-T');
// Step 4: Display the top five filtered results
var filteredTable = $('#filteredtable');
filteredTable.empty().append(CreateTable(dataFiltered.top(5), variablesInTable,
'Our First Filtered Table'));
}
Explanation:
1. Initialize Crossfilter: CrossfilterInstance = crossfilter(medicineData); creates
a Crossfilter instance with medicineData.
2. Define Dimension: medNameDim is a dimension for the MedName field.
3. Filter by Medicine: dataFiltered filters for "Grazax 75 000 SQ-T".
4. Display Results: CreateTable() formats and displays the filtered data in a table.
This code displays the top five results for the specified medicine in a table using
Crossfilter and JavaScript.
To calculate the average stock per medicine with Crossfilter, you need to set up a
custom reduce() function with three components: initialization, adding a record, and
removing a record. Here’s how to implement this:
javascript
Copy code
function main() {
// Step 1: Initialize Crossfilter
var CrossfilterInstance = crossfilter(medicineData);
// Step 2: Create a dimension based on medicine names
var medNameDim = CrossfilterInstance.dimension(function(d)
{ return d.MedName;
});
// Step 3: Define the reduce functions
var reduceInitAvg = function() {
return { count: 0, stockSum: 0, stockAvg: 0 };
};
var reduceAddAvg = function(p, v)
{ p.count += 1;
p.stockSum += Number(v.Stock);
p.stockAvg = Math.round(p.stockSum / p.count);
return p;
};
var reduceRemoveAvg = function(p, v)
{ p.count -= 1;
p.stockSum -= Number(v.Stock);
p.stockAvg = p.count === 0 ? 0 : Math.round(p.stockSum / p.count);
return p;
};
// Step 4: Apply reduce function to calculate average stock per medicine
var
dataFiltered =
medNameDim.group().reduce(reduceAddAvg,
reduceRemoveAvg, reduceInitAvg);
// Define which columns to display in the table
var variablesInTable = ["key",
"value.stockAvg"];
// Step 5: Display the reduced table with average stock per medicine
var filteredTable = $('#filteredtable');
filteredTable.empty().append(CreateTable(dataFiltered.top(Infinity),
variablesInTable, 'Reduced Table'));
}
Explanation:
1. Initialization (reduceInitAvg): Sets initial values to zero for count, stockSum,
and stockAvg.
2. Add Function (reduceAddAvg): Increments count by 1, adds the current
Stock value to stockSum, and recalculates stockAvg.
3. Remove Function (reduceRemoveAvg): Decreases count by 1, subtracts
Stock from stockSum, and recalculates stockAvg. If count is zero, stockAvg
resets to zero to avoid division by zero.
4. Apply Reduction: Groups by medicine name and reduces using
reduceAddAvg, reduceRemoveAvg, and reduceInitAvg.
5. Display Results: Uses CreateTable() to show the average stock for each
medicine.
This setup allows you to calculate and display the average stock per medicine using
Crossfilter’s custom reduce() functionality.
Creating an interactive dashboard with dc.js
To create an interactive dashboard with dc.js, follow these
steps to set up the HTML layout, JavaScript for Crossfilter
and dc.js, and then render the charts. Here’s how you can
structure the code:
1. Update index.html:
Add placeholders for the charts and the reset button:
html
Copy
code
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Data Science Application</title>
<link rel="stylesheet"
href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.0/css/bo
otstrap.min.css">
<link rel="stylesheet" href="dc.min.css">
</head>
<body>
<main class='container'>
<h1>Chapter 10: Data Science Application</h1>
<!-- Input and Filtered Table Rows -->
<div class="row">
<div class='col-lg-12'>
<div id="inputtable" class="well well-sm"></div>
</div>
</div>
<div class="row">
<div class='col-lg-12'>
<div id="filteredtable" class="well well-sm"></div>
</div>
</div>
<!-- Reset Button -->
<button class="btn btn-success">Reset Filters</button>
<!-- Dashboard Chart Rows -->
<div class="row">
<div class="col-lg-6">
<div id="StockOverTime"
class="well well-sm"></div>
<div id="LightSensitiveStock" class="well well-
sm"></div>
</div>
<div class="col-lg-6">
<div id="StockPerMedicine"
class="well well-sm"></div>
</div>
</div>
</main>
<script
src="https://code.jquery.com/jquery-1.9.1.min.js"></script>
<script
src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.0/js/boot
strap.min.js"></script>
<script src="crossfilter.min.js"></script>
<script src="d3.v3.min.js"></script>
<script src="dc.min.js"></script>
<script src="application.js"></script>
</body>
</html>
2. Set up the Dashboard with application.js:
In application.js, use Crossfilter and dc.js to create the charts.
javascript
Copy code
function main() {
// Initialize Crossfilter with data
var CrossfilterInstance = crossfilter(medicineData);
// Dimension for Date and Medicine
var DateDim = CrossfilterInstance.dimension(function(d)
{ return d.Day; });
var medNameDim =
CrossfilterInstance.dimension(function(d) { return
d.MedName; });
var lightSenDim =
CrossfilterInstance.dimension(function(d) { return
d.LightSen; });
// Group data by dimensions
var SummatedStockPerDay =
DateDim.group().reduceSum(function(d) { return d.Stock; });
var AvgStockMedicine =
medNameDim.group().reduce(reduceAddAvg,
reduceRemoveAvg, reduceInitAvg);
var SummatedStockLight =
lightSenDim.group().reduceSum(function(d) { return
d.Stock; });
// Determine min and max dates for x-axis
var minDate = DateDim.bottom(1)[0].Day;
var maxDate = DateDim.top(1)[0].Day;
// Line Chart for Stock Over Time
var StockOverTimeLineChart =
dc.lineChart("#StockOverTime");
StockOverTimeLineChart
.width(null)
.height(400)
.dimension(DateDim)
.group(SummatedStockPerDay)
.x(d3.time.scale().domain([minDate, maxDate]))
.xAxisLabel("Year 2015")
.yAxisLabel("Stock")
.margins({left: 60, right: 50, top: 50, bottom: 50});
// Row Chart for Average Stock per
Medicine var
AverageStockPerMedicineRowChart =
dc.rowChart("#StockPerMedicine");
AverageStockPerMedicineRowChart
.width(null)
.height(1200)
.dimension(medNameDim)
.group(AvgStockMedicine)
.margins({top: 20, left: 10, right: 10, bottom: 20})
.valueAccessor(function(p) { return p.value.stockAvg; });
// Pie Chart for Light Sensitive Stock
var LightSensitiveStockPieChart =
dc.pieChart("#LightSensitiveStock");
LightSensitiveStockPieChart
.width(null)
.height(300)
.dimension(lightSenDim)
.radius(90)
.group(SummatedStockLight);
// Reset Filters Function
resetFilters = function() {
StockOverTimeLineChart.filterAll();
LightSensitiveStockPieChart.filterAll();
AverageStockPerMedicineRowChart.filterAll();
dc.redrawAll();
};
// Link reset button with resetFilters function
$('.btn-success').click(resetFilters);
// Render all charts
dc.renderAll();
}
// Reduce functions for custom average calculation
function reduceInitAvg() {
return { count: 0, stockSum: 0, stockAvg: 0 };
}
function reduceAddAvg(p, v) {
p.count += 1;
p.stockSum += Number(v.Stock);
p.stockAvg = Math.round(p.stockSum / p.count);
return p;
}
function reduceRemoveAvg(p, v) {
p.count -= 1;
p.stockSum -= Number(v.Stock);
p.stockAvg = p.count === 0 ? 0 : Math.round(p.stockSum /
p.count);
return p;
}
3. Explanation:
• HTML Structure: The placeholders for the input table,
filtered table, and three chart areas (StockOverTime,
StockPerMedicine, and LightSensitiveStock) are
defined, along with a reset button.
• Data Preparation in JavaScript:
o CrossfilterInstance initializes Crossfilter with
data, and three dimensions are created: DateDim,
medNameDim, and lightSenDim.
o Groups are defined for each dimension to
aggregate data.
• Charts Configuration:
o Stock Over Time (Line Chart): Uses DateDim
with aggregated stock data
(SummatedStockPerDay).
o Average Stock per Medicine (Row Chart): Uses
medNameDim and displays the average stock using
a custom reduce() function.
o Light Sensitive Stock (Pie Chart): Uses
lightSenDim to show stock grouped by light
sensitivity.
• Reset Button: Clears filters from all charts and
redraws them.
This setup provides a fully functional dashboard with linked
charts, allowing interactive data exploration.