0% found this document useful (0 votes)
67 views

Module 4-Data visualization to the end user

no

Uploaded by

sadhanakrishna05
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Module 4-Data visualization to the end user

no

Uploaded by

sadhanakrishna05
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Data visualization to the end user

Data visualization options


 You have several options for delivering a dashboard to your end users.
 Here we’ll focus on a single option, and by the end of this chapter you’ll be able to create
a dashboard yourself.
 This chapter’s case is that of a hospital pharmacy with a stock of a few thousand
medicines.
 The government came out with a new norm to all pharmacies: all medicines should be
checked for their sensitivity to light and be stored in new, special containers.
 One thing the government didn’t supply to the pharmacies was an actual list of light-
sensitive medicines. This is no problem for you as a data scientist because every
medicine has a patient information leaflet that contains this information.
 You distill the information with the clever use of text mining and assign a “light
sensitive” or “not light sensitive” tag to each medicine.
 This information is then uploaded to the central database.
 In addition, the pharmacy needs to know how many containers would be necessary. For
this they give you access to the pharmacy stock data.
 When you draw a sample with only the variables you require, the data set looks like
figure 9.1 when opened in Excel.

 As you can see, the information is time-series data for an entire year of stock movement,
so every medicine thus has 365 entries in the data set.
 Although the case study is an existing one and the medicines in the data set are real, the
values of the other variables presented here were randomly generated, as the original data
is classified.
 Also, the data set is limited to 29 medicines, a little more than 10,000 lines of data.
 Even though people do create reports using crossfilter.js (a Javascript MapReduce
 library) and dc.js (a Javascript dashboarding library) with more than a million lines of
data, for the example’s sake you’ll use a fraction of this amount.
 Also, it’s not recommended to load your entire database into the user’s browser; the
browser will freeze while loading, and if it’s too much data, the browser will even crash.
 Normally data is pre-calculated on the server and parts of it are requested using, for
example, a REST service.
 To turn this data into an actual dashboard you have many options and you can find a
short overview of the tools later in this chapter.
 Among all the options, for this book we decided to go with dc.js, which is a crossbreed
between the JavaScript MapReduce library Crossfilter and the data visualization library
d3.js.
 Crossfilter was developed by Square Register, a company that handles payment
transactions; it’s comparable to PayPal but its focus is on mobile.
 Square developed Crossfilter to allow their customers extremely speedy slice and dice
on their payment history.
 Crossfilter is not the only JavaScript library capable of Map- Reduce processing, but it
most certainly does the job, is open source, is free to use, and is maintained by an
established company (Square).
 Example alternatives to Crossfilter are Map.js, Meguro, and Underscore.js.
 JavaScript might not be known as a data crunching language, but these libraries do give
web browsers that extra bit of punch in case data does need to be handled in the browser.
 d3.js can safely be called the most versatile JavaScript data visualization library available
at the time of writing; it was developed by Mike Bostock as a successor to his Protovis
library.
 Many JavaScript libraries are built on top of d3.js.
 NVD3, C3.js, xCharts, and Dimple offer roughly the same thing: an abstraction layer on
top of d3.js, which makes it easier to draw simple graphs.
 They mainly differ in the type of graphs they support and their default design.
 visit their websites and find out for yourself:
 ■ NVD3—http://nvd3.org/
 ■ C3.js—http://c3js.org/
 ■ xCharts—http://tenxer.github.io/xcharts/
 ■ Dimple—http://dimplejs.org/
Many options exist. So why dc.js?
 The main reason: compared to what it delivers, an interactive dashboard where clicking
one graph will create filtered views on related graphs, dc.js is surprisingly easy to set up.
 Click around the dashboard and see the graphs react and interact when you select and
deselect data points.
 As stated before, dc.js has two big prerequisites: d3.js and crossfilter.js.
9.2 Crossfilter, the JavaScript MapReduce library
 JavaScript isn’t the greatest language for data crunching.
 But that didn’t stop people, like the folks at Square, from developing MapReduce
libraries for it. If you’re dealing with data, every bit of speed gain helps.
 You don’t want to send enormous loads of
 data over the internet or even your internal network though, for these reasons:
1. Sending a bulk of data will tax the network to the point where it will bother other users.
2. The browser is on the receiving end, and while loading in the data it will temporarily
freeze. For small amounts of data this is unnoticeable, but when you start looking at
100,000 lines, it can become a visible lag. When you go over 1,000,000 lines, depending
on the width of your data, your browser could give up on you.
Conclusion:
 it’s a balance exercise. For the data you do send, there is a Crossfilter to handle it for you
once it arrives in the browser.
 In our case study, the pharmacist requested the central server for stock data of 2015 for
29 medicines she was particularly interested in.
9.2.1 Setting up everything
 It’s time to build the actual application, and the ingredients of our small dc.js application
are as follows:
 JQuery—To handle the interactivity
 Crossfilter.js—A MapReduce library and prerequisite to dc.js
 d3.js—A popular data visualization library and prerequisite to dc.js
 dc.js—The visualization library you will use to create your interactive dashboard
 Bootstrap—A widely used layout library you’ll use to make it all look better
You’ll write only three files:
 index.html—The HTML page that contains your application
 application.js—To hold all the JavaScript code you’ll write
 application.css—For your own CSS

 In addition, you’ll need to run our code on an HTTP server.


 You could go through the effort of setting up a LAMP (Linux, Apache, MySQL, PHP),
WAMP (Windows, Apache, MySQL, PHP), or XAMPP (Cross Environment, Apache,
MySQL, PHP, Perl) server.
 Instead we can do it with a single Python command. Use your command-line tool (Linux
shell or Windows CMD) and move to the folder containing your index.html (once it’s
there).
python -m SimpleHTTPServer
 For Python 3.4
python -m http.server 8000
 As you can see in figure 9.3, an HTTP server is started on localhost port 8000.
 In your browser this translates to “localhost:8000”; putting “0.0.0.0:8000” won’t work.

Make sure to have all the required files available in the same folder as your index.html.
You can download them from the Manning website or from their creators’ websites.
■ dc.css and dc.min.js—https://dc-js.github.io/dc.js/
■ d3.v3.min.js—http://d3js.org/
■ crossfilter.min.js—http://square.github.io/crossfilter/
Now we know how to run the code we’re about to create, so let’s look at the index.html page,
shown in the following listing.

 No surprises here. The header contains all the CSS libraries you’ll use, so we’ll load our
JavaScript at the end of the HTML body.
 Using a JQuery onload handler, your application will be loaded when the rest of the page
is ready.
 You start off with two table placeholders: one to show what your input data looks like,
<div id="inputtable"></ div>, and the other one will be used with Crossfilter to show a
filtered table, <div id="filteredtable"></div>.
 Several Bootstrap CSS classes were used, such as “well”, “container”, the Bootstrap grid
system with “row” and “col-xx-xx”, and so on.
 They make the whole thing look nicer but they aren’t mandatory.
 Now that you have your HTML set up, it’s time to show your data onscreen.
 For this, turn your attention to the application.js file you created.
 First, we wrap the entire code “to be” in a JQuery onload handler.
$(function() {
//All future code will end up in this wrapper
})
 Now we’re certain our application will be loaded only when all else is ready.
 This is important because we’ll use JQuery selectors to manipulate the HTML.
 It’s time to load in data.
d3.csv('medicines.csv',function(data) {
main(data)
});
 You don’t have a REST service ready and waiting for you, so for the example you’ll
draw the data from a .csv file.
 This file is available for download on Manning’s website. d3.js offers an easy function
for that.
 After loading in the data you hand it over to your main application function in the d3.csv
callback function.
 Apart from the main function you have a CreateTable function, which you will use to…
you guessed it…create your tables, as shown in the following listing
CreateTable() requires three arguments:
■ data—The data it needs to put into a table.
■ variablesInTable—What variables it needs to show.
■ Title—The title of the table.
 CreateTable() uses a predefined variable, tableTemplate, that contains our overall table
layout.
 CreateTable() can then add rows of data to this template.
 Now that you have your utilities, let’s get to the main function of the application, as
shown in the following listing.
 You start off by showing your data on the screen, but preferably not all of it; only the first
five entries will do, as shown in figure 9.4.
 You have a date variable in your data and you want to make sure Crossfilter will
recognize it as such later on, so you first parse it and create a new variable called Day.
 You show the original, Date, to appear in the table for now, but later on you’ll use Day
for all your calculations.

You might also like