0% found this document useful (0 votes)
40 views131 pages

1152cs191 Data Visualization Unit V

Uploaded by

Abhinav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views131 pages

1152cs191 Data Visualization Unit V

Uploaded by

Abhinav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 131

School of Computing

Department of Computer Science &


Engineering

1152CS191- Data Visualization


Category : Program Elective
UNIT-V

Course Handling Facult

12/10/2024 Department of Computer Science & Engineering Data Visualization


No
CO
No
CO

CO5
Engineering Knowledge

12/10/2024
Problem Analysis

Design / Development of solutions

Conduct investigations of complex


problems

Modern Tool usage

The Engineer Society

Environment & Sustainability


Course Outcomes

Ethics

Visualization
Individual & Team Work

Communication

Department of Computer Science & Engineering


Data Project Management & Finance
Explore different visualization tools for various applications

Life Long Learning


Course Outcomes

Mathematical Concepts
K2

Software Development
taxonomy)
revised Bloom’s
Level of learning
domain (Based on

Transferring Skills
Correlation of COs with Student Outcomes ABET
EAC and CAC

CO SO1 SO2 SO3 SO4 SO5 SO6 SO7

CO5 3 2 2 - - 2 3

CO SO1 SO2 SO3 SO4 SO5 SO6

CO5 3 2 2 - - 2

12/10/2024 Department of Computer Science & Engineering Data


Visualization
Course Content
UNIT V Data Visualization Tools 9

 Trends in Data Visualization and Other Tools


 Tableau
 Data Wrangler
 Python
 D3.js
 R and Shiny
 Visualization for Genetic Network Reconstruction
 Reconstruction, Visualization and Analysis of Medical Images
 Exploratory Graphics of a Financial Dataset
 Graphical Data Representation in Bankruptcy Analysis
 Visualization Tools for Insurance Risk Processes

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Trends in Data Visualization
Responsive and interactive analysis

 A software that is agile in responding to user commands is becoming


an important feature in most data visualization software in the current
market.

 This includes the implementation of a drag-and-drop user-interface in


creating dashboards and graphical illustrations.

 Interactive dashboards support active communication within the


network of the software.

 The interactive dashboard aids the business in effectively


communicating with employees, business partners, and company
clients.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Trends in Data Visualization
Drag and Drop User-interface

12/10/2024 Department of Computer Science & Engineering Data Visualization


Trends in Data Visualization
Deeply analyzing raw data for timely decisions

This trend in data visualization software is majorly likened by businesses of


all sizes that desire to produce timely decisions based on speedy raw data
analysis.

A software that is capable of using SQL to arrange unstructured data


showcases a powerful intelligence platform.

The utilization of common data modeling language results in impressive


visualizations and smooth data consolidation activities.

Instant report generation is also a well-known feature of this software.

12/10/2024 Department of Computer Science & Engineering Data Visualization


Trends in Data Visualization
Integrates data into one platform for powerful analysis of data

 This trend in data visualization software allows the secured sharing of


the entire data from all the different departments of an organization.

 This also supports complete annotations and real-time data updates.

 It also offers effective data analysis through the implementation of


more engaging graphical tools and designs.

 Team players of the organizations will be more effective in


conducting their daily work through the fast creation of visualizations
that showcase business processes and job descriptions.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Trends in Data Visualization
Availability of helpful tools

• Software that is being purchased by many companies possesses


various tools like KPI widgets, pivot tables, and tabular view
components. These are helpful tools in report generation.

• This type of software supports team collaborations by enabling


accurate, complete, and timely report creation and decision
development.

• This also allows the insertion of any reports or dashboards in the


company’s websites, blogs, and applications.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Trends in Data Visualization

Accessible through desktop computers and mobile devices

• It can link, graph, and share data from desktop computers to


mobile devices.

• Dashboards can be shared with team members and analyzed


through the systems either through desktop computers, laptops,
and/or mobile devices like smartphones and tablets.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Trends in Data Visualization
Social collaborations

 The dashboard provides real-time data for more effective collaborative


efforts for team members who are mobilizing the software.

 It also possesses multi-functioning widgets, sparklines, and trend


indicators to assist the team in understanding data visualizations.

 The user-to-user messaging capability of data visualization software


supports effective communication between users of the software.

 This social sharing feature is a helpful factor for the successful relay
of information between team members.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Trends in Data Visualization

Set defaults and customize data connectors and templates

 The design of the graphical visualizations used in classifying


and arranging the data gathered by the company is also setting
a trend for most software buyers.

 The customization feature of some software allows the users to


manipulate data connectors and graphical templates according
to their needs.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Data Visualization Software: Change Is Rapid

 There are numerous combinations of features that individual data


visualization software possesses.

 The latest trends in data visualization software focus on the graphical


designs and templates availability, interactive capabilities, speedy
data transformation, data analysis assistance, availability of helpful
tools, device accessibility like desktops and mobile devices software
compatibility, team communication support, and customization
features.

 These are the latest trends in the global market for data visualization
tools.

 It changes every year as technological advancements occur rapidly.


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Data Visualization Software: Change Is Rapid

 There are numerous combinations of features that individual data


visualization software possesses.

 The latest trends in data visualization software focus on the graphical


designs and templates availability, interactive capabilities, speedy
data transformation, data analysis assistance, availability of helpful
tools, device accessibility like desktops and mobile devices software
compatibility, team communication support, and customization
features.

 These are the latest trends in the global market for data visualization
tools.

 It changes every year as technological advancements occur rapidly.


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Tableau
 There are numerous combinations of features that individual data
visualization software possesses.

 The latest trends in data visualization software focus on the graphical


designs and templates availability, interactive capabilities, speedy
data transformation, data analysis assistance, availability of helpful
tools, device accessibility like desktops and mobile devices software
compatibility, team communication support, and customization
features.

 These are the latest trends in the global market for data visualization
tools.

 It changes every year as technological advancements occur rapidly.


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Data Wrangling

•Data wrangling involves processing the data in various formats like - merging,
grouping, concatenating etc. for the purpose of analysing or getting them ready
to be used with another set of data.

•Python has built-in features to apply these wrangling methods to various data
sets to achieve the analytical goal.

Goals of Data Wrangling :


• Gathering data from numerous sources to reveal a more profound
intelligence within it
• Provide actionable and accurate data in the hands of business/data analysts
in a timely matter
• Reduce the time spent collecting and organizing, in short cleaning unruly
data before it can be used
• Enable data analysts and scientists to focus on the analysis of data, not the
wrangling part
• Help senior leaders in an organization to take better decisions
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Steps to Data Wrangling

• Data Acquisition

 Every data is different and has been created differently

 Recognizing the authenticity of the data obtained

 Identifying the source of the data

• Joining Data

• Data Cleansing

Department of Computer Science & Engineering Data


12/10/2024 Visualization
D3.js

• D3 stands for Data-Driven Documents. D3.js is a JavaScript


library for manipulating documents based on data.

• D3.js is a dynamic, interactive, online data visualizations


framework used in a large number of websites.

• D3.js is written by Mike Bostock, created as a successor to an


earlier visualization toolkit called Protovis.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
D3.js - Need

 D3.js is a JavaScript library used to create interactive


visualizations in the browser.
 The D3.js library allows us to manipulate elements of a webpage
in the context of a data set.
 These elements can be HTML, SVG, or Canvas elements and
can be introduced, removed, or edited according to the contents of
the data set.
 It is a library for manipulating the DOM objects.
 D3.js can be a valuable aid in data exploration, it gives control
over data representation and add interactivity.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
D3.js - Need

 Extremely flexible.
 Easy to use and fast.
 Supports large datasets.
 Declarative programming.
 Code reusability.
 Has wide variety of curve generating functions.
 Associates data to an element or group of elements in the html
page.

12/10/2024 Department of Computer Science & Engineering Data Visualization


D3.js - Benefits

D3.js is an open source project and works without any plugin.

It requires very less code and comes up with the following benefits

 Great data visualization.
 It is modular. You can download a small piece of D3.js, which
you want to use. No need to load the whole library every time.
 Easy to build a charting component.
 DOM manipulation.

12/10/2024 Department of Computer Science & Engineering Data Visualization


D3.js - development environment Components

• D3.js is an open source project and works without any plugin. It


requires very less code and comes up with the following
benefits −
• D3.js library
• Editor
• Web browser
• Web server

12/10/2024 Department of Computer Science & Engineering Data Visualization


D3.js Library

 D3.js is an open-source library and the source code of the library is


freely available on the web at https://d3js.org/ website.
 Visit the D3.js website and download the latest version of D3.js
(d3.zip). As of now, the latest version is 4.6.0.
 After the download is complete, unzip the file and look for d3.min.js.
 This is the minified version of the D3.js source code.
 Copy the d3.min.js file and paste it into your project's root folder or
any other folder, where you want to keep all the library files. Include
the d3.min.js file in your HTML page

12/10/2024 Department of Computer Science & Engineering Data Visualization


Example

<!DOCTYPE html>
<html lang = "en">
<head>
<script src = "/path/to/d3.min.js"></script>
</head>
<body>
<script> // write your d3 code here.. </script> </body>
</html>

Department of Computer Science & Engineering Data


12/10/2024 Visualization
D3.js is an open source JavaScript

 D3.js is an open source JavaScript library for −


 Data-driven manipulation of the Document Object Model
(DOM).
 Working with data and shapes.
 Laying out visual elements for linear, hierarchical, network and
geographic data.
 Enabling smooth transitions between user interface (UI) states.
 Enabling effective user interaction.
 web standards are heavily used in D3.js.
 HyperText Markup Language (HTML)
 Document Object Model (DOM)
 Cascading Style Sheets (CSS)
 Scalable Vector Graphics (SVG)
 JavaScript
12/10/2024 Department of Computer Science & Engineering Data Visualization
HyperText Markup Language (HTML)

HTML is used to structure the content of the webpage. It is stored


in a text file with the extension “.html”.
Example − A typical bare-bones HTML example looks like this
<!DOCTYPE html>
<html lang = "en">
<head>
<meta charset = "UTF-8">
<title></title>
</head>
<body> </body>
</html>

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Document Object Model (DOM)

When a HTML page is loaded by a browser, it is converted to a


hierarchical structure. Every tag in HTML is converted to an element /
object in the DOM with a parent-child hierarchy. It makes HTML more
logically structured. Once the DOM is formed, it becomes easier to
manipulate (add/modify/remove) the elements on the page.
Let us understand the DOM using the following HTML document −
<!DOCTYPE html>
<html lang = "en">
<head>
<title>My Document</title>
</head>
<body>
<div>
<h1>Greeting</h1>
<p>Hello World!</p>
</div>
</body> </html>
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Document Object Model (DOM)

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Cascading Style Sheets (CSS)

Cascading Style Sheets (CSS)

• HTML gives a structure to the webpage, CSS styles makes the


webpage more pleasant to look at.

• CSS is a Style Sheet Language used to describe the


presentation of a document written in HTML or XML (including
XML dialects like SVG or XHTML).

• CSS describes how elements should be rendered on a webpage.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Scalable Vector Graphics (SVG)

 SVG is a way to render images on the webpage.

 SVG is not a direct image, but is just a way to create images


using text.

 As its name suggests, it is a Scalable Vector. It scales itself


according to the size of the browser, so resizing your browser
will not distort the image.

 All browsers support SVG except IE 8 and below.

 Data visualizations are visual representations and it is


convenient to use SVG to render visualizations using the D3.js.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Java Script

 JavaScript is a loosely typed client side scripting language that


executes in the user's browser.

 JavaScript interacts with HTML elements (DOM elements) in


order to make the web user interface interactive.

 JavaScript implements the ECMAScript Standards, which


includes core features based on ECMA-262 specifications as
well as other features, which are not based on the ECMAScript
standards.

 JavaScript knowledge is a prerequisite for D3.js.

12/10/2024 Department of Computer Science & Engineering Data Visualization


Tableau

 Tableau is a Business Intelligence tool for visually analyzing the data.

 Users can create and distribute an interactive and shareable dashboard,


which depict the trends, variations, and density of the data in the form
of graphs and charts.

 Tableau can connect to files, relational and Big Data sources to acquire
and process data. The software allows data blending and real-time
collaboration, which makes it very unique.

 It is used by businesses, academic researchers, and many government


organizations for visual data analysis.

 It is also positioned as a leader Business Intelligence and Analytics


Platform in Gartner Magic Quadrant.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Tableau Features
Visual Discovery − The user
explores and analyzes the data by
using visual tools like colors, trend
lines, charts, and graphs. There is
very little script to be written as
nearly everything is done by drag
Speed of Analysis − As it and drop.
does not require high level
of programming expertise,
any user with access to data
can start using it to derive
value from the data.

Self-Reliant − Tableau does not Blend Diverse Data Sets −


Tableau allows you to blend
need a complex software setup.
different relational,
The desktop version which is
semistructured and raw data
used by most users is easily
sources in real time, without
installed and contains all the
features needed to start and expensive up-front integration
costs. The users don’t need to
complete data analysis.
know the details of how data is
stored.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Tableau Features

Architecture Agnostic −
Tableau works in all kinds of
devices where data flows. 02 Real-Time Collaboration −
Tableau can filter, sort, and
Hence, the user need not
worry about specific 01 discuss data on the fly and
embed a live dashboard in
hardware or software portals like SharePoint site or
requirements to use Tableau. Salesforce.

Centralized Data − Tableau


server provides a centralized
location to manage all of the
organization’s published data 03
sources.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Data Wrangling
Wrangling is a process where one transforms “raw” data for
making it more suitable for analysis and it will improve the
quality of your data

Data Exploration: Checking for feature data types, unique


values, and describing data.
Null Values: Counting null values and deciding what to do with
them.
Reshaping and Feature Engineering: This step transforms raw
data into a more useful format. Examples of feature engineering
include one-hot encoding, aggregation, joins, and grouping.
Text Processing: BeautifulSoup and Regex (among other tools)
are often used to clean and extract web scraped texts from HTML
and XML documents.
12/10/2024 Department of Computer Science & Engineering Data Visualization
Data Wrangling
Reshaping and Text Processing:
Wrangling is a process where one Feature BeautifulSoup and
transforms “raw” data for making Engineering: Regex (among other
it more suitable for analysis and it This step transforms tools) are often used
raw data into a more to clean and extract
will improve the quality of your useful format. web scraped texts
data. Examples of feature from HTML and
engineering include XML documents.
Null Values: one-hot encoding,
Counting null values aggregation, joins,
Data Exploration: and deciding what to and grouping.
Checking for feature do with them.
data types, unique
values, and describing
data.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Importing Libraries

Pandas: Pandas navigates data frame and


01 check for each column’s data type, null values,
and unique values.

NumPy: This package is essential for any data science project. It has
02 a lot of mathematical functions that operate on multi-dimensional
arrays and data frames.

Matplotlib & Seaborn: They are plotting and graphing libraries that to
03 visualize data in an intuitive way.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Python - Chart Properties

Creating a Chart

import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0,10)
y=x^2

#Simple Plot
plt.plot(x,y)

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Labeling the Axes

import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0,10)
y=x^2
#Labeling the Axes and Title
plt.title("Graph Drawing")
plt.xlabel("Time")
plt.ylabel("Distance")
#Simple Plot
plt.plot(x,y)

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Formatting Line type and Colour

import numpy as np
import matplotlib.pyplot as plt

x = np.arange(0,10)
y=x^2
#Labeling the Axes and Title
plt.title("Graph Drawing")
plt.xlabel("Time")
plt.ylabel("Distance")

# Formatting the line colors


plt.plot(x,y,'r')

# Formatting the line type


plt.plot(x,y,'>')
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Heatmaps

A heatmap contains values representing various shades of the same colour


for each value to be plotted.
Usually the darker shades of the chart represent higher values than the
lighter shade.
For a very different value a completely different colour can also be used.
from pandas import DataFrame
import matplotlib.pyplot as plt

data=[{2,3,4,1},{6,3,5,2},{6,3,5,4},{3,7,5,4},
{2,8,1,5}]
Index= ['I1', 'I2','I3','I4','I5']
Cols = ['C1', 'C2', 'C3','C4']
df = DataFrame(data, index=Index,
columns=Cols)

plt.pcolor(df)
plt.show()
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Scatterplots

Scatterplots show many points plotted in the Cartesian plane. Each point
represents the values of two variables. One variable is chosen in the
horizontal axis and another in the vertical axis.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4),
columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')

Department of Computer Science & Engineering


12/10/2024 Data Visualization
Bubble Chart

Bubble charts display data as a cluster of circles. The required data to


create bubble chart needs to have the xy coordinates, size of the bubble and
the colour of the bubbles. The colours can be supplied by the library itself.

import matplotlib.pyplot as plt


import numpy as np

# create data
x = np.random.rand(40)
y = np.random.rand(40)
z = np.random.rand(40)
colors = np.random.rand(40)
# use the scatter function
plt.scatter(x, y, s=z*1000,c=colors)
plt.show()

Department of Computer Science & Engineering Data


12/10/2024 Visualization
3D Charts

3dPlot is drawn by mpl_toolkits.mplot3d to add a subplot to an existing 2d


plot.
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt

chart = plt.figure()
chart3d = chart.add_subplot(111,
projection='3d')

# Create some test data.


X, Y, Z = axes3d.get_test_data(0.08)

# Plot a wireframe.
chart3d.plot_wireframe(X, Y, Z,
color='r',rstride=15, cstride=10)
plt.show()
12/10/2024 partment of Computer Science & Engineering Data Visualization
Time Series
Time series is a series of data points in which each data point is associated
with a timestamp. A simple example is the price of a stock in the stock market
at different points of time on a given day. Another example is the amount of
rainfall in a region at different months of the year.
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('path_to_file/stock.csv')
df = pd.DataFrame(data, columns =
['ValueDate', 'Price'])

# Set the Date as Index


df['ValueDate'] =
pd.to_datetime(df['ValueDate'])
df.index = df['ValueDate']
del df['ValueDate']
df.plot(figsize=(15, 6))
plt.show()
12/10/2024 rtment of Computer Science & Engineering Data Visualization
Geographical Data
Many open source python libraries now have been created to represent the
geographical maps. They are highly customizable and offer a varierty of maps
depicting areas in different shapes and colours. One such package is Cartopy.

import matplotlib.pyplot as plt


import cartopy.crs as ccrs
fig = plt.figure(figsize=(15, 10))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())

# make the map global rather than have it zoom in to


# the extents of any plotted data

ax.set_extent((60, 150, 55, -25))


ax.stock_img()
ax.coastlines()
ax.tissot(facecolor='purple', alpha=0.8)
plt.show()

12/10/2024 Department of Computer Science & Engineering Data Visualization


R& Shiny

• Shiny is an open package from RStudio, which provides a web


application framework to create interactive web applications
(visualization) called “Shiny apps”.

• The web applications seamlessly display R objects (like plots,


tables etc.) and can also be made live to allow access to anyone.

• Shiny provides automatic reactive binding between inputs and


outputs which we will be discussing in the later parts of this
article.

• It also provides extensive pre-built widgets which make it


possible to build elegant and powerful applications with minimal
effort.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
R& Shiny

•Any shiny app is built using two components:


1. UI.R: This file creates the user interface in a shiny application. It
provides interactivity to the shiny app by taking the input from the user
and dynamically displaying the generated output on the screen.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
R& Shiny

2. Server.R: This file contains the series of steps to convert the input
given by user into the desired output to be displayed.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization using R& Shiny

Creating interactive visualization for data sets

The basic layout for writing ui.R is :

 library(shiny) shinyUI(fluidPage( titlePanel("#Title"),


sidebarLayout( sidebarPanel( ), mainPanel( #write
output ) ) ))

Similarly, the basic layout for writing server.R is :

 library(shiny) shinyServer(function(input, output) { #write


server function })

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization using R& Shiny

 Drawing histograms for iris dataset in R using Shiny

 Drawing Scatterplots for iris dataset in R using Shiny

 Loan Prediction Practice problem

 Explanatory analysis of multiple variables of Loan Prediction


Practice problem.

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization using R& Shiny

Advantages :

 Efficient Response Time: The response time of shiny app is


very small, making it possible to deliver real time output(s) for
the given input.

 Complete Automation of the app: A shiny app can be


automated to perform a set of operations to produce the desire
output based on input.

 Knowledge of HTML, CSS or JavaScript not required: It


requires absolutely no prior knowledge of HTML, CSS or
JavaScript to create a fully functional shiny app.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Visualization using R& Shiny
Advantages :

Advance Analytics: Shiny app is very powerful and can be used


to visualize even the most complex of data like 3D plots, maps,
etc.

Cost effective: The paid versions of shinyapps.io and shiny


servers provide a cost effective scalable solution for deployment
of shiny apps online.

Open Source: Building and getting a shiny app online is free of


cost, if you wish to deploy your app on the free version
of shinyapps.io

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization using R& Shiny

Disadvantages :

Requires timely updates: As the functions used in the app gets


outdated sometimes with newer package versions, it becomes
necessary to update your shiny app time to time.

No selective access and permissions: There’s no provision for


blocking someone’s access to your app or proving selective
access. Once the app is live on the web, it is freely accessible by
everyone

Restriction on traffic in free version: In free version of


shinyapps.io, you only get 25 active hours of your app per month
per account.
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Visualization for Genetic Network
Reconstruction

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Data Preprocessing

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Data Augmentation

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Clustering and Graphical Models

Dr.M.Kavitha Department of Computer Science & Engineering


12/10/2024 Data Visualization
Reconstruction, Visualization and Analysis of
Medical Images
 Advances in medical imaging systems have made significant
contributions to medical diagnoses and treatments by providing
anatomic and functional information about human bodies that is
difficult to obtain without these techniques.

 These modalities also generate large quantities of noisy data that need
modern techniques of computational statistics for image
reconstruction, visualization and analysis.

 Computerized tomography (CT) is an important technique for


obtaining accurate information about the interior of a human body
based on observations detected outside the body.

 The precise reconstruction of images of the interior of the human body


from this data is a challenge suited to computational statistics.
12/10/2024 Department of Computer Science & Engineering Data Visualization
Reconstruction, Visualization and Analysis of
Medical Images

 As the detected observations are indirectly related to the target


image, the tomographic reconstruction problem is an inverse
problem, which is often ill-posed or ill-conditioned.

 Due to the nature of ill-posedness, the reconstruction of, for example,


Positron Emission Tomography (PET) images by maximum
likelihood estimation with the EM algorithm (MLE-EM) weighted
least square estimation (WLSE), and other methods without
regularization, will produce images with edge and noise artifacts.

 Thus, computational statistical techniques must be used to integrate


and fuse the correlated but incomplete structure information with
other medical modalities, like X-ray CT, magnetic

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Reconstruction, Visualization
and Analysis of Medical Images

Department of Computer Science & Engineering Data


12/10/2024 Visualization
PET Images

The scanning, acquisition and reconstruction process of a PET system

Department of Computer Science & Engineering Data


12/10/2024 Visualization
PET Images

The scanning, acquisition and reconstruction process of a PET system

Department of Computer Science & Engineering Data


12/10/2024 Visualization
PET Images

Coronal and sagittal images of a mouse reconstructed by PDEM (right)


and FBP (left). The images reconstructed by PDEM have less noise than
those reconstructed by FBP. The enlarged brain image reconstructed by
PDEM has clearer boundaries than that reconstructed by FBP
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Ultrasound Images 2.3

A rosette map consisting of 42 Gabor filters. The half-peak magnitude


frequency bandwidth is set to one octave for each Gabor filter. Only the
half-peak supports of the filters are shown in the map
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Magnetic Resonance Images

Derivation of the distance map. The underlying liver ultrasound image is


decomposed into overlapping blocks of subimages. Each block is filtered
with a set of Gabor functions to derive its G-vector. The distance map is
formed from the G-vector lengths of all blocks
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Magnetic Resonance Images

Segmentation results are displayed for a sequence of MR images of


myocardium in the space domain. The top three images display
classification results and the bottom three images show prediction results
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Magnetic Resonance Images

Segmentation results are displayed for a sequence of MR images of


myocardium in the frequency domain. The top three images display
classification results and the bottom three images show prediction results
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Magnetic Resonance Images

Segmentation results are displayed for a sequence of MR images of


myocardium in the space–frequency domain. he top three images display
classification results and the bottom three images show prediction results
12/10/2024 Data Visualization
Magnetic Resonance Images

Classification error rates for a sequence of MR images where Obj represents


the object, Bg denotes the background, and Total refers to the average error
of the whole image

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Magnetic Resonance Images

Segmentation results are displayed for a sequence of MR images of


myocardium in the space–frequency domain. he top three images display
classification results and the bottom three images show prediction results
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Magnetic Resonance Images

Prediction error rates for a sequence of MR images are reported, where Obj
represents the object, Bg denotes the background, and Total refers to the
average error of the whole image

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Exploratory Graphics of a
Financial Dataset

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Data

Variables in the Bankruptcy dataset

Department of Computer Science & Engineering Data


12/10/2024 Visualization
First Graphics

A bankruptcy bar chart. 506 of the 82626 records refer to bankruptcy

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Bar Chart

A bar chart of the number of records by US region


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Histogram

A histogram of the number of records per year

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Spinogram

A spinogram of the number of records per year, with foreign registered


companies selected. The width of a bar in the spinogram is proportional to the
height of the corresponding bar in the original histogram
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Bar Chart

A barchart of the number of records categorized by NAICS group

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Histogram

A histogram of log(Total Assets), logTA, for the companies. he marks below


the axis to the let are interactive controls for the anchorpoint and binwidth.
The horizontal (red) marks record bins where the count is too small to be
drawn
12/10/2024 Data Visualization
Parallel Box Plot

Parallel boxplots of financial ratios. Each boxplot is individually scaled


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Outliers

Parallel coordinate plot of financial ratios with skew distributions. Seven outliers
have been selected
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Outliers

Scatterplots of TL.TA, the ratio of total liabilities to total assets, plotted against
Total Assets. The let-hand plot presents all of the data, and it shows that all high
values of liabilities are associated with low asset values. The right-hand plot
presents a zoom of about 10-2 on the x-axis by 10-3 on the y-axis, along with
some α-blending
12/10/2024 DEpartment of Computer Science & Engineering Data Visualization
Outliers

Parallel coordinate plot of the financial ratios with skewed distributions. The
seven outliers selected have been removed. The plot’s (red) border is a sign that
not all data are displayed exceeding

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Scatterplots

Scatterplot of Sales vs. Total Assets with the seven outlying companies
highlighted (the lighter blob in the lower let corner)

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Outliers

A histogram of the current assets ratio on the let and a weighted histogram of the
same variable, weighted by Total Assets, on the right

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Scatterplots

Scatterplots of Cash.TA and Inv.TA, the ratios of cash and inventories to total
assets (let), and of CA.TA and Kap.TA, current assets and property to total assets
(right)
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Scatterplots

The Scatterplot with a smaller pointsize and α-blending to better display the
bivariate structures

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Sactterplots

Scatterplots of inventories against fixed assets (let) and of cash against current
assets (right)

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Mosaic Plots

Fluctuation diagram of industry sectors by region

12/10/2024 Department of Computer Science & Engineering Data Visualization


Mosaic Plots

Fluctuation diagram of sectors by regions weighted by Total Assets


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel coordinate plot of financial ratios and logTA, excluding 55 outliers, with
bankrupt companies highlighted
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel coordinate plot of financial ratios and logTA, excluding 55 outliers, with
bankrupt companies highlighted, α-blending=0.1only for unselected data
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel coordinate plot of financial ratios and logTA, excluding 55 outliers, with
bankrupt companies highlighted, α-blending=0.1only for unselected data
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel coordinate plot of financial ratios and logTA, excluding 55 outliers, with
bankrupt companies highlighted, α-blending=0.1only for selected and
unselected data
12/10/2024 Department of Computer Science & Engineering Data Visualization
Parallel Coordinate Plot

Scatterplots of the ratios of intangibles and cash to total assets, with companies
that went bankrupt selected. More α-blending has been used in the right-hand
plot
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel boxplots of logTA by year from  to , all on the same
scale

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Parallel Coordinate Plot

Parallel boxplots of financial ratios and logTA for all companies. he background boxplots are
for all of the data, and the superimposed standard boxplots are for the selected cases,
companies with Total Assets 1000
12/10/2024 Department of Computer Science & Engineering Data Visualization
Parallel Coordinate Plot

Spinograms of the ratios Cash.TA, Inv.TA, Kap.TA, and Intg.TA. he companies with Total
Assets 1000 have been selected
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

Parallel boxplots of financial ratios and logTA for the 18610 companies with Total
Assets 1000. Companies that went bankrupt have been selected
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Graphical Data Representation
in Bankruptcy Analysis

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Parallel Coordinate Plot

A classification example. The boundary between the classes of solvent (black triangles)
and insolvent (white squares) companies was estimated using DA, the logit regression
(two indistinguishable linear boundaries) and an SVM (a nonlinear boundary) for a
subsample of the Bundesbank data.The background corresponds to the PDs computed
with an SVM Department of Computer Science & Engineering Data
12/10/2024 Visualization
Parallel Coordinate Plot

One-year cumulative PDs evaluated for several financial ratios from the German
Bundesbank data. The ratios are net income change (K21), net interest ratio (K24),
interest coverage ratio (K29), and logarithm of total assets (K33). he k nearest
neighbors procedure was used with a window size of around 8% of all of the
observations. The total number of observations is 553500
Department of Computer Science & Engineering Data
12/10/2024 Visualization
SVM Approach

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Variable Selection

Mapping from a two-dimensional data space to a three-dimensional space of


features

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Variable Selection

Mapping from a two-dimensional data space to a three-dimensional space of


features

Department of Computer Science & Engineering Data


12/10/2024 Visualization
AR Model

Accuracy ratios for univariate SVM models. Box plots are estimated based on
100 random subsamples. he AR for the model containing only the random
variable K10 is zero
Department of Computer Science & Engineering Data
12/10/2024 Visualization
AR Model

Accuracy ratios for SVM models with eight variables. Each model includes the
variables K5, K29, K7, K33, K18, K21, K24, and one of the remaining
variables. Box plots are estimated based on 100 random subsamples
Department of Computer Science & Engineering Data
12/10/2024 Visualization
AR Model

12/10/2024 Department of Computer Science & Engineering Data Visualization


Probability of Default

12/10/2024 Department of Computer Science & Engineering Data Visualization


Probability of Default

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Probability of Default

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Probability of Default

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization Tools for Insurance
Risk Processes

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Visualization Tools

 Ruin probability plot


 Density evolution plot
 Quantile lines
 Probability gates.

All four of these techniques permit an immediate evaluation of


model adequacy and the risks faced by the company based on
visual inspection of the generated plots.

12/10/2024 Department of Computer Science & Engineering Data Visualization


Property Claim Services

Graph of the Property claim Services catastrophe loss data


Department of Computer Science & Engineering Data
12/10/2024 Visualization
Log-normal, Pareto, and Burr distributions

Shapes of the mean excess function e(x) for the log-normal (dashed line),
gamma with α < 1 (dotted line), gamma with α >1 (solid line) and a mixture of
two exponential distributions (long-dashed line).
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Log-normal, Pareto, and Burr distributions

Shapes of the mean excess function e(x) for the Pareto (dashed line), Burr
(long-dashed line),Weibull with τ < 1 (solid line) andWeibull with τ >1 (dotted
line) distributions. From XploRe
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Log-normal, Pareto, and Burr distributions

The empirical mean excess function ˆen(x) for the PCS catastrophe loss
amounts in billions of USD (Left panel) and waiting times in years (Right
panel).
Department of Computer Science & Engineering Data
12/10/2024 Visualization
Pareto Probability Plot

Pareto probability plot of the PCS loss data. Apart from the two very extreme
observations (Hurricane Andrew and Northridge Earthquake), the points (crosses)
more or less constitute a straight line, validating the choice of the Pareto
distribution. he inset is a magnification of the bottom let part of the original plot.
From the Ruin Probabilities Toolbox
12/10/2024 Department of Computer Science & Engineering Data Visualization
Log-normal Probability Plot

Log-normal probability plot of the PCS loss data. The x-axis corresponds to
logarithms of the losses. The deviations from the straight line at both ends
question the adequacy of the log-normal law. From the Ruin Probabilities
Toolbox

12/10/2024 Department of Computer Science & Engineering Data Visualization


Log-normal Probability Plot

Log-normal probability plot of the PCS waiting time data. The x-axis
corresponds to logarithms of the losses. From the Ruin Probabilities Toolbox

Department of Computer Science & Engineering Data


12/10/2024 Visualization
Exponential Probability Plot

Exponential probability plot of the PCS waiting time data. The plot deviates
from a straight line at the far end. From the Ruin Probabilities Toolbox

Dr.M.Kavitha Department of Computer Science & Engineering


12/10/2024 Data Visualization
Exponential Probability Plot

Discontinuous visualization of the trajectories of a risk process.

Dr.M.Kavitha Department of Computer Science & Engineering


12/10/2024 Data Visualization
Ruin Probability

Alternative (continuous) visualization of the trajectories of a risk process. The


bankruptcy time is denoted by a star.

12/10/2024 Department of Computer Science & Engineering Data Visualization


Ruin Probability

Ruin probability plot with respect to the time horizon T (left axis, in months)
and the initial capital u (right axis, in million DKK).

Drepartment of Computer Science & Engineering Data


12/10/2024 Visualization
3D Visualization

Three-dimensional visualization of the density evolution of a risk process with


respect to the risk process value Rt (left axis) and time t (right axis).

Department of Computer Science & Engineering Data


12/10/2024 Visualization
2D Visualization

Two-dimensional projection of the density evolution. From the Ruin


Probabilities Toolbox
Department of Computer Science & Engineering Data
12/10/2024 Visualization
2D Visualization

Two-dimensional projection of the density evolution. From the Ruin


Probabilities Toolbox
Department of Computer Science & Engineering Data
12/10/2024 Visualization
References

 https://elitedatascience.com/python-data-wrangling-tutorial
 https://public.tableau.com/views/Assignment13TableauVisualizations/A
ssignment13?:embed=y&:showVizHome=no&:showTabs=y&:display_c
ount=y&:display_static_image=y&:bootstrapWhenNotified=true
 https://www.youtube.com/watch?v=LoKR70IB8Xk
 https://scitools.org.uk/cartopy/docs/latest/index.html
 https://scitools.org.uk/cartopy/docs/latest/crs/projections.html#cartopy.c
rs.EquidistantConic
 https://scitools.org.uk/cartopy/docs/latest/matplotlib/intro.html
 https://www.analyticsvidhya.com/blog/2016/10/creating-interactive-
data-visualization-using-shiny-app-in-r-with-examples/

Department of Computer Science & Engineering Data


12/10/2024 Visualization
References

• http://techslides.com/over-1000-d3-js-examples-and-demos
• http://christopheviau.com/d3list/
• https://www.tutorialspoint.com/online_d3js_editor.php
• https://www.tutorialsteacher.com/d3js/create-bar-chart-using-d3js
• https://www.tutorialsteacher.com/d3js/create-svg-chart-in-d3js
• https://www.d3-graph-gallery.com/graph/line_basic.html
• https://basemaptutorial.readthedocs.io/en/latest/first_map.html
• https://www.earthdatascience.org/courses/scientists-guide-to-plotting-da
ta-in-python/plot-spatial-data/customize-raster-plots/interactive-maps/
• https://rosenfelder.ai/create-maps-with-python/
• https://developers.google.com/earth-engine/guides/exporting
• http://techslides.com/over-1000-d3-js-examples-and-demos

12/10/2024 Department of Computer Science & Engineering Data Visualization

You might also like