0% found this document useful (0 votes)
11 views34 pages

Lecture9 InfoVis Intro

The document outlines Assignment 1 for the Big Data Visual Analytics course at IIT Kanpur, detailing two parts: a simplified isocontour algorithm and VTK volume rendering. It specifies submission guidelines, including a penalty for late submissions and the necessity of a README file. Additionally, it covers various information visualization techniques, libraries, and their applications in data analysis.

Uploaded by

okstudyshivi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views34 pages

Lecture9 InfoVis Intro

The document outlines Assignment 1 for the Big Data Visual Analytics course at IIT Kanpur, detailing two parts: a simplified isocontour algorithm and VTK volume rendering. It specifies submission guidelines, including a penalty for late submissions and the necessity of a README file. Additionally, it covers various information visualization techniques, libraries, and their applications in data analysis.

Uploaded by

okstudyshivi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Big Data Visual Analytics (CS 661)

Instructor: Soumya Dutta


Department of Computer Science and Engineering
Indian Institute of Technology Kanpur (IITK)
email: [email protected]
Assignment 1 - Due: 18/02/23 11:59pm

• Part 1: Simplified isocontour algorithm for 2D data without handling


marching square cases or ambiguity cases explicitly
• Traverse the cell vertices in counterclockwise order
• Not allowed to use VTK’s contour filter, write your own code following the
method we discussed in class
• You do not have to implement the entire marching squares algorithm
• You do not have to handle those cases separately
• This is a simplified version of the algorithm
• Part 2: VTK Volume Rendering, Transfer Function, and Shading
• Consult VTK’s manual, examples for help
• Read the instructions very carefully!!

IITK CS661: Big Data Visual Analytics: Soumya Dutta 2


Assignment 1 - Submission
Process
• Submission through HelloIITK
• Only one group member needs to submit from each group
• Submit Python scripts in a single Zipped file
• README.txt file is mandatory with detailed instructions of how to run
your code and pass parameters and anything else you want the TA to
know for running your code
• There is a 10% penalty each day after the submission deadline for up
to 20% (2 late days).
• After that, you get zero.
• No deadline extension requests please!

IITK CS661: Big Data Visual Analytics: Soumya Dutta 3


Assignment Group Info
• If you are not listed in this google spreadsheet, your lab will not be
graded!
• https://docs.google.com/spreadsheets/d/1sELK0bk5KNKjh9ZJzrzpSeQ
nEy6VY51I-vCZ3QJiMcs/edit?usp=sharing

IITK CS661: Big Data Visual Analytics: Soumya Dutta 4


Acknowledgements
• Some of the following slides are adapted from the excellent course
materials and tutorials made available by:
• Prof. Michelle Borkin (Northeastern University)

IITK CS661: Big Data Visual Analytics: Soumya Dutta 5


How to Say Nothing with Scientific
Visualization
• Never include a color legend
• Avoid annotation
• Never mention error characteristics
• When in doubt, smooth
• Avoid providing performance data
• Never learn anything about the data or the discipline
• Never compare with others
• Never cite references of data
• Claim generalizability but show result on a single data
• Use view angle to hide shortcomings
• ‘This is easily extended to 3D’

IITK CS661: Big Data Visual Analytics: Soumya Dutta Fourteen Ways to Say Nothing with Scientific Visualization, Eric Raible, NASA, in IEEE Computer. 6
Information
Visualization
Information Visualization
(InfoVis) Table data

• The use of computer-supported, interactive


visual representations of data to amplify
cognition
• Data is not necessarily defined on a spatial domain
• Data is not always numerical
• Data is inherently discrete
• The study of transforming data, information,
and knowledge into interactive visual
representations

Graph data

IITK CS661: Big Data Visual Analytics: Soumya Dutta 8


Information Visualization for
Business Data
Visual summary of about 10,000 emails in 2008
Topic strength

Time (Year 2008)

IITK CS661: Big Data Visual Analytics: Soumya Dutta Interactive, topic-based visual text summarization and analysis. 9
Information Visualization for
Science Data

Circle viewer with indicators of environmental


variables at the selected location. Silica: inorganic SiO 2
concentration; Temp: temperature; Nutrient: inorganic
NO3 concentration; Light: photosynthetically available
radiation.
Exploratory Data Visualization Tool for Museum Visitors

IITK CS661: Big Data Visual Analytics: Soumya Dutta Living Liquid: Design and Evaluation of an Exploratory Visualization Tool for Museum Visitors 10
Information Visualization for
Soccer Data

IITK CS661: Big Data Visual Analytics: Soumya Dutta SoccerStories: A Kick-off for Visual Soccer Analysis, https://www.youtube.com/watch?v=eFIorHSMiSQ 11
Information Visualization for ML Classifiers

• A detailed evaluation
of classifiers for
model selection and
debugging
• An interactive,
comparative, model
agnostic visualization
system

IITK CS661: Big Data Visual Analytics: Soumya Dutta ConfusionFlow: A model-agnostic visualization for temporal analysis of classifier confusion 12
Information Visualization for ML Model
Explainability

IITK CS661: Big Data Visual Analytics: Soumya Dutta https://gandissect.csail.mit.edu/ 13


A Brief Taxonomy of InfoVis
Techniques
• InfoVis Techniques
• Empirical Methods
• Interaction
• Frameworks
• Applications

IITK CS661: Big Data Visual Analytics: Soumya Dutta A survey on information visualization: recent advances and challenges, Liu et al. 14
Empirical Methods
• Empirical methods are categorized as
• Model and Evaluation
• Model
• Visual representation model
• Data driven model
• Evaluation
• User studies are the most used in InfoVis and offer a scientifically sound
method to measure visualization performance
• Statistical methods

IITK CS661: Big Data Visual Analytics: Soumya Dutta 15


Interaction
• Interaction is a fundamental aspect of InfoVis techniques
• Two Interaction categories
• WIMP (windows, icons, mouse, pointer )
• Post-WIMP
• Touch interfaces
• Another operation-based categorization of interactions
• select, explore, reconfigure, encode, abstract/elaborate, filter, and connect

IITK CS661: Big Data Visual Analytics: Soumya Dutta 16


Frameworks/Systems
• Researchers have proposed a variety
of visualization systems such as
Improvise, the InfoVis Toolkit, and
Prefuse to support the creation and
customization of visualization
applications.
• More recently, a new web-based
library called Data-Driven Documents
(D3) has become a very popular
toolkit to construct interactive
visualizations on the web
• https://d3js.org/
https://observablehq.com/@d3/galleryhttps://observablehq.com/@d3/gallery

IITK CS661: Big Data Visual Analytics: Soumya Dutta 17


Applications
• Four different types of data and applications
• Graph data visualization
• Text data visualization
• Map data visualization
• Multivariate data visualization

IITK CS661: Big Data Visual Analytics: Soumya Dutta 18


Exploratory Data Analysis

“The greatest value of a picture is


when it forces us to notice what we
never expected to see.”
- John Tukey

IITK CS661: Big Data Visual Analytics: Soumya Dutta 19


InfoVis: Big Data Aspects
• Common objectives for big data visualization
• Decision initiation or modification
• Enhancing understanding
• Considerations for creating big data visualization systems
• Source data
• Information transfer to the audience
• Design choices/ scalability
• Enhance visualization by Graphical overlays
• Highlights
• Encodings
• Summary statistics
• Annotations
IITK CS661: Big Data Visual Analytics: Soumya Dutta 20
InfoVis: Issues and Risks
• Imprecision and Inaccuracy
• Display information at a lower level of precision and accuracy than numerical
or tabular formats
• Optical Significance
• Viewer can interpret a difference or pattern as meaningful based on his or her
perception, sometimes without corresponding quantitative evidence to
support this interpretation
• Visualization Oversaturation
• A dramatic increase in deficient and flawed visualizations

IITK CS661: Big Data Visual Analytics: Soumya Dutta 21


Libraries for Data
analysis and
Visualization

IITK CS661: Big Data Visual Analytics: Soumya Dutta 22


Libraries for Data Visualization: Matplotlib

• The most basic and Python’s standard data


visualization library
• A comprehensive library for creating static,
animated, and interactive visualizations in Python.
• https://matplotlib.org/
• Examples:
https://matplotlib.org/stable/gallery/index.html

IITK CS661: Big Data Visual Analytics: Soumya Dutta 23


Libraries for Data Visualization: Seaborn

• Built on top of Matplotlib but with better aesthetics


and interactivity
• It provides a high-level interface for drawing
attractive and informative statistical graphics.
• https://seaborn.pydata.org/
• Examples: https://seaborn.pydata.org/examples

IITK CS661: Big Data Visual Analytics: Soumya Dutta 24


Libraries for Data Visualization: Bokeh
• Bokeh is a Python library for creating interactive
visualizations for modern web browsers.
• Build beautiful graphics, ranging from simple plots to
complex dashboards
• Create JavaScript-powered visualizations without
writing any JavaScript code
• https://docs.bokeh.org/en/latest/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 25


Libraries for Data Visualization:
Plotly Dash
• Dash is an Open-Source Python library for creating
reactive, Web-based applications
• Built on top of Plotly.js and React.js
• User interface library for creating analytical web
applications
• https://dash.plotly.com/
• https://dash.gallery/Portal/

• Dash is ‘React’ for Python


• React: A JavaScript library for building user interfaces

IITK CS661: Big Data Visual Analytics: Soumya Dutta 26


Libraries for Data Visualization:
D3
• D3 - Data-Driven Documents
• D3.js is a JavaScript library for manipulating documents
based on data.
• D3 helps you bring data to life using HTML, SVG, and CSS.
• D3’s emphasis on web standards gives you the full
capabilities of modern browsers
• Combines powerful visualization components and a data-
driven approach to DOM manipulation
• https://d3js.org/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 27


Plotly Python
https://plotly.com/python/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 28


Plotly Python
• Plotly Python library is an interactive, open-source plotting library that
supports over 40 unique visualization types
• Built on top of Plotly.js library
• Plotly applications can be made as web applications using Dash library
• More details: https://plotly.com/python/getting-started/
• Tutorial:
https://www.kaggle.com/code/kanncaa1/plotly-tutorial-for-beginners
/notebook

IITK CS661: Big Data Visual Analytics: Soumya Dutta https://plotly.com/python/getting-started/ 29


Main Idea in Plotly
• Data, Layout, and Figure
• The Data object defines what we want to display in the chart (that is, the data)
• We define a collection of data and the specifications to display them as a trace.
• Think of a line chart with two lines representing two different categories: each line is a trace.
• The Layout object defines features that are not related to data (like title, axis
titles, and so on). We can also use the Layout to add annotations and shapes to
the chart.
• The Figure object creates the final object to be plotted. It's an object that
contains both data and layout.

IITK CS661: Big Data Visual Analytics: Soumya Dutta https://www.freecodecamp.org/news/how-and-why-i-used-plotly-instead-of-d3-to-visualize-my-lollapalooza-data-d48345e2ca68/30


Bokeh Library
https://docs.bokeh.org
/en/latest/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 31


Bokeh
• Bokeh provides Python-based APIs to the users to write visualization
routines
• Bokeh renders its plots using HTML and JavaScript with the help from
BokehJS library which works at the backend of Bokeh
• Useful for web application development
• Open-source project
• Code at: https://github.com/bokeh/bokeh

IITK CS661: Big Data Visual Analytics: Soumya Dutta 32


Bokeh: Features
• Can interact with common Python data tools such as Pandas, Jupyter
Notebook, etc.
• Plots provide interactivity with the data
• Matplotlib and Seaborn plots are primarily static!
• Users can add custom JS code to customize functionalities of the
Bokeh library
• Bokeh plots can be easily integrated into Flask or Django applications
for building complex visual analytics systems

IITK CS661: Big Data Visual Analytics: Soumya Dutta 33


Interfaces to Bokeh
• bokeh.plotting • bokeh.models
• High level interface for plotting • Low level interface to Bokeh
data library
• Contains definition of Bokeh’s • More customizable and flexible
figure class than bokeh.plotting
• Figure class allows plotting • Users have to write more low-
vectorized graphics as glyphs level codes
• Glyph: Building block of Bokeh
plots
• Lines, circles, rectangles, other
shapes

IITK CS661: Big Data Visual Analytics: Soumya Dutta Fig. src: Wikipedia 34

You might also like