0% found this document useful (0 votes)
36 views

IISE 2021 - Data Processing and Visualization

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

IISE 2021 - Data Processing and Visualization

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Processing and

Visualization Training
Contents
1 What is Datviz?

Why Datviz?
2
3 How to do Datviz?

Datviz
Fundamentals 5
4 Do’s and Don’ts

Anomaly
Detection 6
The discrepancy of data processing and
data visualization

1 2
Data Processing Data Visualization

Data is collected and translated into usable The graphic representation of data.
information.

Data Processing Life Cycle Most Data Visualization Application for Daily
Working Use

By using visual elements, data visualization tools provide an accessible way to see and
understand trends, outliers, and patterns in data.
Advantages of Data Processing and Data
Visualization

Data Processing 1 Data Visualization 2

Highly efficient Quick, clear understanding of the information.

Time-saving Identifying trends and opportunities


Advantages
High speed processing Exploring and gaining business insights

Reduces errors Speed up the decision-making process

Source: 1 Source: 2
https://www.elprocus.com/data-processing-types-and-its-applications/
Data Visualization Starts with ‘Why’
“A beautiful chart ● The choice of what type of visualization to use isn’t purely aesthetic,
that no one can read nor is it entirely personal.
is just ● The wrong choice can lead your viewer to boredom, confusion, or both.
● Even worse, visualizing data inaccurately can constitute a breach of trust
abstract art.” between you and your audience.

So, Start with a Clear Purpose or ‘Why’


Ask yourself, “What do I want to show?”

Comparison Relationship Composition Distribution


Show differences between Explore relationships Analyze how Explore how values are
values so as to quickly between values. Find each component value grouped in your data.
compare categories as well correlations, outliers, and affects Usually used when given
as value change over time. clusters of data. the total value. clear classifications.
Data Visualization Cheat Sheet

Source: https://www.kaggle.com/getting-started/160583
Data Visualization Grammar and Vocabulary

Datviz Vocabulary Datviz Grammar

https://github.com/ft-interactive/chart-doctor/tre https://www.kdnuggets.com/2018/08/data-visua
e/master/visual-vocabulary lization-cheatsheet.html
Data Visualization: ‘I can read this!’ > ‘Wow Cool!’
Concern Details Visualized

Although donut chart looks correct at first glance, it causes major problems.
Problems with ● Comparing sizes of different groups is error-prone.
donut chart ● Viewers are focused in the center of the chart, while useful data is
provided in the background.

A chart/plot looking great can have flaws. Some of the usual flaws include:
Good looking does
● Skewing perceptions, making it difficult to rate proportions, or even
not mean more
getting more value out of the plot.
readable!
● Adding additional objects in the chart may be obstructive.

Doing more things


Exaggerations are very ill-advised when making charts/plots. Why?
does not mean
● Scaling purposes -- a chart must have a reliable scale.
making a better
● Focus may be shifted to the wrong object in the chart.
chart!

Just as how bar chart is more often, if not always, better than pie chart, all
Heights are easier
height-characterized charts are usually easier to read, more informative and
to compare
has more value compared to non-height-characterized charts.

Depending on the situation, a quirky or unique chart may please the


Avoid
audience more than just a simple one. However, as a best practice,
non-standard
(whenever you do not know clearly what you’re trying to achieve with a
shapes
unique chart), always refer to the simple ones for readability.
Data Visualization Do’s and Don’ts

we often use pie charts or donut charts


without percentages
DON’T

circular shapes are

harder to interpret

People recognize even percentages like 25%, 50%, 75%, or 100%


DO

pretty well, but normally struggle with intermediate values

Consider using bar charts for accurate percentage indication


Data Visualization Do’s and Don’ts

While sometimes data highlights itself (like one column being


DON’T

much bigger than others), it often needs a little help

If you want people to focus on something that isn’t very


obvious, then highlight it in a different color

Try not to use

color fill
DO

to maintain good contrast


Data Visualization Do’s and Don’ts

SCAN HERE

Using colour the choice of a palette is crucial for good perception


of the diagram
Choosing one yourself you risk a potential loss of clarity

Don't mix too many colors together (2-3 is the most optimal number) and, if you
absolutely need to use colors, use palettes designed to be color blind friendly

Why should you bother if there is a solid set of colours ready to be used?
Early Anomaly Detection & Insight: Findings
Pattern, Trend and Outlier
Trend Pattern Outlier
In order to know the future, Sometimes we can gained insight In real life, we often facing several
sometimes we need to look back from certain behavior from the data abnormalities (noises) in a dataset
from past data itself
Outlier are the data points that need to
Trend usually used for knowing whether In order to predict the future outcomes,
be handled in data processing.
the future would be uptrend or we need to analyse the pattern in
downtrend dataset

e.g Based on the graph above, we can conclude that


there were increment trend on number of subscription e.g There were positive correlation between highway e.g There are several outlier in horsepower data point.
fixed broadband in Indonesia usage vs city usage. We can conclude that if there were Later on, we need to eliminated or make several
high traffic in the highway which means the city are adjustment in order to overcome the problems
more likely to be crowded

https://towardsdatascience.com/15-data-exploration-techniques-to-go-from-data-to-insights-93f66e6805df
Our Experiences

You might also like