CSC 428 - 4
CSC 428 - 4
Lecture Notes_4
In the world of Big Data, data visualization tools and technologies are essential to analyze
massive amounts of information and make data-driven decisions.
Advantages
Our eyes are drawn to colors and patterns. We can quickly identify red from blue, and
squares from circles. Our culture is visual, including everything from art and
advertisements to TV and movies. Data visualization is another form of visual art that
grabs our interest and keeps our eyes on the message. When we see a chart, we quickly
see trends and outliers. If we can see something, we internalize it quickly. It’s storytelling
with a purpose. If you’ve ever stared at a massive spreadsheet of data and couldn’t see
a trend, you know how much more effective a visualization can be.
Disadvantages
While there are many advantages, some of the disadvantages may seem less obvious.
For example, when viewing a visualization with many different datapoints, it’s easy to
make an inaccurate assumption. Or sometimes the visualization is just designed wrong
so that it’s biased or confusing.
1
• Correlation doesn’t always mean causation.
• Core messages can get lost in translation.
The importance of data visualization is simple: it helps people see, interact with, and
better understand data. Whether simple or complex, the right visualization can bring
everyone on the same page, regardless of their level of expertise.
It’s hard to think of a professional industry that doesn’t benefit from making data more
understandable. Every STEM field benefits from understanding data—and so do fields in
government, finance, marketing, history, consumer goods, service industries, education,
sports, and so on.
While we’ll always wax poetically about data visualization (you’re on the Tableau website,
after all) there are practical, real-life applications that are undeniable. And, since
visualization is so prolific, it’s also one of the most useful professional skills to develop.
The better you can convey your points visually, whether in a dashboard or a slide deck,
the better you can leverage that information. The concept of the citizen data scientist is
on the rise. Skill sets are changing to accommodate a data-driven world. It is increasingly
valuable for professionals to be able to use data to make decisions and use visuals to tell
stories of when data informs the who, what, when, where, and how.
While traditional education typically draws a distinct line between creative storytelling and
technical analysis, the modern professional world also values those who can cross
between the two: data visualization sits right in the middle of analysis and visual
storytelling.
As the “age of Big Data” kicks into high gear, visualization is an increasingly key tool to
make sense of the trillions of rows of data generated every day. Data visualization helps
to tell stories by curating data into a form easier to understand, highlighting the trends
and outliers. A good visualization tells a story, removing the noise from data and
highlighting useful information.
2
However, it’s not simply as easy as just dressing up a graph to make it look better or
slapping on the “info” part of an infographic. Effective data visualization is a delicate
balancing act between form and function. The plainest graph could be too boring to catch
any notice or it make tell a powerful point; the most stunning visualization could utterly fail
at conveying the right message or it could speak volumes. The data and the visuals need
to work together, and there’s an art to combining great analysis with great storytelling.
A cluster chart is one of the many different examples of data visualization that Tableau
can perform. Of course, one of the best ways to understand data visualization is to see it.
What a crazy concept! With public data visualization galleries and data everywhere
online, it can be overwhelming to know where to start. Tableau’s own public gallery shows
off loads of visualizations made with the free Tableau Public tool, we feature some
common starter business dashboards as usable templates, and Viz of the Day collects
some of the best community creations. We’ve also collected 10 of the best examples of
data visualization of all time, with examples that map historical conquests, analyze film
scripts, reveal hidden causes of mortality, and more.
When you think of data visualization, your first thought probably immediately goes to
simple bar graphs or pie charts. While these may be an integral part of visualizing data
and a common baseline for many data graphics, the right visualization must be paired
with the right set of information. Simple graphs are only the tip of the iceberg. There’s a
whole selection of visualization methods to present data in effective and interesting ways.
Chart: Information presented in a tabular, graphical form with data displayed along two
axes. Can be in the form of a graph, diagram, or map. Learn more.
Graph: A diagram of points, lines, segments, curves, or areas that represents certain
variables in comparison to each other, usually along two axes at a right angle.
Geospatial: A visualization that shows data in map form using different shapes and
colors to show the relationship between pieces of data and specific locations. Learn more.
3
Infographic: A combination of visuals and words that represent data. Usually uses charts
or diagrams.
Dashboards: A collection of visualizations and data displayed in one place to help with
analyzing and presenting data. Learn more.
Area Map: A form of geospatial visualization, area maps are used to show specific values
set over a map of a country, state, county, or any other geographic location. Two common
types of area maps are choropleths and isopleths. Learn more.
Bar Chart: Bar charts represent numerical values compared to each other. The length of
the bar represents the value of each variable. Learn more.
Box-and-whisker Plots: These show a selection of ranges (the box) across a set
measure (the bar). Learn more.
Gantt Chart: Typically used in project management, Gantt charts are a bar chart
depiction of timelines and tasks. Learn more.
Heat Map: A type of geospatial visualization in map form which displays specific data
values as different colors (this doesn’t need to be temperatures, but that is a common
use). Learn more.
Highlight Table: A form of table that uses color to categorize similar data, allowing the
viewer to read it more easily and intuitively. Learn more.
Histogram: A type of bar chart that split a continuous measure into different bins to help
analyze the distribution. Learn more.
Pie Chart: A circular chart with triangular segments that shows data as a percentage of
a whole. Learn more.
Treemap: A type of chart that shows different, related values in the form of rectangles
nested together.
There are dozens of tools for data visualization and data analysis. These range from
simple to complex, from intuitive to obtuse. Not every tool is right for every person looking
4
to learn visualization techniques, and not every tool can scale to industry or enterprise
purposes. If you’d like to learn more about the options, feel free to read up here or dive
into detailed third-party analysis like the Gartner Magic Quadrant.
Also, remember that good data visualization theory and skills will transcend specific tools
and products. When you’re learning this skill, focus on best practices and explore your
own personal style when it comes to visualizations and dashboards. Data visualization
isn’t going away any time soon, so it’s important to build a foundation of analysis and
storytelling and exploration that you can carry with you regardless of the tools or software
you end up using.
Data visualization is one of the steps of the data science process, which states
that after data has been collected, processed and modeled, it must be visualized
for conclusions to be made. Data visualization is also an element of the broader
data presentation architecture (DPA) discipline, which aims to identify, locate,
manipulate, format and deliver data in the most efficient way possible.
Data visualization is important for almost every career. It can be used by teachers
to display student test results, by computer scientists exploring advancements in
artificial intelligence (AI) or by executives looking to share information with
stakeholders. It also plays an important role in big data projects. As businesses
accumulated massive collections of data during the early years of the big data
trend, they needed a way to get an overview of their data quickly and easily.
Visualization tools were a natural fit.
5
algorithms, it becomes important to visualize the outputs to monitor results and
ensure that models are performing as intended. This is because visualizations of
complex algorithms are generally easier to interpret than numerical outputs.
• the ability to absorb information quickly, improve insights and make faster
decisions;
• an increased understanding of the next steps that must be taken to
improve the organization;
• an improved ability to maintain the audience's interest with information
they can understand;
• an easy distribution of information that increases the opportunity to share
insights with everyone involved;
• eliminate the need for data scientists since data is more accessible and
understandable; and
• an increased ability to act on findings quickly and, therefore, achieve
success with greater speed and less mistakes.
6
and present information to business owners and stakeholders in ways they can
understand.
Big data visualization often goes beyond the typical techniques used in normal
visualization, such as pie charts, histograms and corporate graphs. It instead uses
more complex representations, such as heat maps and fever charts. Big data
visualization requires powerful computer systems to collect raw data, process it
and turn it into graphical representations that humans can use to quickly draw
insights.
While big data visualization can be beneficial, it can pose several disadvantages
to organizations. They are as follows:
• infographics
• bubble clouds
7
• bullet graphs
• heat maps
• fever charts
• time series charts
Line charts. This is one of the most basic and common techniques used. Line
charts display how variables can change over time.
Scatter plots. This technique displays the relationship between two variables.
A scatter plot takes the form of an x- and y-axis with dots to represent data points.
Treemaps. This method shows hierarchical data in a nested format. The size of
the rectangles used for each category is proportional to its percentage of the
whole. Treemaps are best used when multiple categories are present, and the goal
is to compare different parts of a whole.
Population pyramids. This technique uses a stacked bar graph to display the
complex social narrative of a population. It is best used when trying to display the
distribution of a population.
Sales and marketing. Research from market and consumer data provider Statista
estimated $566 billion was spent on digital advertising in 2022 and that number
will cross the $700 billion mark by 2025. Marketing teams must pay close attention
to their sources of web traffic and how their web properties generate revenue. Data
8
visualization makes it easy to see how marketing efforts effect traffic trends over
time.
Logistics. Shipping companies can use visualization tools to determine the best
global shipping routes.
9
1.6 The science of data visualization
The science of data visualization comes from an understanding of how humans
gather and process information. Daniel Kahn and Amos Tversky collaborated on
research that defined two different methods for gathering and processing
information.
10
The generated images may also include interactive capabilities, enabling users to
manipulate them or look more closely into the data for questioning and analysis.
Indicators designed to alert users when data has been updated or when predefined
conditions occur can also be integrated.
As data visualization vendors extend the functionality of these tools, they are
increasingly being used as front ends for more sophisticated big data
environments. In this setting, data visualization software helps data engineers and
scientists keep track of data sources and do basic exploratory analysis of data sets
prior to or after more detailed advanced analyses.
The biggest names in the big data tools marketplace include Microsoft, IBM, SAP
and SAS. Some other vendors offer specialized big data visualization software;
popular names in this market include Tableau, Qlik and Tibco.
While Microsoft Excel continues to be a popular tool for data visualization, others
have been created that provide more sophisticated abilities:
11
• MicroStrategy
• Google Charts
12