0% found this document useful (0 votes)
3 views61 pages

Data Visualization 1

Data visualization is the graphical representation of data using visual elements like charts and graphs, making complex information easier to understand. It has a rich history spanning from prehistoric cave paintings to modern interactive tools, and is essential for discovering trends and insights in various fields. Effective data visualization requires clarity, simplicity, accuracy, and relevance, while also considering the audience's ability to interpret the visuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views61 pages

Data Visualization 1

Data visualization is the graphical representation of data using visual elements like charts and graphs, making complex information easier to understand. It has a rich history spanning from prehistoric cave paintings to modern interactive tools, and is essential for discovering trends and insights in various fields. Effective data visualization requires clarity, simplicity, accuracy, and relevance, while also considering the audience's ability to interpret the visuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Definitionof DATA VISUALIZATION

 Data visualization is a graphical representation of quantitative information and data by using visual elements like
graphs, charts, and maps. Data visualization convert large and small data sets into visuals, which is easy to
understand and process for humans.
 Data visualization tools provide accessible ways to understand outliers, patterns, and trends in the data. In the world
of Big Data, the data visualization tools and technologies are required to analyze vast amounts of information. Data
visualizations are common in your everyday life, but they always appear in the form of graphs and charts. The
combination of multiple visualizations and bits of information are still referred to as Infographics.
 Data visualizations are used to discover unknown facts and trends. You can see visualizations in the form of line
charts to display change over time. Bar and column charts are useful for observing relationships and making
comparisons. A pie chart is a great way to show parts-of-a-whole. And maps are the best way to share geographical
data visually.
 Today's data visualization tools go beyond the charts and graphs used in the Microsoft Excel spreadsheet, which
displays the data in more sophisticated ways such as dials and gauges, geographic maps, heat maps, pie chart,
and fever chart.

What makes Data Visualization Effective?

 Effective data visualization are created by communication, data science, and design collide. Data visualizations did
right key insights into complicated data sets into meaningful and natural. American statistician and Yale professor
Edward Tufte believe useful data visualizations consist of ?complex ideas communicated with clarity, precision,
and efficiency.

 To craft an effective data visualization, you need to start with clean data that is well-sourced and complete. After the
data is ready to visualize, you need to pick the right chart. After you have decided the chart type, you need to design
and customize your visualization to your liking. Simplicity is essential - you don't want to add any elements that
distract from the data.

1
History of Data Visualization
The history of data visualization is a long and fascinating journey that spans centuries. Here is a brief overview of key
developments and milestones in the history of data visualization:
1. Prehistoric Period: Early humans used cave paintings, drawings, and symbols to convey information about their
surroundings, such as maps of hunting grounds or celestial charts.
2. Ancient and Classical Periods: Ancient civilizations like the Egyptians, Greeks, and Romans created visual
representations of data through maps, diagrams, and inscriptions. For instance, the Greeks developed the earliest
known geographic maps and anatomical diagrams.
3. Middle Ages: During this period, illuminated manuscripts and diagrams were used to illustrate complex ideas in
various fields, including theology, astronomy, and medicine.
4. Renaissance: The Renaissance saw a resurgence of interest in science and art, which led to the creation of detailed
maps, anatomical drawings, and early cartography. Leonardo da Vinci's notebooks are famous for their detailed
scientific illustrations.
5. 17th-18th Century: The Age of Enlightenment marked the beginning of more systematic data visualization.
Mathematicians like John Graunt created early statistical graphics, and advancements in astronomy led to the
development of star charts and celestial maps.
6. 19th Century: The 19th century saw the emergence of thematic maps, particularly in the field of geography. Early
statisticians and demographers like William Playfair and Charles Minard developed innovative ways to display data
using graphical representations.
7. 19th-20th Century: Florence Nightingale is known for her use of polar area diagrams to illustrate the significance of
proper sanitation practices in hospitals during the Crimean War.
8. Late 19th-20th Century: With the advent of photography and the industrial revolution, there was an explosion in the
creation of data visualizations. Pioneers like Charles Joseph Minard (known for his flow maps), John Snow (famous
for his cholera map), and Otto Neurath (creator of Isotype) made significant contributions to the field.
9. 20th Century: The development of electronic computers in the mid-20th century marked a significant turning point
in data visualization. It became easier to create complex and interactive visualizations. Notable examples include the
work of Edward Tufte and the introduction of graphical user interfaces (GUIs) for data visualization.
10.Late 20th-21st Century: The digital age brought a revolution in data visualization. Software tools like Tableau, D3.js,
and ggplot in R made it easier for non-experts to create informative and interactive data visualizations. Infographics
and interactive web-based data visualization became increasingly popular.
11.Present and Future: Data visualization continues to evolve with advances in technology and the availability of big
data. Virtual reality, augmented reality, and machine learning techniques are being integrated into data visualization
to provide new ways of exploring and understanding data.
Today, data visualization is a critical tool in various fields, including business, science, journalism, and education. It plays
a vital role in conveying complex information in a visually engaging and comprehensible manner.

Why Use Data Visualization?


1. To make easier in understand and remember.
2. To discover unknown facts, outliers, and trends.
3. To visualize relationships and patterns quickly.
4. To ask a better question and make better decisions.
5. To competitive analyze.
6. To improve insights.

Advantages of Data Visualization:-


1. Better Agreement : In business, for numerous periods, it happens that we need to look at the exhibitions of two
components or two situations. A conventional methodology is to experience the massive information of both the
circumstances and afterward examine it. This will clearly take a great deal of time.
2. A Superior Method : It can tackle the difficulty of placing the informationof both perspectives into the pictorial
structure. This will unquestionably give a superior comprehension of the circumstances. For instance, Google patterns
assist us with understanding information identified with top ventures or inquiries in pictorial or graphical structures.

2
3. Simple Sharing of Data : With the representation of the information, organizations present another arrangement of
correspondence. Rather than sharing the cumbersome information, sharing the visual data will draw in and pass on
across the data which is more absorbable.
4. Deals Investigation : With the assistance of information representation, a salesman can, without much of a stretch,
comprehend the business chart of items. With information perception instruments like warmth maps, he will have the
option to comprehend the causes that are pushing the business numbers up just as the reasons that are debasing the
business numbers. Information representation helps in understanding the patterns and furthermore, different variables
like sorts of clients keen on purchasing, rehash clients, the impact of topography, and so forth.
5. Discovering Relations Between Occasions : A business is influenced by a lot of elements. Finding a relationship
between these elements or occasions encourages chiefs to comprehend the issues identified with their business. For
instance, the online business market is anything but another thing today. Each time during certain happy seasons, like
Christmas or Thanksgiving, the diagrams of online organizations go up.

Disadvantages of data visualization:-


1. Can be time-consuming: Creating visualizations can be a time-consuming process, especially when dealing with large
and complex datasets. This can slow down the machine learning workflow and reduce productivity.
2. Can be misleading: While data visualization can help identify patterns and relationships in data, it can also be
misleading if not done correctly. Visualizations can create the impression of patterns or trends that may not actually
exist, leading to incorrect conclusions and poor decision-making.
3. Can be difficult to interpret: Some types of visualizations, such as those that involve 3D or interactive elements, can
be difficult to interpret and understand. This can lead to confusion and misinterpretation of the data.
4. May not be suitable for all types of data: Certain types of data, such as text or audio data, may not lend themselves
well to visualization. In these cases, alternative methods of analysis may be more appropriate.
5. May not be accessible to all users: Some users may have visual impairments or other disabilities that make it difficult
or impossible for them to interpret visualizations. In these cases, alternative methods of presenting data may be
necessary to ensure accessibility.

characteristics of effective data visualization:


Data visualization is a powerful tool for conveying information and insights in a visual format. Here are some key
characteristics of effective data visualization:
1. Clarity: Data visualizations should be clear and easily understandable. The primary purpose is to convey information,
so the audience should be able to grasp the message quickly and accurately.
2. Simplicity: Less is often more. Simplify the visualization by removing unnecessary elements, labels, and decorations.
A clean design helps focus on the essential data.
3. Relevance: Ensure that the data and visual elements used are directly relevant to the message you want to convey.
Avoid distractions or unrelated details.
4. Accuracy: Data should be accurately represented. Errors or distortions in the data can lead to misinterpretations and
incorrect conclusions.
5. Consistency: Maintain consistency in design elements such as colors, fonts, and scales throughout the visualization.
This makes it easier for viewers to understand and compare data points.
6. Context: Provide context and background information to help viewers understand the data and its significance. This
may include titles, labels, and captions.
7. Interactivity: Interactive features can enhance data visualization by allowing users to explore the data further. Tools
like tooltips, zooming, and filtering can help users gain deeper insights.
8. Aesthetics: While aesthetics are important, they should not overshadow the data. Visual appeal should complement
the content without overwhelming it.
9. Efficiency: Effective data visualization should communicate information efficiently. Viewers should be able to extract
insights quickly without spending excessive time analyzing the visualization.
10.Storytelling: Data visualization should tell a story or convey a narrative. It should guide the viewer through the data
and highlight key points or trends.
11.Appropriateness: Choose the right type of visualization for the data and the message you want to convey. Whether
it's a bar chart, line graph, scatter plot, or more advanced visualizations like heatmaps or Sankey diagrams, the
choice should suit the data's characteristics.

3
Data visualization process:
1. Determine the decision you want to make
“One of the biggest pitfalls in data visualization is people worrying too much about making the visuals look a certain
way. The important work happens long before that point,” says Cook. In other words, don’t get wrapped up in colors and
other aesthetics too soon. Your first step is figuring out what decision you’re trying to make. You can have all the data in
the world, but it won’t mean much if you’re not sure what to do with it. Cook recommends posing the decision in the
form of a question so you’re clear on the answer you’re seeking. “If you aren’t clear on your decision, your visual won’t
be either,” he explains. Here’s an example of a clear decision question: During which fiscal quarter should we launch
our new product?
2. Identify the metrics that inform the decision
You likely have tons of data available to you, but only certain data points will be relevant to your decision. Before
getting overwhelmed by data sets, consider which specific points would be most helpful for answering your decision
question .
Once you identify the right metrics, determine whether you can actually collect them with any accuracy. You may
find that some data points either aren’t available or are inaccurate. In this case, you typically have two alternatives: Kick
off a project to collect the data (such as developing and distributing a survey) or revisit the first step and
adjust your question.
3. Develop the story you want to tell
Next up is developing a story from your data. Cook shares a few questions you can use to prepare your narrative:
• Is the data about comparison? You may be making a decision based on metrics being bigger or smaller — or faster
or slower.
• Is the data about changes over time? Your decision may concern entering a new market or tracking product launch
performance over time.
• Is the data about categorization? You may have a cost-based decision that needs to identify where the business
is losing money.
4. Select the appropriate visual
This part of the data visualization process is fairly simple, as most visuals naturally follow the type of story you want
to tell. Consider these examples:
• Comparison stories typically work best with bar graphs.
• Time-based stories pair well with line charts.
• Categorical stories typically necessitate tree charts.
5. Add relevant elements to the visual
“Now is the point in the data visualization process when you can focus on aesthetics,” says Cook. The purpose of this
step is to make choices about your visual that aid in not only its appeal but also fostering comprehension. You may need
to add callouts to your chart to emphasize certain data points or add important context. For instance, say you created a
chart that was missing a week of sales data. The audience may assume you made a mistake, but you didn’t include the
data for good reason —a hurricane caused the business to close that week. A well-placed callout can prevent this
confusion. Color decisions can benefit from a designer’s eye — and some common sense. For example,
people often associate red with negativity (recall the saying about sales being “in the red”). So if your chart is sharing
good news, you may want to avoid using that color.
6. Clearly label and review the visual
Where the previous step was about choosing visualization elements, this step is about making note of the choices you
made. Title the visual appropriately. Make sure units are correct (e.g., dollars vs euros) and incremented consistently.
Ensure there’s a legend to explain color meanings.“Here you’re just making sure the audience doesn’t have unnecessary
questions about what they’re viewing,” Cook explains.
7. Let a nonexpert review the visual
“The last step of the data visualization process is quite important. You need a different set of eyes on the visual you’ve
created — preferably eyes that don’t have the same knowledge or experience as your own,” says Cook. Giving your
visual to someone else to review, especially someone who doesn’t know much about the subject matter or underlying
data, is an important spot check. Ideally, they should be able to comprehend the story you’re trying to communicate
without any issues.If they have any trouble, Cook says you may need to go back a few steps. The most common problem
is using the wrong type of chart for the data you’re presenting. Otherwise, you may just need to add a callout or two to
fill in any blanks in the visual narrative.“But if you’ve followed these steps carefully and dedicated a reasonable amount
of time to the task, you should be set,” Cook says.
4
Categories of Data Visualization
Data visualization is very critical to market research where both numerical and categorical data can be visualized, which
helps in an increase in the impact of insights and also helps in reducing the risk of analysis paralysis. So, data
visualization is categorized into the following categories:
1. Numerical Data :
Numerical data is also known as Quantitative data. Numerical data is any data where data generally represents
amount such as height, weight, age of a person, etc. Numerical data visualization is easiest way to visualize data. It is
generally used for helping others to digest large data sets and raw numbers in a way that makes it easier to interpret
into action. Numerical data is categorized into two categories :
 Continuous Data –
It can be narrowed or categorized (Example: Height measurements).
 Discrete Data –
This type of data is not “continuous” (Example: Number of cars or children’s a household has).
The type of visualization techniques that are used to represent numerical data visualization is Charts and Numerical
Values. Examples are Pie Charts, Bar Charts, Averages, Scorecards, etc.
2. Categorical Data :
Categorical data is also known as Qualitative data. Categorical data is any data where data generally represents
groups. It simply consists of categorical variables that are used to represent characteristics such as a person’s
ranking, a person’s gender, etc. Categorical data visualization is all about depicting key themes, establishing
connections, and lending context. Categorical data is classified into three categories :
 Binary Data –
In this, classification is based on positioning (Example: Agrees or Disagrees).
 Nominal Data –
In this, classification is based on attributes (Example: Male or Female).
 Ordinal Data –
In this, classification is based on ordering of information (Example: Timeline or processes).
The type of visualization techniques that are used to represent categorical data is Graphics, Diagrams, and
Flowcharts. Examples are Word clouds, Sentiment Mapping, Venn Diagram, etc.

Why is Data Visualization So Important?


1. Data Visualization Discovers the Trends in Data
The most important thing that data visualization does is discover the trends in data. After all, it is much easier to observe
data trends when all the data is laid out in front of you in a visual form as compared to data in a table. For example, the
screenshot below on Tableau demonstrates the sum of sales made by each customer in descending order. However, the
color red denotes loss while grey denotes profits. So it is very easy to observe from this visualization that even though
some customers may have huge sales, they are still at a loss. This would be very difficult to observe from a table.
2. Data Visualization Provides a Perspective on the Data
Data Visualization provides a perspective on data by showing its meaning in the larger scheme of things. It demonstrates
how particular data references stand with respect to the overall data picture. In the data visualization below, the data
between sales and profit provides a data perspective with respect to these two measures. It also demonstrates that
there are very few sales above 12K and higher sales do not necessarily mean a higher profit.
3. Data Visualization Puts the Data into the Correct Context
It is very difficult to understand the context of the data with data visualization. Since context provides the whole
circumstances of the data, it is very difficult to grasp by just reading numbers in a table. In the below data visualization
on Tableau, a Tree Map is used to demonstrate the number of sales in each region of the United States. It is very easy to
understand from this data visualization that California has the largest number of sales out of the total number since the
rectangle for California is the largest. But this information is not easy to understand outside of context without
data visualization.
4. Data Visualization Saves Time
It is definitely faster to gather some insights from the data using data visualization rather than just studying a chart. In
the screenshot below on Tableau, it is very easy to identify the states that have suffered a net loss rather than a profit.
This is because all the cells with a loss are colored red using a heat map, so it is obvious states have suffered a loss.
Compare this to a normal table where you would need to check each cell to see if it has a negative value to determine a
loss. Obviously, data visualization saves a lot of time in this situation!
5
5. Data Visualization Tells a Data Story
Data visualization is also a medium to tell a data story to the viewers. The visualization can be used to present the data
facts in an easy-to-understand form while telling a story and leading the viewers to an inevitable conclusion. This data
story, like any other type of story, should have a good beginning, a basic plot, and an ending that it is leading towards.
For example, if a data analyst has to craft a data visualization for company executives detailing the profits on various
products, then the data story can start with the profits and losses of various products and move on to recommendations
on how to tackle the losses.

Model of communication systems


The Communication Systems Model, often referred to as the Shannon-Weaver model or the Shannon-Weaver
communication model, is a fundamental framework used to understand how communication processes work. It was
developed by Claude Shannon and Warren Weaver in 1949 and has since been adapted and expanded upon in various
fields, including information theory, telecommunications, and general communication studies. The model breaks down
communication into several key components and illustrates the flow of information from a sender to a receiver. Here's
an in-depth explanation of each component:
1. Sender (Information Source):The sender is the individual, group, or entity that initiates the communication process.
 In data visualization, the sender could be the person or team responsible for creating the data visualization, such
as a data analyst, designer, or journalist.
2. Message:The message is the information, data, or content that the sender wishes to convey to the receiver.
 In data visualization, the message is the visual representation of data, including charts, graphs, tables, and
accompanying text or annotations.
3. Encoding:Encoding is the process of converting the message into a form suitable for transmission. This may involve
selecting the appropriate visual elements, colors, scales, and formats to effectively convey the data.
 In data visualization, encoding includes selecting the chart type, choosing data variables for axes, applying color
schemes, and creating labels.
4. Channel:The channel is the medium or method used to transmit the message from the sender to the receiver.
Channels can be physical (e.g., printed reports, presentations) or digital (e.g., websites, mobile apps).
 In data visualization, the channel could be a dashboard, a web application, a printed report, a presentation, or any
other platform for displaying visualized data.
5. Decoding:Decoding is the process by which the receiver interprets and makes sense of the message. It involves
understanding the visual elements and extracting meaning from the data visualization.
 In data visualization, decoding requires the viewer to interpret the charts, graphs, and other visual elements to
derive insights.
6. Receiver:The receiver is the individual or group for whom the message is intended. They are responsible for
receiving, interpreting, and, ideally, understanding the message.
 In data visualization, the receiver is the audience or users who are viewing and analyzing the visualized data.
7. Noise:Noise refers to any interference, distortion, or factors that can disrupt the communication process. It can
occur at any point in the model and affect the accuracy and clarity of the message.
 In data visualization, noise can manifest as inaccuracies in data, misleading visual elements, or distractions that
hinder the audience's understanding.
8. Feedback:Feedback is a vital component of the communication model. It represents the response or reaction of the
receiver to the message. Feedback can be used to assess the effectiveness of the communication and make
necessary adjustments.
 In data visualization, feedback may include user interactions, questions, comments, and reactions that help
improve the visualizations.
9. Context:The context includes the broader environment in which the communication takes place. It encompasses
factors such as the purpose of the communication, the cultural context, the sender's goals, and the expectations of
the receiver.
The Communication Systems Model emphasizes the importance of effective encoding, clear channels, accurate
decoding, and the feedback loop in successful communication. In the context of data visualization, it highlights the role
of the data visualizer (sender), the visual representation (message), the choice of platform (channel), the audience
(receiver), and the ongoing process of interpretation and feedback. Understanding this model can help data
practitioners create data visualizations that effectively convey insights and ensure that the intended message is received
and understood by the audience.

6
Types of Communication Problems- technical, semantic and
effectiveness
Communication problems in data visualization can be categorized into three main types: technical, semantic, and
effectiveness issues. These categories help to differentiate the nature of the problems and guide efforts to resolve them:
Technical Communication Problems:
a. Performance Issues:
Problem: Slow loading times or unresponsive interactions in interactive data visualizations can frustrate users and deter
engagement.
Solution: Optimize the performance of data visualization tools and platforms to ensure smooth user experiences.
b. Compatibility and Rendering:
Problem: Data visualizations may not display correctly or interact as intended across different devices, browsers, or
screen sizes.
Solution: Test and ensure compatibility on various platforms and use responsive design to adapt to different screen
sizes.
c. Data Integration:
Problem: Data from multiple sources may not integrate seamlessly, leading to discrepancies or inconsistencies.
Solution: Implement data integration solutions and data cleansing processes to ensure data consistency.
d. Security and Privacy:
Problem: Security and privacy concerns may arise when sharing sensitive or confidential data visualizations.
Solution: Implement appropriate security measures, such as user access controls and encryption, to protect data.
Semantic Communication Problems:
a. Misinterpretation:Problem: Data visualizations may be misunderstood due to ambiguity or a lack of clarity in the way
data is presented.
Solution: Ensure labels and legends are clear, and consider user testing to confirm that the visualization is interpreted as
intended.
b. Data Overloading:Problem: Visualizations may present too much information at once, leading to cognitive overload.
Solution: Simplify the visualization, prioritize the most important information, and provide options for drilling down into
details.
c. Terminology and Jargon:Problem: Using technical terms or jargon unfamiliar to the audience can hinder
comprehension.
Solution: Use plain language explanations and provide definitions for unfamiliar terms.
d. Cultural and Contextual Issues:Problem: Differences in cultural context or local knowledge may impact how data
visualizations are understood.
Solution: Consider the cultural background of the audience and provide additional context or explanations when
necessary.
Effectiveness Communication Problems:
a. Lack of Engagement:Problem: The audience may not find the visualization engaging, leading to a lack of interest in the
data.
Solution: Use storytelling techniques, compelling visuals, and interactivity to engage the audience.
b. Failure to Convey Insights:Problem: Data visualizations may not effectively convey the intended insights or key
takeaways.
Solution: Ensure the message is clear and the visualization emphasizes the most important data points.
c. Irrelevant Information:
Problem: Visualizations may include data that is irrelevant to the audience's needs, resulting in information overload.
Solution: Customize visualizations to match the audience's specific interests and requirements.
d. Lack of Feedback and Iteration:
Problem: Failing to seek feedback or iterate on visualizations can result in missed opportunities for improvement.
Solution: Continuously gather feedback from users and colleagues and use it to refine and enhance visualizations.

By categorizing communication problems in data visualization as technical, semantic, or effectiveness issues, you can
identify the specific nature of the problem and apply appropriate solutions to enhance the clarity and impact of your
visualizations.

7
Data types in data visualization
In data visualization, the choice of data types plays a critical role in determining how information is conveyed and
understood. The appropriate data type depends on the nature of the data and the message you want to convey. Here
are some common data types used in data visualization:
1. Numerical Data:
Numerical data consists of numbers and can be further categorized into:
 Continuous Data: Represents values that can take any real number within a range. Examples include temperature,
height, and time.
 Discrete Data: Comprises countable values with gaps between them. Examples include the number of employees
in a company, the count of products sold, or the number of cars in a parking lot.
2. Categorical Data:
Categorical data represents distinct categories, and it is often used to group data into non-numeric labels. Examples
include:
 Nominal Data: Categories without any inherent order or ranking, such as colors or types of animals.
 Ordinal Data: Categories with a meaningful order, but the intervals between them are not necessarily equal.
For instance, education levels (e.g., high school, college, graduate) or customer satisfaction levels (e.g., poor,
fair, good).
3. Time Series Data:
Time series data represents values collected at different points in time, making it essential for visualizing trends,
patterns, and changes over time. Examples include stock prices, weather data, and sales figures over months or
years.
4. Text Data:
Text data visualization is used to convey information from textual sources. Common methods include word clouds,
text sentiment analysis, and textual network analysis.
5. Geospatial Data:
Geospatial data visualization is used to represent data with geographic or spatial components. Examples include
maps, heat maps, choropleth maps, and spatial distributions.
6. Hierarchical Data:
Hierarchical data visualization is suitable for data organized in a hierarchical structure. Examples include
organizational charts, family trees, and file directory structures.
7. Network Data:
Network data visualization is employed to depict relationships and connections between entities. Examples include
social network graphs, supply chain networks, and web page linking structures.
8. Multivariate Data:
Multivariate data visualization involves the representation of data with multiple variables, often in a two- or three-
dimensional space. Techniques include scatter plots, bubble charts, parallel coordinates, and radar charts.
9. Image and Video Data:
Image and video data visualization techniques are used for conveying visual information. Examples include medical
imaging, satellite imagery, and video analytics.
10. Financial Data:
Financial data visualization techniques are specifically designed for representing financial information such as stock
prices, portfolio performance, and economic indicators.
11. Temporal Data:
Temporal data visualization focuses on patterns and trends within a specific time frame, which can be a subset of
time series data. Examples include hourly temperature fluctuations, daily website traffic, or monthly sales data.
12. Big Data:
Big data visualization is concerned with handling and representing vast amounts of data that traditional methods
may not handle efficiently. Techniques include data aggregation, sampling, and interactive visualizations.
13. Scientific Data:
Scientific data visualization techniques are used in fields such as biology, physics, and chemistry to represent
complex scientific data, including molecule structures, biological pathways, and particle collisions.
Choosing the appropriate data type and visualization method is crucial for effectively conveying insights and ensuring
that the audience can understand and interpret the information correctly. The selection of data type should align with
the message you want to convey and the characteristics of the data you are working with.
8
Relationships
In data visualization, relationships refer to the patterns, connections, and associations that can be revealed when data is
presented graphically. Visualizing relationships in data helps users better understand the underlying data and draw
meaningful insights from it.
Here are some key aspects of relationships in data visualization:
1. Correlation and Causation: Data visualization can help identify correlations between variables, where changes in one
variable are associated with changes in another. However, it's essential to be cautious about assuming causation
based solely on correlation.
2. Cluster Analysis: Data visualization can reveal natural groupings or clusters in your data. This is particularly useful in
techniques like k-means clustering, where data points are grouped based on similarity.
3. Trends and Patterns: Line charts and scatter plots are commonly used to depict trends and patterns in data. These
can show how one variable changes in relation to another, helping users understand relationships between them.
4. Heatmaps: Heatmaps display the strength or intensity of a relationship between two or more variables. They often
use color to represent the degree of association or dissimilarity between data points.
5. Network Diagrams: In cases where relationships involve connections between entities, network diagrams or graphs
are valuable. These show nodes (entities) and edges (connections) and can represent social networks, supply chains,
and more.
6. Regression Analysis: This statistical technique is used to model and visualize the relationship between a dependent
variable and one or more independent variables. Linear regression, for example, is used to understand the
relationship between variables by fitting a line to the data points.
7. Matrix Visualizations: These visualizations display relationships in a matrix format, where rows and columns
represent variables, and the cells represent the relationship between them. Heatmaps are a common example of
matrix visualizations.
8. Sankey Diagrams: These diagrams illustrate flows and relationships between different categories or stages of a
process. They are useful for visualizing the movement of resources, energy, or information.
9. Chord Diagrams: Chord diagrams show relationships between data points in a circular format. They are often used to
visualize connections between categories or entities.
10. Tree Maps and Sunburst Charts: These visualizations display hierarchical relationships among data elements. Each
level of the hierarchy is represented as a nested rectangle (in the case of tree maps) or as segments of a sunburst
chart.
11. Scatter Plots: Scatter plots are useful for showing the relationship between two continuous variables. Each data
point is plotted as a point on the chart, making it easy to identify trends or outliers.
12. Time-Series Analysis: For temporal data, time-series visualizations can help users understand how variables change
over time and if there are any relationships or seasonality in the data.
13. Geospatial Visualization: When data involves geographical locations, mapping tools can reveal relationships based
on location. For example, maps can show regional variations, proximity, or density of data points.
In summary, relationships in data visualization are about revealing connections and associations within the data.
Choosing the right visualization technique depends on the nature of your data and the specific relationships you want to
explore or communicate to your audience. Effective data visualization can simplify complex relationships, making it
easier for users to draw insights and make informed decisions.

Visualization formats:
Data visualization can take various formats, depending on the type of data and the insights you want to convey. Some
common visualization formats include:
1. Bar Charts: Used to compare categories of data. They are effective for showing comparisons and trends.
2. Line Charts: Ideal for showing trends over time. They connect data points with lines, making it easy to see patterns.
3. Pie Charts: Show the composition of a whole. Each slice represents a proportion of the whole.
4. Scatter Plots: Display individual data points on a two-dimensional plane, often used to identify relationships or
correlations.
5. Heatmaps: Show data values as colors in a grid. They are excellent for displaying large datasets and identifying
patterns.
6. Histograms: Used for visualizing the distribution of data. They group data into bins and display the frequency of data
points in each bin.
7. Box Plots: Display the distribution of a dataset, including outliers, quartiles, and median values.

9
8. Treemaps: Hierarchical visualization that divides data into nested rectangles, often used for displaying hierarchical
data structures.
9. Network Diagrams: Visualize relationships between entities in a network, like social networks or organizational
structures.
10. Choropleth Maps: Use color-coding to represent data by geographic regions, such as countries or states.
11. Word Clouds: Display word frequency in a text document, with more frequently occurring words appearing larger.
12. Sankey Diagrams: Show the flow of resources or values between entities, often used for visualizing processes or
energy flows.
13. Bubble Charts: Similar to scatter plots but with the addition of bubble size to represent a third variable.
14. Radar Charts: Used to compare multiple variables for a single data point, often used in performance evaluation.
15. Gantt Charts: Display a timeline of tasks or events, showing their duration and dependencies.
16. 3D Visualizations: Add an additional dimension to visualizations, often used for complex data exploration.
The choice of visualization format depends on the nature of your data and the story you want to tell. It's important to
select the format that effectively communicates your data's insights to your audience.

Basic principles for data visualization


Effective data visualization is essential for conveying insights and information clearly and efficiently. To create successful
data visualizations, you should adhere to several basic principles.
Here are some fundamental principles for data visualization:
1. Know Your Audience:Understand who your audience is and what they need from the visualization. Tailor your
visualizations to meet their expectations and level of expertise.
2. Simplify and Clarify:Keep your visualizations simple and clutter-free. Remove unnecessary elements, labels, and
decorations that don't add value. Clarity should be your top priority.
3. Use Appropriate Visual Encodings:Choose the right chart type and visual encoding to represent your data
accurately. Bar charts for comparisons, line charts for trends, pie charts for parts of a whole, etc.
4. Provide Context:Add context through titles, captions, axis labels, and legends. Ensure that viewers understand what
they are looking at and can interpret the data accurately.
5. Emphasize Key Insights:Highlight the most important data points or trends. Use color, size, or annotations to draw
attention to critical information.
6. Maintain Consistency:Maintain a consistent style throughout your visualization, including colors, fonts, and scales.
This consistency helps users interpret the data more easily.
7. Avoid Misleading Visuals:Be cautious about unintentionally creating misleading visualizations. Ensure that the scale
and axes accurately represent the data and that comparisons are fair.
8. Use Color Wisely:Choose a limited color palette, and use color purposefully to convey information. Avoid overly
vibrant or distracting colors. Consider colorblind accessibility.
9. Provide Interactivity:When appropriate, add interactive elements such as tooltips, zoom, or filters to allow users to
explore the data in more depth.
10.Tell a Story:Structure your visualization to tell a coherent story or present a clear message. Guide users through the
data and its insights in a logical sequence.
11.Consider Data Integrity:Ensure data accuracy and integrity. Check for outliers, errors, and inconsistencies in the data
before visualizing it.
12.Keep it Mobile-Friendly:Design your visualizations to be responsive and mobile-friendly, as many users access data
on various devices.
13.Aim for Simplicity:The "less is more" principle applies to data visualization. Simple visualizations are often more
effective in communicating insights.
14.Provide Feedback and Guidance:Offer explanations, context, and guidance within the visualization, so users can
interpret the data correctly. Annotations and descriptions can be helpful.
15.Consider Accessibility:Ensure that your visualizations are accessible to individuals with disabilities. Use alt text for
images, consider screen readers, and provide text alternatives for non-text elements.
By following these fundamental principles, you can create data visualizations that are informative, engaging, and easy
for your audience to understand. Effective data visualization enhances communication and decision-making, making it a
valuable tool in various fields, from business and science to education and journalism.

10
Principles of communicating data
(1) Know your goal
The goal of data visualization is to present data in a graphical or visual format that makes it easier to understand,
interpret, and derive insights from the data.Effective data visualization aims to:
Communicate Information: It should convey complex data in a clear and concise manner, allowing viewers to quickly
grasp key insights.
Facilitate Understanding: Visualizations should simplify complex data, patterns, and relationships, making it easier for
people to make informed decisions or draw conclusions.
Highlight Trends and Patterns: Visualizations can reveal trends, patterns, and anomalies in data that may not be
immediately apparent in raw numbers or text.
Support Decision-Making: They should aid decision-making processes by providing actionable insights and helping users
identify opportunities or areas that require attention.
Engage and Persuade: Visualizations can be used to engage and persuade an audience, whether it's for reporting,
storytelling, or advocacy.
Enhance Data Exploration: Interactive visualizations allow users to explore data, drilling down into details or changing
parameters to gain deeper insights.
Minimize Cognitive Load: Effective data visualizations reduce the cognitive load on viewers by presenting information in
a way that aligns with how the human brain processes visual information.
Ultimately, the goal of data visualization is to transform data into a visual narrative that enables better understanding,
decision-making, and communication of insights.
(2) Use the right data
Using the right data in data visualization is crucial for creating meaningful and informative visuals. Here are some tips:
Define Your Objective: Clearly define the purpose of your data visualization. Are you trying to show trends, compare
data, or highlight patterns? Understanding your objective will help you select the right data.
Data Relevance: Ensure that the data you use is directly related to your objective. Irrelevant data can confuse your
audience and dilute your message.
Quality Data: Verify the quality of your data. It should be accurate, up-to-date, and free from errors. Cleaning and
preprocessing your data may be necessary.
Data Integrity: Maintain data integrity. Ensure that there are no duplicate records or missing values that can skew your
visualization.
Consider the Audience: Think about who will be viewing your visualization. Tailor the data to their level of
understanding and the information they need.
Avoid Overloading: Don't overload your visualization with too much data. Keep it simple and focused on the key points.
Use Appropriate Visualizations: Choose the right type of chart or graph that best represents your data. Bar charts, line
graphs, pie charts, and scatter plots are some common options.
Labeling and Context: Provide clear labels and context for your data. Titles, axis labels, and legends should help your
audience understand what they're seeing.
Highlight Key Data Points: Emphasize the most important data points or trends that support your objective. Use colors
or annotations to draw attention.
Interactivity: If possible, add interactivity to your visualization to allow users to explore the data on their own.
Accessibility: Ensure that your data visualization is accessible to all, including those with disabilities. Use alt text for
images and consider color choices for those with color blindness.
Feedback and Iteration: Gather feedback on your data visualization and be willing to iterate and improve it based on
user input.
By following these principles and being selective about the data you use, you can create data visualizations that
effectively convey your message and insights to your audience.
(3) Select suitable visualization
The choice of a suitable visualization in data visualization depends on the type of data you have and the message you
want to convey. Here are some common types of visualizations and when to use them:
Bar Chart: Use bar charts to compare values between different categories or show changes over time.
Line Chart: Use line charts to display trends or changes over continuous intervals, such as time series data.
Pie Chart: Use pie charts to show the parts of a whole and their proportions, but be cautious as they can be less effective
than bar charts for precise comparisons.
11
Scatter Plot: Use scatter plots to visualize the relationship between two continuous variables and identify patterns or
correlations.
Histogram: Use histograms to visualize the distribution of a single variable, especially when dealing with large datasets.
Heatmap: Use heatmaps to represent data in a matrix format, often to show correlations or patterns in multivariate
data.
Box Plot (Box-and-Whisker Plot): Use box plots to display the distribution of a dataset, including median, quartiles, and
potential outliers.
Bubble Chart: Use bubble charts to display three dimensions of data, where the size of bubbles represents a third
variable.
Treemap: Use treemaps to show hierarchical data structures, often used in visualizing directory structures or nested
categories.
Sankey Diagram: Use Sankey diagrams to illustrate flow or proportionality between multiple categories or stages in a
process.
Choropleth Map: Use choropleth maps to visualize data on geographical regions, where color intensity represents a
value.
Word Cloud: Use word clouds to display the frequency or importance of words in text data.
Radar Chart: Use radar charts to compare multiple quantitative variables for a single data point, often used in
performance assessments.
Gantt Chart: Use Gantt charts to visualize project schedules and timelines.
Network Diagram: Use network diagrams to represent relationships between nodes, often used in social network
analysis or network flow analysis.
The choice of visualization should consider the nature of your data, your audience, and the insights you want to convey.
Experimenting with different types of visualizations can help you find the most suitable one for your
specific data and goals.
(4) Design for aesthetics
Designing for aesthetics in data visualization is crucial to engage and effectively communicate with your audience. Here
are some key principles to consider:
Color Choice: Use a harmonious color palette that is visually appealing and ensures good contrast between data
elements. Avoid using too many colors, and consider colorblind-friendly options.
Typography: Choose clear and readable fonts. Use font size, style, and weight to emphasize important information.
Consistency in typography throughout your visualization is key.
Whitespace: Utilize whitespace to separate and group elements, making your visualization less cluttered and more
visually pleasing. It can also help guide the viewer's attention.
Layout: Organize your data in a logical and structured manner. Consider the placement of titles, labels, and legends to
make the visualization easy to understand.
Visual Hierarchy: Use visual cues like size, color, and position to establish a hierarchy of information. Important data
points should stand out and draw the viewer's attention.
Simplicity: Keep your visualization simple and focused on the core message. Avoid unnecessary decorations or
embellishments that distract from the data.
Consistency: Maintain a consistent style throughout your visualization. This includes consistent use of colors, fonts, and
visual elements.
Balance: Distribute visual elements evenly across the visualization to create a sense of balance. This helps prevent a
cluttered or chaotic appearance.
Engagement: Incorporate interactive elements if applicable to allow users to explore the data themselves. Interactivity
can enhance engagement and understanding.
Storytelling: Consider the narrative you want to convey through your data. Arrange your visualizations in a way that tells
a compelling story or highlights key insights.
Feedback: Test your visualization with a diverse group of users and gather feedback to make improvements. User
feedback can help refine the aesthetics and functionality.
Accessibility: Ensure that your visualization is accessible to all users, including those with disabilities. Use alt text for
images, provide text alternatives, and follow accessibility guidelines.
Remember that aesthetics should not overshadow the clarity and accuracy of your data. Striking a balance
between visual appeal and effective communication is key to creating aesthetically pleasing data visualizations.

12
(5) Choose an effective medium and channel
Choosing an effective medium and channel for data visualization depends on various factors such as your audience, the
complexity of the data, and your communication goals. Here are some options:
Static Infographics: These are suitable for simple data that can be presented in a single image. They are commonly
shared on social media and in reports.
Interactive Web Dashboards: Ideal for complex datasets that require exploration. Tools like Tableau, Power BI, or D3.js
allow users to interact with data, making it useful for data analysts and decision-makers.
Data Videos: Animated videos can effectively convey trends and narratives within data. They are engaging and can be
shared on platforms like YouTube.
Data-Driven Reports: For in-depth analysis, consider reports in PDF or web format. They allow for detailed explanations
alongside visualizations and are often used in business and academia.
Data Art: For creative and artistic presentations, consider using data to create visual art installations or exhibits in
physical or virtual spaces.
Data Storytelling: Use storytelling techniques to weave data into a compelling narrative. This can be done through
articles, blog posts, or presentations.
Augmented Reality (AR) or Virtual Reality (VR): These technologies can immerse users in data environments, making
them suitable for immersive data exploration or training simulations.
Mobile Apps: Develop apps that present data interactively, which can be especially useful for data that needs to be
accessed on-the-go.
Social Media: Share data in bite-sized chunks on platforms like Twitter or Instagram for quick engagement.
Printed Materials: Traditional mediums like posters or brochures are still effective for conveying data in physical
settings, such as conferences or exhibitions.
The choice should align with your audience's preferences and the story you want to tell with the data. Combining
multiple mediums and channels can also be effective, depending on your objectives.
(6) Check the result
Collect and Prepare Data: First, gather the data you want to visualize. Ensure that it's clean, organized, and relevant to
your analysis.
Choose the Right Visualization: Select the type of chart or graph that best represents your data and the insights you
want to convey. Common types include bar charts, line graphs, pie charts, scatter plots, and heatmaps.

Create the Visualization: Use a data visualization tool or software like Excel, Tableau, Python with libraries like
Matplotlib or Seaborn, or online tools like Google Data Studio. Input your data and design the visualization according to
your preferences.

Interpret the Visualization: Once you have the visualization, analyze it to draw conclusions. Look for trends, patterns,
outliers, or any insights that the visualization reveals.

Compare and Validate: Compare the visualization results with your initial hypotheses or expectations. Ensure that the
insights make sense in the context of your data.

Share and Communicate: Share the visualization with others who need to see the results. You can include it in reports,
presentations, or dashboards to convey your findings effectively.
Iterate if Necessary: If the initial visualization doesn't provide the insights you need, consider adjusting the visualization
type or exploring the data differently. Visualization is an iterative process.

Seek Feedback: Get feedback from colleagues or stakeholders to ensure that your interpretation aligns with their
understanding of the data.

Document and Save: Save your visualization and any related analysis for future reference. Documentation is crucial for
reproducibility.
Take Action: Based on the insights gained from the visualization, make informed decisions or take actions as needed.

Remember that effective data visualization is not just about creating pretty charts; it's about conveying meaningful
information from your data.
13
Data story telling for social and market communication
Data storytelling is the process of translating data analyses into understandable terms in order to influence a business
decision or action. Data analysis focuses on creating valuable insights from data to give further context and
understanding to an intended audience.
Data storytelling in social and market communication involves using data, visualizations, and narratives to convey
information and insights that are relevant to social or marketing contexts. It is a strategic approach to communicate
complex data in a way that is engaging, persuasive, and relatable to your target audience.
Here's an explanation of data storytelling for social and market communication:
1. Understanding the Audience:Start by understanding the characteristics and preferences of your target
audience. What are their interests, needs, and pain points? This knowledge will guide how you craft and present
your data story.
2. Defining the Message:Clearly define the core message or insight you want to convey through your data
storytelling. Your message should be the central takeaway that you want your audience to remember and act upon.
3. Data Selection:Carefully select the data that supports your message. It should be relevant, reliable, and directly
tied to your communication goals. Ensure the data is accurate and up-to-date.
4. Narrative Structure:Organize your data story into a structured narrative that includes:
 Introduction: Set the stage by explaining the context and the issue or question your data addresses.
 Data Exploration: Present the data, allowing your audience to grasp the key trends and insights.
 Key Message: Highlight the central message or insight that you want to convey.
 Supporting Evidence: Provide additional data and visuals that reinforce your primary message.
 Conclusion: Summarize the key takeaways and discuss potential implications or actions.
5. Visualization Choices:Select the appropriate data visualizations that best illustrate your points. Common
visualization formats include bar charts, line charts, pie charts, heatmaps, and more.
6. Context and Annotations:Use labels, annotations, and context to help your audience interpret the data. Explain
data sources, units, and other relevant details to enhance understanding.
7. Engaging Storytelling:Incorporate storytelling techniques to make the data more engaging and relatable. Use
anecdotes, examples, and a compelling narrative structure to capture your audience's interest.
8. Emphasis on Impact:Explain how the data insights can have real-world impacts in social or market contexts.
Personalize the data by discussing how it affects individuals, businesses, or the target market.
9. Visual Enhancements:Use visuals, images, infographics, and graphics to enhance the storytelling experience and
make the data more digestible and visually appealing.
10.Comparisons and Benchmarks:Provide benchmarks and comparisons to provide context to your data. Show
how the current situation compares to historical data or industry standards.
11.Simplicity and Clarity:Avoid technical jargon and overly complex charts. Simplify your language and visuals to
ensure that your message is accessible to a broad audience.
12.Interactivity (if applicable):If presenting data online or through interactive media, use interactive elements,
such as tooltips or filters, to allow users to explore the data in more detail.
13.Citations and Transparency:Always provide citations and references for your data sources to build credibility
and ensure transparency.
14.Feedback and Iteration:Test your data story on a small sample of your audience to gather feedback and make
improvements before sharing it more widely.
15.Distribution and Promotion:Share your data story through relevant channels, such as social media, email,
presentations, or marketing campaigns. Effectively promote it to reach your target audience.
16.Measurement of Impact:After sharing your data story, track its impact using metrics such as views, shares,
conversions, and comments to gauge its success and make adjustments for future communications.
Data storytelling in social and market communication is a potent way to connect with your audience, build trust, and
drive informed decisions or actions. By combining the engaging aspects of storytelling with the persuasive qualities of
data visualization, you can effectively communicate complex information in a way that resonates with your audience and
influences their behavior or choices.

14
The three components of data storytelling
Data storytelling comprises data, narrative and visualizations.
1. The data serves as the base of a data story. It's information from accurate data gathering and analysis. Data can be
gathered from such places as charts and dashboards using data analysis tools.
2. The narrative is a verbal or written storyline that's used to effectively communicate insights from the data. The
narrative should be within the context of the data and aim to show a clear reasoning for following actions or
decisions. Narratives should be based on data and present a clear explanation of what the data means and its
importance.
3. Visualizations act as further representations of both the data and narrative and are used to communicate the story
more clearly. Visualizations include graphs, charts, diagrams and photos.

Importance of data storytelling


 Data storytelling is a great way to gather insights about data for people who aren't formally trained in how to read
data gathered from the dashboards of data analysis tools. Others who might be easily overwhelmed by a massive
amount of data points could find it difficult to find any meaning or remember data presented to them in a typical
dashboard, chart or graph. Data storytelling frames that information in a way that's clear and memorable for those
people. A story will engage those people and present the data in a way they can process, comprehend and
empathize with any effects the data shows.
 As opposed to, for example, a data scientist explaining the significance of gathered data to a board with only a
spreadsheet full of numbers, data storytelling helps convey the significance of what those numbers mean. This
makes the presented data much more compelling and memorable.

What makes a good data story?


A good data story must use data, a narrative and visualizations to be effective. However, to make the narrative, a data
story must also include the following:
 A setting:The setting should be based on the data. If, for example, the data is about internal systems, then the
setting would be inside an organization with the same internal setup.
 Characters: The characters could include customers, the organization, stakeholders or other key players the data
surrounds.
 A conflict: The conflict is any issue and the effects of that issue that the data might present. The conflict will affect
the characters or setting.
 Resolution:The resolution is a proposed solution to any apparent issues or anything that might help inform the
decision-making processes.
Data stories don't always need conflicts. However, if this element of the data story is skipped, the resolution is a
recommended course of action.
Each insight that the data shows can also be illustrated with a visualization to help the audience better follow along with
the story. Communicating an effective data story requires hard and soft skills.

Examples of good data stories


Data storytellers have become more popular inside and outside of the workplace.
The music app Spotify, for example, sends out recap stories annually to its users. On December 1 of each year, Spotify
Wrapped delivers users a wrap up of their music and podcast consumption. These stories contain statistics for each user
based on all of the music they listened to that year. Seeing this data presented in such a way provides an engaging way
for users to understand the music they listen to the most.
Similarly, Slack sends an email to its customers that consists of a visual story that expresses key insights on how they've
used the service.
Durham, N.C.-based Automated Insights uses natural language generation software that turns data -- such as statistics
from a basketball or baseball game -- into Associated Press wire stories. The company hopes to provide a similar service
for businesses, where it turns sales or marketing data into news stories.
Beyond analysts, data scientists and other business users, data stories in the future will be provided by new data tools.
Data storytelling and artificial intelligence predictions can be used together in powerful tools to help create predictions
from data without any extensive configurations.

15
Trends in market research
Market research has been significantly influenced by advances in data visualization and analytics. Several trends in
market research related to data visualization have emerged to provide deeper insights, improve decision-making, and
enhance communication. Here are some of the notable trends:

1. Interactive Data Dashboards:Interactive dashboards allow users to explore and interact with data in real-time.
These dashboards enable researchers and decision-makers to filter, drill down, and manipulate data to uncover
insights, trends, and patterns.
2. Real-time Data Visualization:The demand for real-time market insights is growing. Businesses are using data
visualization tools to display data as it's collected, helping them make agile decisions and adapt to changing market
conditions swiftly.
3. Advanced Predictive Analytics:Market researchers are using predictive analytics models and visualization
techniques to forecast market trends, customer behavior, and demand patterns. This aids in proactive decision-
making and planning.
4. AI-Driven Data Visualization:Artificial intelligence (AI) and machine learning are being integrated into data
visualization tools to automate data analysis and generate insights. AI algorithms can help identify hidden trends and
anomalies in large datasets.
5. Geo-spatial Analysis:Location-based data visualization is becoming increasingly important. Companies use
geographical data and mapping tools to analyze regional market trends, consumer demographics, and location-based
marketing strategies.
6. Customer Journey Mapping:Data visualization is used to map and visualize the customer journey. This helps
businesses understand customer touchpoints, identify pain points, and improve the customer experience.
7. Big Data Visualization:As the volume of data generated continues to grow, market researchers are turning to big
data visualization tools to make sense of large and complex datasets. These tools are essential for deriving
meaningful insights from vast amounts of information.
8. Cross-platform and Omni-channel Insights:Market researchers need to analyze data from various sources
and channels, including social media, e-commerce platforms, offline retail, and customer feedback. Data visualization
solutions are evolving to provide a comprehensive view of customer behavior across these channels.
9. Custom Data Storytelling:Researchers are creating custom data stories using visualization tools to present
findings in a compelling and narrative format. These stories are designed to make data more accessible and relatable
to stakeholders.
10.Ethical Data Visualization:With concerns about data privacy and transparency, ethical data visualization
practices are gaining prominence. Researchers and businesses are taking measures to ensure that data is presented
accurately and ethically.
11.User-friendly Interfaces:Data visualization tools are becoming more user-friendly and accessible to non-
technical users. This democratizes data analysis and enables a wider range of professionals to create and interpret
visualizations.
12.Data Integration:Market research is increasingly focusing on integrating data from various sources, such as CRM
systems, social media, and third-party data providers, to create a holistic view of the market and consumer behavior.
13.Mobile Optimization:As mobile device usage continues to grow, market research data visualization is being
optimized for mobile platforms. This ensures that insights are accessible to professionals on the go.
14.Explainable AI in Visualization:As AI-driven insights become more common, the need for transparent and
explainable AI models and visualization techniques is increasing. Users want to understand the rationale behind AI-
driven recommendations and insights.

These trends reflect the evolving landscape of market research, driven by the need for more accessible, insightful, and
actionable data visualization solutions. As technology continues to advance, market researchers will continue to
leverage data visualization to stay competitive and make data-driven decisions.

16
data visualization dashboards
Data visualization dashboards are a user interface tool that displays critical data and information in a visual, easy-to-
understand format. These dashboards are designed to provide users with a comprehensive and real-time overview of
key performance indicators, metrics, and data in a single, centralized location. Here's an explanation of data visualization
dashboards:
Key Components of Data Visualization Dashboards:
1. Widgets or Visualizations: Dashboards consist of various visual elements or widgets, such as charts, graphs,
tables, maps, and other data representations. These visualizations provide a clear and graphical representation of
data.
2. Data Sources: Dashboards are connected to data sources, which can include databases, spreadsheets, APIs, or
other data repositories. Data is retrieved from these sources and displayed in the dashboard.
3. Interactivity: Dashboards are often interactive, allowing users to explore the data by interacting with the widgets.
Common interactive features include filtering, drilling down into details, and changing date ranges.
4. KPIs (Key Performance Indicators): Dashboards typically display KPIs prominently. KPIs are critical metrics that
provide a quick snapshot of performance and progress toward organizational goals.
5. Widgets Arrangement: The arrangement and layout of widgets on the dashboard are designed to maximize user
comprehension and efficiency. They are often organized in a logical and intuitive manner.
Characteristics of Data Visualization Dashboards:
1. Real-time or Periodic Updates: Dashboards can be set to provide real-time updates or periodic refreshes,
ensuring that users always have access to the most current data.
2. Customization: Users can often customize dashboards to display the specific data and metrics most relevant to
their roles and objectives.
3. Accessibility: Dashboards are typically accessible through web browsers or mobile apps, enabling users to access
data from anywhere with an internet connection.
4. Role-based Access: Access to certain parts of a dashboard or specific data may be restricted based on user roles
and permissions.
Benefits of Data Visualization Dashboards:
1. Data Clarity: Dashboards provide a visual representation of data, making it easier for users to understand complex
information at a glance.
2. Decision Support: Users can make data-driven decisions and respond quickly to changing circumstances, as they
have real-time access to important metrics.
3. Efficiency: Dashboards save time by centralizing data, reducing the need to switch between multiple applications
or reports.
4. Collaboration: They facilitate collaboration by allowing teams to work from a shared data source and discuss
findings based on the same information.
5. Goal Tracking: Dashboards make it easy to track progress toward specific goals and objectives, helping
organizations stay on target.
6. Alerts and Notifications: Some dashboards can be configured to send alerts or notifications when certain
conditions or thresholds are met, enabling proactive management.
Use Cases for Data Visualization Dashboards:
1. Business Analytics: Monitoring sales, marketing, and financial performance.
2. Project Management: Tracking project progress, timelines, and resource allocation.
3. IT Operations: Monitoring system and network performance, security, and uptime.
4. Healthcare: Visualizing patient data, clinical outcomes, and resource utilization.
5. E-commerce: Monitoring website traffic, sales, and customer behavior.
6. Supply Chain Management: Tracking inventory levels, shipment status, and supplier performance.
In summary, data visualization dashboards are powerful tools for presenting and interacting with data, making it easier
for users to analyze and act upon information to support decision-making and improve organizational performance.
17
some examples of data visualization dashboards:
Sales Performance Dashboard:
Example: A company's sales team uses a dashboard to track monthly revenue, sales by region, product performance,
and customer acquisition metrics. Users can filter by time period, product category, or salesperson to analyze
performance.
Marketing Analytics Dashboard:
Example: A digital marketing agency uses a dashboard to monitor website traffic, social media engagement, email
campaign performance, and conversion rates. It displays key metrics like click-through rates, bounce rates, and
conversion funnels.
Financial Dashboard:
Example: A CFO of a multinational corporation utilizes a dashboard to monitor financial health. It displays real-time data
on revenue, expenses, profit margins, cash flow, and debt. Users can drill down into specific financial statements and
time periods.
Healthcare Dashboard:
Example: A hospital administrator relies on a healthcare dashboard to track patient admissions, discharges, bed
occupancy, patient outcomes, and the availability of critical resources such as ventilators during a crisis.
Project Management Dashboard:
Example: A project manager overseeing a software development project uses a dashboard to monitor task progress,
resource allocation, budget status, and the project's critical path. It displays Gantt charts, milestone tracking, and issue
status.
E-commerce Analytics Dashboard:
Example: An online retailer uses a dashboard to visualize website traffic, conversion rates, cart abandonment, and
revenue. It tracks metrics such as average order value, customer acquisition cost, and top-selling products.
Supply Chain Dashboard:
Example: A supply chain manager uses a dashboard to monitor inventory levels, supplier performance, shipping
statuses, and demand forecasting. It shows metrics like lead times, order fulfillment rates, and inventory turnover.
Human Resources Dashboard:
Example: An HR manager uses a dashboard to track employee turnover, performance metrics, recruitment progress, and
diversity statistics. It includes data on retention rates, time-to-fill job openings, and training effectiveness.
Social Media Analytics Dashboard:
Example: A social media manager employs a dashboard to monitor brand sentiment, engagement metrics, follower
growth, and content performance across various platforms. It tracks metrics like likes, shares, comments, and click-
through rates.
Energy Consumption Dashboard:
Example: A facility manager uses a dashboard to visualize energy consumption patterns, identify energy-saving
opportunities, and track environmental impact metrics. It displays data on electricity, gas, and water usage.
Customer Support Dashboard:
Example: A customer support team uses a dashboard to monitor service request volumes, response times, customer
satisfaction scores, and common support issues. It provides insights into response efficiency and the effectiveness of
support teams.
Higher Education Dashboard:
Example: A university administration uses a dashboard to track student enrollment, retention rates, graduation rates,
and academic performance. It provides insights into which programs are thriving and where improvements are needed.
18
What is Tableau?
 Tableau is a very powerful data visualization tool that can be used by data analysts, scientists, statisticians, etc. to
visualize the data and get a clear opinion based on the data analysis. Tableau is very famous as it can take in data
and produce the required data visualization output in a very short time. Basically, it can elevate your data into
insights that can be used to drive your action in the future. And Tableau can do all this while providing the highest
level of security with a guarantee to handle security issues as soon as they arise or are found by users.
 Tableau also allows you to prepare, clean, and format data of all types and ranges and then create data
visualizations to obtain actionable insights that can be shared with other users. You can use data queries to obtain
insights from your visualizations and also manage metadata using Tableau. In fact, it is a lifesaver for many people
in Business Intelligence as it allows you to handle data without having great technical knowledge. So you can use
Tableau as an individual data analyst or at a large scale for your business team and organization. In fact, there are
many organizations using Tableau such as Amazon, Lenovo, Walmart, Accenture, etc. There are different Tableau
products that are aimed at different types of users, whether they be individuals or organizations. So let’s see these
in detail now.
 Values in Tableau
There are two types of values in the tableau:
 Dimensions: Values that are discrete (which can not change with respect to time) in nature called Dimension in
tableau. Example: city name, product name, country name.
 Measures: Values that are continuous (which can change with respect to time) in nature called Measure in tableau.
Example: profit, sales, discount, population.

Advantages of Tableau
1. Create great visualizations
Of course, the first advantage of a data visualization tool is that you can create wonderful and detailed data
visualizations using data that initially wasn’t very ordered. You can use Tableau Prep to shape, clean, and combine the
data into desired forms so that it can be used for creating data charts, dashboards, visualizations, etc.
2. Obtain detailed insights
You can obtain detailed and unexpected insights from the data using Tableau. You can explore the data from different
angles to see if any patterns emerge or you can even ask open-ended questions from the data and perform various
comparisons to obtain unexpected insights. This effect is heightened even more when you are using real -time data as
it changes your viewpoint continuously.
3. User-friendly Approach
Tableau is created for people who don’t have detailed technical skills or much coding experience and so its user -
friendly approach is its greatest strength. You can create detailed data visualizations from Tableau without having
many technical skills as most of its features use a drag-and-drop approach to put the correct parameters in the rows
and columns to create visualizations. This is so simple and intuitive that even a layman can manage it.
4. Support for Different Data Sources
Tableau can connect to various data sources, data warehouses, and files that contain disparate data and exist in
different kinds of storage mediums. Tableau can access data from the cloud, data that is available in spreadsheets, big
data, non-relational data, etc. Tableau has the capacity to manage data from all these different data sources and
blend these different types of data to create complex and detailed data visualizations that are an asset to IT
companies.

limitations of tableau in data visualization


While Tableau is a powerful and versatile tool for data visualization and business intelligence, it does have some
limitations. Here are some common limitations of Tableau:
1. Cost: Tableau can be relatively expensive for individual users and small businesses, especially the more advanced
versions like Tableau Server or Tableau Online.
2. Steep Learning Curve: While Tableau's drag-and-drop interface is user-friendly, mastering the more advanced
features and calculations may require a significant learning curve for some users.
3. Data Size and Performance: Large datasets may lead to performance issues, especially when working with live
connections to data sources. Data extracts can be created to improve performance, but this can increase storage
requirements.

19
4. Limited ETL Capabilities: Tableau has data preparation capabilities, but for complex ETL (Extract, Transform, Load)
tasks, users may need to rely on external tools or databases.
5. Limited Statistical and Predictive Analysis: Tableau is not a replacement for dedicated statistical or predictive
analytics tools like R or Python. While it has basic statistical functions, it's not designed for in-depth data analysis.
6. Limited Customization of Visualizations: Although Tableau offers various visualization types, there are limits to how
much you can customize their appearance. Users may find it challenging to create very customized or unique
visualizations.
7. Limited Version Control: Tableau doesn't offer built-in version control for workbooks and dashboards, which can be
an issue when collaborating on projects.
8. Complex Data Relationships: Handling complex data relationships and hierarchies can be challenging, and Tableau
may not always handle these situations gracefully.
9. Limited Natural Language Processing (NLP): While Tableau has improved its NLP capabilities, it's not as advanced as
some dedicated NLP tools for analyzing unstructured text data.
10. Offline Access: Tableau Online and Tableau Server are required for online sharing and collaboration. Users who need
offline access to their visualizations may find this limiting.

Features of tableau
Tableau offers a wide range of features for data visualization and business intelligence, making it a popular tool for data
analysts, business professionals, and organizations looking to gain insights from their data. Here are some key features
of Tableau in data visualization:
1. Data Connectivity:Tableau can connect to various data sources, including databases, spreadsheets, cloud platforms,
and web services. This allows users to import and analyze data from multiple sources.
2. Data Transformation and Cleansing:Tableau provides data preparation tools that allow users to clean, transform,
and shape data for analysis. This includes functions for data cleaning, filtering, pivoting, and aggregating.
3. Drag-and-Drop Interface:Tableau's intuitive, user-friendly interface allows users to create visualizations by simply
dragging and dropping fields onto the canvas. No coding or complex programming is required.
4. Rich Visualization Types:Tableau supports a wide range of visualization types, including bar charts, line charts,
scatter plots, heatmaps, maps, treemaps, and more. Users can choose the most appropriate visualization for their
data.
5. Interactivity:Dashboards and reports created in Tableau are highly interactive. Users can filter data, highlight specific
data points, and drill down into details with ease, facilitating data exploration.
6. Mapping and Geographic Analysis:Tableau provides robust mapping capabilities, allowing users to create
geographic visualizations, plot locations on maps, and perform spatial analytics.
7. Aggregation and Granularity Control:Users can control the level of detail in their visualizations and easily aggregate
data to view high-level trends or drill down to specific details.
8. Data Blending:Tableau allows users to blend data from multiple sources, enabling cross-functional analysis and
insights.
9. Data Forecasting:Tableau supports forecasting for time-series data, making it possible to create predictive models
and visualize future trends.
10. Data Integration and ETL:While not a full ETL tool, Tableau offers data integration capabilities for basic data
transformation, blending, and joining. More advanced ETL can be performed with external tools if needed.
11. Collaboration and Sharing:Tableau allows users to share their visualizations, dashboards, and reports with
colleagues and stakeholders through Tableau Server, Tableau Online, or Tableau Public.
12. Data Security and Permissions:Users can control who has access to their data and dashboards, ensuring that
sensitive information is protected.
13. Performance Optimization:Tableau provides features like data extracts, data source filters, and query optimization
to improve the performance of dashboards, especially with large datasets.
14. Integration and APIs:Tableau can be integrated with various data sources, applications, and programming languages
like Python and R. It offers APIs for extending functionality and automation.
15. Mobile Accessibility:Tableau is designed to be responsive, allowing users to access and interact with visualizations
on various devices, including mobile phones and tablets.
16. Alerts and Notifications:Users can set up data alerts to receive notifications when specific data thresholds or
conditions are met.
These features make Tableau a versatile and powerful tool for data visualization, exploration, and analysis, helping
organizations make data-driven decisions and communicate insights effectively.
20
Products Offered by Tableau
1. Tableau Desktop:
Description: Tableau Desktop is the core authoring and development tool. It is used by analysts, data scientists, and
business professionals to create interactive data visualizations, dashboards, and reports. With Tableau Desktop,
users can connect to various data sources, clean and transform data, and build visualizations using a drag-and-drop
interface.
Key Features:
 Data Connectivity: Tableau Desktop supports a wide range of data sources, including databases, spreadsheets,
web services, and cloud platforms.
 Visualizations: Users can create a variety of visualizations, from simple bar charts to complex geographic maps
and interactive dashboards.
 Data Preparation: It offers data preparation tools to clean and shape data for analysis.
 Calculated Fields: Users can create custom calculated fields to perform advanced calculations.
 Interactivity: Dashboards created in Tableau Desktop are highly interactive, allowing users to filter and explore
data.
Disadvantages:
 Cost: Tableau Desktop can be expensive for individual users.
 Learning Curve: Advanced features may require time to master.
 Limited Collaboration: Designed for individual authoring, not collaboration.
2. Tableau Server:
Description: Tableau Server is a web-based platform that enables organizations to share, collaborate, and govern
Tableau content. It provides centralized management of Tableau workbooks and dashboards, making them
accessible to authorized users through a web browser.
Key Features:
Centralized Sharing: Tableau Server allows organizations to publish, share, and collaborate on Tableau content in a
secure and controlled environment.
 User Management: Administrators can manage user access, permissions, and security settings.
 Data Governance: It offers tools for monitoring usage, ensuring data security, and tracking performance.
 Authentication: Supports integration with existing authentication systems for seamless user access.
Disadvantages:
 Cost: Tableau Server can be costly to implement and maintain.
 IT Dependency: Requires IT support for setup and maintenance.
 Limited Mobile Capabilities: Limited mobile interactivity compared to Tableau Mobile.
3. Tableau Online:
Description: Tableau Online is a cloud-based version of Tableau Server. It offers the same functionality as Tableau
Server but is hosted and managed by Tableau in the cloud. It's designed for organizations that prefer not to maintain
their own server infrastructure.
Key Features:
 Cloud Hosting: Tableau Online provides cloud-based hosting of Tableau content, eliminating the need for on-
premises infrastructure.
 Accessibility: Users can access their Tableau content from anywhere with an internet connection.
 Scalability: It can easily scale to accommodate growing data and user needs.
Disadvantages:
 Data Privacy Concerns: Data in the cloud may raise privacy and security concerns.
 Limited Customization: Fewer customization options compared to on-premises Tableau Server.
4. Tableau Prep:
Description: Tableau Prep is a data preparation tool that helps users clean, shape, and combine data from various
sources before analysis. It provides a visual and interactive interface for data cleaning and transformation.
Key Features:
 Data Cleaning: Users can clean and transform data, including tasks like pivoting, splitting, and aggregating.
 Data Blending: It allows users to combine data from multiple sources.
 Visual Data Flow: Data preparation is performed through a visual data flow interface, making it user-friendly.
Disadvantages:
21
 Limited ETL: Not as powerful as dedicated ETL tools for complex transformations.
 Limited Analytics: Primarily focused on data cleaning, not data analysis.
 Cost: May require an additional investment.
5. Tableau Mobile:
Description: Tableau Mobile is a mobile application that allows users to access and interact with Tableau
visualizations on their smartphones and tablets. It's designed for on-the-go access to business insights.
Key Features:
 Responsive Design: Tableau Mobile provides a responsive design to optimize the user experience on different
mobile devices.
 Interactivity: Users can interact with dashboards, apply filters, and explore data on mobile devices.
Disadvantages:
 Limited Offline Access: Requires an internet connection to access visualizations.
 Smaller Screen: Limited screen real estate for complex dashboards.
6. Tableau Public:
Description: Tableau Public is a free version of Tableau that allows users to create and share public data
visualizations and dashboards with the global Tableau community. However, data shared on Tableau Public is
publicly accessible and cannot be used for confidential or proprietary information.
Key Features:
 Free Access: It's a free platform, making it accessible to a wide audience.
 Community Engagement: Users can share their visualizations with the Tableau Public community and embed
them on websites and blogs.
Disadvantages:
 Public Access: Data shared is publicly accessible and cannot be used for sensitive or proprietary information.
 Limited Data Sources: Restricted to a few data sources.
7. Tableau Reader:
Description: Tableau Reader is a free desktop application that allows users to view and interact with Tableau
workbooks and dashboards created by Tableau Desktop. It is primarily for individual use and doesn't offer sharing or
collaboration capabilities.
Key Features:
 Viewing: Users can open and view Tableau workbooks and dashboards.
 Interactivity: Allows limited interactivity with the visualizations.
Disadvantages:
 Limited to Viewing: Cannot create or edit visualizations.
 Limited Collaboration: Designed for individual use, no collaboration features.
These are the core Tableau products and services designed to cover various aspects of data visualization, data
preparation, and collaboration within organizations. Keep in mind that Tableau may have introduced new products or
made updates and changes to their offerings since my last knowledge update, so it's advisable to visit the official
Tableau website for the latest information.

22
Tableau Architecture
Tableau Server is designed to connect many data tiers. It can connect clients from Mobile, Web, and Desktop. Tableau
Desktop is a powerful data visualization tool. It is very secure and highly available.
It can run on both the physical machines and virtual machines. It is a multi-process, multi-user, and multi-
threaded system.
Providing such powerful features requires unique architecture.
The different layers used in Tableau server are given in the following architecture diagram:-

1. Data server:- The primary component of Tableau Architecture is the Data sources which can connect to it.
Tableau can connect with multiple data sources. It can blend the data from various data sources. It can connect to
an excel file, database, and a web application at the same time. It can also make the relationship between different
types of data sources.
2. Data connector:- The Data Connectors provide an interface to connect external data sources with the Tableau Data
Server.
Tableau has in-built SQL/ODBC connector. This ODBC Connector can be connected with any databases without using
their native connector. Tableau desktop has an option to select both extract and live data. On the uses basis, one can be
easily switched between live and extracted data.
o Real-time data or live connection: Tableau can be connected with real data by linking to the external database
directly. It uses the infrastructure existing database by sending dynamic multidimensional expressions (MDX) and
SQL statements. This feature can be used as a linking between the live data and Tableau rather than importing the
data. It makes optimized and a fast database system. Mostly in other enterprises, the size of the database is large,
and it is updated periodically. In these cases, Tableau works as a front-end visualization tool by connecting with the
live data.
o Extracted or in-memory data: Tableau is an option to extract the data from external data sources. We make a local
copy in the form of Tableau extract file. It can remove millions of records in the Tableau data engine with a single
click. Tableau's data engine uses storage such as ROM, RAM, and cache memory to process and store data. Using
filters, Tableau can extract a few records from a large dataset. This improves performance, especially when we are
working on massive datasets. Extracted data allows the users to visualize the data offline, without connecting to
the data source.
3. Components of Tableau server: Different types of component of the Tableau server are:
o Application server
o VizQL server
o Data server

23
A. Application server: The application server is used to provide the authorizations and authentications. It handles the
permission and administration for mobile and web interfaces. It gives a guarantee of security by recording each session
id on Tableau Server. The administrator is configuring the default timeout of the session in the server.
B. VizQL server: VizQL server is used to convert the queries from the data source into visualizations. Once the client
request is forwarded to the VizQL process, it sends the query directly to the data source retrieves information in the
form of images. This visualization or image is presented for the users. Tableau server creates a cache of visualization to
reduce the load time. The cache can be shared between many users who have permission to view the visualization.
C. Data server: Data server is used to store and manage the data from external data sources. It is a central data
management system. It provides data security, metadata management, data connection, driver requirements, and
data storage. It stores the related details of data set like calculated fields, metadata, groups, sets, and parameters. The
data source can extract the data as well as make live connections with external data sources.
4. Gateway: The gateway directed the requests from users to Tableau components. When the client sends a request, it
is forwarded to the external load balancer for processing. The gateway works as a distributor of processes to different
components. In case of absence of external load balancer, the gateway also works as a load balancer. For single server
configuration, one gateway or primary server manages all the processes. For multiple server configurations, one physical
system works as a primary server, and others are used as worker servers. Only one machine is used as a primary server
in Tableau Server environment.
5. Clients: The visualizations and dashboards in Tableau server can be edited and viewed using different clients. Clients
are a web browser, mobile applications, and Tableau Desktop.
o Web Browser: Web browsers like Google Chrome, Safari, and Firefox support the Tableau server. The
visualization and contents in the dashboard can be edited by using these web browser.
o Mobile Application: The dashboard from the server can be interactively visualized using mobile application and
browser. It is used to edit and view the contents in the workbook.
o Tableau Desktop: Tableau desktop is a business analytics tool. It is used to view, create, and publish the
dashboard in Tableau server. Users can access the various data source and build visualization in Tableau desktop.

Download and Install Tableau


To download and install Tableau for data visualization, follow these steps:
1. System Requirements:Ensure your computer meets the system requirements for Tableau. Visit the official Tableau
website to find the specific requirements for your operating system.
2. Download Tableau:Visit the Tableau download page on the official website
(https://www.tableau.com/products/desktop/download).
Choose the version of Tableau Desktop you want to download. You may have the option of downloading a trial
version or a licensed version, depending on your needs.
Click the appropriate download button.
3. Installation:Once the download is complete, locate the Tableau installer file (it should be an executable file with a
.exe extension for Windows and .dmg for macOS) and double-click it to begin the installation.
4. Install Tableau:Follow the on-screen prompts to install Tableau on your computer. You'll be guided through the
installation process, which typically involves agreeing to the license agreement, choosing the installation location,
and selecting options for file associations.
5. License Activation:If you are installing a licensed version of Tableau, you will be prompted to enter your product key
and activate the license during the installation process. For a trial version, you may need to provide some basic
information.
6. Complete Installation:After you've configured the installation options and activated your license (if applicable), click
"Install" to complete the installation process.
7. Launch Tableau:
Once the installation is finished, you can launch Tableau Desktop by finding it in your applications or programs list or
by clicking on the desktop shortcut, if created during installation.
8. Initial Configuration:
When you launch Tableau for the first time, you may be prompted to configure your data connection preferences
and sign in with your Tableau account.
9. Start Using Tableau:
After completing the initial configuration, you can start creating visualizations, connecting to data sources, and
exploring Tableau's features.

24
Using the Workspace Control Effectively
If you are addicted to working with spreadsheets or other analysis tools, learning Tableau's desktop environment will be
helpful. If you have no familiarity with spreadsheets or database terminology, you can still be effectively using Tableau
within a few days.
The Data Connection Page and Start Page
 Open Tableau, and you see the start page of Tableau Desktop.
 On the left side, the data window gives connection options. If you click on that to connect to the Data, you are taken
to the data connection workspace. You can also access this page by clicking on the hard disk tab which is next to the
Start button. If you want to connect to one of the data sources listed On a Server section, you must to go to
Tableau?s website and download a connector for the required database. Here is no limit on the number of data
connection drivers you can install, but some dealer requires that you validate a valid license to their software before
downloading their connector.
 On the right side of the Connect to the Data page, you will see saved data connections. Tableau provides four as
sample data for learning. Any other links you have collected (.tds files) are displayed there as well. Return to the
Home button and look at the Workbooks area in the start page. The Workbooks area saves the last nine workbooks
you've opened. If you want to keep a workbook there that you frequently use, go over the workbook image and click
on the push pin. That will prevent the workbook from being cycled out of view.
 To remove saved workbooks from the start page click on the red X that appears when you float over the workbook's
image. At the bottom of this start page, the Getting Started area provides links to training videos and promotional
materials. The sample workbook area provides links to sample workbooks containing excellent example material.
Clicking on More Samples takes you to Tableau's visual gallery on the web with even more example workbooks.

Tableau User Interface


Tableau's user interface (UI) is designed to be intuitive and user-friendly, making it accessible to both data professionals
and business users. It provides a workspace for creating data visualizations and dashboards. Here's an overview of the
key elements and components of Tableau's UI for data visualization:
 Menu Bar: The menu bar at the top of the Tableau window provides access to various commands, including file
operations, data connections, formatting, and more.
 Data Pane: The Data pane on the left side allows you to connect to data sources, view the data structure, and drag
and drop fields onto the canvas for analysis.
 Sheets and Dashboards: The central part of the UI is where you work on creating your visualizations. You can switch
between different sheets (worksheets) and dashboards, each of which can contain various visualizations.
 Shelves: Shelves are located on the left, top, and bottom of the central canvas. They include the Columns shelf, Rows
shelf, and Marks card. These are where you place dimensions and measures to build your visualizations.
 Marks Card: The Marks card allows you to control the appearance and behavior of marks in your visualizations, such
as colors, size, labels, and tooltips.
 Show Me Panel: The Show Me panel on the left provides quick access to various visualization types, helping you
choose the best chart type for your data.
 Data Connection: The Data Source tab on the left allows you to manage your data connections, apply filters, and edit
data source settings.
 Toolbar: The toolbar provides buttons for actions like saving your work, undo/redo, switching between sheets and
dashboards, and various formatting options.
 Filters Shelf: You can use the Filters shelf to add filters to your visualizations, allowing users to interact with the data
by selecting specific values.
 Pages Shelf: The Pages shelf is used for creating dynamic visualizations that can be animated over time or using a
specific dimension.
 Legends and Color Legends: These components provide information about the color schemes and legends used in
your visualizations.
 Worksheet Tabs: Each open worksheet or dashboard is represented by a tab at the bottom, making it easy to switch
between them.
 Status Bar: The status bar at the bottom provides information about data source connections, extract updates, and
other system information.

25
 Analytics Pane: The Analytics pane on the left allows you to add and customize various analytical calculations and
functions.
 Parameters and Set Controls: These components, located on the left, enable users to create dynamic parameters
and sets for their visualizations.
 Preview Pane: The preview pane on the right shows a preview of your visualization as you make changes to it.
 Web Edit Mode: Tableau provides a web-based editing mode where you can create and edit dashboards directly in a
web browser.
 Tableau Server Integration: If connected to Tableau Server or Tableau Online, users can publish their visualizations
to the server or online platform for collaboration and sharing.
Tableau's user interface is highly interactive and is designed to facilitate the creation of data visualizations and
dashboards with ease. Depending on your specific task or visualization requirements, you'll interact with these
components to design and analyze your data effectively. Keep in mind that the exact layout and features may vary
depending on the version of Tableau you are using.

Tableau Desktop Workspace


 Click on the Tableau icon displayed in the left-hand side of the Tableau worksheet page and expose the contents of
the worksheet tab selected at the bottom of the screen. When you connect with a new data source, this is the
default workspace view.
 Go to the home page and select the global superstore sales-Excel sheet.
 Open a connection to a saved data source, you also should have an open blank worksheet.
 In many ways, you can open a workspace page; for example, go to the display Tableau's icon on your desktop and
you have a data source shown on your desktop. Dragging any data source icon and dropping it on the Tableau icon
opens Tableau's worksheet page for the selected data source. Also, you can open as many connections as you need
in Tableau by going to the data connection page or start page and select a new connection.
 Now, the worksheet is connected to the global Superstore Sales-Excel dataset.
Tableau Desktop Workspace Menu
The Tableau desktop workspace consists of various elements as given below:
1. Menu Bar: It consists of menu options like File, Data, Worksheet, Dashboard, Story, Analysis, Map, Format, Server,
Window, and Help. The options in the menu bar, including features like data source connection, file saving, design,
table calculation options, and file export features for creating a dashboard, worksheet, and storyboard.
 File Menu: For any Windows program the file menu contains New, Open, Close, Save, Save As, and Print, functions.
The most frequently used feature found in this menu is the Print to pdf option. This allows us to export our
dashboard or worksheet in pdf form. If you don't remember where Tableau places files, or you want to change the
default file-save location, use the repository location option for review the file and change it. We can create a
packaged workbook from the export packaged workbook option in a fast manner.
 Data Menu: You can use a data menu if you find some interesting tabular data on a website that you want to
analyze with Tableau. Highlight and copy the data from the site, then use the Paste Data option to input it into
Tableau. Once pasted, then Tableau will copy the data from the Windows clipboard and add a data source in the
data window. The Edit Relationships menu option is used in data blending. This menu option is needed if the field
names are not identical in two different data sources. It allows you to define the related fields correctly.
 Worksheet Menu: The Export option allows you to export the worksheet as an Excel crosstab, an image, or in
Access database file format. The Duplicate as Crosstab option creates a crosstab version of the worksheet and
places it in a new worksheet.
 Dashboard Menu: The Action Menu is a useful feature that is reachable from both the Worksheet Menu and the
Dashboard Menu.
 Analysis Menu: In this menu, you can access the stack marks and aggregate measures options. These switches
allow you to adjust default Tableau behaviors that are useful if you required to build non-standard chart types. The
Create Edit Calculated Field and Calculated Field options are used to make measures and new dimensions that
don't exist in your data source.
 Map Menu: The Map Menu bar is used to alter the base map color schemes. The other menu bar are related in the
way of replacing Tableau's standard maps with other map sources. You can also import the geocoding for the
custom locations using the geocoding menu.

26
 Format Menu: This menu is not used very commonly because pointing at anything, and right-clicking gets you to a
context-specific formatting menu more quickly. You may need to alter the cell size in a worksheet rarely. If you
don't like the default workbook theme, use the Workbook Theme menu to select one of the other two options.
2. Toolbar Icon: Toolbar icon below the menu bar can be used to edit the workbook using different features like redo,
undo, new data source, save, slideshow, and so on.
3. Dimension Shelf: The dimension presents in the data source for example- customer (customer name, segment),
order (order date, order id, ship date, and ship mode), and location (country, state, and city) these all type of data
source can be viewed in the dimension shelf.
4. Measure Shelf: The measures present in the data source, for example- Discount, Profit, Profit ratio, Quantity, and
Sales- These all types of data source can be viewed in the measure shelf.
5. Sets and Parameters Shelf: The user-defined sets and parameters can view in the sets and parameters. It is also used
to edit the existing sets and parameters.
6. Page Shelf: Page shelf is used to view the visualization in video format by keeping the related filter on the page shelf.
7. Filter Shelf: Filter Shelf is used to filter the graphical view by the help of the measures and dimensions.
8. Masks Cards: Marks card is used to design the visualization. The data components of the visualization like size, color,
path, shape, label, and tooltip are used in the visualizations. It can be modified in the marks card.
9. Worksheet: The worksheet is the space where the actual visualization, design, and functionalities are viewed in the
workbook.
10. Tableau Repository: Tableau repository is used to store all the files related to the Tableau desktop. It includes
various folders like Connectors, Bookmarks, Data sources, Logs, Extensions, Map sources, Shapes, Services, Tab
Online Sync Client, and Workbooks. My Tableau repository is located in the file path C:\Users\User\Documents\My
Tableau Repository.

Toolbar
Tableau's toolbar is an essential part of the user interface that provides quick access to various functions and tools for
creating, customizing, and interacting with data visualizations. The toolbar is located just below the menu bar in Tableau
and contains a range of icons that represent different actions and features.
Here's an overview of the key components and functions of the Tableau toolbar:
1. Open: The "Open" icon allows you to open existing Tableau workbooks or projects.
2. Save: The "Save" icon is used to save your current Tableau workbook or project.
3. Undo and Redo: These icons enable you to undo or redo recent actions in your workbook.
4. New Worksheet: Clicking this icon creates a new worksheet where you can build additional visualizations.
5. New Dashboard: This icon opens a new dashboard for combining multiple visualizations onto a single page.
6. Data Source: Clicking this icon allows you to connect to and configure data sources for your project.
7. Show Data Source: The "Show Data Source" icon displays the Data Source tab, allowing you to work with the data
source, apply data transformations, and shape your data.
8. Sheets and Dashboards: These icons provide quick access to the tabs for individual sheets and dashboards in your
project.
9. Show Me: The "Show Me" icon opens the "Show Me" panel, which provides predefined visualization templates and
helps you select the best chart type for your data.
10. Connect to Data: This icon allows you to establish a connection to different data sources and access the data you
want to visualize.
11. Data Preparation: The "Data Preparation" icon opens Tableau Prep, a tool for cleaning, transforming, and shaping
your data before visualization.
12. Annotations: You can use the "Annotations" icon to add notes, shapes, and lines to your visualizations, making it
easier to communicate insights.
13. Dashboard Options: The "Dashboard Options" icon opens the settings for configuring dashboard layout, size, and
other properties.
14. Worksheet Options: This icon allows you to configure the settings of the current worksheet, including titles,
formatting, and layout.
15. Publish to Tableau Server/Online: Clicking this icon is used to publish your workbook or dashboard to Tableau
Server or Tableau Online for sharing and collaboration.
16. Web Edit: The "Web Edit" icon opens the web editing mode for Tableau Server or Tableau Online, allowing you to
edit and create content directly in the web environment.
17. Server/Online Home: Clicking this icon takes you to the home page of Tableau Server or Tableau Online.
27
18. User Profile: The "User Profile" icon provides access to your Tableau account and user settings.
19. Tableau Help: The "Tableau Help" icon links to documentation and support resources, including online help and
community forums.
20. Feedback and Product Updates: Clicking this icon allows you to provide feedback on Tableau and check for product
updates.
The toolbar in Tableau streamlines the workflow for building data visualizations and dashboards. It offers quick access to
key functions, making it easier to design, analyze, and share data-driven insights. Keep in mind that the toolbar icons and
options may evolve with new versions of Tableau, so consult the latest Tableau documentation for the most up-to-date
information.

Sheets
In Tableau, a "sheet" is a fundamental component used to create and design visualizations. Sheets are part of a Tableau
workbook, and they allow you to build individual visualizations and explore different aspects of your data. Sheets serve
as the canvas where you can drag and drop data fields, define visualization types, and customize the appearance and
behavior of your charts and graphs.
Here's an overview of sheets in Tableau:
1. Creating a Sheet:To create a new sheet, you can click the "New Worksheet" icon in the toolbar or select
"Worksheet" from the "Sheet" menu. You can also duplicate existing sheets to create variations of your
visualizations.
2. Workspace:A sheet in Tableau provides a workspace or canvas where you can build and design your visualization.
The workspace typically includes columns and rows, which represent the X and Y axes.
3. Drag-and-Drop Interface:The core of creating visualizations in Tableau is the drag-and-drop interface. You can select
dimensions and measures from the data source and drop them onto the canvas to define what to visualize.
4. Shelves:Above the canvas, you'll find shelves that allow you to place fields, which control various aspects of your
visualization. These shelves include:
 Columns: Defines how data is presented on the X-axis.
 Rows: Determines data placement on the Y-axis.
 Filters: Lets you filter the data displayed on the sheet.
 Marks: Allows you to customize the appearance of data marks.
5. Marks Card:The Marks Card, located above the canvas, is used to specify how the marks (data points) on your
visualization should look. You can control mark properties such as color, size, shape, and label from this card.
6. Data Shading:Depending on the visualization type, you may be able to apply data shading or color encoding to
convey additional information within the visualization.
7. Worksheet Options:Each sheet has a set of worksheet options accessible from the worksheet tab. These options
include formatting settings, titles, tooltips, and other properties specific to that sheet.
8. Field Pane:On the left side of the interface, you can access the Field Pane, which displays the list of available data
fields. You can drag and drop fields from the Field Pane onto the shelves or canvas.
9. Legends and Color Legends:
Legends provide key information about the data in your visualization, such as color encoding and data ranges.
10. Actions and Interactivity:
You can create actions that add interactivity to your visualizations, such as filtering one visualization based on
selections in another.
11. Dashboard Integration:
Sheets can be incorporated into dashboards, allowing you to combine multiple sheets and visualizations onto a
single dashboard.
12. Preview and Interact:
You can use the preview mode to interact with your visualization and see how it behaves when you apply filters or
select data points.
13. Data Source Preview:
You can use the Data Source tab to preview and inspect the data source to verify your field selections.
14. Annotations and Notes:
Sheets allow you to add annotations, notes, and reference lines to provide context to your visualizations.
Sheets in Tableau are versatile and flexible, allowing you to create a wide range of visualizations, from simple bar
charts and scatter plots to complex interactive dashboards. By combining multiple sheets and dashboards, you can
effectively communicate insights from your data to a wide audience.
28
Dashboards
In Tableau, dashboards are powerful tools for data visualization and reporting. Dashboards allow you to combine
multiple visualizations, filters, and interactive elements into a single, unified view. This enables you to present and
explore data insights in a cohesive and meaningful way.
Components and Features of a Tableau Dashboard:
1. Dashboard Size and Layout:You can customize the size and layout of your dashboard to fit your specific needs. You
can choose from different layout containers and size options.
2. Sheets and Objects:Dashboards allow you to add sheets, visualizations, and objects from your workbook. These
elements can be arranged and resized to create a cohesive layout.
3. Horizontal and Vertical Layout Containers:You can use containers to organize your dashboard elements. Horizontal
and vertical layout containers help you control the placement of visualizations and objects.
4. Sheets and Objects Shelf:The "Sheets" and "Objects" shelf on the left side of the dashboard interface provides a list
of available sheets and objects in your workbook. You can drag and drop these onto the dashboard.
5. Objects:You can add various objects to your dashboard, including images, text, web content, and blank objects,
which can be customized to provide context or explanations for your data.
6. Quick Filters:Quick filters allow users to interactively filter the data on the dashboard. You can add filter controls
that affect multiple visualizations at once.
7. Actions:Actions enable interactivity within a dashboard. You can create actions that let users click on one
visualization to affect the data displayed in another.
8. Legends:You can include legends in your dashboard to provide context for color coding and data range information
used in visualizations.
9. Titles and Text:Dashboards support text and title elements. You can add titles and captions to describe the content
and insights presented.
10. Background and Borders:You can customize the background color, image, and borders of your dashboard to create a
polished and branded look.
11. Layout Containers:Containers are used to group and control the placement of dashboard elements. You can nest
containers to create more complex layouts.
Advantages of Tableau Dashboards:
1. Data Integration: Dashboards allow you to combine multiple visualizations and data sources into one view, providing
a holistic understanding of your data.
2. Interactivity: Dashboards support user interactivity, such as filtering and parameter actions, making it easy to
explore data.
3. Visual Appeal: You can design visually appealing dashboards with customized layouts, fonts, colors, and images to
effectively communicate insights.
4. Storytelling: Dashboards enable storytelling by arranging visualizations and text in a narrative flow, making it easier
to convey a data-driven story.
5. Efficiency: Users can quickly analyze data from a single dashboard, reducing the need to navigate through multiple
worksheets.
Challenges of Tableau Dashboards:
1. Complexity: Building complex dashboards can be time-consuming and may require advanced skills.
2. Performance: Highly interactive dashboards with multiple components may require careful optimization to ensure
smooth performance.
3. Accessibility: Ensuring that dashboards are accessible to all users, including those with disabilities, can be a
challenge.
4. Maintenance: Regular updates and maintenance of dashboards to accommodate changing data sources and user
requirements are necessary.
How to create a data dashboard:
There are many different solutions to help you build dashboards: Tableau, Excel, or Google Sheets. But at a basic level,
here are important steps to help you build a dashboard:
1. Define your audience and goals: Ask who you are building this dashboard for and what do they need to understand?
Once you know that, you can answer their questions more easily with selected visualizations and data.

29
2. Choose your data: Most businesses have an abundance of data from different sources. Choose only what’s relevant to
your audience and goal to avoid overwhelming your audience with information.
3. Double-check your data: Always make sure your data is clean and correct before building a dashboard. The last thing
you want is to realize in several months that your data was wrong the entire time.
4. Choose your visualizations: There are many different types of visualizations to use, such as charts, graphs, maps, etc.
Choose the best one to represent your data. For example, bar and pie charts can quickly become overwhelming when
they include too much information.
5. Use a template: When building a dashboard for the first time, use a template or intuitive software to save time and
headaches. Carefully choose the best one for your project and don’t try to shoehorn data into a template that doesn’t
work.
6. Keep it simple: Use similar colors and styles so your dashboard doesn’t become cluttered and overwhelming.
7. Iterate and improve: Once your dashboard is in a good place, ask for feedback from a specific person in your core
audience. Find out if it makes sense to them and answers their questions. Take that feedback to heart and make
improvements for better adoption and understanding.

Data Window
The "Data Window" in Tableau is a critical part of the user interface that provides a preview of the data source you've
connected to in your Tableau workbook. It allows you to interact with, examine, and manipulate your data before
creating visualizations. Here's an overview of the Tableau Data Window:
Accessing the Data Window: To access the Data Window in Tableau, follow these steps:
 Open your Tableau workbook.
 In the Tableau interface, you'll typically find the Data Window located on the left side of the workspace.
Key Features and Functions of the Data Window:
1. Data Source Preview:The Data Window displays a preview of your connected data source, showing the initial rows of
your dataset. This helps you understand the structure and content of your data.
2. Data Fields:The Data Window lists all the data fields available in your dataset, which include dimensions (categorical
variables) and measures (quantitative variables). You can drag and drop these fields onto the canvas to create
visualizations.
3. Data Type Indicators:Next to each data field, Tableau provides data type indicators to inform you whether a field is
recognized as a dimension, measure, or attribute.
4. Field Metadata:When you click on a data field in the Data Window, you can access metadata and details about that
field, such as data type, number of distinct values, and the role assigned to it.
5. Data Sorting and Filtering:You can sort and filter the data in the Data Window to view specific data subsets or order
the data in a particular way.
6. Data Source Editing:The Data Window allows you to make changes to your data source, including creating calculated
fields, renaming fields, and modifying data roles.
7. Data Source Options:Right-clicking on a data field or selecting it provides access to a range of options, including
creating groups, hierarchies, sets, and more.
8. Data Source Split View:You can switch between a single view of your data source and a split view that displays both
your data source and a worksheet for visualization.
9. Data Source Filtering:The Data Window supports data source filtering, allowing you to apply filters to the data
before it's brought into the visualization. This can be useful for reducing data volume.
Advantages of the Data Window:
1. Data Exploration: The Data Window allows you to quickly explore your data, understand its structure, and identify
patterns or anomalies.
2. Data Preparation: You can perform data preparation tasks like cleaning, filtering, and shaping data within the Data
Window.
3. Field Management: It offers tools to manage data fields, such as renaming, changing data types, and defining custom
calculations.
4. Data Role Assignment: You can specify the role of each field, like dimension or measure, to control how it's used in
visualizations.

30
Challenges of the Data Window:
1. Data Volume: With large datasets, the Data Window may become challenging to work with due to the sheer volume
of data.
2. Complex Data Structures: If your dataset has complex structures or joins, understanding the data relationships in the
Data Window can be challenging.
3. Data Cleanup: Extensive data preparation tasks may require additional tools or external data cleansing processes.

Data Types
Tableau is the easy-to-use Business Intelligence tool used in data visualization. Its unique feature is, to allow data real-
time collaboration and data blending, etc. Through Tableau, users can connect databases, files, and other big data
sources and can create a shareable dashboard through them. Tableau is mainly used by researchers, professionals, and
government organizations for data analysis and visualization.The data type classifies the data value into its definite type,
some may be characters (eg- ‘Vansh’), some may be integers (eg- 108), and some may be floating type (eg- 1.854), etc. In
this way, every data value lies under certain data types. Tableau too has a set of data types under which it classifies data
value present in it as field values.In Tableau, we have seven primary data types. The function of Tableau is to
automatically detect the data types of various fields, as soon as the data is uploaded from the source and allocate it to
the fields.
These six data types are:-
(1) String Data type: The collection of characters give rise to the string data type. A string is always enclosed within a
single or double inverted comma. The samples of the string are — “Vansh”, “Hi! How are you?”, and
“GeeksforGeeks”, etc.
We can divide String data type into two types, Char and Varchar.
Char string type- Char data type normally stores alphanumeric data values having fixed lengths. If the user enters a
string value which is greater than the fixed length of the Char data type, then the system returns an error.
Varchar string type- Varchar data type also stores alphanumeric data values. As the name suggests, Varchar stores data
values having a variable length. So, the user can enter as many string values as they want, without facing any restriction
from the system.
(2) Numeric Data type: This data type consists of both integer type or floating type. Out of which users prefer to use
integer type over floating type, as it is difficult to accumulate the decimal point after a certain limit. It also contains a
function known as the Round() function which can be used in rounding up float values.
(3) Date and Time Data type: Tableau supports all forms of date and time like dd-mm-yy, or mm-dd-yyyy, etc. And the
time data values can be in the form of a decade, year, quarter, month, hour, minutes, seconds, etc. Whenever the
user enters data and time values, Tableau automatically registers it under Date data type and Date & Time data
value.
(4) Boolean Data type: As a result of relational calculations, boolean data type values are formed. The boolean data
values are either True or False. Many a time the result of a relational calculation is unknown, in this situation Null
data values are used.
(5) Geographic Data type: All values that are used in maps, comes under geographic data type. The example of
geographic data values is country name, state name, city, region, postal codes, etc.
(6) Cluster or Mixed Data type: Sometimes data set contains values having a mixture of data types. Such values are
known as cluster group values or mixed data values. In such a situation, users have the option either to handle it
manually or allow Tableau to operate on it.

file types:
Tableau uses several file types for various purposes. Here are some of the key Tableau file types in detail:
(1) Tableau Workbook (.twb):
This is the primary file format used by Tableau for workbooks. It contains information about data connections,
worksheets, dashboards, and layouts..twb files do not embed the data; instead, they reference data sources, which can
be in various formats.
(2) Tableau Packaged Workbook (.twbx):
A .twbx file is a packaged workbook, which includes both the workbook (.twb) and the data source. All the necessary
data and metadata are bundled into a single file, making it easier to share with others .Useful when you want to ensure
that recipients have access to the data along with the workbook.
(3) Tableau Data Extract (.hyper):

31
Tableau Data Extracts are highly optimized, columnar data storage files designed for performance. These files are
typically created when you extract data from a data source in Tableau. Extracts can be used to speed up data
visualization, especially with large datasets.
(4) Tableau Data Source (.tds):
A .tds file is used to save data source information separately from a workbook .It contains metadata about data
connections, custom calculations, and other data source settings .Useful for sharing data source configurations across
multiple workbooks.
(5) Tableau Data Source Extract (.tdsx):
Similar to a .tds file, but it also includes an embedded data extract (.hyper).This allows you to package the data source
and the extract together for easier sharing.
(6) Tableau Bookmark (.tbm):
A .tbm file is used to save individual worksheet or dashboard settings, including layout and formatting .It enables you to
share specific views or configurations of your Tableau work.
(7) Tableau Workbook Template (.twbt):
A .twbt file is a Tableau workbook template that can be used as a starting point for new workbooks .It contains
predefined formatting, layout, and other settings to maintain consistency across projects.
(8) Tableau Packaged Data Source (.tdsx):
This file format bundles a data source (.tds) along with an embedded data extract (.hyper).Useful for sharing self-
contained data sources with others .These file types enable Tableau users to create, save, and share their data
visualizations, reports, and data sources efficiently. The choice of file type depends on your specific needs, such as
sharing workbooks with or without data, packaging data sources, or creating templates.

Data Connection with Data Sources


Tableau is a powerful data visualization and business intelligence tool that allows you to connect to various data sources
to create interactive and insightful visualizations.
Here are the steps to connect Tableau to different data sources:
1. Launch Tableau: Open Tableau Desktop or Tableau Server, depending on your setup and licensing.
2. Connect to Data Source:
 Click on the "Connect to Data" option.
 You'll see a list of data source options. Tableau supports a wide range of data sources, including databases, cloud
services, spreadsheets, and more.
Some common data sources include:
 Database: You can connect to various database types such as MySQL, PostgreSQL, SQL Server, Oracle, etc.
 File: You can import data from Excel, CSV, JSON, and other file formats.
 Cloud: You can connect to cloud-based data sources like Amazon Redshift, Google BigQuery, or Salesforce.
 Web Data Connector: If the data source is on the web and offers a Tableau Web Data Connector, you can use
this option.
 Server: You can connect to data hosted on a Tableau Server or Tableau Online.
3. Select Data Source Type: Choose the data source type that matches your data, and click on it. For example, if you
want to connect to a MySQL database, select the "MySQL" option.
4. Provide Connection Details:
 Depending on your data source, you will need to provide connection details. This typically includes the
server/host name, port, database name, and authentication credentials.
 You might need to install the relevant database drivers if Tableau doesn't have them by default.
5. Connect to Data:Once you've entered the connection details, click the "Connect" or "Sign In" button.
6. Data Source Tab:Tableau will load a data source tab, where you can see tables, views, or files available in your data
source.
7. Data Preparation:You can perform data preparation tasks within Tableau to clean, transform, and shape the data as
needed.
8. Build Visualizations:After connecting to your data source and preparing the data, you can start building
visualizations. Drag and drop fields onto the Rows and Columns shelves to create charts, graphs, and dashboards.
9. Save and Publish:Save your Tableau workbook locally, and if you're using Tableau Server or Tableau Online, you can
publish your workbook to the server for sharing with others.
10. Scheduled Refresh (optional):If your data source is dynamic and updates frequently, you can set up scheduled
refreshes to keep your visualizations up-to-date.
32
make a connection with the Text File?
Tableau can connect to the following text files:
(a) .csv
(b) .tsv
(c) .txt
(d) .tab
Step 1: Open the Tableau Desktop.
Step 2: Click on the Text File option available below the Connect.
Step 3: Select the file to connect and click on the Open button.
Step 4: Now, you will see the CSV file on the data source’s left side.

make a connection with the Excel File?


Step 1: Open the Tableau Desktop.
Step 2: Click on the Microsoft Excel option available below the Connect.
Step 3: Select the file to connect and click on the Open button.
Step 4: Now, you will see the XLS file on the left side of the data source.
Step 5: More than one sheet can be dragged from the sheets tab.

Tebleau calculation with :


(1) Functions
(2) Fields
(3) Operator
(4) Literal
(5) Parameters
(6) Comments
Tableau is a powerful data visualization tool commonly used for business intelligence and data analytics. To create
calculations in Tableau, you can use various functions, fields, operators, literals, parameters, and comments. Here's an
overview of each:
(1) Functions: Tableau provides a wide range of built-in functions for data manipulation and calculations. These
include mathematical functions (e.g., SUM, AVG), date functions (e.g., DATEADD, DATEDIFF), string functions
(e.g., LEFT, RIGHT), and more. Functions allow you to perform operations on your data.
(2) Fields: Fields represent columns in your data source, and you can use them in calculations. For example, you
might create a calculated field that combines the values of two existing fields.
(3) Operators: Operators are symbols or keywords used to perform operations on fields or literals. Common
operators include + (addition), - (subtraction), * (multiplication), / (division), and logical operators like AND, OR.
(4) Literals: Literals are constant values used in calculations. For example, you might use a literal number (e.g., 5) or
a string (e.g., "Category A") in your calculations.
(5) Parameters: Parameters are dynamic values that allow users to input data and change calculations interactively.
You can use parameters to create more flexible and user-friendly dashboards.
(6) Comments: Comments are not directly related to calculations but are useful for documenting your work. You can
add comments to explain the purpose of a calculation or provide context for others who might view your Tableau
workbook.
Here's a simple example of a Tableau calculation:
Calculation: Profit Margin
Formula: (SUM([Profit]) / SUM([Sales])) * 100
Explanation:-
Functions: SUM is used to calculate the sum of Profit and Sales.
Fields: [Profit] and [Sales] are fields from your data source.
Operators: / is the division operator, and * is the multiplication operator.
Literals: 100 is a literal value used to convert the result into a percentage.
Parameters: Parameters can be used to make this calculation dynamic if needed.
Comments: You can add comments to explain the purpose or context of this calculation.
In Tableau, you can create calculated fields using the calculated field editor, and you can use calculated fields in your
visualizations to gain insights from your data.
33
Functions:
Any data analysis involves a lot of calculations. In Tableau, the calculation editor is used to apply calculations to the fields
being analyzed. Tableau has a number of inbuilt functions which help in creating expressions for complex
calculations.Following are the description of different categories of functions:
(A) Number Functions :
These are the functions used for numeric calculations. They only take numbers as inputs. Following are some examples
of important number functions.

(B) String Functions:


String Functions are used for string manipulation. Following are some important string functions with examples

34
(C) Date Functions:
Tableau has a variety of date functions to carry out calculations involving dates. All the date functions use the date part
which is a string indicating the part of the date such as - month, day, or year. Following table lists some examples of
important date functions.

(D) Logical Functions:


These functions evaluate some single value or the result of an expression and produce a boolean output.

35
(E) Aggregate Functions:

Tableau – Operators:
Tableau provides a variety of operators that you can use for different purposes in your calculations and data
visualization. These operators can be categorized into four main types: General operators, Arithmetic operators,
Relational operators, and Logical operators. Here's an overview of each category:
1. General Operators:
 Assignment Operator (=): Used to assign a value to a field or variable. For example, you can create a calculated field
to assign a value to a new field:
[New Field] = [Existing Field]
 Wildcard Operator (*): Used for pattern matching in string data. For example, to find all products with "apple" in
their name, you can use "*apple*".
2. Arithmetic Operators: Arithmetic operators perform mathematical operations on numeric values.
 + (Addition): Used to add values together.
 -(Subtraction): Used to subtract values.
 *(Multiplication): Used for multiplication.
 / (Division): Used to divide values.
 % (Modulus): Returns the remainder of a division operation.
 ^ (Exponentiation): Raises a number to a power.
Examples:
Sales + Profit: Adds the "Sales" and "Profit" values.
Year - 1: Subtracts 1 from the "Year" field.
Quantity * Price: Multiplies "Quantity" by "Price".
Total Sales / Number of Orders: Divides total sales by the number of orders.
3. Relational Operators: Relational operators are used to compare values and return a Boolean result (true or
false). They are commonly used in logical expressions.
 = (Equal to): Checks if two values are equal.
 <>or != (Not equal to): Checks if two values are not equal.
 >(Greater than): Compares if one value is greater than another.
 < (Less than): Compares if one value is less than another.
36
 >= (Greater than or equal to): Checks if one value is greater than or equal to another.
 <= (Less than or equal to): Checks if one value is less than or equal to another.
Examples:
Sales = Target Sales: Compares if "Sales" is equal to "Target Sales."
Profit > 0: Checks if profit is greater than zero.
Order Date < #2023-01-01#: Compares if the order date is before January 1, 2023.
4. Logical Operators: Logical operators are used to create complex logical conditions by combining multiple
conditions.
 AND: Returns true if both conditions are true.
 OR: Returns true if at least one condition is true.
 NOT: Negates a condition (returns true if the condition is false).
Examples:
Sales > 1000 AND Profit > 100: Returns true if both sales and profit are greater than 1000 and 100, respectively.
Category = 'Electronics' OR Category = 'Appliances': Returns true if the category is either 'Electronics' or 'Appliances.'
NOT (Region = 'West'): Returns true if the region is not 'West.'
These operators are essential for creating calculated fields, filters, and other logic within Tableau to manipulate and
analyze your data effectively. Depending on your analysis requirements, you can combine these operators in various
ways to build complex calculations and conditions.

Precedence of Operator:
The below table is describing the order of precedence of the operator. The top row of below table has the highest
precedence. Some operators in the same row have the same precedence.
If two operators have the same precedence, they are analyzed from left to the right in the formula. Parentheses can also
be used in the same order, and the inner parentheses are evaluated before the outer parentheses.

37
Tableau basic filters
Tableau provides several basic filters that you can use in data visualization to interactively explore and analyze your
data. These filters allow you to control what data is displayed in your visualizations and dashboards. Here are some of
the fundamental filters in Tableau:
1. Quick Filters:
 Quick filters are easy-to-use filters that you can add to your dashboard to allow users to interactively explore the
data. They are applied to one or more worksheets on the dashboard.
 Quick filters can be created by right-clicking a field in the Data pane and selecting "Show Quick Filter" or by dragging
a field to the Filters shelf.
2. Filter Actions:
 Filter actions are interactive filters that allow you to control one visualization with another. For example, you can
click on a data point in one visualization to filter data in another.
 To create filter actions, go to the "Dashboard" menu and select "Actions." Then, choose "Filter" as the action type
and specify the source and target sheets.
3. Dimension Filters:
 Dimension filters are used to filter data based on discrete (categorical) fields. They provide a list of categories from
which you can select to filter the data.
 You can create dimension filters by dragging a dimension to the Filters shelf.
4. Measure Filters:
 Measure filters are used to filter data based on continuous (numeric) fields. They allow you to set a range of values
for the filter.
 To create a measure filter, right-click a measure in the Data pane, select "Show Filter," and then choose "Range of
Dates" or "At Least/At Most" options, depending on the filter type.
5. Context Filters:
 Context filters are special filters that allow you to create a context in which other filters operate. When you apply a
context filter, Tableau creates a temporary subset of data based on the context filter's conditions.
 You can create a context filter by right-clicking a filter in the Filters shelf and selecting "Add to Context."
6. Top N Filters:
 Top N filters allow you to filter data to show only the top N items based on a specific measure. For example, you can
use a Top N filter to display the top 10 products by sales.
 To create a Top N filter, right-click a dimension or measure in the Filters shelf and select "Top" to set the number of
items to display.
7. Relative Date Filters:
 Relative date filters make it easy to filter data based on time periods relative to the current date. You can use
options like "Last N Days," "Next N Months," etc.
 To create a relative date filter, right-click a date field in the Filters shelf, and select "Add to Filters." Then, choose the
"Relative Date" option.
8. Combined Filters:
 You can combine multiple filters to create complex filter conditions. This can be useful when you want to filter data
based on multiple dimensions and measures.
To combine filters, use the logical operators (AND, OR) within the filter conditions.
These basic filters in Tableau provide you with the flexibility to control the data displayed in your visualizations, enabling
you and your audience to explore and analyze data interactively and gain valuable insights.

tableau Literal
In Tableau, a "literal" refers to a constant value or fixed text that you can include in calculated fields, parameters, or
other parts of your data visualization. Literals are static and do not change unless manually modified. They are used to
add specific values or text to your visualizations, which can be particularly useful for creating calculated fields or for
adding annotations to your charts.
There are a few types of literals you can use in Tableau:
1. Numeric Literal: A constant numeric value. For example, you might use a numeric literal in a calculated field to
perform mathematical operations. Here's an example that uses a numeric literal to calculate a 10% discount:

38
[Price] * 0.10
In this example, 0.10 is a numeric literal representing a 10% discount.
2. String Literal: A fixed text value enclosed in double quotation marks. String literals are often used for labels or
annotations in your visualizations. For instance, you might add a string literal as an annotation to a chart:
"Total Sales for 2023"
In this example, the string literal is "Total Sales for 2023."
3. Boolean Literal: A constant value representing either true or false. Boolean literals are commonly used in logical
expressions. For example:
[Region] = "East"
In this example, the expression compares the field [Region] with the string literal "East."
4. Date Literal: A fixed date value that can be used in date calculations and date-based filters. Date literals are
typically expressed as #YYYY-MM-DD#. For example, you can use a date literal to create a filter for a specific date:
[Order Date] = #2023-10-15#
In this example, the date literal represents October 15, 2023.
Using literals in Tableau allows you to add context and specific values to your calculations, making your visualizations
more informative and tailored to your data. They are especially valuable in calculated fields where you need to perform
operations, create custom dimensions, or add labels to your charts.

Tableau Field
In Tableau, a "field" is a fundamental concept that represents a column of data in your dataset or data source. Fields are
essential for data visualization as they provide the structure and content for creating charts, graphs, tables, and other
types of visual representations.
Fields can be broadly categorized into two main types: dimensions and measures.
1. Dimensions:
Dimensions are categorical or discrete fields that represent qualitative data. They provide a way to segment and
categorize data into distinct groups or categories.
Examples of dimensions include:
 Product categories
 Geographic locations (e.g., countries, cities)
 Time-based data (e.g., months, days, years)
 Customer names
Dimensions are typically used to define the "slices" or "segments" in your visualizations. They determine how data is
grouped and categorized. You can drag and drop dimensions onto the Rows and Columns shelves to create categorical
views in your visualizations.
2. Measures:
Measures are quantitative or continuous fields that represent numeric data. They provide the basis for performing
mathematical calculations and aggregations.
Examples of measures include:
 Sales revenue
 Profit
 Quantity sold
 Temperature readings
Measures are used to perform various mathematical operations, such as SUM, AVG, MAX, MIN, etc., to analyze and
visualize numeric data. Measures are typically placed on the Values shelf in Tableau, and they are used to create the
"meat" of your charts and graphs.

Fields in Tableau are the building blocks for creating data visualizations. By combining dimensions and measures and
applying various functions and filters, you can generate a wide range of visualizations to explore, analyze, and
communicate insights from your data. Fields can be added to different parts of your visualization, such as the Rows and
Columns shelves, the Marks card, or the Filters shelf, to define the structure and appearance of your visualizations.
Fields are the core components that help you turn raw data into meaningful charts and dashboards in Tableau.

39
Tableau Parameter
In Tableau, a "parameter" is a dynamic input that allows you to create more interactive and flexible data visualizations.
Parameters are a way to add an extra layer of control to your visualizations, allowing users to change specific values or
criteria and see how those changes affect the data and the visual representation. Parameters are particularly useful
when you want to provide users with options for customizing their views, performing what-if analysis, or exploring data
in various ways.
Here are some key points about parameters in data visualization:
1. Creating Parameters:You can create parameters in Tableau by defining a name and data type (e.g., string, number,
date) for the parameter. Parameters can be created from the "Data" pane or the "Analysis" menu.
2. Using Parameters:Parameters can be used in calculated fields, filters, and reference lines within your visualizations.
They serve as dynamic placeholders for values that users can change.
For example, you can use a parameter to create a calculated field that multiplies a measure by the parameter's
value, allowing users to adjust the multiplier.
3. Control Options:You can specify control options for parameters, such as setting a range of allowable values or
providing a list of predefined choices. Users can interact with the parameter through drop-down lists, sliders, or
input boxes, depending on how you configure it.
4. Dynamic Filtering:Parameters can be used as a way to create dynamic filters. For example, you can use a parameter
to filter data based on a specific dimension or measure, allowing users to change the filter criteria without editing
the view.
5. What-If Analysis:Parameters are valuable for conducting what-if analysis. Users can adjust parameter values to see
how changes impact the visualizations. For instance, you can create a parameter for a discount rate and observe how
different rates affect profit.
6. Dashboard Interaction:Parameters can be included in dashboards, enabling users to adjust values and instantly see
the impact on multiple visualizations. This enhances the interactivity of your dashboards.
7. Reference Lines and Bands:Parameters can be used to dynamically control reference lines and bands in your
visualizations. You can allow users to change reference values or thresholds using parameters.
8. Data Exploration:Parameters make it easier for users to explore data. They can quickly change criteria or apply
different filters without needing to access the underlying data source.
Here's a simplified example of how a parameter can be used in Tableau:
Suppose you have a parameter named "Discount Rate," and you create a calculated field that calculates discounted sales
using this parameter. Users can adjust the "Discount Rate" parameter, and the visualizations will automatically update to
reflect the new discounted sales figures.
Parameters in Tableau offer a powerful way to make your data visualizations more interactive, user-friendly, and
adaptable to different scenarios, allowing users to gain deeper insights from your data.
Here's how parameters work in Tableau and their significance in data visualization:
1. Creating Parameters:Parameters are created through the Parameter dialog in Tableau. You specify the data type
(e.g., string, date, numeric), and you can define a range or list of allowable values.
2. Using Parameters:You can use parameters in various parts of your visualization, such as calculated fields, filters,
reference lines, and calculated field expressions. Essentially, you replace a constant value in your calculations with a
parameter, making it dynamic.
3. Parameter Controls:Parameters are typically displayed as a control in your visualization, often as a dropdown list,
slider, or input box. Users can interact with these controls to change the parameter's value.
4. Dynamic Data Visualization:Parameters enable dynamic data visualization. They allow users to change aspects of the
visualization, such as filtering data, adjusting thresholds, or selecting specific categories, on the fly.
Here are a few common use cases for parameters in data visualization:
 Filtering Data: Parameters can be used to create dynamic filters. For example, you can create a parameter for "Top
N" and allow users to select how many items should be displayed in a chart. This provides users with greater control
over what they see.
 Comparing Scenarios: You can create parameters to switch between different scenarios or measures in a dashboard.
For instance, you can create a parameter to switch between viewing sales revenue and profit.
 Adjusting Thresholds: Parameters can be used to adjust threshold values in your visualizations. Users can change a
parameter to set a different threshold for what is considered a "high" or "low" value.

40
 Highlighting Data: Parameters can control the color, size, or style of marks in a visualization. For example, you can
create a parameter to highlight specific data points based on user preferences.
 Custom Calculations: You can use parameters to allow users to input custom calculations. This can be particularly
useful when users want to perform ad-hoc calculations within a dashboard.

Tableau Comment
In Tableau, comments are a feature that allows you to add explanatory or descriptive notes to various elements within
your data visualization projects. Comments can be useful for providing context, explaining the purpose of a visualization,
sharing insights, or collaborating with team members on a specific project. Here's how you can use comments in
Tableau:

Worksheet and Dashboard Comments:


 You can add comments directly to worksheets and dashboards to provide explanations about the visualizations
and their components. To do this:
 Open the worksheet or dashboard you want to add a comment to.
 In the menu, go to "Worksheet" or "Dashboard," then select "Show Comments" to display the Comments
pane on the left side.
 Click on the "Add a Comment" button in the Comments pane.
 Type your comment in the text box and click "Save."
This is particularly useful when you want to explain the purpose of a dashboard, highlight key insights, or provide
context for specific charts and graphs.
Commenting on Data Fields:
 You can add comments to specific fields within your data source. This is helpful when you want to provide
information about the meaning or source of a particular field.
 In the Data Source tab, right-click on a field, then select "Create Comment" to add a comment about that
field.
Collaborative Comments:
 Tableau Online and Tableau Server offer collaborative features that allow team members to discuss and annotate
specific points on a visualization.
 You can add comments to a published workbook, which can be seen and responded to by other
authorized users.
 Select a point on the visualization, right-click, and choose "Add Comment" to start a conversation about a
particular data point.
Annotations:
 Annotations are similar to comments but are used to add explanatory text to specific data points in a
visualization.
 You can manually create annotations in a worksheet by selecting a data point and then choosing "Annotate"
from the toolbar.
 Annotations can also be created automatically when data points exceed certain thresholds or conditions set by
you.

Comments and annotations in Tableau help make your data visualizations more informative and facilitate collaboration
within your team. They provide a way to communicate insights, context, and data explanations directly within your
visualizations, which can be invaluable for decision-making and understanding complex data.

41
Simple Text Visualization:
Text visualization:
Definition: The text visualization chart is the graphical representation of qualitative data frequency, such as keywords or
customer feedback.
The graph gives greater prominence to words that appear more frequently in a source text. The larger the word, the
higher its frequency .You can use the chart to perform exploratory textual analysis by identifying words that frequently
appear in a set of interviews, documents, or other text. Also, you can use it to communicate the most salient points or
themes in the reporting stage.
uses of text visualization charts below:
 Summarize Large Amounts of Text:Automatically highlight key terms in a series of texts, and categorize text by
topic, sentiment, and more, saving hours of reading time .With a text visualization or data visualization dashboard,
you can understand text data at a glance.
 Make Text Data Easy to Understand:Our brains process visual data 60,000 times faster than texts and numbers. Text
visualization examples effectively simplify complex data and communicate ideas and concepts to team managers.
 Find Insights in Qualitative Data:Customer feedback holds a trove of insights. Through text visualization examples,
you can get an overview of the features, products, and topics that are most important to your customers.
 Discover Hidden Trends and Patterns:You can easily analyze and visualize insights over time to detect fluctuations,
and quickly find the root cause. Extracting reliable insights from qualitative data sets, such as keywords, should never
be an Achilles Heel for you.
why do we need text visualization?
 Text Visualization can help reveal your audience’s thoughts:You can use the chart to understand your audience’s
feelings about a topic/situation. Besides, you can leverage the chart to summarize data-driven views. The chart can
help you summarize the market feedback using first-hand data.
 Quick and informative :You can easily get live feedback from your audience in real-time.
 Exciting and emotional : The chart can help audiences feel part of your data story.
 Engaging :The Word cloud is incredibly engaging and visually appealing to many audiences. The chart can be an
icebreaker or an entry point for a topic of discussion.
 Word Clouds are visual:Our brains process visual content 60,000 times faster than texts and numbers. This provides
a logical rationale for using the Word Cloud generator to analyze your textual data for actionable insights.
There are top 4 text visualization examples:
(1) Word Cloud

42
(2) Tag Cloud

(3) Slope Chart

(4) Sankey Chart

43
Visual displays of information as a form of Table
A data table, or a spreadsheet, is an efficient format for comparative data analysis on categorical objects. Usually, the
items being compared are placed in a column, while the categorical objects are in the rows. The quantitative value is
then placed at the intersection of the row and column, called the cell. The following examples demonstrate data tables.
This table compares monthly payments for buying or leasing various cars (categories). The first two columns are being
compared; the other columns contain additional, secondary information.

There are different types of Graphs for displays information:


Bar Graph
A bar graph should be used to avoid clutter when one data label is long or if you have more than 10 items to compare.
Best Use Cases for These Types of Graphs:
Bar graphs can help you compare data
between different groups or to track changes
over time. Bar graphs are most useful when
there are big changes or to show how one
group compares against other groups .
The example above compares the number of
customers by business role. It makes it easy
to see that there is more than twice the
number of customers per role for individual
contributors than any other group .A bar
graph also makes it easy to see which group
of data is highest or most common.
For example, at the start of the pandemic,
online businesses saw a big jump in traffic.
So, if you want to look at monthly traffic for an online business, a bar graph would make it easy to see that jump.
Other use cases for bar graphs include:
- Product comparisons.
- Product usage.
- Category comparisons.
- Marketing traffic by month or year.
- Marketing conversions.
Design Best Practices for Bar Graphs:
- Use consistent colors throughout the chart, selecting accent colors to highlight meaningful data points or changes
over time.
- Use horizontal labels to improve readability.
- Start the y-axis at 0 to appropriately reflect the values in your graph.

44
Limitations of bar graphs:
1. Ineffectiveness for Continuous Data: Bar graphs are not well-suited for representing continuous data, where values
can take on any number within a range. For such data, histograms or line charts may be more appropriate.
2. Complexity with Many Categories: When dealing with a large number of categories or data points, creating a bar
graph can result in a cluttered and hard-to-read visualization. It becomes inefficient to represent numerous
categories using bars. In such cases, alternatives like treemaps or scatter plots may be more effective.
3. Single-Dimension Data: Bar graphs primarily display data for one variable or dimension at a time. They are not
designed for representing relationships or correlations between multiple variables. If you need to show how two or
more variables interact, other visualizations like scatter plots or bubble charts are more suitable.
4. Precision in Value Comparison: Precisely comparing the values of bars can be difficult, especially when the values
are very close to each other. Users often need to rely on visual estimation, which can lead to inaccuracies. For
precise value comparison, tables or dot plots may be better choices.
5. Challenges with Negative Values: Bar graphs are typically used for representing positive values. When negative
values are involved, it may be less intuitive and may require additional annotation or explanation.
6. Misleading Scaling: In some cases, manipulating the scaling of the y-axis can exaggerate or diminish differences
between data points. This can lead to visual distortions and misinterpretation. It is important to ensure that the
scaling is appropriate and does not misrepresent the data.
7. Not Ideal for Time-Series Data: While bar graphs can be used to display time-series data, they may not effectively
capture trends and patterns over time. Line charts, area charts, or box plots are often better choices for representing
time-based data.

Line Graph
A line graph reveals trends or progress over time, and you can use it to show many different categories of data. You
should use it when you chart a continuous data set.

Best Use Cases for These Types of Graphs:


Line graphs help users track changes over short and long periods. Because of this, these types of graphs are good for
seeing small changes .Line graphs can help you compare changes for more than one group over the same period. They're
also helpful for measuring how different groups relate to each other .A business might use this graph to compare sales
rates for different products or services over time .These charts are also helpful for measuring service channel
performance. For example, a line graph that tracks how many chats or emails your team responds to per month.

Design Best Practices for Line Graphs:


- Use solid lines only.
- Don't plot more than four lines to avoid visual distractions.
- Use the right height so the lines take up roughly 2/3 of the y-axis' height.

45
Features of Line Graphs:
1. Time-Series Representation: Line graphs are especially effective for representing data over time. They can show
trends, patterns, and fluctuations in data, making them ideal for tracking changes and developments.
2. Sequential Data: Line graphs are designed to display data points in a sequence. This sequential representation is
particularly useful for showing how data evolves over time, which is common in areas like finance, stock market
analysis, and weather forecasting.
3. Continuous Data: Line graphs are well-suited for continuous data, such as temperature, stock prices, or population
growth, where values can vary across a continuous range.
4. Clear Trend Identification: Line graphs make it easy to identify trends, whether they are upward, downward, or
stable. They are great for visualizing changes in data over time, helping users to spot patterns and irregularities.
5. Interpolation of Values: Line graphs allow for interpolation between data points, which helps in estimating values
between known data points. This can be especially useful for understanding the behavior of data within a time
period.
6. Comparison of Multiple Series: Line graphs can display multiple lines on the same chart, enabling the comparison of
several related datasets simultaneously. This makes it easy to analyze how different variables or categories evolve
over time.

Limitations of Line Graphs:


1. Not Suitable for Categorical Data: Line graphs are not appropriate for representing categorical or discrete data
where values don't have a natural order. For categorical data, bar charts or pie charts are more suitable.
2. Overcrowding with Many Data Points: When you have a large number of data points, a line graph can become
cluttered and hard to interpret. This is particularly an issue if the data is too densely packed along the x-axis.
3. Non-Specific Trends: While line graphs can show trends and patterns, they may not provide specific information
about individual data points. If precise values are required, additional data points or labels may be needed.
4. Assumes Linearity: Line graphs assume that data follows a linear pattern, which may not always be the case. If the
data has complex non-linear patterns or multiple trends, a line graph might not effectively capture those nuances.
5. Limited for Negative Values: Line graphs are typically used for displaying positive values over time. If you need to
represent negative values, you may need to adapt the graph or use additional visualization techniques.
6. Scaling Issues: The scaling of the y-axis can significantly impact the interpretation of the data. Improper scaling can
make trends appear steeper or flatter than they are, leading to misinterpretations.
7. Difficulty in Comparing Across Categories: Line graphs are more focused on comparing changes over time within a
single category or variable. They may not be as efficient at comparing different categories or variables to each other.

Area Chart
An area chart is basically a line chart, but the space between the x-axis and the line is filled with a color or pattern. It is
useful for showing part-to-
whole relations, like showing
individual sales reps’
contributions to total sales
for a year. It helps you
analyze both overall and
individual trend information.

Best Use Cases for


These Types of Charts:
Area charts help show
changes over time. They
work best for big differences
between data sets and help
visualize big trends .For
example, the chart above
shows users by creation date

46
and life cycle stage .A line chart could show more subscribers than marketing qualified leads. But this area chart
emphasizes how much bigger the number of subscribers is than any other group .These charts make the size of a group
and how groups relate to each other more visually important than data changes over time.
Area graphs can help your business to:
- Visualize which product categories or products within a category are most popular.
- Show key performance indicator (KPI) goals vs. outcomes.
- Spot and analyze industry trends.
Design Best Practices for Area Charts:
- Use transparent colors so information isn't obscured in the background.
- Don't display more than four categories to avoid clutter.
- Organize highly variable data at the top of the chart to make it easy to read.
Features of Area Charts:
1. Time-Series Representation: Like line graphs, area charts are effective for representing data over time. They can
convey trends, patterns, and fluctuations, making them suitable for tracking changes and developments.
2. Sequential Data: Area charts are designed to display data points in a sequence, making them well-suited for showing
how data evolves over time. They are particularly useful when you want to emphasize the cumulative nature of data.
3. Continuous Data: Area charts work well with continuous data, just like line graphs. Continuous data, such as
temperature, stock prices, or population growth, can be visually depicted using area charts.
4. Emphasis on Cumulative Values: Area charts provide a clear emphasis on cumulative values. They are great for
illustrating the total effect of several variables or categories over time, showing how individual components
contribute to the whole.
5. Visualizing Change Over Time: Area charts are effective at highlighting changes and trends in data. They emphasize
not only the values at a specific point but also the overall effect of those values over the entire time period.
6. Comparison of Multiple Series: Just like line graphs, area charts can display multiple series on the same chart. This
allows for easy comparison of several related datasets or categories, showcasing their cumulative contributions over
time.
Limitations of Area Charts:
1. Not Suitable for Categorical Data: Area charts are not appropriate for representing categorical or discrete data
where values don't have a natural order. For categorical data, bar charts or pie charts are better choices.
2. Overcrowding with Many Data Points: When there is a large number of data points or categories, area charts can
become overcrowded and challenging to interpret. Clutter can obscure the trends or patterns.
3. Non-Specific Trends: Similar to line graphs, area charts may not provide specific information about individual data
points. If precise values are required, additional data points or labels may be needed.
4. Assumes Linearity: Area charts assume that data follows a linear pattern, which may not always be the case.
Complex non-linear patterns or multiple trends may not be effectively represented.
5. Scaling Issues: The scaling of the y-axis can significantly impact the interpretation of the data. Inappropriate scaling
can exaggerate or diminish the apparent size of trends, leading to misinterpretations.
6. Difficulty in Comparing Across Categories: Area charts are focused on comparing cumulative values and trends
within a single category or variable over time. They may not be as efficient at comparing different categories or
variables to each other.

Pie Chart
A pie chart shows a static number and how categories represent part of a whole — the composition of something. A pie
chart represents numbers in percentages, and the total sum of all segments needs to equal 100%.
Best Use Cases for This Type of Chart:
The image above shows another example of customers by role in the company .The bar graph example shows you that
there are more individual contributors than any other role. But this pie chart makes it clear that they make up over 50%
of customer roles.
Pie charts make it easy to see a section in relation to the whole, so they are good for showing:
- Customer personas in relation to all customers.
- Revenue from your most popular products or product types in relation to all product sales.
- Percent of total profit from different store locations.

47
Design Best Practices for Pie Charts:
- Don't illustrate too many categories to ensure differentiation between slices.
- Ensure that the slice values add up to 100%.
- Order slices according to their size.

Features of Pie Charts:


1. Part-to-Whole Representation: Pie charts are designed to show the relationship between parts and the whole. They
are ideal for illustrating how various components or categories contribute to a total, with each "slice" representing a
portion of the whole.
2. Percentage Representation: Each segment in a pie chart is typically represented as a percentage of the whole,
making it easy to see the relative contribution of each category. This can help in quickly identifying the largest and
smallest categories.
3. Simple and Intuitive: Pie charts are simple and intuitive for most audiences. People are generally familiar with the
concept of a pie chart, which makes it easy to understand at a glance.
4. Useful for Showing Relative Proportions: Pie charts excel at showing the relative proportions of categories within a
dataset. They are particularly effective when you want to emphasize the distribution or composition of a whole.
5. Limited Number of Categories: Pie charts work well when there is a limited number of categories to display. A small
number of slices (e.g., 2 to 6) ensures that the chart remains clear and easy to read.

Limitations of Pie Charts:


1. Inefficient for Many Categories: Pie charts become inefficient and confusing when you have a large number of
categories or slices. Too many slices make the chart cluttered and difficult to interpret.
2. Difficulty in Comparing Categories: Pie charts are not well-suited for comparing categories with similar sizes. It can
be challenging to distinguish small differences, leading to imprecise comparisons.
3. Lack of Precision: While pie charts are effective at showing relative proportions, they do not provide precise values
for each category. If you need precise numerical information, bar charts or tables are more suitable.
4. Difficulty in Ranking Data: Pie charts do not effectively show the ranking or order of categories. It's difficult to
determine which category is the largest or smallest without referring to the percentages or labels.
5. Common Misrepresentation: Pie charts are often misused or misinterpreted. They can be misleading when the size
of the slices is not proportionate to the data, or when the slices are not labeled accurately.
6. Limited Data Exploration: Pie charts are primarily designed for conveying a snapshot of relative proportions. They
are not well-suited for exploring complex data relationships or trends over time.
7. Ineffective for Negative Values: Pie charts cannot represent negative values or values that fall below zero. They are
designed for positive values only.
8. Less Effective in Showing Trends: Unlike other chart types, such as line charts or bar charts, pie charts are not
effective at showing trends or changes over time.

48
Scatter Plot Chart
A scatter plot or scatter gram chart will show the relationship between two different variables or reveal distribution
trends .Use this chart when there are many different data points, and you want to highlight similarities in the data set.
This is useful when looking for outliers or understanding your data's distribution.

Best Use Cases for These Types of Charts:


Scatter plots are helpful in situations where you have too much data to see a pattern quickly. They are best when you
use them to show relationships between two large data sets .In the example above, this chart shows how customer
happiness relates to the time it takes for them to get a response .This type of graph makes it easy to compare two data
sets. Use cases might include:
- Employment and manufacturing output.
- Retail sales and inflation.
- Visitor numbers and outdoor temperature.
- Sales growth and tax laws.
Try to choose two data sets that already have a positive or negative relationship. That said, this type of graph can also
make it easier to see data that falls outside of normal patterns.
Design Best Practices for Scatter Plots:
- Include more variables, like different sizes, to incorporate more data.
- Start the y-axis at 0 to represent data accurately.
- If you use trend lines, only use a maximum of two to make your plot easy to understand.

Features of Scatter Plots:


1. Relationship Exploration: Scatter plots are particularly effective for exploring relationships between two numeric
variables. They visually represent the distribution of data points and highlight patterns and trends.
2. Data Density: Scatter plots can handle a high density of data points, making them suitable for displaying large
datasets without overcrowding or compromising clarity.
3. Identifying Outliers: Scatter plots can easily identify outliers, which are data points that fall far from the main cluster
of data. Outliers can be identified as data points located away from the central trend in the plot.
4. Correlation Assessment: Scatter plots are valuable for assessing the strength and direction of correlation between
two variables. Positive, negative, or no correlation can be easily observed.
5. Pattern Identification: They can reveal various patterns, including linear relationships, exponential growth,
clustering, and non-linear trends, which can provide insights into data behavior.
6. Data Distribution: Scatter plots display the distribution of data points, showing the spread and concentration of data
values. This is particularly useful for understanding data characteristics.
Limitations of Scatter Plots:
1. Limited for One-Dimensional Data: Scatter plots are designed for representing the relationship between two
variables. They are not suitable for one-dimensional data or data with more than two variables.
49
2. Complexity in Multivariate Data: When you need to analyze relationships between more than two variables, scatter
plots become less effective. In such cases, alternative visualization methods, such as parallel coordinate plots, may
be more appropriate.
3. Difficulty in Labeling Many Data Points: When there are many data points on the plot, labeling each point can be
impractical. It becomes challenging to identify individual data points.
4. Misleading in Non-Linear Scales: Scatter plots can be misleading if the axes are not displayed on a linear scale. Non-
linear scales may distort the relationship between data points.
5. Limited for Categorical Data: Scatter plots are primarily designed for numeric data. They may not effectively
represent categorical or discrete data, which is better visualized using bar charts or box plots.
6. Interpretation Complexity: Scatter plots require some level of interpretation, especially when there are numerous
data points or patterns. Users must be able to discern the trends and relationships within the plot.
7. Lack of Precision: While scatter plots provide a visual overview of data, they do not provide precise numerical values
for each data point. If you need precise values, you may need to supplement the plot with a table or another chart
type.

Bubble Chart
A bubble chart is similar to a scatter plot in that it can show distribution or relationship. There is a third data set shown
by the size of the bubble or circle.

Best Use Cases for This Type of Chart:


In the example above, the number of hours spent online isn't just compared to the user's age, as it would be on a scatter
plot chart .Instead, you can also see how the gender of the user impacts time spent online .This makes bubble charts
useful for seeing the rise or fall of trends over time. It also lets you add another option when you're trying to understand
relationships between different segments or categories.
For example, if you want to launch a new product, this chart could help you quickly see your new product's cost, risk,
and value. This can help you focus your energies on a low-risk new product with a high potential return.

You can also use bubble charts for:


- Top sales by month and location.
- Customer satisfaction surveys.
- Store performance tracking.
- Marketing campaign reviews.
Design Best Practices for Bubble Charts:
- Scale bubbles according to area, not diameter.
- Make sure labels are clear and visible.
- Use circular shapes only.
-

50
Features of Bubble Charts:
1. Three-Dimensional Data: Bubble charts represent data in three dimensions, where two numeric variables are
displayed on the x and y axes, and the third variable is represented by the size of the bubbles. This allows you to
explore relationships and patterns among these three variables simultaneously.
2. Size Encoding: The size of each bubble in the chart is used to encode a specific data attribute or value. This allows
you to convey additional information or context beyond what is shown in a typical scatter plot.
3. Data Density: Like scatter plots, bubble charts can handle a high density of data points. They are particularly useful
when you have a large dataset and want to visualize the distribution of data while showing the relative magnitude of
a third variable.
4. Comparison of Multiple Dimensions: Bubble charts are suitable for comparing data points across three dimensions,
making them valuable for multivariate analysis and identifying relationships or trends that may not be evident in
two-dimensional charts.
5. Identifying Outliers: The use of bubble size allows for the identification of outliers or data points that significantly
differ from the rest. Large bubbles can quickly capture attention and represent unusual or important data points.
6. Customization: Bubble charts often allow for customization of bubble size, color, and labeling, enabling you to
convey specific information and insights as needed.

Limitations of Bubble Charts:


1. Limited to Three Variables: Bubble charts are suitable for representing three variables at most—two on the x and y
axes and the third using bubble size. When you have more than three variables, other visualization techniques may
be required.
2. Complexity with Many Bubbles: As the number of bubbles increases, the chart can become cluttered and hard to
interpret. Labeling each bubble in a chart with many data points can be challenging.
3. Difficulty in Exact Comparison: It can be challenging to make precise comparisons between bubbles when their sizes
are used to encode data values. Accurate comparisons often require a legend or additional reference points.
4. Non-Linearity of Bubble Size: The perception of the relative magnitude of data values may not be linear with the
bubble size. Larger bubbles may appear much larger than the data values they represent, leading to potential
misinterpretation.
5. Challenges with Negative Values: Bubble charts are typically used for visualizing positive values. They may not
effectively represent negative values or values that fall below zero.
6. Overlapping Bubbles: When bubbles overlap due to similar x and y values, it can be challenging to distinguish
individual data points. This can be especially problematic when there are many overlapping bubbles.

Waterfall Chart
Use a waterfall chart to show how an initial value changes with intermediate values — either positive or negative —and
results in a final value .Use this chart to reveal the composition of a number. An example of this would be to showcase
how different departments influence overall company revenue and lead to a specific profit number.

Best Use Cases for This Type of Chart:


These types of charts make it easier to understand how internal and external factors impact a product or campaign as a
whole.
In the example above, the chart moves from the starting balance on the far left to the ending balance on the far right.
Factors in the center include deposits, transfers in and out, and bank fees.
A waterfall chart offers a quick visual, making complex processes and outcomes easier to see and troubleshoot. For
example, SaaS companies often measure customer churn. This format can help visualize changes in new, current, and
free trial users or changes by user segment.
You may also want to try a waterfall chart to show:
- Changes in revenue or profit over time.
- Inventory audits.
- Employee staffing reviews.

Design Best Practices for Waterfall Charts:


- Use contrasting colors to highlight differences in data sets.
- Choose warm colors to indicate increases and cool colors to indicate decreases.

51
Features of Waterfall Charts:
1. Change Analysis: Waterfall charts are primarily used for analyzing changes and variations in a particular measure or
value. They are effective at illustrating how an initial value transitions through various stages to reach a final value.
2. Component Breakdown: Waterfall charts break down the total change into its individual components. Each
component (e.g., increase, decrease, or neutral) is represented as a floating bar, showing how it contributes to the
overall change.
3. Sequential Representation: Waterfall charts display data in a sequential, step-by-step manner, making it easy to
follow the progression from one stage to the next. This allows for clear understanding of how each component
affects the total.
4. Value Attribution: Waterfall charts provide insight into the attribution of value to different factors. They help in
understanding the drivers of changes, such as the contribution of various expenses to profit or the breakdown of
revenue by product category.
5. Total to Components: The total of the waterfall chart is typically represented by a single column at the beginning or
end of the chart, and the components that contribute to the total are shown as floating bars. This provides a clear
visual transition from total to components.
Limitations of Waterfall Charts:
1. Limited to Change Analysis: Waterfall charts are designed for change analysis and are less suitable for representing
static data or other types of data relationships. They may not effectively convey other types of insights.
2. Complexity with Many Components: When there are many components in a waterfall chart, it can become complex
and challenging to read. Overly detailed charts can lead to visual clutter and reduced clarity.
3. Not Suitable for Multivariate Analysis: Waterfall charts are primarily used for understanding the impact of individual
factors on a measure. They are not designed for analyzing multiple variables or comparing multiple measures.
4. Limited for Time-Series Data: While waterfall charts can represent changes over time, they may not be the best
choice for capturing complex trends and patterns in time-series data. Line charts or area charts may be more
effective for that purpose.
5. Interpretation Required: Waterfall charts may require some level of interpretation, especially when there are many
components. Users need to understand how to follow the sequence and interpret the chart's structure.
6. Limited for Categorical Data: Waterfall charts are most effective for showing changes in numeric data. They are not
suitable for representing categorical or discrete data, which can be better visualized using bar charts, pie charts, or
other methods.
52
Heat Map
A heat map shows the relationship between two items and provides rating information, such as high to low or poor to
excellent. This chart displays the rating information using varying colors or saturation.

Best Use Cases for Heat Maps:


In the example above, the darker the shade of green shows where the majority of people agree .With enough data, heat
maps can make a viewpoint that might seem subjective more concrete. This makes it easier for a business to act on
customer sentiment.
There are many uses for these types of charts. In fact, many tech companies use heat map tools to gauge user
experience for apps, online tools, and website design.
Another common use for heat map graphs is location assessment. If you're trying to find the right location for your new
store, these maps can give you an idea of what the area is like in ways that a visit can't communicate.
Heat maps can also help with spotting patterns, so they're good for analyzing trends that change quickly, like ad
conversions. They can also help with:
- Competitor research.
- Customer sentiment.
- Sales outreach.
- Campaign impact.
- Customer demographics.
Design Best Practices for Heat Map:
- Use a basic and clear map outline to avoid distracting from the data.
- Use a single color in varying shades to show changes in data.
- Avoid using multiple patterns.

53
Features of Heat Maps:
1. Data Density: Heat maps are excellent for visualizing high-density data, particularly when you have large datasets
with many data points. They effectively represent data distributions.
2. Two-Dimensional Data: Heat maps are primarily designed for displaying two-dimensional data, such as a matrix of
values or a grid of categories. They are particularly useful for visualizing relationships between two variables.
3. Color Encoding: Heat maps use color to encode data values, with different colors representing different values. This
allows users to quickly identify patterns and variations in the data.
4. Pattern Recognition: Heat maps are effective at revealing patterns, clusters, and trends within data. They can be
used for identifying hotspots (high values) and cold spots (low values) in the data.
5. Customization: Users can often customize the color scheme, making it possible to highlight specific ranges or
features within the data. This enables tailored visualization to convey different insights.
6. Hierarchical Data: Heat maps can be used to visualize hierarchical data, such as tree maps or dendrogram heat
maps, which show relationships between categories and subcategories.
7. Interactivity: In digital environments, heat maps can be interactive. Users can hover over cells or regions to see
specific values or access additional details.

Limitations of Heat Maps:


1. Limited to Two Variables: Heat maps are primarily designed for representing two variables. If you have more than
two variables, it may be challenging to capture the full complexity of the data.
2. Color Blindness Issues: The effectiveness of heat maps relies on color distinctions. However, users with color
blindness may have difficulty interpreting these distinctions. Design considerations must be made for accessibility.
3. Complexity with Large Datasets: With very large datasets, heat maps can become overcrowded and challenging to
read. This is especially true if the data matrix has many rows and columns.
4. Non-Numerical Data: Heat maps are best suited for visualizing numerical data. They may not be suitable for
categorical or text data without some form of aggregation or transformation.
5. Interpretation Complexity: Heat maps require some level of interpretation, especially when there are many data
points. Users must be able to discern patterns and variations in the data.
6. Size of Cells: The size of cells in a heat map can impact the level of detail. Smaller cells can provide more granularity
but may be harder to read, while larger cells may lose detail.
7. Scaling Issues: The scaling and normalization of data in heat maps can influence the visual representation. Care must
be taken to ensure that the chosen scaling is appropriate for the data and does not distort patterns.

Gantt Chart
The Gantt chart is a horizontal chart that dates back to 1917. This chart maps the different tasks completed over a
period of time
.Gantt charting is
one of the most
essential tools
for project
managers. It
brings all the
completed and
uncompleted
tasks into one
place and tracks
the progress of
each .While the
left side of the
chart displays all
the tasks, the
right side shows
the progress and
schedule for each
of these tasks.

54
This chart type allows you to:
- Break projects into tasks.
- Track the start and end of the tasks.
- Set important events, meetings, and announcements.
- Assign tasks to the team and individuals.
Best Use Cases for This Type of Chart:
Gantt charts are perfect for analyzing, road mapping, and monitoring progress over a period of time .The chart above
divides the different tasks involved in product creation. Each of these tasks has a timeline that can be mapped on the
calendar view .From the vision and strategy to the seed funding round, the Gantt chart helps project management teams
build long-term strategies.
The best part ? You can bring the stakeholders, project team, and managers to a single place.
You can use Gantt charts in various tasks, including:
- Tracking employee records as a human resource.
- Tracking sales leads in a sales process.
- Plan and track construction work.
Design Best Practices for Gantt Charts:
- Use same colors for a similar group of activities.
- Make sure to label the task dependencies to map project start and completion.
- Use light colors that align with the texts and grids of the chart.
Features of Gantt Charts:
1. Task Scheduling: Gantt charts are designed for scheduling and managing tasks and activities. They provide a visual
representation of when tasks start and end.
2. Time Sequencing: Gantt charts display tasks along a timeline, making it easy to see the sequence of tasks, their
durations, and overlaps.
3. Resource Allocation: Gantt charts can be used to allocate resources to tasks. This helps in balancing workloads and
ensuring resources are used efficiently.
4. Dependencies: Gantt charts allow you to show task dependencies, indicating which tasks must be completed before
others can begin. This is crucial for understanding the project's critical path.
5. Progress Tracking: Gantt charts enable real-time tracking of task progress. You can update and adjust task durations
as work is completed or delayed.
6. Customization: Gantt charts are often customizable, allowing you to add task details, assign responsible parties, and
add milestones or critical dates.
7. Resource Visualization: Gantt charts can display resource usage, helping you see when specific resources are
allocated to different tasks.
8. Project Planning: They are essential for project planning, making it easier to see the project's timeline, identify
bottlenecks, and ensure tasks are completed in a logical order.
Limitations of Gantt Charts:
1. Complexity with Large Projects: Gantt charts can become complex and overcrowded with many tasks and
dependencies. This can make them challenging to read and interpret.
2. Difficulty with Non-Linear Projects: While Gantt charts are excellent for linear projects with well-defined
dependencies, they may not be the best choice for complex, non-linear projects with multiple interrelated tasks.
3. Limited for Multidimensional Data: Gantt charts are primarily designed for scheduling tasks along a single timeline.
They may not be suitable for projects that involve multiple dimensions of data, such as budget allocation, task cost,
or resource availability.
4. Resource Overallocation: Gantt charts may not effectively handle situations where resources are overallocated to
multiple tasks at the same time. Managing resource constraints can be challenging.
5. Inadequate for Continuous Monitoring: Gantt charts are less effective for continuous monitoring of dynamic
projects, as they may require frequent updates to reflect real-time changes accurately.
6. Non-Numerical Data: Gantt charts are not well-suited for visualizing non-numerical data or non-scheduling data.
They are task-oriented and may not effectively represent other types of information.
7. Complex Interpretation: Interpreting a Gantt chart can be challenging for individuals unfamiliar with the format, and
it may require some training to understand and work with them effectively.

55
What is Data Visualization Cluttering?
 Data Visualization is the representation of data using typical graphics such as charts, Infographics , animations , and
plots. These displays help understand the relationship between different data labels and features available in
complex data .Generally, Data Visualization includes techniques such as tables, pie charts, stacked bars, line charts
,area charts, histograms, scattered plots, heat maps & tree maps .The data cluttering problem occurs when the data
dimension is higher .Data Cluttering is a disordered collection of graphical entities in the formation of data
visualization .Data clutter results in misinformation about the data entities .Decision-making is impossible as it
hinders readers' view of observing the patterns in data.
 There is no single type of solution for the data cluttering problem, as every cluttering results due tovariation in the
visualization techniques and analysis target.
 Data visualization cluttering, also known as visual clutter, refers to the presence of excessive or unnecessary visual
elements within a data visualization, which can hinder understanding and interpretation. Clutter can obscure the
meaningful information in a chart or graph, making it difficult for viewers to discern patterns, trends, and insights. It
occurs when there is an overload of visual cues, data points, labels, or design elements that overwhelm the viewer.
Common sources of clutter in data visualizations include:
1. Overlapping Data Points: When data points, markers, or labels overlap, it can be challenging to distinguish individual
elements, leading to confusion.
2. Too Many Data Points: A high density of data points, particularly in scatter plots or heat maps, can result in
overcrowding, making it difficult to see patterns or trends.
3. Excessive Labels: Adding labels to data points, categories, or axes is essential for understanding, but too many labels
can make the visualization messy and unreadable.
4. Redundant Information: Including redundant or unnecessary information, such as duplicating data in multiple ways,
can clutter the visualization without adding value.
5. Intricate Design Elements: Intricate or overly decorative design elements, like complex color schemes or ornate
chart backgrounds, can introduce clutter and distract from the data.
6. Too Many Categories or Dimensions: Representing too many categories or dimensions within a single visualization
can result in visual complexity and clutter.
clutter clutter free

Data visualization cluttering can have several negative consequences,


including:
1. Reduced interpretability: Clutter makes it difficult to interpret the data and draw meaningful conclusions from the
visualization.
2. Increased cognitive load: Viewers must expend more mental effort to process and make sense of a cluttered
visualization, potentially leading to information overload.
3. Loss of impact: Clutter can dilute the impact of the data and diminish the effectiveness of the visualization in
conveying its intended message.

56
To reduce clutter in data visualizations, consider the following strategies:
1. Simplify: Remove unnecessary data points, labels, or design elements. Focus on the most critical information.
2. Use Hierarchies: If dealing with a large dataset, consider hierarchical or layered visualizations to provide more detail
on demand.
3. Grouping: Group related data points or categories to reduce visual complexity. Stacked bar charts or treemaps are
examples of group-based visualizations.
4. Colors and Contrast: Use color sparingly and purposefully. Ensure there is adequate contrast between elements for
readability.
5. Interactivity: Implement interactive features that allow users to explore the data in a controlled manner, revealing
details as needed.
6. White Space: Use ample white space to separate and organize elements within the visualization.
7. Clear Labels: Ensure that labels are concise, meaningful, and placed appropriately to guide viewers.
By addressing clutter and designing clean and effective data visualizations, you can enhance the understanding and
impact of your data presentations.

GESTALT PRINCIPLES FOR DATA VISUALIZATION


Data visualization is not just about transforming data into understandable and good-looking charts. Every person who’s
very good at something started somewhere that laid the foundation of their expertise.
What’s the connection between these two ideas? Becoming good in data visualization requires the acquisition of
foundational knowledge. Understanding why certain data visualization techniques work better than others has
psychological roots. You may be aware of it or maybe not, but every time you’re doing data viz, you definitely need to
apply Gestalt Principles.
Gestalt means “unified whole” in English and is generally associated with the idea that the whole is greater than the sum
of its parts. It refers to the patterns that you perceive when presented with a few graphical elements. The Gestalt
Principles consist of several principles that describe how the human brain sees visual information, namely – proximity,
similarity, continuity, closure, connection, and enclosure. People, especially designers who understand these principles,
can develop visuals that communicate information in the most effective ways.
1. PROXIMITY:
The nearer the objects to each other, the more we logically think that these objects belong to the same group. This is the
simplest way to link data that you want to be seen together. All you need is
enough white space to separate groups from other data that surrounds them.
In dashboards, placing visuals closer together encourages the users to think
that the grouped visuals are in the same context. The way the objects are
positioned in relation to each other can also make the user unconsciously
move their eyes from left to right and/or top to bottom.
2. SIMILARITY:
Objects of the same color, size, shape and orientation belong to the same group, right? The tendency of how we group
things according to these factors or attributes are also part of Gestalt Principles. We associate
categorical variables to attributes such red color for loss, green color for profit, triangles for
cats, etc.This principle works especially well as a means of identifying different datasets in a
graph. Even when data that we wish to link resides in separate locations on a dashboard, the
principle of similarity can be applied to establish that link. For example, using the color green
to represent revenue across various graphs. This technique can be useful for encouraging comparisons of any data in
various places, such as order count, order size, and order revenue.
3. ENCLOSURE:
A group of objects can be enclosed by anything that forms a visual border around them (for example a line or a common
field of color). This enclosure causes the objects to appear to be set apart
in a region that is distinct from the rest of what we see.
This principle is exhibited frequently in the use of borders and fill colors or
shading in tables and graphs to group information and set it apart. Be
aware that it does not take a strong enclosure (e.g. bright, thick lines or
dominant colors) to create a strong perception of grouping.

57
4. CLOSURE:

Our eyes tend to add any missing pieces of a familiar shape. When faced with ambiguous objects that seems to be
incomplete, open, and in an unusual form, we naturally perceive it as closed or as a whole. The principle of closure
asserts that we perceive open structures as closed, complete, and regular whenever there is a way that we can
reasonably do so.
We can apply this tendency to perceive whole structures in dashboards, especially in the design of graphs. For example,
this principle explains why only two axes, rather than full enclosure, are required on a graph to define the space in which
the data appears, like in a bar chart with x and y-axis values visible.
5. CONTINUITY:
We perceive objects as belonging together, as part of a single whole, if they are aligned with one another or appear to
form a continuation of one another. It’s like the closure principle, but besides the visual connection to form shape, we
also attach visual direction as part of the continuation.
In a dashboard, things that are aligned with one another appear to belong to the same group. For example, in a pivoted
table or matrix table, it is obvious which groups belong to the subgroup when the hierarchy is expanded. We can see the
groupings without the need for vertical grid lines to delineate them, the distinct alignment alone makes the grouping
distinguish easily.
6. CONNECTION:

We perceive objects that are connected in some way, such as by a line, as part of the same group. It supersedes other
principles like proximity and similarity in terms of visual grouping perception because putting a direct connection
between objects is a strong factor in determining the grouping of objects. Connection is only weaker when compared to
enclosure.
The principle of connection is especially useful for tying together non-quantitative data, for example, to represent
relationships between steps in a process or between employees in an organization.
To wrap it up, the real purpose behind Gestalt Principles is for us to really understand how we perceive information. As
we have seen, these principles are powerful and when applied correctly and logically, can deliver the right and intended
effect to our audience from our data visualizations.

58
Types of visual clutter-
Visual clutter can significantly affect the readability and aesthetics of a design. Each type of visual clutter you mentioned
plays a role in how a design is perceived. Let's explore each of them in detail:
1. Lack of Visual Order:
Explanation: Visual order refers to the organization and structure of elements within a design. When there's a lack of
visual order, elements are positioned without a clear system or structure, leading to a chaotic and disorganized
appearance.
Impact: A lack of visual order can make it difficult for viewers to understand the layout and relationships between
elements, resulting in confusion and frustration.
Example: A poster with text and images randomly scattered on the page without clear alignment or grouping.
2. Alignment:
Explanation: Alignment involves the positioning of elements along a common axis or guide, ensuring that they are
visually connected and organized. Proper alignment contributes to a clean and structured design.
Impact: Misaligned elements disrupt the visual flow, making the design appear disorderly and unprofessional.
Alignment is critical for creating a harmonious and organized look.
Example: A brochure with text that is not vertically or horizontally aligned with images and graphics, causing a sense
of imbalance.
3. White Space:
Explanation: White space, also known as negative space, is the empty space around and between design elements.
Adequate use of white space provides breathing room and separation, while too little or too much white space can
create problems.
Impact: Inadequate white space can lead to overcrowding, making it challenging for viewers to distinguish between
elements. Excessive white space can create a sense of emptiness and disconnection.
Example: A website with excessive spacing between elements that makes the content feel scattered and difficult to
follow.
4. Non-Strategic Use of Contrast:
Explanation: Contrast involves variations in visual elements, such as color, size, font weight, or style, to create
emphasis and visual interest. When contrast is applied inconsistently or excessively, it can create visual chaos.
Impact: Non-strategic contrast can lead to confusion and make it challenging to focus on the main message. It can
also create a visually overwhelming design.
Example: An advertisement where different text elements use a multitude of colors, fonts, and sizes with no clear
hierarchy.
5. Pre-Attentive Attributes:
Explanation: Pre-attentive attributes are visual cues that the brain rapidly processes, often before conscious
attention is directed to them. These attributes include color, size, shape, and orientation.
Impact: Overusing or misusing pre-attentive attributes can introduce visual clutter, as they may conflict and disrupt
the viewer's ability to quickly grasp the intended message.
Example: A data visualization with a multitude of differently colored data points that lack a clear and consistent
scheme, making it hard to discern patterns.

Lack of visual order in clutter:


Visual clutter can disrupt the perception of a clear visual order in various ways. Some common shortcomings or issues
associated with visual clutter include:
(1) Overlapping Elements: When objects or elements overlap in a cluttered visual space, it becomes challenging to
discern their order and hierarchy.
(2) Inconsistent Alignment: Visual clutter often leads to inconsistencies in alignment, making it difficult to establish a
structured layout.
(3) Lack of Grouping: Without proper grouping or categorization of elements, it's hard to determine how items relate to
each other within the visual space.
(4) Insufficient White Space: A cluttered design may lack adequate white space, which is essential for creating a sense
of balance and order.
(5) Confusing Typography: Unorganized typography, such as varying fonts, sizes, and styles, can contribute to visual
disorder.

59
(6) Complex Color Schemes: Too many colors or conflicting color choices can disrupt the visual hierarchy and order.
(7) Visual Noise: Extraneous details, decorations, or irrelevant content can add noise and hinder the perception of
order.
(8) Lack of Focal Points: Without clear focal points, viewers may struggle to identify where their attention should be
directed within the cluttered composition.
(9) Ineffective Use of Visual Cues: Misuse or overuse of visual cues like arrows, lines, or icons can lead to confusion
rather than clarity.

Align clutter:
Alignment of clutter typically refers to arranging or organizing cluttered objects or items in a more orderly or visually
pleasing manner. It's a way to bring some order to a chaotic space. Depending on the context, you can align clutter by:
- Grouping similar items together.
- Using containers or storage solutions to keep things organized.
- Sorting items by size, color, or function.
- Creating designated spaces for specific items.
- Regularly de-cluttering and getting rid of items you no longer need.
The specific approach to aligning clutter may vary based on the nature of the clutter and the space you're working with.
What do you mean by white space and non-white space in data visualization?
White Space: A white space is the empty space among the elements of a graphical composition. A good use of white
spaces will increase readability and focus the readers’ attention. For example, within a text, white spaces split big chunks
of text into small paragraphs which makes them easy to understand. In addition, white spaces enhance and highlight
some elements of a visualization, and thus emphases the main contents.
Non-white pace: In data visualization, "non-white space" typically refers to the parts of a chart or graph that are filled
with data or meaningful content, as opposed to the empty or blank areas. White space, in this context, is the empty
space or margins around the data elements .Non-white space contains the visual representations of your data, such as
bars in a bar chart, data points in a scatter plot, or segments in a pie chart. It's the area where the data is presented and
where viewers focus their attention to interpret the information being conveyed.
There are two types of white spaces :
In data visualization, there are generally two main types of white space:
(1) Macro White Space: This refers to the large gaps or empty areas in a visualization that help separate different
sections or group related elements. Macro white space can be used to improve readability and guide the viewer's
attention.
(2) Micro White Space: This is the smaller, finer spacing between individual elements within a chart or graph. Micro
white space is essential for clarity and to prevent visual clutter. It helps distinguish data points, labels, and other
elements from each other.
These two types of white space play a crucial role in creating effective and aesthetically pleasing data visualizations.
Properly managing white space can enhance the overall understanding of the data being presented.

Contrast:
Contrast refers to the differences between two or more things, often used to highlight distinctions or make
comparisons. In various contexts, such as design, photography, literature, and more, contrast can have advantages and
disadvantages:
Advantages of Contrast:
(1) Clarity: Contrast can enhance clarity and make it easier to distinguish between elements or objects.
(2) Emphasis: It can draw attention to specific details or focal points, making them stand out.
(3) Visual Interest: Contrast can make a design or composition visually engaging and dynamic.
(4) Depth: In art and photography, contrast can create a sense of depth and dimension.
(5) Highlighting Differences: It's useful for highlighting differences, such as in data visualization.
Disadvantages of Contrast:
(1) Overwhelm: Excessive contrast can be overwhelming and create visual fatigue.
(2) Distraction: Too much contrast may distract from the main message or content.
(3) Inconsistency: In some cases, too much contrast can lead to an inconsistent or chaotic look.
(4) Accessibility: High contrast can be challenging for individuals with visual impairments.
(5) Subjectivity: The perception of contrast can vary among individuals, leading to different interpretations.
60
Strategic use of contrast:
Contrast is a powerful tool in data visualization. Here are some strategic ways to use it:
(1) Highlight Key Data: Use contrast to make important data elements stand out. For example, you can use bold colors
or larger font sizes for key data points to draw attention to them.
(2) Color Contrast: Choose color schemes that provide good contrast. High contrast between data elements and
backgrounds makes it easier for viewers to differentiate and interpret the information. Be mindful of colorblindness
considerations.
(3) Background vs. Data: Ensure that the background of your visualization is neutral and doesn't distract from the data.
Use a light background with dark data elements or vice versa.
(4) Text vs. Data: Contrast between text and data is crucial for readability. Use a readable font and make sure text labels
are clearly legible against the data points they describe.
(5) Grouping and Categorization: Use contrast to visually group and categorize data. For example, you can use different
colors or patterns to distinguish between categories or data series.
(6) Emphasizing Trends: To highlight trends or comparisons, you can use contrast to make certain lines or bars in a chart
more prominent, making it easier for viewers to identify patterns and etc.
Remember that while contrast can be a powerful tool, it should be used thoughtfully to enhance understanding
rather than confuse or overwhelm the viewer. Test your visualizations with potential users to ensure that the use
of contrast aligns with their comprehension and needs.

Pre-attentive attributes
Pre-attentive attributes in data visualization refer to visual properties that the human brain can quickly and effortlessly
process without conscious thought. These attributes include things like color, size, position, length, and orientation. They
are used to draw the viewer's attention to specific data points or patterns in a visualization.
Advantages of using pre-attentive attributes in data visualization:
1. Rapid Perception: Pre-attentive attributes are processed very quickly by the brain, allowing viewers to grasp
information at a glance.
2. Effective Highlighting: They can be used to emphasize important data points or trends, making it easier for viewers
to focus on what matters.
3. Reduced Cognitive Load: By leveraging these attributes, visualizations can reduce the cognitive load on viewers,
making it easier for them to understand complex data.
4. Enhanced Communication: Pre-attentive attributes can improve the clarity and effectiveness of data
communication, helping viewers make better decisions.
Disadvantages of using pre-attentive attributes:
1. Misinterpretation: Overuse or misuse of pre-attentive attributes can lead to misinterpretation of data or visual
clutter if not carefully implemented.
2. Subjectivity: The effectiveness of these attributes can vary depending on cultural, contextual, and individual factors,
making it challenging to create universally effective visualizations.
3. Limited Attributes: There are only a limited number of pre-attentive attributes, so it's essential to choose the right
ones for a specific dataset and visualization.
4. Potential Bias: Certain attributes, like color, can introduce bias if not used thoughtfully, potentially leading to
misleading interpretations.
Types of Pre-attentive Attributes:
(1) Color: Changes in color can be used to highlight or differentiate data points. For example, using different colors for
categories in a bar chart.
(2) Size: Varying the size of visual elements, such as points or bars, can represent quantitative values. Larger elements
typically indicate larger values.
(3) Shape: Different shapes can be used to represent different categories or data points. For instance, circles and
squares could represent different product types.
(4) Position: The spatial arrangement of data points on a chart can convey relationships or groupings. For example,
scatter plot points placed higher on the y-axis might indicate higher values.
(5) Length: The length of bars or lines can be used to represent quantities or values. Longer bars typically represent
larger values.
(6) Orientation: The orientation of lines or bars can convey information, such as the direction of change or
trends.
61

You might also like