Data Visualization 1
Data Visualization 1
Data visualization is a graphical representation of quantitative information and data by using visual elements like
graphs, charts, and maps. Data visualization convert large and small data sets into visuals, which is easy to
understand and process for humans.
Data visualization tools provide accessible ways to understand outliers, patterns, and trends in the data. In the world
of Big Data, the data visualization tools and technologies are required to analyze vast amounts of information. Data
visualizations are common in your everyday life, but they always appear in the form of graphs and charts. The
combination of multiple visualizations and bits of information are still referred to as Infographics.
Data visualizations are used to discover unknown facts and trends. You can see visualizations in the form of line
charts to display change over time. Bar and column charts are useful for observing relationships and making
comparisons. A pie chart is a great way to show parts-of-a-whole. And maps are the best way to share geographical
data visually.
Today's data visualization tools go beyond the charts and graphs used in the Microsoft Excel spreadsheet, which
displays the data in more sophisticated ways such as dials and gauges, geographic maps, heat maps, pie chart,
and fever chart.
Effective data visualization are created by communication, data science, and design collide. Data visualizations did
right key insights into complicated data sets into meaningful and natural. American statistician and Yale professor
Edward Tufte believe useful data visualizations consist of ?complex ideas communicated with clarity, precision,
and efficiency.
To craft an effective data visualization, you need to start with clean data that is well-sourced and complete. After the
data is ready to visualize, you need to pick the right chart. After you have decided the chart type, you need to design
and customize your visualization to your liking. Simplicity is essential - you don't want to add any elements that
distract from the data.
1
History of Data Visualization
The history of data visualization is a long and fascinating journey that spans centuries. Here is a brief overview of key
developments and milestones in the history of data visualization:
1. Prehistoric Period: Early humans used cave paintings, drawings, and symbols to convey information about their
surroundings, such as maps of hunting grounds or celestial charts.
2. Ancient and Classical Periods: Ancient civilizations like the Egyptians, Greeks, and Romans created visual
representations of data through maps, diagrams, and inscriptions. For instance, the Greeks developed the earliest
known geographic maps and anatomical diagrams.
3. Middle Ages: During this period, illuminated manuscripts and diagrams were used to illustrate complex ideas in
various fields, including theology, astronomy, and medicine.
4. Renaissance: The Renaissance saw a resurgence of interest in science and art, which led to the creation of detailed
maps, anatomical drawings, and early cartography. Leonardo da Vinci's notebooks are famous for their detailed
scientific illustrations.
5. 17th-18th Century: The Age of Enlightenment marked the beginning of more systematic data visualization.
Mathematicians like John Graunt created early statistical graphics, and advancements in astronomy led to the
development of star charts and celestial maps.
6. 19th Century: The 19th century saw the emergence of thematic maps, particularly in the field of geography. Early
statisticians and demographers like William Playfair and Charles Minard developed innovative ways to display data
using graphical representations.
7. 19th-20th Century: Florence Nightingale is known for her use of polar area diagrams to illustrate the significance of
proper sanitation practices in hospitals during the Crimean War.
8. Late 19th-20th Century: With the advent of photography and the industrial revolution, there was an explosion in the
creation of data visualizations. Pioneers like Charles Joseph Minard (known for his flow maps), John Snow (famous
for his cholera map), and Otto Neurath (creator of Isotype) made significant contributions to the field.
9. 20th Century: The development of electronic computers in the mid-20th century marked a significant turning point
in data visualization. It became easier to create complex and interactive visualizations. Notable examples include the
work of Edward Tufte and the introduction of graphical user interfaces (GUIs) for data visualization.
10.Late 20th-21st Century: The digital age brought a revolution in data visualization. Software tools like Tableau, D3.js,
and ggplot in R made it easier for non-experts to create informative and interactive data visualizations. Infographics
and interactive web-based data visualization became increasingly popular.
11.Present and Future: Data visualization continues to evolve with advances in technology and the availability of big
data. Virtual reality, augmented reality, and machine learning techniques are being integrated into data visualization
to provide new ways of exploring and understanding data.
Today, data visualization is a critical tool in various fields, including business, science, journalism, and education. It plays
a vital role in conveying complex information in a visually engaging and comprehensible manner.
2
3. Simple Sharing of Data : With the representation of the information, organizations present another arrangement of
correspondence. Rather than sharing the cumbersome information, sharing the visual data will draw in and pass on
across the data which is more absorbable.
4. Deals Investigation : With the assistance of information representation, a salesman can, without much of a stretch,
comprehend the business chart of items. With information perception instruments like warmth maps, he will have the
option to comprehend the causes that are pushing the business numbers up just as the reasons that are debasing the
business numbers. Information representation helps in understanding the patterns and furthermore, different variables
like sorts of clients keen on purchasing, rehash clients, the impact of topography, and so forth.
5. Discovering Relations Between Occasions : A business is influenced by a lot of elements. Finding a relationship
between these elements or occasions encourages chiefs to comprehend the issues identified with their business. For
instance, the online business market is anything but another thing today. Each time during certain happy seasons, like
Christmas or Thanksgiving, the diagrams of online organizations go up.
3
Data visualization process:
1. Determine the decision you want to make
“One of the biggest pitfalls in data visualization is people worrying too much about making the visuals look a certain
way. The important work happens long before that point,” says Cook. In other words, don’t get wrapped up in colors and
other aesthetics too soon. Your first step is figuring out what decision you’re trying to make. You can have all the data in
the world, but it won’t mean much if you’re not sure what to do with it. Cook recommends posing the decision in the
form of a question so you’re clear on the answer you’re seeking. “If you aren’t clear on your decision, your visual won’t
be either,” he explains. Here’s an example of a clear decision question: During which fiscal quarter should we launch
our new product?
2. Identify the metrics that inform the decision
You likely have tons of data available to you, but only certain data points will be relevant to your decision. Before
getting overwhelmed by data sets, consider which specific points would be most helpful for answering your decision
question .
Once you identify the right metrics, determine whether you can actually collect them with any accuracy. You may
find that some data points either aren’t available or are inaccurate. In this case, you typically have two alternatives: Kick
off a project to collect the data (such as developing and distributing a survey) or revisit the first step and
adjust your question.
3. Develop the story you want to tell
Next up is developing a story from your data. Cook shares a few questions you can use to prepare your narrative:
• Is the data about comparison? You may be making a decision based on metrics being bigger or smaller — or faster
or slower.
• Is the data about changes over time? Your decision may concern entering a new market or tracking product launch
performance over time.
• Is the data about categorization? You may have a cost-based decision that needs to identify where the business
is losing money.
4. Select the appropriate visual
This part of the data visualization process is fairly simple, as most visuals naturally follow the type of story you want
to tell. Consider these examples:
• Comparison stories typically work best with bar graphs.
• Time-based stories pair well with line charts.
• Categorical stories typically necessitate tree charts.
5. Add relevant elements to the visual
“Now is the point in the data visualization process when you can focus on aesthetics,” says Cook. The purpose of this
step is to make choices about your visual that aid in not only its appeal but also fostering comprehension. You may need
to add callouts to your chart to emphasize certain data points or add important context. For instance, say you created a
chart that was missing a week of sales data. The audience may assume you made a mistake, but you didn’t include the
data for good reason —a hurricane caused the business to close that week. A well-placed callout can prevent this
confusion. Color decisions can benefit from a designer’s eye — and some common sense. For example,
people often associate red with negativity (recall the saying about sales being “in the red”). So if your chart is sharing
good news, you may want to avoid using that color.
6. Clearly label and review the visual
Where the previous step was about choosing visualization elements, this step is about making note of the choices you
made. Title the visual appropriately. Make sure units are correct (e.g., dollars vs euros) and incremented consistently.
Ensure there’s a legend to explain color meanings.“Here you’re just making sure the audience doesn’t have unnecessary
questions about what they’re viewing,” Cook explains.
7. Let a nonexpert review the visual
“The last step of the data visualization process is quite important. You need a different set of eyes on the visual you’ve
created — preferably eyes that don’t have the same knowledge or experience as your own,” says Cook. Giving your
visual to someone else to review, especially someone who doesn’t know much about the subject matter or underlying
data, is an important spot check. Ideally, they should be able to comprehend the story you’re trying to communicate
without any issues.If they have any trouble, Cook says you may need to go back a few steps. The most common problem
is using the wrong type of chart for the data you’re presenting. Otherwise, you may just need to add a callout or two to
fill in any blanks in the visual narrative.“But if you’ve followed these steps carefully and dedicated a reasonable amount
of time to the task, you should be set,” Cook says.
4
Categories of Data Visualization
Data visualization is very critical to market research where both numerical and categorical data can be visualized, which
helps in an increase in the impact of insights and also helps in reducing the risk of analysis paralysis. So, data
visualization is categorized into the following categories:
1. Numerical Data :
Numerical data is also known as Quantitative data. Numerical data is any data where data generally represents
amount such as height, weight, age of a person, etc. Numerical data visualization is easiest way to visualize data. It is
generally used for helping others to digest large data sets and raw numbers in a way that makes it easier to interpret
into action. Numerical data is categorized into two categories :
Continuous Data –
It can be narrowed or categorized (Example: Height measurements).
Discrete Data –
This type of data is not “continuous” (Example: Number of cars or children’s a household has).
The type of visualization techniques that are used to represent numerical data visualization is Charts and Numerical
Values. Examples are Pie Charts, Bar Charts, Averages, Scorecards, etc.
2. Categorical Data :
Categorical data is also known as Qualitative data. Categorical data is any data where data generally represents
groups. It simply consists of categorical variables that are used to represent characteristics such as a person’s
ranking, a person’s gender, etc. Categorical data visualization is all about depicting key themes, establishing
connections, and lending context. Categorical data is classified into three categories :
Binary Data –
In this, classification is based on positioning (Example: Agrees or Disagrees).
Nominal Data –
In this, classification is based on attributes (Example: Male or Female).
Ordinal Data –
In this, classification is based on ordering of information (Example: Timeline or processes).
The type of visualization techniques that are used to represent categorical data is Graphics, Diagrams, and
Flowcharts. Examples are Word clouds, Sentiment Mapping, Venn Diagram, etc.
6
Types of Communication Problems- technical, semantic and
effectiveness
Communication problems in data visualization can be categorized into three main types: technical, semantic, and
effectiveness issues. These categories help to differentiate the nature of the problems and guide efforts to resolve them:
Technical Communication Problems:
a. Performance Issues:
Problem: Slow loading times or unresponsive interactions in interactive data visualizations can frustrate users and deter
engagement.
Solution: Optimize the performance of data visualization tools and platforms to ensure smooth user experiences.
b. Compatibility and Rendering:
Problem: Data visualizations may not display correctly or interact as intended across different devices, browsers, or
screen sizes.
Solution: Test and ensure compatibility on various platforms and use responsive design to adapt to different screen
sizes.
c. Data Integration:
Problem: Data from multiple sources may not integrate seamlessly, leading to discrepancies or inconsistencies.
Solution: Implement data integration solutions and data cleansing processes to ensure data consistency.
d. Security and Privacy:
Problem: Security and privacy concerns may arise when sharing sensitive or confidential data visualizations.
Solution: Implement appropriate security measures, such as user access controls and encryption, to protect data.
Semantic Communication Problems:
a. Misinterpretation:Problem: Data visualizations may be misunderstood due to ambiguity or a lack of clarity in the way
data is presented.
Solution: Ensure labels and legends are clear, and consider user testing to confirm that the visualization is interpreted as
intended.
b. Data Overloading:Problem: Visualizations may present too much information at once, leading to cognitive overload.
Solution: Simplify the visualization, prioritize the most important information, and provide options for drilling down into
details.
c. Terminology and Jargon:Problem: Using technical terms or jargon unfamiliar to the audience can hinder
comprehension.
Solution: Use plain language explanations and provide definitions for unfamiliar terms.
d. Cultural and Contextual Issues:Problem: Differences in cultural context or local knowledge may impact how data
visualizations are understood.
Solution: Consider the cultural background of the audience and provide additional context or explanations when
necessary.
Effectiveness Communication Problems:
a. Lack of Engagement:Problem: The audience may not find the visualization engaging, leading to a lack of interest in the
data.
Solution: Use storytelling techniques, compelling visuals, and interactivity to engage the audience.
b. Failure to Convey Insights:Problem: Data visualizations may not effectively convey the intended insights or key
takeaways.
Solution: Ensure the message is clear and the visualization emphasizes the most important data points.
c. Irrelevant Information:
Problem: Visualizations may include data that is irrelevant to the audience's needs, resulting in information overload.
Solution: Customize visualizations to match the audience's specific interests and requirements.
d. Lack of Feedback and Iteration:
Problem: Failing to seek feedback or iterate on visualizations can result in missed opportunities for improvement.
Solution: Continuously gather feedback from users and colleagues and use it to refine and enhance visualizations.
By categorizing communication problems in data visualization as technical, semantic, or effectiveness issues, you can
identify the specific nature of the problem and apply appropriate solutions to enhance the clarity and impact of your
visualizations.
7
Data types in data visualization
In data visualization, the choice of data types plays a critical role in determining how information is conveyed and
understood. The appropriate data type depends on the nature of the data and the message you want to convey. Here
are some common data types used in data visualization:
1. Numerical Data:
Numerical data consists of numbers and can be further categorized into:
Continuous Data: Represents values that can take any real number within a range. Examples include temperature,
height, and time.
Discrete Data: Comprises countable values with gaps between them. Examples include the number of employees
in a company, the count of products sold, or the number of cars in a parking lot.
2. Categorical Data:
Categorical data represents distinct categories, and it is often used to group data into non-numeric labels. Examples
include:
Nominal Data: Categories without any inherent order or ranking, such as colors or types of animals.
Ordinal Data: Categories with a meaningful order, but the intervals between them are not necessarily equal.
For instance, education levels (e.g., high school, college, graduate) or customer satisfaction levels (e.g., poor,
fair, good).
3. Time Series Data:
Time series data represents values collected at different points in time, making it essential for visualizing trends,
patterns, and changes over time. Examples include stock prices, weather data, and sales figures over months or
years.
4. Text Data:
Text data visualization is used to convey information from textual sources. Common methods include word clouds,
text sentiment analysis, and textual network analysis.
5. Geospatial Data:
Geospatial data visualization is used to represent data with geographic or spatial components. Examples include
maps, heat maps, choropleth maps, and spatial distributions.
6. Hierarchical Data:
Hierarchical data visualization is suitable for data organized in a hierarchical structure. Examples include
organizational charts, family trees, and file directory structures.
7. Network Data:
Network data visualization is employed to depict relationships and connections between entities. Examples include
social network graphs, supply chain networks, and web page linking structures.
8. Multivariate Data:
Multivariate data visualization involves the representation of data with multiple variables, often in a two- or three-
dimensional space. Techniques include scatter plots, bubble charts, parallel coordinates, and radar charts.
9. Image and Video Data:
Image and video data visualization techniques are used for conveying visual information. Examples include medical
imaging, satellite imagery, and video analytics.
10. Financial Data:
Financial data visualization techniques are specifically designed for representing financial information such as stock
prices, portfolio performance, and economic indicators.
11. Temporal Data:
Temporal data visualization focuses on patterns and trends within a specific time frame, which can be a subset of
time series data. Examples include hourly temperature fluctuations, daily website traffic, or monthly sales data.
12. Big Data:
Big data visualization is concerned with handling and representing vast amounts of data that traditional methods
may not handle efficiently. Techniques include data aggregation, sampling, and interactive visualizations.
13. Scientific Data:
Scientific data visualization techniques are used in fields such as biology, physics, and chemistry to represent
complex scientific data, including molecule structures, biological pathways, and particle collisions.
Choosing the appropriate data type and visualization method is crucial for effectively conveying insights and ensuring
that the audience can understand and interpret the information correctly. The selection of data type should align with
the message you want to convey and the characteristics of the data you are working with.
8
Relationships
In data visualization, relationships refer to the patterns, connections, and associations that can be revealed when data is
presented graphically. Visualizing relationships in data helps users better understand the underlying data and draw
meaningful insights from it.
Here are some key aspects of relationships in data visualization:
1. Correlation and Causation: Data visualization can help identify correlations between variables, where changes in one
variable are associated with changes in another. However, it's essential to be cautious about assuming causation
based solely on correlation.
2. Cluster Analysis: Data visualization can reveal natural groupings or clusters in your data. This is particularly useful in
techniques like k-means clustering, where data points are grouped based on similarity.
3. Trends and Patterns: Line charts and scatter plots are commonly used to depict trends and patterns in data. These
can show how one variable changes in relation to another, helping users understand relationships between them.
4. Heatmaps: Heatmaps display the strength or intensity of a relationship between two or more variables. They often
use color to represent the degree of association or dissimilarity between data points.
5. Network Diagrams: In cases where relationships involve connections between entities, network diagrams or graphs
are valuable. These show nodes (entities) and edges (connections) and can represent social networks, supply chains,
and more.
6. Regression Analysis: This statistical technique is used to model and visualize the relationship between a dependent
variable and one or more independent variables. Linear regression, for example, is used to understand the
relationship between variables by fitting a line to the data points.
7. Matrix Visualizations: These visualizations display relationships in a matrix format, where rows and columns
represent variables, and the cells represent the relationship between them. Heatmaps are a common example of
matrix visualizations.
8. Sankey Diagrams: These diagrams illustrate flows and relationships between different categories or stages of a
process. They are useful for visualizing the movement of resources, energy, or information.
9. Chord Diagrams: Chord diagrams show relationships between data points in a circular format. They are often used to
visualize connections between categories or entities.
10. Tree Maps and Sunburst Charts: These visualizations display hierarchical relationships among data elements. Each
level of the hierarchy is represented as a nested rectangle (in the case of tree maps) or as segments of a sunburst
chart.
11. Scatter Plots: Scatter plots are useful for showing the relationship between two continuous variables. Each data
point is plotted as a point on the chart, making it easy to identify trends or outliers.
12. Time-Series Analysis: For temporal data, time-series visualizations can help users understand how variables change
over time and if there are any relationships or seasonality in the data.
13. Geospatial Visualization: When data involves geographical locations, mapping tools can reveal relationships based
on location. For example, maps can show regional variations, proximity, or density of data points.
In summary, relationships in data visualization are about revealing connections and associations within the data.
Choosing the right visualization technique depends on the nature of your data and the specific relationships you want to
explore or communicate to your audience. Effective data visualization can simplify complex relationships, making it
easier for users to draw insights and make informed decisions.
Visualization formats:
Data visualization can take various formats, depending on the type of data and the insights you want to convey. Some
common visualization formats include:
1. Bar Charts: Used to compare categories of data. They are effective for showing comparisons and trends.
2. Line Charts: Ideal for showing trends over time. They connect data points with lines, making it easy to see patterns.
3. Pie Charts: Show the composition of a whole. Each slice represents a proportion of the whole.
4. Scatter Plots: Display individual data points on a two-dimensional plane, often used to identify relationships or
correlations.
5. Heatmaps: Show data values as colors in a grid. They are excellent for displaying large datasets and identifying
patterns.
6. Histograms: Used for visualizing the distribution of data. They group data into bins and display the frequency of data
points in each bin.
7. Box Plots: Display the distribution of a dataset, including outliers, quartiles, and median values.
9
8. Treemaps: Hierarchical visualization that divides data into nested rectangles, often used for displaying hierarchical
data structures.
9. Network Diagrams: Visualize relationships between entities in a network, like social networks or organizational
structures.
10. Choropleth Maps: Use color-coding to represent data by geographic regions, such as countries or states.
11. Word Clouds: Display word frequency in a text document, with more frequently occurring words appearing larger.
12. Sankey Diagrams: Show the flow of resources or values between entities, often used for visualizing processes or
energy flows.
13. Bubble Charts: Similar to scatter plots but with the addition of bubble size to represent a third variable.
14. Radar Charts: Used to compare multiple variables for a single data point, often used in performance evaluation.
15. Gantt Charts: Display a timeline of tasks or events, showing their duration and dependencies.
16. 3D Visualizations: Add an additional dimension to visualizations, often used for complex data exploration.
The choice of visualization format depends on the nature of your data and the story you want to tell. It's important to
select the format that effectively communicates your data's insights to your audience.
10
Principles of communicating data
(1) Know your goal
The goal of data visualization is to present data in a graphical or visual format that makes it easier to understand,
interpret, and derive insights from the data.Effective data visualization aims to:
Communicate Information: It should convey complex data in a clear and concise manner, allowing viewers to quickly
grasp key insights.
Facilitate Understanding: Visualizations should simplify complex data, patterns, and relationships, making it easier for
people to make informed decisions or draw conclusions.
Highlight Trends and Patterns: Visualizations can reveal trends, patterns, and anomalies in data that may not be
immediately apparent in raw numbers or text.
Support Decision-Making: They should aid decision-making processes by providing actionable insights and helping users
identify opportunities or areas that require attention.
Engage and Persuade: Visualizations can be used to engage and persuade an audience, whether it's for reporting,
storytelling, or advocacy.
Enhance Data Exploration: Interactive visualizations allow users to explore data, drilling down into details or changing
parameters to gain deeper insights.
Minimize Cognitive Load: Effective data visualizations reduce the cognitive load on viewers by presenting information in
a way that aligns with how the human brain processes visual information.
Ultimately, the goal of data visualization is to transform data into a visual narrative that enables better understanding,
decision-making, and communication of insights.
(2) Use the right data
Using the right data in data visualization is crucial for creating meaningful and informative visuals. Here are some tips:
Define Your Objective: Clearly define the purpose of your data visualization. Are you trying to show trends, compare
data, or highlight patterns? Understanding your objective will help you select the right data.
Data Relevance: Ensure that the data you use is directly related to your objective. Irrelevant data can confuse your
audience and dilute your message.
Quality Data: Verify the quality of your data. It should be accurate, up-to-date, and free from errors. Cleaning and
preprocessing your data may be necessary.
Data Integrity: Maintain data integrity. Ensure that there are no duplicate records or missing values that can skew your
visualization.
Consider the Audience: Think about who will be viewing your visualization. Tailor the data to their level of
understanding and the information they need.
Avoid Overloading: Don't overload your visualization with too much data. Keep it simple and focused on the key points.
Use Appropriate Visualizations: Choose the right type of chart or graph that best represents your data. Bar charts, line
graphs, pie charts, and scatter plots are some common options.
Labeling and Context: Provide clear labels and context for your data. Titles, axis labels, and legends should help your
audience understand what they're seeing.
Highlight Key Data Points: Emphasize the most important data points or trends that support your objective. Use colors
or annotations to draw attention.
Interactivity: If possible, add interactivity to your visualization to allow users to explore the data on their own.
Accessibility: Ensure that your data visualization is accessible to all, including those with disabilities. Use alt text for
images and consider color choices for those with color blindness.
Feedback and Iteration: Gather feedback on your data visualization and be willing to iterate and improve it based on
user input.
By following these principles and being selective about the data you use, you can create data visualizations that
effectively convey your message and insights to your audience.
(3) Select suitable visualization
The choice of a suitable visualization in data visualization depends on the type of data you have and the message you
want to convey. Here are some common types of visualizations and when to use them:
Bar Chart: Use bar charts to compare values between different categories or show changes over time.
Line Chart: Use line charts to display trends or changes over continuous intervals, such as time series data.
Pie Chart: Use pie charts to show the parts of a whole and their proportions, but be cautious as they can be less effective
than bar charts for precise comparisons.
11
Scatter Plot: Use scatter plots to visualize the relationship between two continuous variables and identify patterns or
correlations.
Histogram: Use histograms to visualize the distribution of a single variable, especially when dealing with large datasets.
Heatmap: Use heatmaps to represent data in a matrix format, often to show correlations or patterns in multivariate
data.
Box Plot (Box-and-Whisker Plot): Use box plots to display the distribution of a dataset, including median, quartiles, and
potential outliers.
Bubble Chart: Use bubble charts to display three dimensions of data, where the size of bubbles represents a third
variable.
Treemap: Use treemaps to show hierarchical data structures, often used in visualizing directory structures or nested
categories.
Sankey Diagram: Use Sankey diagrams to illustrate flow or proportionality between multiple categories or stages in a
process.
Choropleth Map: Use choropleth maps to visualize data on geographical regions, where color intensity represents a
value.
Word Cloud: Use word clouds to display the frequency or importance of words in text data.
Radar Chart: Use radar charts to compare multiple quantitative variables for a single data point, often used in
performance assessments.
Gantt Chart: Use Gantt charts to visualize project schedules and timelines.
Network Diagram: Use network diagrams to represent relationships between nodes, often used in social network
analysis or network flow analysis.
The choice of visualization should consider the nature of your data, your audience, and the insights you want to convey.
Experimenting with different types of visualizations can help you find the most suitable one for your
specific data and goals.
(4) Design for aesthetics
Designing for aesthetics in data visualization is crucial to engage and effectively communicate with your audience. Here
are some key principles to consider:
Color Choice: Use a harmonious color palette that is visually appealing and ensures good contrast between data
elements. Avoid using too many colors, and consider colorblind-friendly options.
Typography: Choose clear and readable fonts. Use font size, style, and weight to emphasize important information.
Consistency in typography throughout your visualization is key.
Whitespace: Utilize whitespace to separate and group elements, making your visualization less cluttered and more
visually pleasing. It can also help guide the viewer's attention.
Layout: Organize your data in a logical and structured manner. Consider the placement of titles, labels, and legends to
make the visualization easy to understand.
Visual Hierarchy: Use visual cues like size, color, and position to establish a hierarchy of information. Important data
points should stand out and draw the viewer's attention.
Simplicity: Keep your visualization simple and focused on the core message. Avoid unnecessary decorations or
embellishments that distract from the data.
Consistency: Maintain a consistent style throughout your visualization. This includes consistent use of colors, fonts, and
visual elements.
Balance: Distribute visual elements evenly across the visualization to create a sense of balance. This helps prevent a
cluttered or chaotic appearance.
Engagement: Incorporate interactive elements if applicable to allow users to explore the data themselves. Interactivity
can enhance engagement and understanding.
Storytelling: Consider the narrative you want to convey through your data. Arrange your visualizations in a way that tells
a compelling story or highlights key insights.
Feedback: Test your visualization with a diverse group of users and gather feedback to make improvements. User
feedback can help refine the aesthetics and functionality.
Accessibility: Ensure that your visualization is accessible to all users, including those with disabilities. Use alt text for
images, provide text alternatives, and follow accessibility guidelines.
Remember that aesthetics should not overshadow the clarity and accuracy of your data. Striking a balance
between visual appeal and effective communication is key to creating aesthetically pleasing data visualizations.
12
(5) Choose an effective medium and channel
Choosing an effective medium and channel for data visualization depends on various factors such as your audience, the
complexity of the data, and your communication goals. Here are some options:
Static Infographics: These are suitable for simple data that can be presented in a single image. They are commonly
shared on social media and in reports.
Interactive Web Dashboards: Ideal for complex datasets that require exploration. Tools like Tableau, Power BI, or D3.js
allow users to interact with data, making it useful for data analysts and decision-makers.
Data Videos: Animated videos can effectively convey trends and narratives within data. They are engaging and can be
shared on platforms like YouTube.
Data-Driven Reports: For in-depth analysis, consider reports in PDF or web format. They allow for detailed explanations
alongside visualizations and are often used in business and academia.
Data Art: For creative and artistic presentations, consider using data to create visual art installations or exhibits in
physical or virtual spaces.
Data Storytelling: Use storytelling techniques to weave data into a compelling narrative. This can be done through
articles, blog posts, or presentations.
Augmented Reality (AR) or Virtual Reality (VR): These technologies can immerse users in data environments, making
them suitable for immersive data exploration or training simulations.
Mobile Apps: Develop apps that present data interactively, which can be especially useful for data that needs to be
accessed on-the-go.
Social Media: Share data in bite-sized chunks on platforms like Twitter or Instagram for quick engagement.
Printed Materials: Traditional mediums like posters or brochures are still effective for conveying data in physical
settings, such as conferences or exhibitions.
The choice should align with your audience's preferences and the story you want to tell with the data. Combining
multiple mediums and channels can also be effective, depending on your objectives.
(6) Check the result
Collect and Prepare Data: First, gather the data you want to visualize. Ensure that it's clean, organized, and relevant to
your analysis.
Choose the Right Visualization: Select the type of chart or graph that best represents your data and the insights you
want to convey. Common types include bar charts, line graphs, pie charts, scatter plots, and heatmaps.
Create the Visualization: Use a data visualization tool or software like Excel, Tableau, Python with libraries like
Matplotlib or Seaborn, or online tools like Google Data Studio. Input your data and design the visualization according to
your preferences.
Interpret the Visualization: Once you have the visualization, analyze it to draw conclusions. Look for trends, patterns,
outliers, or any insights that the visualization reveals.
Compare and Validate: Compare the visualization results with your initial hypotheses or expectations. Ensure that the
insights make sense in the context of your data.
Share and Communicate: Share the visualization with others who need to see the results. You can include it in reports,
presentations, or dashboards to convey your findings effectively.
Iterate if Necessary: If the initial visualization doesn't provide the insights you need, consider adjusting the visualization
type or exploring the data differently. Visualization is an iterative process.
Seek Feedback: Get feedback from colleagues or stakeholders to ensure that your interpretation aligns with their
understanding of the data.
Document and Save: Save your visualization and any related analysis for future reference. Documentation is crucial for
reproducibility.
Take Action: Based on the insights gained from the visualization, make informed decisions or take actions as needed.
Remember that effective data visualization is not just about creating pretty charts; it's about conveying meaningful
information from your data.
13
Data story telling for social and market communication
Data storytelling is the process of translating data analyses into understandable terms in order to influence a business
decision or action. Data analysis focuses on creating valuable insights from data to give further context and
understanding to an intended audience.
Data storytelling in social and market communication involves using data, visualizations, and narratives to convey
information and insights that are relevant to social or marketing contexts. It is a strategic approach to communicate
complex data in a way that is engaging, persuasive, and relatable to your target audience.
Here's an explanation of data storytelling for social and market communication:
1. Understanding the Audience:Start by understanding the characteristics and preferences of your target
audience. What are their interests, needs, and pain points? This knowledge will guide how you craft and present
your data story.
2. Defining the Message:Clearly define the core message or insight you want to convey through your data
storytelling. Your message should be the central takeaway that you want your audience to remember and act upon.
3. Data Selection:Carefully select the data that supports your message. It should be relevant, reliable, and directly
tied to your communication goals. Ensure the data is accurate and up-to-date.
4. Narrative Structure:Organize your data story into a structured narrative that includes:
Introduction: Set the stage by explaining the context and the issue or question your data addresses.
Data Exploration: Present the data, allowing your audience to grasp the key trends and insights.
Key Message: Highlight the central message or insight that you want to convey.
Supporting Evidence: Provide additional data and visuals that reinforce your primary message.
Conclusion: Summarize the key takeaways and discuss potential implications or actions.
5. Visualization Choices:Select the appropriate data visualizations that best illustrate your points. Common
visualization formats include bar charts, line charts, pie charts, heatmaps, and more.
6. Context and Annotations:Use labels, annotations, and context to help your audience interpret the data. Explain
data sources, units, and other relevant details to enhance understanding.
7. Engaging Storytelling:Incorporate storytelling techniques to make the data more engaging and relatable. Use
anecdotes, examples, and a compelling narrative structure to capture your audience's interest.
8. Emphasis on Impact:Explain how the data insights can have real-world impacts in social or market contexts.
Personalize the data by discussing how it affects individuals, businesses, or the target market.
9. Visual Enhancements:Use visuals, images, infographics, and graphics to enhance the storytelling experience and
make the data more digestible and visually appealing.
10.Comparisons and Benchmarks:Provide benchmarks and comparisons to provide context to your data. Show
how the current situation compares to historical data or industry standards.
11.Simplicity and Clarity:Avoid technical jargon and overly complex charts. Simplify your language and visuals to
ensure that your message is accessible to a broad audience.
12.Interactivity (if applicable):If presenting data online or through interactive media, use interactive elements,
such as tooltips or filters, to allow users to explore the data in more detail.
13.Citations and Transparency:Always provide citations and references for your data sources to build credibility
and ensure transparency.
14.Feedback and Iteration:Test your data story on a small sample of your audience to gather feedback and make
improvements before sharing it more widely.
15.Distribution and Promotion:Share your data story through relevant channels, such as social media, email,
presentations, or marketing campaigns. Effectively promote it to reach your target audience.
16.Measurement of Impact:After sharing your data story, track its impact using metrics such as views, shares,
conversions, and comments to gauge its success and make adjustments for future communications.
Data storytelling in social and market communication is a potent way to connect with your audience, build trust, and
drive informed decisions or actions. By combining the engaging aspects of storytelling with the persuasive qualities of
data visualization, you can effectively communicate complex information in a way that resonates with your audience and
influences their behavior or choices.
14
The three components of data storytelling
Data storytelling comprises data, narrative and visualizations.
1. The data serves as the base of a data story. It's information from accurate data gathering and analysis. Data can be
gathered from such places as charts and dashboards using data analysis tools.
2. The narrative is a verbal or written storyline that's used to effectively communicate insights from the data. The
narrative should be within the context of the data and aim to show a clear reasoning for following actions or
decisions. Narratives should be based on data and present a clear explanation of what the data means and its
importance.
3. Visualizations act as further representations of both the data and narrative and are used to communicate the story
more clearly. Visualizations include graphs, charts, diagrams and photos.
15
Trends in market research
Market research has been significantly influenced by advances in data visualization and analytics. Several trends in
market research related to data visualization have emerged to provide deeper insights, improve decision-making, and
enhance communication. Here are some of the notable trends:
1. Interactive Data Dashboards:Interactive dashboards allow users to explore and interact with data in real-time.
These dashboards enable researchers and decision-makers to filter, drill down, and manipulate data to uncover
insights, trends, and patterns.
2. Real-time Data Visualization:The demand for real-time market insights is growing. Businesses are using data
visualization tools to display data as it's collected, helping them make agile decisions and adapt to changing market
conditions swiftly.
3. Advanced Predictive Analytics:Market researchers are using predictive analytics models and visualization
techniques to forecast market trends, customer behavior, and demand patterns. This aids in proactive decision-
making and planning.
4. AI-Driven Data Visualization:Artificial intelligence (AI) and machine learning are being integrated into data
visualization tools to automate data analysis and generate insights. AI algorithms can help identify hidden trends and
anomalies in large datasets.
5. Geo-spatial Analysis:Location-based data visualization is becoming increasingly important. Companies use
geographical data and mapping tools to analyze regional market trends, consumer demographics, and location-based
marketing strategies.
6. Customer Journey Mapping:Data visualization is used to map and visualize the customer journey. This helps
businesses understand customer touchpoints, identify pain points, and improve the customer experience.
7. Big Data Visualization:As the volume of data generated continues to grow, market researchers are turning to big
data visualization tools to make sense of large and complex datasets. These tools are essential for deriving
meaningful insights from vast amounts of information.
8. Cross-platform and Omni-channel Insights:Market researchers need to analyze data from various sources
and channels, including social media, e-commerce platforms, offline retail, and customer feedback. Data visualization
solutions are evolving to provide a comprehensive view of customer behavior across these channels.
9. Custom Data Storytelling:Researchers are creating custom data stories using visualization tools to present
findings in a compelling and narrative format. These stories are designed to make data more accessible and relatable
to stakeholders.
10.Ethical Data Visualization:With concerns about data privacy and transparency, ethical data visualization
practices are gaining prominence. Researchers and businesses are taking measures to ensure that data is presented
accurately and ethically.
11.User-friendly Interfaces:Data visualization tools are becoming more user-friendly and accessible to non-
technical users. This democratizes data analysis and enables a wider range of professionals to create and interpret
visualizations.
12.Data Integration:Market research is increasingly focusing on integrating data from various sources, such as CRM
systems, social media, and third-party data providers, to create a holistic view of the market and consumer behavior.
13.Mobile Optimization:As mobile device usage continues to grow, market research data visualization is being
optimized for mobile platforms. This ensures that insights are accessible to professionals on the go.
14.Explainable AI in Visualization:As AI-driven insights become more common, the need for transparent and
explainable AI models and visualization techniques is increasing. Users want to understand the rationale behind AI-
driven recommendations and insights.
These trends reflect the evolving landscape of market research, driven by the need for more accessible, insightful, and
actionable data visualization solutions. As technology continues to advance, market researchers will continue to
leverage data visualization to stay competitive and make data-driven decisions.
16
data visualization dashboards
Data visualization dashboards are a user interface tool that displays critical data and information in a visual, easy-to-
understand format. These dashboards are designed to provide users with a comprehensive and real-time overview of
key performance indicators, metrics, and data in a single, centralized location. Here's an explanation of data visualization
dashboards:
Key Components of Data Visualization Dashboards:
1. Widgets or Visualizations: Dashboards consist of various visual elements or widgets, such as charts, graphs,
tables, maps, and other data representations. These visualizations provide a clear and graphical representation of
data.
2. Data Sources: Dashboards are connected to data sources, which can include databases, spreadsheets, APIs, or
other data repositories. Data is retrieved from these sources and displayed in the dashboard.
3. Interactivity: Dashboards are often interactive, allowing users to explore the data by interacting with the widgets.
Common interactive features include filtering, drilling down into details, and changing date ranges.
4. KPIs (Key Performance Indicators): Dashboards typically display KPIs prominently. KPIs are critical metrics that
provide a quick snapshot of performance and progress toward organizational goals.
5. Widgets Arrangement: The arrangement and layout of widgets on the dashboard are designed to maximize user
comprehension and efficiency. They are often organized in a logical and intuitive manner.
Characteristics of Data Visualization Dashboards:
1. Real-time or Periodic Updates: Dashboards can be set to provide real-time updates or periodic refreshes,
ensuring that users always have access to the most current data.
2. Customization: Users can often customize dashboards to display the specific data and metrics most relevant to
their roles and objectives.
3. Accessibility: Dashboards are typically accessible through web browsers or mobile apps, enabling users to access
data from anywhere with an internet connection.
4. Role-based Access: Access to certain parts of a dashboard or specific data may be restricted based on user roles
and permissions.
Benefits of Data Visualization Dashboards:
1. Data Clarity: Dashboards provide a visual representation of data, making it easier for users to understand complex
information at a glance.
2. Decision Support: Users can make data-driven decisions and respond quickly to changing circumstances, as they
have real-time access to important metrics.
3. Efficiency: Dashboards save time by centralizing data, reducing the need to switch between multiple applications
or reports.
4. Collaboration: They facilitate collaboration by allowing teams to work from a shared data source and discuss
findings based on the same information.
5. Goal Tracking: Dashboards make it easy to track progress toward specific goals and objectives, helping
organizations stay on target.
6. Alerts and Notifications: Some dashboards can be configured to send alerts or notifications when certain
conditions or thresholds are met, enabling proactive management.
Use Cases for Data Visualization Dashboards:
1. Business Analytics: Monitoring sales, marketing, and financial performance.
2. Project Management: Tracking project progress, timelines, and resource allocation.
3. IT Operations: Monitoring system and network performance, security, and uptime.
4. Healthcare: Visualizing patient data, clinical outcomes, and resource utilization.
5. E-commerce: Monitoring website traffic, sales, and customer behavior.
6. Supply Chain Management: Tracking inventory levels, shipment status, and supplier performance.
In summary, data visualization dashboards are powerful tools for presenting and interacting with data, making it easier
for users to analyze and act upon information to support decision-making and improve organizational performance.
17
some examples of data visualization dashboards:
Sales Performance Dashboard:
Example: A company's sales team uses a dashboard to track monthly revenue, sales by region, product performance,
and customer acquisition metrics. Users can filter by time period, product category, or salesperson to analyze
performance.
Marketing Analytics Dashboard:
Example: A digital marketing agency uses a dashboard to monitor website traffic, social media engagement, email
campaign performance, and conversion rates. It displays key metrics like click-through rates, bounce rates, and
conversion funnels.
Financial Dashboard:
Example: A CFO of a multinational corporation utilizes a dashboard to monitor financial health. It displays real-time data
on revenue, expenses, profit margins, cash flow, and debt. Users can drill down into specific financial statements and
time periods.
Healthcare Dashboard:
Example: A hospital administrator relies on a healthcare dashboard to track patient admissions, discharges, bed
occupancy, patient outcomes, and the availability of critical resources such as ventilators during a crisis.
Project Management Dashboard:
Example: A project manager overseeing a software development project uses a dashboard to monitor task progress,
resource allocation, budget status, and the project's critical path. It displays Gantt charts, milestone tracking, and issue
status.
E-commerce Analytics Dashboard:
Example: An online retailer uses a dashboard to visualize website traffic, conversion rates, cart abandonment, and
revenue. It tracks metrics such as average order value, customer acquisition cost, and top-selling products.
Supply Chain Dashboard:
Example: A supply chain manager uses a dashboard to monitor inventory levels, supplier performance, shipping
statuses, and demand forecasting. It shows metrics like lead times, order fulfillment rates, and inventory turnover.
Human Resources Dashboard:
Example: An HR manager uses a dashboard to track employee turnover, performance metrics, recruitment progress, and
diversity statistics. It includes data on retention rates, time-to-fill job openings, and training effectiveness.
Social Media Analytics Dashboard:
Example: A social media manager employs a dashboard to monitor brand sentiment, engagement metrics, follower
growth, and content performance across various platforms. It tracks metrics like likes, shares, comments, and click-
through rates.
Energy Consumption Dashboard:
Example: A facility manager uses a dashboard to visualize energy consumption patterns, identify energy-saving
opportunities, and track environmental impact metrics. It displays data on electricity, gas, and water usage.
Customer Support Dashboard:
Example: A customer support team uses a dashboard to monitor service request volumes, response times, customer
satisfaction scores, and common support issues. It provides insights into response efficiency and the effectiveness of
support teams.
Higher Education Dashboard:
Example: A university administration uses a dashboard to track student enrollment, retention rates, graduation rates,
and academic performance. It provides insights into which programs are thriving and where improvements are needed.
18
What is Tableau?
Tableau is a very powerful data visualization tool that can be used by data analysts, scientists, statisticians, etc. to
visualize the data and get a clear opinion based on the data analysis. Tableau is very famous as it can take in data
and produce the required data visualization output in a very short time. Basically, it can elevate your data into
insights that can be used to drive your action in the future. And Tableau can do all this while providing the highest
level of security with a guarantee to handle security issues as soon as they arise or are found by users.
Tableau also allows you to prepare, clean, and format data of all types and ranges and then create data
visualizations to obtain actionable insights that can be shared with other users. You can use data queries to obtain
insights from your visualizations and also manage metadata using Tableau. In fact, it is a lifesaver for many people
in Business Intelligence as it allows you to handle data without having great technical knowledge. So you can use
Tableau as an individual data analyst or at a large scale for your business team and organization. In fact, there are
many organizations using Tableau such as Amazon, Lenovo, Walmart, Accenture, etc. There are different Tableau
products that are aimed at different types of users, whether they be individuals or organizations. So let’s see these
in detail now.
Values in Tableau
There are two types of values in the tableau:
Dimensions: Values that are discrete (which can not change with respect to time) in nature called Dimension in
tableau. Example: city name, product name, country name.
Measures: Values that are continuous (which can change with respect to time) in nature called Measure in tableau.
Example: profit, sales, discount, population.
Advantages of Tableau
1. Create great visualizations
Of course, the first advantage of a data visualization tool is that you can create wonderful and detailed data
visualizations using data that initially wasn’t very ordered. You can use Tableau Prep to shape, clean, and combine the
data into desired forms so that it can be used for creating data charts, dashboards, visualizations, etc.
2. Obtain detailed insights
You can obtain detailed and unexpected insights from the data using Tableau. You can explore the data from different
angles to see if any patterns emerge or you can even ask open-ended questions from the data and perform various
comparisons to obtain unexpected insights. This effect is heightened even more when you are using real -time data as
it changes your viewpoint continuously.
3. User-friendly Approach
Tableau is created for people who don’t have detailed technical skills or much coding experience and so its user -
friendly approach is its greatest strength. You can create detailed data visualizations from Tableau without having
many technical skills as most of its features use a drag-and-drop approach to put the correct parameters in the rows
and columns to create visualizations. This is so simple and intuitive that even a layman can manage it.
4. Support for Different Data Sources
Tableau can connect to various data sources, data warehouses, and files that contain disparate data and exist in
different kinds of storage mediums. Tableau can access data from the cloud, data that is available in spreadsheets, big
data, non-relational data, etc. Tableau has the capacity to manage data from all these different data sources and
blend these different types of data to create complex and detailed data visualizations that are an asset to IT
companies.
19
4. Limited ETL Capabilities: Tableau has data preparation capabilities, but for complex ETL (Extract, Transform, Load)
tasks, users may need to rely on external tools or databases.
5. Limited Statistical and Predictive Analysis: Tableau is not a replacement for dedicated statistical or predictive
analytics tools like R or Python. While it has basic statistical functions, it's not designed for in-depth data analysis.
6. Limited Customization of Visualizations: Although Tableau offers various visualization types, there are limits to how
much you can customize their appearance. Users may find it challenging to create very customized or unique
visualizations.
7. Limited Version Control: Tableau doesn't offer built-in version control for workbooks and dashboards, which can be
an issue when collaborating on projects.
8. Complex Data Relationships: Handling complex data relationships and hierarchies can be challenging, and Tableau
may not always handle these situations gracefully.
9. Limited Natural Language Processing (NLP): While Tableau has improved its NLP capabilities, it's not as advanced as
some dedicated NLP tools for analyzing unstructured text data.
10. Offline Access: Tableau Online and Tableau Server are required for online sharing and collaboration. Users who need
offline access to their visualizations may find this limiting.
Features of tableau
Tableau offers a wide range of features for data visualization and business intelligence, making it a popular tool for data
analysts, business professionals, and organizations looking to gain insights from their data. Here are some key features
of Tableau in data visualization:
1. Data Connectivity:Tableau can connect to various data sources, including databases, spreadsheets, cloud platforms,
and web services. This allows users to import and analyze data from multiple sources.
2. Data Transformation and Cleansing:Tableau provides data preparation tools that allow users to clean, transform,
and shape data for analysis. This includes functions for data cleaning, filtering, pivoting, and aggregating.
3. Drag-and-Drop Interface:Tableau's intuitive, user-friendly interface allows users to create visualizations by simply
dragging and dropping fields onto the canvas. No coding or complex programming is required.
4. Rich Visualization Types:Tableau supports a wide range of visualization types, including bar charts, line charts,
scatter plots, heatmaps, maps, treemaps, and more. Users can choose the most appropriate visualization for their
data.
5. Interactivity:Dashboards and reports created in Tableau are highly interactive. Users can filter data, highlight specific
data points, and drill down into details with ease, facilitating data exploration.
6. Mapping and Geographic Analysis:Tableau provides robust mapping capabilities, allowing users to create
geographic visualizations, plot locations on maps, and perform spatial analytics.
7. Aggregation and Granularity Control:Users can control the level of detail in their visualizations and easily aggregate
data to view high-level trends or drill down to specific details.
8. Data Blending:Tableau allows users to blend data from multiple sources, enabling cross-functional analysis and
insights.
9. Data Forecasting:Tableau supports forecasting for time-series data, making it possible to create predictive models
and visualize future trends.
10. Data Integration and ETL:While not a full ETL tool, Tableau offers data integration capabilities for basic data
transformation, blending, and joining. More advanced ETL can be performed with external tools if needed.
11. Collaboration and Sharing:Tableau allows users to share their visualizations, dashboards, and reports with
colleagues and stakeholders through Tableau Server, Tableau Online, or Tableau Public.
12. Data Security and Permissions:Users can control who has access to their data and dashboards, ensuring that
sensitive information is protected.
13. Performance Optimization:Tableau provides features like data extracts, data source filters, and query optimization
to improve the performance of dashboards, especially with large datasets.
14. Integration and APIs:Tableau can be integrated with various data sources, applications, and programming languages
like Python and R. It offers APIs for extending functionality and automation.
15. Mobile Accessibility:Tableau is designed to be responsive, allowing users to access and interact with visualizations
on various devices, including mobile phones and tablets.
16. Alerts and Notifications:Users can set up data alerts to receive notifications when specific data thresholds or
conditions are met.
These features make Tableau a versatile and powerful tool for data visualization, exploration, and analysis, helping
organizations make data-driven decisions and communicate insights effectively.
20
Products Offered by Tableau
1. Tableau Desktop:
Description: Tableau Desktop is the core authoring and development tool. It is used by analysts, data scientists, and
business professionals to create interactive data visualizations, dashboards, and reports. With Tableau Desktop,
users can connect to various data sources, clean and transform data, and build visualizations using a drag-and-drop
interface.
Key Features:
Data Connectivity: Tableau Desktop supports a wide range of data sources, including databases, spreadsheets,
web services, and cloud platforms.
Visualizations: Users can create a variety of visualizations, from simple bar charts to complex geographic maps
and interactive dashboards.
Data Preparation: It offers data preparation tools to clean and shape data for analysis.
Calculated Fields: Users can create custom calculated fields to perform advanced calculations.
Interactivity: Dashboards created in Tableau Desktop are highly interactive, allowing users to filter and explore
data.
Disadvantages:
Cost: Tableau Desktop can be expensive for individual users.
Learning Curve: Advanced features may require time to master.
Limited Collaboration: Designed for individual authoring, not collaboration.
2. Tableau Server:
Description: Tableau Server is a web-based platform that enables organizations to share, collaborate, and govern
Tableau content. It provides centralized management of Tableau workbooks and dashboards, making them
accessible to authorized users through a web browser.
Key Features:
Centralized Sharing: Tableau Server allows organizations to publish, share, and collaborate on Tableau content in a
secure and controlled environment.
User Management: Administrators can manage user access, permissions, and security settings.
Data Governance: It offers tools for monitoring usage, ensuring data security, and tracking performance.
Authentication: Supports integration with existing authentication systems for seamless user access.
Disadvantages:
Cost: Tableau Server can be costly to implement and maintain.
IT Dependency: Requires IT support for setup and maintenance.
Limited Mobile Capabilities: Limited mobile interactivity compared to Tableau Mobile.
3. Tableau Online:
Description: Tableau Online is a cloud-based version of Tableau Server. It offers the same functionality as Tableau
Server but is hosted and managed by Tableau in the cloud. It's designed for organizations that prefer not to maintain
their own server infrastructure.
Key Features:
Cloud Hosting: Tableau Online provides cloud-based hosting of Tableau content, eliminating the need for on-
premises infrastructure.
Accessibility: Users can access their Tableau content from anywhere with an internet connection.
Scalability: It can easily scale to accommodate growing data and user needs.
Disadvantages:
Data Privacy Concerns: Data in the cloud may raise privacy and security concerns.
Limited Customization: Fewer customization options compared to on-premises Tableau Server.
4. Tableau Prep:
Description: Tableau Prep is a data preparation tool that helps users clean, shape, and combine data from various
sources before analysis. It provides a visual and interactive interface for data cleaning and transformation.
Key Features:
Data Cleaning: Users can clean and transform data, including tasks like pivoting, splitting, and aggregating.
Data Blending: It allows users to combine data from multiple sources.
Visual Data Flow: Data preparation is performed through a visual data flow interface, making it user-friendly.
Disadvantages:
21
Limited ETL: Not as powerful as dedicated ETL tools for complex transformations.
Limited Analytics: Primarily focused on data cleaning, not data analysis.
Cost: May require an additional investment.
5. Tableau Mobile:
Description: Tableau Mobile is a mobile application that allows users to access and interact with Tableau
visualizations on their smartphones and tablets. It's designed for on-the-go access to business insights.
Key Features:
Responsive Design: Tableau Mobile provides a responsive design to optimize the user experience on different
mobile devices.
Interactivity: Users can interact with dashboards, apply filters, and explore data on mobile devices.
Disadvantages:
Limited Offline Access: Requires an internet connection to access visualizations.
Smaller Screen: Limited screen real estate for complex dashboards.
6. Tableau Public:
Description: Tableau Public is a free version of Tableau that allows users to create and share public data
visualizations and dashboards with the global Tableau community. However, data shared on Tableau Public is
publicly accessible and cannot be used for confidential or proprietary information.
Key Features:
Free Access: It's a free platform, making it accessible to a wide audience.
Community Engagement: Users can share their visualizations with the Tableau Public community and embed
them on websites and blogs.
Disadvantages:
Public Access: Data shared is publicly accessible and cannot be used for sensitive or proprietary information.
Limited Data Sources: Restricted to a few data sources.
7. Tableau Reader:
Description: Tableau Reader is a free desktop application that allows users to view and interact with Tableau
workbooks and dashboards created by Tableau Desktop. It is primarily for individual use and doesn't offer sharing or
collaboration capabilities.
Key Features:
Viewing: Users can open and view Tableau workbooks and dashboards.
Interactivity: Allows limited interactivity with the visualizations.
Disadvantages:
Limited to Viewing: Cannot create or edit visualizations.
Limited Collaboration: Designed for individual use, no collaboration features.
These are the core Tableau products and services designed to cover various aspects of data visualization, data
preparation, and collaboration within organizations. Keep in mind that Tableau may have introduced new products or
made updates and changes to their offerings since my last knowledge update, so it's advisable to visit the official
Tableau website for the latest information.
22
Tableau Architecture
Tableau Server is designed to connect many data tiers. It can connect clients from Mobile, Web, and Desktop. Tableau
Desktop is a powerful data visualization tool. It is very secure and highly available.
It can run on both the physical machines and virtual machines. It is a multi-process, multi-user, and multi-
threaded system.
Providing such powerful features requires unique architecture.
The different layers used in Tableau server are given in the following architecture diagram:-
1. Data server:- The primary component of Tableau Architecture is the Data sources which can connect to it.
Tableau can connect with multiple data sources. It can blend the data from various data sources. It can connect to
an excel file, database, and a web application at the same time. It can also make the relationship between different
types of data sources.
2. Data connector:- The Data Connectors provide an interface to connect external data sources with the Tableau Data
Server.
Tableau has in-built SQL/ODBC connector. This ODBC Connector can be connected with any databases without using
their native connector. Tableau desktop has an option to select both extract and live data. On the uses basis, one can be
easily switched between live and extracted data.
o Real-time data or live connection: Tableau can be connected with real data by linking to the external database
directly. It uses the infrastructure existing database by sending dynamic multidimensional expressions (MDX) and
SQL statements. This feature can be used as a linking between the live data and Tableau rather than importing the
data. It makes optimized and a fast database system. Mostly in other enterprises, the size of the database is large,
and it is updated periodically. In these cases, Tableau works as a front-end visualization tool by connecting with the
live data.
o Extracted or in-memory data: Tableau is an option to extract the data from external data sources. We make a local
copy in the form of Tableau extract file. It can remove millions of records in the Tableau data engine with a single
click. Tableau's data engine uses storage such as ROM, RAM, and cache memory to process and store data. Using
filters, Tableau can extract a few records from a large dataset. This improves performance, especially when we are
working on massive datasets. Extracted data allows the users to visualize the data offline, without connecting to
the data source.
3. Components of Tableau server: Different types of component of the Tableau server are:
o Application server
o VizQL server
o Data server
23
A. Application server: The application server is used to provide the authorizations and authentications. It handles the
permission and administration for mobile and web interfaces. It gives a guarantee of security by recording each session
id on Tableau Server. The administrator is configuring the default timeout of the session in the server.
B. VizQL server: VizQL server is used to convert the queries from the data source into visualizations. Once the client
request is forwarded to the VizQL process, it sends the query directly to the data source retrieves information in the
form of images. This visualization or image is presented for the users. Tableau server creates a cache of visualization to
reduce the load time. The cache can be shared between many users who have permission to view the visualization.
C. Data server: Data server is used to store and manage the data from external data sources. It is a central data
management system. It provides data security, metadata management, data connection, driver requirements, and
data storage. It stores the related details of data set like calculated fields, metadata, groups, sets, and parameters. The
data source can extract the data as well as make live connections with external data sources.
4. Gateway: The gateway directed the requests from users to Tableau components. When the client sends a request, it
is forwarded to the external load balancer for processing. The gateway works as a distributor of processes to different
components. In case of absence of external load balancer, the gateway also works as a load balancer. For single server
configuration, one gateway or primary server manages all the processes. For multiple server configurations, one physical
system works as a primary server, and others are used as worker servers. Only one machine is used as a primary server
in Tableau Server environment.
5. Clients: The visualizations and dashboards in Tableau server can be edited and viewed using different clients. Clients
are a web browser, mobile applications, and Tableau Desktop.
o Web Browser: Web browsers like Google Chrome, Safari, and Firefox support the Tableau server. The
visualization and contents in the dashboard can be edited by using these web browser.
o Mobile Application: The dashboard from the server can be interactively visualized using mobile application and
browser. It is used to edit and view the contents in the workbook.
o Tableau Desktop: Tableau desktop is a business analytics tool. It is used to view, create, and publish the
dashboard in Tableau server. Users can access the various data source and build visualization in Tableau desktop.
24
Using the Workspace Control Effectively
If you are addicted to working with spreadsheets or other analysis tools, learning Tableau's desktop environment will be
helpful. If you have no familiarity with spreadsheets or database terminology, you can still be effectively using Tableau
within a few days.
The Data Connection Page and Start Page
Open Tableau, and you see the start page of Tableau Desktop.
On the left side, the data window gives connection options. If you click on that to connect to the Data, you are taken
to the data connection workspace. You can also access this page by clicking on the hard disk tab which is next to the
Start button. If you want to connect to one of the data sources listed On a Server section, you must to go to
Tableau?s website and download a connector for the required database. Here is no limit on the number of data
connection drivers you can install, but some dealer requires that you validate a valid license to their software before
downloading their connector.
On the right side of the Connect to the Data page, you will see saved data connections. Tableau provides four as
sample data for learning. Any other links you have collected (.tds files) are displayed there as well. Return to the
Home button and look at the Workbooks area in the start page. The Workbooks area saves the last nine workbooks
you've opened. If you want to keep a workbook there that you frequently use, go over the workbook image and click
on the push pin. That will prevent the workbook from being cycled out of view.
To remove saved workbooks from the start page click on the red X that appears when you float over the workbook's
image. At the bottom of this start page, the Getting Started area provides links to training videos and promotional
materials. The sample workbook area provides links to sample workbooks containing excellent example material.
Clicking on More Samples takes you to Tableau's visual gallery on the web with even more example workbooks.
25
Analytics Pane: The Analytics pane on the left allows you to add and customize various analytical calculations and
functions.
Parameters and Set Controls: These components, located on the left, enable users to create dynamic parameters
and sets for their visualizations.
Preview Pane: The preview pane on the right shows a preview of your visualization as you make changes to it.
Web Edit Mode: Tableau provides a web-based editing mode where you can create and edit dashboards directly in a
web browser.
Tableau Server Integration: If connected to Tableau Server or Tableau Online, users can publish their visualizations
to the server or online platform for collaboration and sharing.
Tableau's user interface is highly interactive and is designed to facilitate the creation of data visualizations and
dashboards with ease. Depending on your specific task or visualization requirements, you'll interact with these
components to design and analyze your data effectively. Keep in mind that the exact layout and features may vary
depending on the version of Tableau you are using.
26
Format Menu: This menu is not used very commonly because pointing at anything, and right-clicking gets you to a
context-specific formatting menu more quickly. You may need to alter the cell size in a worksheet rarely. If you
don't like the default workbook theme, use the Workbook Theme menu to select one of the other two options.
2. Toolbar Icon: Toolbar icon below the menu bar can be used to edit the workbook using different features like redo,
undo, new data source, save, slideshow, and so on.
3. Dimension Shelf: The dimension presents in the data source for example- customer (customer name, segment),
order (order date, order id, ship date, and ship mode), and location (country, state, and city) these all type of data
source can be viewed in the dimension shelf.
4. Measure Shelf: The measures present in the data source, for example- Discount, Profit, Profit ratio, Quantity, and
Sales- These all types of data source can be viewed in the measure shelf.
5. Sets and Parameters Shelf: The user-defined sets and parameters can view in the sets and parameters. It is also used
to edit the existing sets and parameters.
6. Page Shelf: Page shelf is used to view the visualization in video format by keeping the related filter on the page shelf.
7. Filter Shelf: Filter Shelf is used to filter the graphical view by the help of the measures and dimensions.
8. Masks Cards: Marks card is used to design the visualization. The data components of the visualization like size, color,
path, shape, label, and tooltip are used in the visualizations. It can be modified in the marks card.
9. Worksheet: The worksheet is the space where the actual visualization, design, and functionalities are viewed in the
workbook.
10. Tableau Repository: Tableau repository is used to store all the files related to the Tableau desktop. It includes
various folders like Connectors, Bookmarks, Data sources, Logs, Extensions, Map sources, Shapes, Services, Tab
Online Sync Client, and Workbooks. My Tableau repository is located in the file path C:\Users\User\Documents\My
Tableau Repository.
Toolbar
Tableau's toolbar is an essential part of the user interface that provides quick access to various functions and tools for
creating, customizing, and interacting with data visualizations. The toolbar is located just below the menu bar in Tableau
and contains a range of icons that represent different actions and features.
Here's an overview of the key components and functions of the Tableau toolbar:
1. Open: The "Open" icon allows you to open existing Tableau workbooks or projects.
2. Save: The "Save" icon is used to save your current Tableau workbook or project.
3. Undo and Redo: These icons enable you to undo or redo recent actions in your workbook.
4. New Worksheet: Clicking this icon creates a new worksheet where you can build additional visualizations.
5. New Dashboard: This icon opens a new dashboard for combining multiple visualizations onto a single page.
6. Data Source: Clicking this icon allows you to connect to and configure data sources for your project.
7. Show Data Source: The "Show Data Source" icon displays the Data Source tab, allowing you to work with the data
source, apply data transformations, and shape your data.
8. Sheets and Dashboards: These icons provide quick access to the tabs for individual sheets and dashboards in your
project.
9. Show Me: The "Show Me" icon opens the "Show Me" panel, which provides predefined visualization templates and
helps you select the best chart type for your data.
10. Connect to Data: This icon allows you to establish a connection to different data sources and access the data you
want to visualize.
11. Data Preparation: The "Data Preparation" icon opens Tableau Prep, a tool for cleaning, transforming, and shaping
your data before visualization.
12. Annotations: You can use the "Annotations" icon to add notes, shapes, and lines to your visualizations, making it
easier to communicate insights.
13. Dashboard Options: The "Dashboard Options" icon opens the settings for configuring dashboard layout, size, and
other properties.
14. Worksheet Options: This icon allows you to configure the settings of the current worksheet, including titles,
formatting, and layout.
15. Publish to Tableau Server/Online: Clicking this icon is used to publish your workbook or dashboard to Tableau
Server or Tableau Online for sharing and collaboration.
16. Web Edit: The "Web Edit" icon opens the web editing mode for Tableau Server or Tableau Online, allowing you to
edit and create content directly in the web environment.
17. Server/Online Home: Clicking this icon takes you to the home page of Tableau Server or Tableau Online.
27
18. User Profile: The "User Profile" icon provides access to your Tableau account and user settings.
19. Tableau Help: The "Tableau Help" icon links to documentation and support resources, including online help and
community forums.
20. Feedback and Product Updates: Clicking this icon allows you to provide feedback on Tableau and check for product
updates.
The toolbar in Tableau streamlines the workflow for building data visualizations and dashboards. It offers quick access to
key functions, making it easier to design, analyze, and share data-driven insights. Keep in mind that the toolbar icons and
options may evolve with new versions of Tableau, so consult the latest Tableau documentation for the most up-to-date
information.
Sheets
In Tableau, a "sheet" is a fundamental component used to create and design visualizations. Sheets are part of a Tableau
workbook, and they allow you to build individual visualizations and explore different aspects of your data. Sheets serve
as the canvas where you can drag and drop data fields, define visualization types, and customize the appearance and
behavior of your charts and graphs.
Here's an overview of sheets in Tableau:
1. Creating a Sheet:To create a new sheet, you can click the "New Worksheet" icon in the toolbar or select
"Worksheet" from the "Sheet" menu. You can also duplicate existing sheets to create variations of your
visualizations.
2. Workspace:A sheet in Tableau provides a workspace or canvas where you can build and design your visualization.
The workspace typically includes columns and rows, which represent the X and Y axes.
3. Drag-and-Drop Interface:The core of creating visualizations in Tableau is the drag-and-drop interface. You can select
dimensions and measures from the data source and drop them onto the canvas to define what to visualize.
4. Shelves:Above the canvas, you'll find shelves that allow you to place fields, which control various aspects of your
visualization. These shelves include:
Columns: Defines how data is presented on the X-axis.
Rows: Determines data placement on the Y-axis.
Filters: Lets you filter the data displayed on the sheet.
Marks: Allows you to customize the appearance of data marks.
5. Marks Card:The Marks Card, located above the canvas, is used to specify how the marks (data points) on your
visualization should look. You can control mark properties such as color, size, shape, and label from this card.
6. Data Shading:Depending on the visualization type, you may be able to apply data shading or color encoding to
convey additional information within the visualization.
7. Worksheet Options:Each sheet has a set of worksheet options accessible from the worksheet tab. These options
include formatting settings, titles, tooltips, and other properties specific to that sheet.
8. Field Pane:On the left side of the interface, you can access the Field Pane, which displays the list of available data
fields. You can drag and drop fields from the Field Pane onto the shelves or canvas.
9. Legends and Color Legends:
Legends provide key information about the data in your visualization, such as color encoding and data ranges.
10. Actions and Interactivity:
You can create actions that add interactivity to your visualizations, such as filtering one visualization based on
selections in another.
11. Dashboard Integration:
Sheets can be incorporated into dashboards, allowing you to combine multiple sheets and visualizations onto a
single dashboard.
12. Preview and Interact:
You can use the preview mode to interact with your visualization and see how it behaves when you apply filters or
select data points.
13. Data Source Preview:
You can use the Data Source tab to preview and inspect the data source to verify your field selections.
14. Annotations and Notes:
Sheets allow you to add annotations, notes, and reference lines to provide context to your visualizations.
Sheets in Tableau are versatile and flexible, allowing you to create a wide range of visualizations, from simple bar
charts and scatter plots to complex interactive dashboards. By combining multiple sheets and dashboards, you can
effectively communicate insights from your data to a wide audience.
28
Dashboards
In Tableau, dashboards are powerful tools for data visualization and reporting. Dashboards allow you to combine
multiple visualizations, filters, and interactive elements into a single, unified view. This enables you to present and
explore data insights in a cohesive and meaningful way.
Components and Features of a Tableau Dashboard:
1. Dashboard Size and Layout:You can customize the size and layout of your dashboard to fit your specific needs. You
can choose from different layout containers and size options.
2. Sheets and Objects:Dashboards allow you to add sheets, visualizations, and objects from your workbook. These
elements can be arranged and resized to create a cohesive layout.
3. Horizontal and Vertical Layout Containers:You can use containers to organize your dashboard elements. Horizontal
and vertical layout containers help you control the placement of visualizations and objects.
4. Sheets and Objects Shelf:The "Sheets" and "Objects" shelf on the left side of the dashboard interface provides a list
of available sheets and objects in your workbook. You can drag and drop these onto the dashboard.
5. Objects:You can add various objects to your dashboard, including images, text, web content, and blank objects,
which can be customized to provide context or explanations for your data.
6. Quick Filters:Quick filters allow users to interactively filter the data on the dashboard. You can add filter controls
that affect multiple visualizations at once.
7. Actions:Actions enable interactivity within a dashboard. You can create actions that let users click on one
visualization to affect the data displayed in another.
8. Legends:You can include legends in your dashboard to provide context for color coding and data range information
used in visualizations.
9. Titles and Text:Dashboards support text and title elements. You can add titles and captions to describe the content
and insights presented.
10. Background and Borders:You can customize the background color, image, and borders of your dashboard to create a
polished and branded look.
11. Layout Containers:Containers are used to group and control the placement of dashboard elements. You can nest
containers to create more complex layouts.
Advantages of Tableau Dashboards:
1. Data Integration: Dashboards allow you to combine multiple visualizations and data sources into one view, providing
a holistic understanding of your data.
2. Interactivity: Dashboards support user interactivity, such as filtering and parameter actions, making it easy to
explore data.
3. Visual Appeal: You can design visually appealing dashboards with customized layouts, fonts, colors, and images to
effectively communicate insights.
4. Storytelling: Dashboards enable storytelling by arranging visualizations and text in a narrative flow, making it easier
to convey a data-driven story.
5. Efficiency: Users can quickly analyze data from a single dashboard, reducing the need to navigate through multiple
worksheets.
Challenges of Tableau Dashboards:
1. Complexity: Building complex dashboards can be time-consuming and may require advanced skills.
2. Performance: Highly interactive dashboards with multiple components may require careful optimization to ensure
smooth performance.
3. Accessibility: Ensuring that dashboards are accessible to all users, including those with disabilities, can be a
challenge.
4. Maintenance: Regular updates and maintenance of dashboards to accommodate changing data sources and user
requirements are necessary.
How to create a data dashboard:
There are many different solutions to help you build dashboards: Tableau, Excel, or Google Sheets. But at a basic level,
here are important steps to help you build a dashboard:
1. Define your audience and goals: Ask who you are building this dashboard for and what do they need to understand?
Once you know that, you can answer their questions more easily with selected visualizations and data.
29
2. Choose your data: Most businesses have an abundance of data from different sources. Choose only what’s relevant to
your audience and goal to avoid overwhelming your audience with information.
3. Double-check your data: Always make sure your data is clean and correct before building a dashboard. The last thing
you want is to realize in several months that your data was wrong the entire time.
4. Choose your visualizations: There are many different types of visualizations to use, such as charts, graphs, maps, etc.
Choose the best one to represent your data. For example, bar and pie charts can quickly become overwhelming when
they include too much information.
5. Use a template: When building a dashboard for the first time, use a template or intuitive software to save time and
headaches. Carefully choose the best one for your project and don’t try to shoehorn data into a template that doesn’t
work.
6. Keep it simple: Use similar colors and styles so your dashboard doesn’t become cluttered and overwhelming.
7. Iterate and improve: Once your dashboard is in a good place, ask for feedback from a specific person in your core
audience. Find out if it makes sense to them and answers their questions. Take that feedback to heart and make
improvements for better adoption and understanding.
Data Window
The "Data Window" in Tableau is a critical part of the user interface that provides a preview of the data source you've
connected to in your Tableau workbook. It allows you to interact with, examine, and manipulate your data before
creating visualizations. Here's an overview of the Tableau Data Window:
Accessing the Data Window: To access the Data Window in Tableau, follow these steps:
Open your Tableau workbook.
In the Tableau interface, you'll typically find the Data Window located on the left side of the workspace.
Key Features and Functions of the Data Window:
1. Data Source Preview:The Data Window displays a preview of your connected data source, showing the initial rows of
your dataset. This helps you understand the structure and content of your data.
2. Data Fields:The Data Window lists all the data fields available in your dataset, which include dimensions (categorical
variables) and measures (quantitative variables). You can drag and drop these fields onto the canvas to create
visualizations.
3. Data Type Indicators:Next to each data field, Tableau provides data type indicators to inform you whether a field is
recognized as a dimension, measure, or attribute.
4. Field Metadata:When you click on a data field in the Data Window, you can access metadata and details about that
field, such as data type, number of distinct values, and the role assigned to it.
5. Data Sorting and Filtering:You can sort and filter the data in the Data Window to view specific data subsets or order
the data in a particular way.
6. Data Source Editing:The Data Window allows you to make changes to your data source, including creating calculated
fields, renaming fields, and modifying data roles.
7. Data Source Options:Right-clicking on a data field or selecting it provides access to a range of options, including
creating groups, hierarchies, sets, and more.
8. Data Source Split View:You can switch between a single view of your data source and a split view that displays both
your data source and a worksheet for visualization.
9. Data Source Filtering:The Data Window supports data source filtering, allowing you to apply filters to the data
before it's brought into the visualization. This can be useful for reducing data volume.
Advantages of the Data Window:
1. Data Exploration: The Data Window allows you to quickly explore your data, understand its structure, and identify
patterns or anomalies.
2. Data Preparation: You can perform data preparation tasks like cleaning, filtering, and shaping data within the Data
Window.
3. Field Management: It offers tools to manage data fields, such as renaming, changing data types, and defining custom
calculations.
4. Data Role Assignment: You can specify the role of each field, like dimension or measure, to control how it's used in
visualizations.
30
Challenges of the Data Window:
1. Data Volume: With large datasets, the Data Window may become challenging to work with due to the sheer volume
of data.
2. Complex Data Structures: If your dataset has complex structures or joins, understanding the data relationships in the
Data Window can be challenging.
3. Data Cleanup: Extensive data preparation tasks may require additional tools or external data cleansing processes.
Data Types
Tableau is the easy-to-use Business Intelligence tool used in data visualization. Its unique feature is, to allow data real-
time collaboration and data blending, etc. Through Tableau, users can connect databases, files, and other big data
sources and can create a shareable dashboard through them. Tableau is mainly used by researchers, professionals, and
government organizations for data analysis and visualization.The data type classifies the data value into its definite type,
some may be characters (eg- ‘Vansh’), some may be integers (eg- 108), and some may be floating type (eg- 1.854), etc. In
this way, every data value lies under certain data types. Tableau too has a set of data types under which it classifies data
value present in it as field values.In Tableau, we have seven primary data types. The function of Tableau is to
automatically detect the data types of various fields, as soon as the data is uploaded from the source and allocate it to
the fields.
These six data types are:-
(1) String Data type: The collection of characters give rise to the string data type. A string is always enclosed within a
single or double inverted comma. The samples of the string are — “Vansh”, “Hi! How are you?”, and
“GeeksforGeeks”, etc.
We can divide String data type into two types, Char and Varchar.
Char string type- Char data type normally stores alphanumeric data values having fixed lengths. If the user enters a
string value which is greater than the fixed length of the Char data type, then the system returns an error.
Varchar string type- Varchar data type also stores alphanumeric data values. As the name suggests, Varchar stores data
values having a variable length. So, the user can enter as many string values as they want, without facing any restriction
from the system.
(2) Numeric Data type: This data type consists of both integer type or floating type. Out of which users prefer to use
integer type over floating type, as it is difficult to accumulate the decimal point after a certain limit. It also contains a
function known as the Round() function which can be used in rounding up float values.
(3) Date and Time Data type: Tableau supports all forms of date and time like dd-mm-yy, or mm-dd-yyyy, etc. And the
time data values can be in the form of a decade, year, quarter, month, hour, minutes, seconds, etc. Whenever the
user enters data and time values, Tableau automatically registers it under Date data type and Date & Time data
value.
(4) Boolean Data type: As a result of relational calculations, boolean data type values are formed. The boolean data
values are either True or False. Many a time the result of a relational calculation is unknown, in this situation Null
data values are used.
(5) Geographic Data type: All values that are used in maps, comes under geographic data type. The example of
geographic data values is country name, state name, city, region, postal codes, etc.
(6) Cluster or Mixed Data type: Sometimes data set contains values having a mixture of data types. Such values are
known as cluster group values or mixed data values. In such a situation, users have the option either to handle it
manually or allow Tableau to operate on it.
file types:
Tableau uses several file types for various purposes. Here are some of the key Tableau file types in detail:
(1) Tableau Workbook (.twb):
This is the primary file format used by Tableau for workbooks. It contains information about data connections,
worksheets, dashboards, and layouts..twb files do not embed the data; instead, they reference data sources, which can
be in various formats.
(2) Tableau Packaged Workbook (.twbx):
A .twbx file is a packaged workbook, which includes both the workbook (.twb) and the data source. All the necessary
data and metadata are bundled into a single file, making it easier to share with others .Useful when you want to ensure
that recipients have access to the data along with the workbook.
(3) Tableau Data Extract (.hyper):
31
Tableau Data Extracts are highly optimized, columnar data storage files designed for performance. These files are
typically created when you extract data from a data source in Tableau. Extracts can be used to speed up data
visualization, especially with large datasets.
(4) Tableau Data Source (.tds):
A .tds file is used to save data source information separately from a workbook .It contains metadata about data
connections, custom calculations, and other data source settings .Useful for sharing data source configurations across
multiple workbooks.
(5) Tableau Data Source Extract (.tdsx):
Similar to a .tds file, but it also includes an embedded data extract (.hyper).This allows you to package the data source
and the extract together for easier sharing.
(6) Tableau Bookmark (.tbm):
A .tbm file is used to save individual worksheet or dashboard settings, including layout and formatting .It enables you to
share specific views or configurations of your Tableau work.
(7) Tableau Workbook Template (.twbt):
A .twbt file is a Tableau workbook template that can be used as a starting point for new workbooks .It contains
predefined formatting, layout, and other settings to maintain consistency across projects.
(8) Tableau Packaged Data Source (.tdsx):
This file format bundles a data source (.tds) along with an embedded data extract (.hyper).Useful for sharing self-
contained data sources with others .These file types enable Tableau users to create, save, and share their data
visualizations, reports, and data sources efficiently. The choice of file type depends on your specific needs, such as
sharing workbooks with or without data, packaging data sources, or creating templates.
34
(C) Date Functions:
Tableau has a variety of date functions to carry out calculations involving dates. All the date functions use the date part
which is a string indicating the part of the date such as - month, day, or year. Following table lists some examples of
important date functions.
35
(E) Aggregate Functions:
Tableau – Operators:
Tableau provides a variety of operators that you can use for different purposes in your calculations and data
visualization. These operators can be categorized into four main types: General operators, Arithmetic operators,
Relational operators, and Logical operators. Here's an overview of each category:
1. General Operators:
Assignment Operator (=): Used to assign a value to a field or variable. For example, you can create a calculated field
to assign a value to a new field:
[New Field] = [Existing Field]
Wildcard Operator (*): Used for pattern matching in string data. For example, to find all products with "apple" in
their name, you can use "*apple*".
2. Arithmetic Operators: Arithmetic operators perform mathematical operations on numeric values.
+ (Addition): Used to add values together.
-(Subtraction): Used to subtract values.
*(Multiplication): Used for multiplication.
/ (Division): Used to divide values.
% (Modulus): Returns the remainder of a division operation.
^ (Exponentiation): Raises a number to a power.
Examples:
Sales + Profit: Adds the "Sales" and "Profit" values.
Year - 1: Subtracts 1 from the "Year" field.
Quantity * Price: Multiplies "Quantity" by "Price".
Total Sales / Number of Orders: Divides total sales by the number of orders.
3. Relational Operators: Relational operators are used to compare values and return a Boolean result (true or
false). They are commonly used in logical expressions.
= (Equal to): Checks if two values are equal.
<>or != (Not equal to): Checks if two values are not equal.
>(Greater than): Compares if one value is greater than another.
< (Less than): Compares if one value is less than another.
36
>= (Greater than or equal to): Checks if one value is greater than or equal to another.
<= (Less than or equal to): Checks if one value is less than or equal to another.
Examples:
Sales = Target Sales: Compares if "Sales" is equal to "Target Sales."
Profit > 0: Checks if profit is greater than zero.
Order Date < #2023-01-01#: Compares if the order date is before January 1, 2023.
4. Logical Operators: Logical operators are used to create complex logical conditions by combining multiple
conditions.
AND: Returns true if both conditions are true.
OR: Returns true if at least one condition is true.
NOT: Negates a condition (returns true if the condition is false).
Examples:
Sales > 1000 AND Profit > 100: Returns true if both sales and profit are greater than 1000 and 100, respectively.
Category = 'Electronics' OR Category = 'Appliances': Returns true if the category is either 'Electronics' or 'Appliances.'
NOT (Region = 'West'): Returns true if the region is not 'West.'
These operators are essential for creating calculated fields, filters, and other logic within Tableau to manipulate and
analyze your data effectively. Depending on your analysis requirements, you can combine these operators in various
ways to build complex calculations and conditions.
Precedence of Operator:
The below table is describing the order of precedence of the operator. The top row of below table has the highest
precedence. Some operators in the same row have the same precedence.
If two operators have the same precedence, they are analyzed from left to the right in the formula. Parentheses can also
be used in the same order, and the inner parentheses are evaluated before the outer parentheses.
37
Tableau basic filters
Tableau provides several basic filters that you can use in data visualization to interactively explore and analyze your
data. These filters allow you to control what data is displayed in your visualizations and dashboards. Here are some of
the fundamental filters in Tableau:
1. Quick Filters:
Quick filters are easy-to-use filters that you can add to your dashboard to allow users to interactively explore the
data. They are applied to one or more worksheets on the dashboard.
Quick filters can be created by right-clicking a field in the Data pane and selecting "Show Quick Filter" or by dragging
a field to the Filters shelf.
2. Filter Actions:
Filter actions are interactive filters that allow you to control one visualization with another. For example, you can
click on a data point in one visualization to filter data in another.
To create filter actions, go to the "Dashboard" menu and select "Actions." Then, choose "Filter" as the action type
and specify the source and target sheets.
3. Dimension Filters:
Dimension filters are used to filter data based on discrete (categorical) fields. They provide a list of categories from
which you can select to filter the data.
You can create dimension filters by dragging a dimension to the Filters shelf.
4. Measure Filters:
Measure filters are used to filter data based on continuous (numeric) fields. They allow you to set a range of values
for the filter.
To create a measure filter, right-click a measure in the Data pane, select "Show Filter," and then choose "Range of
Dates" or "At Least/At Most" options, depending on the filter type.
5. Context Filters:
Context filters are special filters that allow you to create a context in which other filters operate. When you apply a
context filter, Tableau creates a temporary subset of data based on the context filter's conditions.
You can create a context filter by right-clicking a filter in the Filters shelf and selecting "Add to Context."
6. Top N Filters:
Top N filters allow you to filter data to show only the top N items based on a specific measure. For example, you can
use a Top N filter to display the top 10 products by sales.
To create a Top N filter, right-click a dimension or measure in the Filters shelf and select "Top" to set the number of
items to display.
7. Relative Date Filters:
Relative date filters make it easy to filter data based on time periods relative to the current date. You can use
options like "Last N Days," "Next N Months," etc.
To create a relative date filter, right-click a date field in the Filters shelf, and select "Add to Filters." Then, choose the
"Relative Date" option.
8. Combined Filters:
You can combine multiple filters to create complex filter conditions. This can be useful when you want to filter data
based on multiple dimensions and measures.
To combine filters, use the logical operators (AND, OR) within the filter conditions.
These basic filters in Tableau provide you with the flexibility to control the data displayed in your visualizations, enabling
you and your audience to explore and analyze data interactively and gain valuable insights.
tableau Literal
In Tableau, a "literal" refers to a constant value or fixed text that you can include in calculated fields, parameters, or
other parts of your data visualization. Literals are static and do not change unless manually modified. They are used to
add specific values or text to your visualizations, which can be particularly useful for creating calculated fields or for
adding annotations to your charts.
There are a few types of literals you can use in Tableau:
1. Numeric Literal: A constant numeric value. For example, you might use a numeric literal in a calculated field to
perform mathematical operations. Here's an example that uses a numeric literal to calculate a 10% discount:
38
[Price] * 0.10
In this example, 0.10 is a numeric literal representing a 10% discount.
2. String Literal: A fixed text value enclosed in double quotation marks. String literals are often used for labels or
annotations in your visualizations. For instance, you might add a string literal as an annotation to a chart:
"Total Sales for 2023"
In this example, the string literal is "Total Sales for 2023."
3. Boolean Literal: A constant value representing either true or false. Boolean literals are commonly used in logical
expressions. For example:
[Region] = "East"
In this example, the expression compares the field [Region] with the string literal "East."
4. Date Literal: A fixed date value that can be used in date calculations and date-based filters. Date literals are
typically expressed as #YYYY-MM-DD#. For example, you can use a date literal to create a filter for a specific date:
[Order Date] = #2023-10-15#
In this example, the date literal represents October 15, 2023.
Using literals in Tableau allows you to add context and specific values to your calculations, making your visualizations
more informative and tailored to your data. They are especially valuable in calculated fields where you need to perform
operations, create custom dimensions, or add labels to your charts.
Tableau Field
In Tableau, a "field" is a fundamental concept that represents a column of data in your dataset or data source. Fields are
essential for data visualization as they provide the structure and content for creating charts, graphs, tables, and other
types of visual representations.
Fields can be broadly categorized into two main types: dimensions and measures.
1. Dimensions:
Dimensions are categorical or discrete fields that represent qualitative data. They provide a way to segment and
categorize data into distinct groups or categories.
Examples of dimensions include:
Product categories
Geographic locations (e.g., countries, cities)
Time-based data (e.g., months, days, years)
Customer names
Dimensions are typically used to define the "slices" or "segments" in your visualizations. They determine how data is
grouped and categorized. You can drag and drop dimensions onto the Rows and Columns shelves to create categorical
views in your visualizations.
2. Measures:
Measures are quantitative or continuous fields that represent numeric data. They provide the basis for performing
mathematical calculations and aggregations.
Examples of measures include:
Sales revenue
Profit
Quantity sold
Temperature readings
Measures are used to perform various mathematical operations, such as SUM, AVG, MAX, MIN, etc., to analyze and
visualize numeric data. Measures are typically placed on the Values shelf in Tableau, and they are used to create the
"meat" of your charts and graphs.
Fields in Tableau are the building blocks for creating data visualizations. By combining dimensions and measures and
applying various functions and filters, you can generate a wide range of visualizations to explore, analyze, and
communicate insights from your data. Fields can be added to different parts of your visualization, such as the Rows and
Columns shelves, the Marks card, or the Filters shelf, to define the structure and appearance of your visualizations.
Fields are the core components that help you turn raw data into meaningful charts and dashboards in Tableau.
39
Tableau Parameter
In Tableau, a "parameter" is a dynamic input that allows you to create more interactive and flexible data visualizations.
Parameters are a way to add an extra layer of control to your visualizations, allowing users to change specific values or
criteria and see how those changes affect the data and the visual representation. Parameters are particularly useful
when you want to provide users with options for customizing their views, performing what-if analysis, or exploring data
in various ways.
Here are some key points about parameters in data visualization:
1. Creating Parameters:You can create parameters in Tableau by defining a name and data type (e.g., string, number,
date) for the parameter. Parameters can be created from the "Data" pane or the "Analysis" menu.
2. Using Parameters:Parameters can be used in calculated fields, filters, and reference lines within your visualizations.
They serve as dynamic placeholders for values that users can change.
For example, you can use a parameter to create a calculated field that multiplies a measure by the parameter's
value, allowing users to adjust the multiplier.
3. Control Options:You can specify control options for parameters, such as setting a range of allowable values or
providing a list of predefined choices. Users can interact with the parameter through drop-down lists, sliders, or
input boxes, depending on how you configure it.
4. Dynamic Filtering:Parameters can be used as a way to create dynamic filters. For example, you can use a parameter
to filter data based on a specific dimension or measure, allowing users to change the filter criteria without editing
the view.
5. What-If Analysis:Parameters are valuable for conducting what-if analysis. Users can adjust parameter values to see
how changes impact the visualizations. For instance, you can create a parameter for a discount rate and observe how
different rates affect profit.
6. Dashboard Interaction:Parameters can be included in dashboards, enabling users to adjust values and instantly see
the impact on multiple visualizations. This enhances the interactivity of your dashboards.
7. Reference Lines and Bands:Parameters can be used to dynamically control reference lines and bands in your
visualizations. You can allow users to change reference values or thresholds using parameters.
8. Data Exploration:Parameters make it easier for users to explore data. They can quickly change criteria or apply
different filters without needing to access the underlying data source.
Here's a simplified example of how a parameter can be used in Tableau:
Suppose you have a parameter named "Discount Rate," and you create a calculated field that calculates discounted sales
using this parameter. Users can adjust the "Discount Rate" parameter, and the visualizations will automatically update to
reflect the new discounted sales figures.
Parameters in Tableau offer a powerful way to make your data visualizations more interactive, user-friendly, and
adaptable to different scenarios, allowing users to gain deeper insights from your data.
Here's how parameters work in Tableau and their significance in data visualization:
1. Creating Parameters:Parameters are created through the Parameter dialog in Tableau. You specify the data type
(e.g., string, date, numeric), and you can define a range or list of allowable values.
2. Using Parameters:You can use parameters in various parts of your visualization, such as calculated fields, filters,
reference lines, and calculated field expressions. Essentially, you replace a constant value in your calculations with a
parameter, making it dynamic.
3. Parameter Controls:Parameters are typically displayed as a control in your visualization, often as a dropdown list,
slider, or input box. Users can interact with these controls to change the parameter's value.
4. Dynamic Data Visualization:Parameters enable dynamic data visualization. They allow users to change aspects of the
visualization, such as filtering data, adjusting thresholds, or selecting specific categories, on the fly.
Here are a few common use cases for parameters in data visualization:
Filtering Data: Parameters can be used to create dynamic filters. For example, you can create a parameter for "Top
N" and allow users to select how many items should be displayed in a chart. This provides users with greater control
over what they see.
Comparing Scenarios: You can create parameters to switch between different scenarios or measures in a dashboard.
For instance, you can create a parameter to switch between viewing sales revenue and profit.
Adjusting Thresholds: Parameters can be used to adjust threshold values in your visualizations. Users can change a
parameter to set a different threshold for what is considered a "high" or "low" value.
40
Highlighting Data: Parameters can control the color, size, or style of marks in a visualization. For example, you can
create a parameter to highlight specific data points based on user preferences.
Custom Calculations: You can use parameters to allow users to input custom calculations. This can be particularly
useful when users want to perform ad-hoc calculations within a dashboard.
Tableau Comment
In Tableau, comments are a feature that allows you to add explanatory or descriptive notes to various elements within
your data visualization projects. Comments can be useful for providing context, explaining the purpose of a visualization,
sharing insights, or collaborating with team members on a specific project. Here's how you can use comments in
Tableau:
Comments and annotations in Tableau help make your data visualizations more informative and facilitate collaboration
within your team. They provide a way to communicate insights, context, and data explanations directly within your
visualizations, which can be invaluable for decision-making and understanding complex data.
41
Simple Text Visualization:
Text visualization:
Definition: The text visualization chart is the graphical representation of qualitative data frequency, such as keywords or
customer feedback.
The graph gives greater prominence to words that appear more frequently in a source text. The larger the word, the
higher its frequency .You can use the chart to perform exploratory textual analysis by identifying words that frequently
appear in a set of interviews, documents, or other text. Also, you can use it to communicate the most salient points or
themes in the reporting stage.
uses of text visualization charts below:
Summarize Large Amounts of Text:Automatically highlight key terms in a series of texts, and categorize text by
topic, sentiment, and more, saving hours of reading time .With a text visualization or data visualization dashboard,
you can understand text data at a glance.
Make Text Data Easy to Understand:Our brains process visual data 60,000 times faster than texts and numbers. Text
visualization examples effectively simplify complex data and communicate ideas and concepts to team managers.
Find Insights in Qualitative Data:Customer feedback holds a trove of insights. Through text visualization examples,
you can get an overview of the features, products, and topics that are most important to your customers.
Discover Hidden Trends and Patterns:You can easily analyze and visualize insights over time to detect fluctuations,
and quickly find the root cause. Extracting reliable insights from qualitative data sets, such as keywords, should never
be an Achilles Heel for you.
why do we need text visualization?
Text Visualization can help reveal your audience’s thoughts:You can use the chart to understand your audience’s
feelings about a topic/situation. Besides, you can leverage the chart to summarize data-driven views. The chart can
help you summarize the market feedback using first-hand data.
Quick and informative :You can easily get live feedback from your audience in real-time.
Exciting and emotional : The chart can help audiences feel part of your data story.
Engaging :The Word cloud is incredibly engaging and visually appealing to many audiences. The chart can be an
icebreaker or an entry point for a topic of discussion.
Word Clouds are visual:Our brains process visual content 60,000 times faster than texts and numbers. This provides
a logical rationale for using the Word Cloud generator to analyze your textual data for actionable insights.
There are top 4 text visualization examples:
(1) Word Cloud
42
(2) Tag Cloud
43
Visual displays of information as a form of Table
A data table, or a spreadsheet, is an efficient format for comparative data analysis on categorical objects. Usually, the
items being compared are placed in a column, while the categorical objects are in the rows. The quantitative value is
then placed at the intersection of the row and column, called the cell. The following examples demonstrate data tables.
This table compares monthly payments for buying or leasing various cars (categories). The first two columns are being
compared; the other columns contain additional, secondary information.
44
Limitations of bar graphs:
1. Ineffectiveness for Continuous Data: Bar graphs are not well-suited for representing continuous data, where values
can take on any number within a range. For such data, histograms or line charts may be more appropriate.
2. Complexity with Many Categories: When dealing with a large number of categories or data points, creating a bar
graph can result in a cluttered and hard-to-read visualization. It becomes inefficient to represent numerous
categories using bars. In such cases, alternatives like treemaps or scatter plots may be more effective.
3. Single-Dimension Data: Bar graphs primarily display data for one variable or dimension at a time. They are not
designed for representing relationships or correlations between multiple variables. If you need to show how two or
more variables interact, other visualizations like scatter plots or bubble charts are more suitable.
4. Precision in Value Comparison: Precisely comparing the values of bars can be difficult, especially when the values
are very close to each other. Users often need to rely on visual estimation, which can lead to inaccuracies. For
precise value comparison, tables or dot plots may be better choices.
5. Challenges with Negative Values: Bar graphs are typically used for representing positive values. When negative
values are involved, it may be less intuitive and may require additional annotation or explanation.
6. Misleading Scaling: In some cases, manipulating the scaling of the y-axis can exaggerate or diminish differences
between data points. This can lead to visual distortions and misinterpretation. It is important to ensure that the
scaling is appropriate and does not misrepresent the data.
7. Not Ideal for Time-Series Data: While bar graphs can be used to display time-series data, they may not effectively
capture trends and patterns over time. Line charts, area charts, or box plots are often better choices for representing
time-based data.
Line Graph
A line graph reveals trends or progress over time, and you can use it to show many different categories of data. You
should use it when you chart a continuous data set.
45
Features of Line Graphs:
1. Time-Series Representation: Line graphs are especially effective for representing data over time. They can show
trends, patterns, and fluctuations in data, making them ideal for tracking changes and developments.
2. Sequential Data: Line graphs are designed to display data points in a sequence. This sequential representation is
particularly useful for showing how data evolves over time, which is common in areas like finance, stock market
analysis, and weather forecasting.
3. Continuous Data: Line graphs are well-suited for continuous data, such as temperature, stock prices, or population
growth, where values can vary across a continuous range.
4. Clear Trend Identification: Line graphs make it easy to identify trends, whether they are upward, downward, or
stable. They are great for visualizing changes in data over time, helping users to spot patterns and irregularities.
5. Interpolation of Values: Line graphs allow for interpolation between data points, which helps in estimating values
between known data points. This can be especially useful for understanding the behavior of data within a time
period.
6. Comparison of Multiple Series: Line graphs can display multiple lines on the same chart, enabling the comparison of
several related datasets simultaneously. This makes it easy to analyze how different variables or categories evolve
over time.
Area Chart
An area chart is basically a line chart, but the space between the x-axis and the line is filled with a color or pattern. It is
useful for showing part-to-
whole relations, like showing
individual sales reps’
contributions to total sales
for a year. It helps you
analyze both overall and
individual trend information.
46
and life cycle stage .A line chart could show more subscribers than marketing qualified leads. But this area chart
emphasizes how much bigger the number of subscribers is than any other group .These charts make the size of a group
and how groups relate to each other more visually important than data changes over time.
Area graphs can help your business to:
- Visualize which product categories or products within a category are most popular.
- Show key performance indicator (KPI) goals vs. outcomes.
- Spot and analyze industry trends.
Design Best Practices for Area Charts:
- Use transparent colors so information isn't obscured in the background.
- Don't display more than four categories to avoid clutter.
- Organize highly variable data at the top of the chart to make it easy to read.
Features of Area Charts:
1. Time-Series Representation: Like line graphs, area charts are effective for representing data over time. They can
convey trends, patterns, and fluctuations, making them suitable for tracking changes and developments.
2. Sequential Data: Area charts are designed to display data points in a sequence, making them well-suited for showing
how data evolves over time. They are particularly useful when you want to emphasize the cumulative nature of data.
3. Continuous Data: Area charts work well with continuous data, just like line graphs. Continuous data, such as
temperature, stock prices, or population growth, can be visually depicted using area charts.
4. Emphasis on Cumulative Values: Area charts provide a clear emphasis on cumulative values. They are great for
illustrating the total effect of several variables or categories over time, showing how individual components
contribute to the whole.
5. Visualizing Change Over Time: Area charts are effective at highlighting changes and trends in data. They emphasize
not only the values at a specific point but also the overall effect of those values over the entire time period.
6. Comparison of Multiple Series: Just like line graphs, area charts can display multiple series on the same chart. This
allows for easy comparison of several related datasets or categories, showcasing their cumulative contributions over
time.
Limitations of Area Charts:
1. Not Suitable for Categorical Data: Area charts are not appropriate for representing categorical or discrete data
where values don't have a natural order. For categorical data, bar charts or pie charts are better choices.
2. Overcrowding with Many Data Points: When there is a large number of data points or categories, area charts can
become overcrowded and challenging to interpret. Clutter can obscure the trends or patterns.
3. Non-Specific Trends: Similar to line graphs, area charts may not provide specific information about individual data
points. If precise values are required, additional data points or labels may be needed.
4. Assumes Linearity: Area charts assume that data follows a linear pattern, which may not always be the case.
Complex non-linear patterns or multiple trends may not be effectively represented.
5. Scaling Issues: The scaling of the y-axis can significantly impact the interpretation of the data. Inappropriate scaling
can exaggerate or diminish the apparent size of trends, leading to misinterpretations.
6. Difficulty in Comparing Across Categories: Area charts are focused on comparing cumulative values and trends
within a single category or variable over time. They may not be as efficient at comparing different categories or
variables to each other.
Pie Chart
A pie chart shows a static number and how categories represent part of a whole — the composition of something. A pie
chart represents numbers in percentages, and the total sum of all segments needs to equal 100%.
Best Use Cases for This Type of Chart:
The image above shows another example of customers by role in the company .The bar graph example shows you that
there are more individual contributors than any other role. But this pie chart makes it clear that they make up over 50%
of customer roles.
Pie charts make it easy to see a section in relation to the whole, so they are good for showing:
- Customer personas in relation to all customers.
- Revenue from your most popular products or product types in relation to all product sales.
- Percent of total profit from different store locations.
47
Design Best Practices for Pie Charts:
- Don't illustrate too many categories to ensure differentiation between slices.
- Ensure that the slice values add up to 100%.
- Order slices according to their size.
48
Scatter Plot Chart
A scatter plot or scatter gram chart will show the relationship between two different variables or reveal distribution
trends .Use this chart when there are many different data points, and you want to highlight similarities in the data set.
This is useful when looking for outliers or understanding your data's distribution.
Bubble Chart
A bubble chart is similar to a scatter plot in that it can show distribution or relationship. There is a third data set shown
by the size of the bubble or circle.
50
Features of Bubble Charts:
1. Three-Dimensional Data: Bubble charts represent data in three dimensions, where two numeric variables are
displayed on the x and y axes, and the third variable is represented by the size of the bubbles. This allows you to
explore relationships and patterns among these three variables simultaneously.
2. Size Encoding: The size of each bubble in the chart is used to encode a specific data attribute or value. This allows
you to convey additional information or context beyond what is shown in a typical scatter plot.
3. Data Density: Like scatter plots, bubble charts can handle a high density of data points. They are particularly useful
when you have a large dataset and want to visualize the distribution of data while showing the relative magnitude of
a third variable.
4. Comparison of Multiple Dimensions: Bubble charts are suitable for comparing data points across three dimensions,
making them valuable for multivariate analysis and identifying relationships or trends that may not be evident in
two-dimensional charts.
5. Identifying Outliers: The use of bubble size allows for the identification of outliers or data points that significantly
differ from the rest. Large bubbles can quickly capture attention and represent unusual or important data points.
6. Customization: Bubble charts often allow for customization of bubble size, color, and labeling, enabling you to
convey specific information and insights as needed.
Waterfall Chart
Use a waterfall chart to show how an initial value changes with intermediate values — either positive or negative —and
results in a final value .Use this chart to reveal the composition of a number. An example of this would be to showcase
how different departments influence overall company revenue and lead to a specific profit number.
51
Features of Waterfall Charts:
1. Change Analysis: Waterfall charts are primarily used for analyzing changes and variations in a particular measure or
value. They are effective at illustrating how an initial value transitions through various stages to reach a final value.
2. Component Breakdown: Waterfall charts break down the total change into its individual components. Each
component (e.g., increase, decrease, or neutral) is represented as a floating bar, showing how it contributes to the
overall change.
3. Sequential Representation: Waterfall charts display data in a sequential, step-by-step manner, making it easy to
follow the progression from one stage to the next. This allows for clear understanding of how each component
affects the total.
4. Value Attribution: Waterfall charts provide insight into the attribution of value to different factors. They help in
understanding the drivers of changes, such as the contribution of various expenses to profit or the breakdown of
revenue by product category.
5. Total to Components: The total of the waterfall chart is typically represented by a single column at the beginning or
end of the chart, and the components that contribute to the total are shown as floating bars. This provides a clear
visual transition from total to components.
Limitations of Waterfall Charts:
1. Limited to Change Analysis: Waterfall charts are designed for change analysis and are less suitable for representing
static data or other types of data relationships. They may not effectively convey other types of insights.
2. Complexity with Many Components: When there are many components in a waterfall chart, it can become complex
and challenging to read. Overly detailed charts can lead to visual clutter and reduced clarity.
3. Not Suitable for Multivariate Analysis: Waterfall charts are primarily used for understanding the impact of individual
factors on a measure. They are not designed for analyzing multiple variables or comparing multiple measures.
4. Limited for Time-Series Data: While waterfall charts can represent changes over time, they may not be the best
choice for capturing complex trends and patterns in time-series data. Line charts or area charts may be more
effective for that purpose.
5. Interpretation Required: Waterfall charts may require some level of interpretation, especially when there are many
components. Users need to understand how to follow the sequence and interpret the chart's structure.
6. Limited for Categorical Data: Waterfall charts are most effective for showing changes in numeric data. They are not
suitable for representing categorical or discrete data, which can be better visualized using bar charts, pie charts, or
other methods.
52
Heat Map
A heat map shows the relationship between two items and provides rating information, such as high to low or poor to
excellent. This chart displays the rating information using varying colors or saturation.
53
Features of Heat Maps:
1. Data Density: Heat maps are excellent for visualizing high-density data, particularly when you have large datasets
with many data points. They effectively represent data distributions.
2. Two-Dimensional Data: Heat maps are primarily designed for displaying two-dimensional data, such as a matrix of
values or a grid of categories. They are particularly useful for visualizing relationships between two variables.
3. Color Encoding: Heat maps use color to encode data values, with different colors representing different values. This
allows users to quickly identify patterns and variations in the data.
4. Pattern Recognition: Heat maps are effective at revealing patterns, clusters, and trends within data. They can be
used for identifying hotspots (high values) and cold spots (low values) in the data.
5. Customization: Users can often customize the color scheme, making it possible to highlight specific ranges or
features within the data. This enables tailored visualization to convey different insights.
6. Hierarchical Data: Heat maps can be used to visualize hierarchical data, such as tree maps or dendrogram heat
maps, which show relationships between categories and subcategories.
7. Interactivity: In digital environments, heat maps can be interactive. Users can hover over cells or regions to see
specific values or access additional details.
Gantt Chart
The Gantt chart is a horizontal chart that dates back to 1917. This chart maps the different tasks completed over a
period of time
.Gantt charting is
one of the most
essential tools
for project
managers. It
brings all the
completed and
uncompleted
tasks into one
place and tracks
the progress of
each .While the
left side of the
chart displays all
the tasks, the
right side shows
the progress and
schedule for each
of these tasks.
54
This chart type allows you to:
- Break projects into tasks.
- Track the start and end of the tasks.
- Set important events, meetings, and announcements.
- Assign tasks to the team and individuals.
Best Use Cases for This Type of Chart:
Gantt charts are perfect for analyzing, road mapping, and monitoring progress over a period of time .The chart above
divides the different tasks involved in product creation. Each of these tasks has a timeline that can be mapped on the
calendar view .From the vision and strategy to the seed funding round, the Gantt chart helps project management teams
build long-term strategies.
The best part ? You can bring the stakeholders, project team, and managers to a single place.
You can use Gantt charts in various tasks, including:
- Tracking employee records as a human resource.
- Tracking sales leads in a sales process.
- Plan and track construction work.
Design Best Practices for Gantt Charts:
- Use same colors for a similar group of activities.
- Make sure to label the task dependencies to map project start and completion.
- Use light colors that align with the texts and grids of the chart.
Features of Gantt Charts:
1. Task Scheduling: Gantt charts are designed for scheduling and managing tasks and activities. They provide a visual
representation of when tasks start and end.
2. Time Sequencing: Gantt charts display tasks along a timeline, making it easy to see the sequence of tasks, their
durations, and overlaps.
3. Resource Allocation: Gantt charts can be used to allocate resources to tasks. This helps in balancing workloads and
ensuring resources are used efficiently.
4. Dependencies: Gantt charts allow you to show task dependencies, indicating which tasks must be completed before
others can begin. This is crucial for understanding the project's critical path.
5. Progress Tracking: Gantt charts enable real-time tracking of task progress. You can update and adjust task durations
as work is completed or delayed.
6. Customization: Gantt charts are often customizable, allowing you to add task details, assign responsible parties, and
add milestones or critical dates.
7. Resource Visualization: Gantt charts can display resource usage, helping you see when specific resources are
allocated to different tasks.
8. Project Planning: They are essential for project planning, making it easier to see the project's timeline, identify
bottlenecks, and ensure tasks are completed in a logical order.
Limitations of Gantt Charts:
1. Complexity with Large Projects: Gantt charts can become complex and overcrowded with many tasks and
dependencies. This can make them challenging to read and interpret.
2. Difficulty with Non-Linear Projects: While Gantt charts are excellent for linear projects with well-defined
dependencies, they may not be the best choice for complex, non-linear projects with multiple interrelated tasks.
3. Limited for Multidimensional Data: Gantt charts are primarily designed for scheduling tasks along a single timeline.
They may not be suitable for projects that involve multiple dimensions of data, such as budget allocation, task cost,
or resource availability.
4. Resource Overallocation: Gantt charts may not effectively handle situations where resources are overallocated to
multiple tasks at the same time. Managing resource constraints can be challenging.
5. Inadequate for Continuous Monitoring: Gantt charts are less effective for continuous monitoring of dynamic
projects, as they may require frequent updates to reflect real-time changes accurately.
6. Non-Numerical Data: Gantt charts are not well-suited for visualizing non-numerical data or non-scheduling data.
They are task-oriented and may not effectively represent other types of information.
7. Complex Interpretation: Interpreting a Gantt chart can be challenging for individuals unfamiliar with the format, and
it may require some training to understand and work with them effectively.
55
What is Data Visualization Cluttering?
Data Visualization is the representation of data using typical graphics such as charts, Infographics , animations , and
plots. These displays help understand the relationship between different data labels and features available in
complex data .Generally, Data Visualization includes techniques such as tables, pie charts, stacked bars, line charts
,area charts, histograms, scattered plots, heat maps & tree maps .The data cluttering problem occurs when the data
dimension is higher .Data Cluttering is a disordered collection of graphical entities in the formation of data
visualization .Data clutter results in misinformation about the data entities .Decision-making is impossible as it
hinders readers' view of observing the patterns in data.
There is no single type of solution for the data cluttering problem, as every cluttering results due tovariation in the
visualization techniques and analysis target.
Data visualization cluttering, also known as visual clutter, refers to the presence of excessive or unnecessary visual
elements within a data visualization, which can hinder understanding and interpretation. Clutter can obscure the
meaningful information in a chart or graph, making it difficult for viewers to discern patterns, trends, and insights. It
occurs when there is an overload of visual cues, data points, labels, or design elements that overwhelm the viewer.
Common sources of clutter in data visualizations include:
1. Overlapping Data Points: When data points, markers, or labels overlap, it can be challenging to distinguish individual
elements, leading to confusion.
2. Too Many Data Points: A high density of data points, particularly in scatter plots or heat maps, can result in
overcrowding, making it difficult to see patterns or trends.
3. Excessive Labels: Adding labels to data points, categories, or axes is essential for understanding, but too many labels
can make the visualization messy and unreadable.
4. Redundant Information: Including redundant or unnecessary information, such as duplicating data in multiple ways,
can clutter the visualization without adding value.
5. Intricate Design Elements: Intricate or overly decorative design elements, like complex color schemes or ornate
chart backgrounds, can introduce clutter and distract from the data.
6. Too Many Categories or Dimensions: Representing too many categories or dimensions within a single visualization
can result in visual complexity and clutter.
clutter clutter free
56
To reduce clutter in data visualizations, consider the following strategies:
1. Simplify: Remove unnecessary data points, labels, or design elements. Focus on the most critical information.
2. Use Hierarchies: If dealing with a large dataset, consider hierarchical or layered visualizations to provide more detail
on demand.
3. Grouping: Group related data points or categories to reduce visual complexity. Stacked bar charts or treemaps are
examples of group-based visualizations.
4. Colors and Contrast: Use color sparingly and purposefully. Ensure there is adequate contrast between elements for
readability.
5. Interactivity: Implement interactive features that allow users to explore the data in a controlled manner, revealing
details as needed.
6. White Space: Use ample white space to separate and organize elements within the visualization.
7. Clear Labels: Ensure that labels are concise, meaningful, and placed appropriately to guide viewers.
By addressing clutter and designing clean and effective data visualizations, you can enhance the understanding and
impact of your data presentations.
57
4. CLOSURE:
Our eyes tend to add any missing pieces of a familiar shape. When faced with ambiguous objects that seems to be
incomplete, open, and in an unusual form, we naturally perceive it as closed or as a whole. The principle of closure
asserts that we perceive open structures as closed, complete, and regular whenever there is a way that we can
reasonably do so.
We can apply this tendency to perceive whole structures in dashboards, especially in the design of graphs. For example,
this principle explains why only two axes, rather than full enclosure, are required on a graph to define the space in which
the data appears, like in a bar chart with x and y-axis values visible.
5. CONTINUITY:
We perceive objects as belonging together, as part of a single whole, if they are aligned with one another or appear to
form a continuation of one another. It’s like the closure principle, but besides the visual connection to form shape, we
also attach visual direction as part of the continuation.
In a dashboard, things that are aligned with one another appear to belong to the same group. For example, in a pivoted
table or matrix table, it is obvious which groups belong to the subgroup when the hierarchy is expanded. We can see the
groupings without the need for vertical grid lines to delineate them, the distinct alignment alone makes the grouping
distinguish easily.
6. CONNECTION:
We perceive objects that are connected in some way, such as by a line, as part of the same group. It supersedes other
principles like proximity and similarity in terms of visual grouping perception because putting a direct connection
between objects is a strong factor in determining the grouping of objects. Connection is only weaker when compared to
enclosure.
The principle of connection is especially useful for tying together non-quantitative data, for example, to represent
relationships between steps in a process or between employees in an organization.
To wrap it up, the real purpose behind Gestalt Principles is for us to really understand how we perceive information. As
we have seen, these principles are powerful and when applied correctly and logically, can deliver the right and intended
effect to our audience from our data visualizations.
58
Types of visual clutter-
Visual clutter can significantly affect the readability and aesthetics of a design. Each type of visual clutter you mentioned
plays a role in how a design is perceived. Let's explore each of them in detail:
1. Lack of Visual Order:
Explanation: Visual order refers to the organization and structure of elements within a design. When there's a lack of
visual order, elements are positioned without a clear system or structure, leading to a chaotic and disorganized
appearance.
Impact: A lack of visual order can make it difficult for viewers to understand the layout and relationships between
elements, resulting in confusion and frustration.
Example: A poster with text and images randomly scattered on the page without clear alignment or grouping.
2. Alignment:
Explanation: Alignment involves the positioning of elements along a common axis or guide, ensuring that they are
visually connected and organized. Proper alignment contributes to a clean and structured design.
Impact: Misaligned elements disrupt the visual flow, making the design appear disorderly and unprofessional.
Alignment is critical for creating a harmonious and organized look.
Example: A brochure with text that is not vertically or horizontally aligned with images and graphics, causing a sense
of imbalance.
3. White Space:
Explanation: White space, also known as negative space, is the empty space around and between design elements.
Adequate use of white space provides breathing room and separation, while too little or too much white space can
create problems.
Impact: Inadequate white space can lead to overcrowding, making it challenging for viewers to distinguish between
elements. Excessive white space can create a sense of emptiness and disconnection.
Example: A website with excessive spacing between elements that makes the content feel scattered and difficult to
follow.
4. Non-Strategic Use of Contrast:
Explanation: Contrast involves variations in visual elements, such as color, size, font weight, or style, to create
emphasis and visual interest. When contrast is applied inconsistently or excessively, it can create visual chaos.
Impact: Non-strategic contrast can lead to confusion and make it challenging to focus on the main message. It can
also create a visually overwhelming design.
Example: An advertisement where different text elements use a multitude of colors, fonts, and sizes with no clear
hierarchy.
5. Pre-Attentive Attributes:
Explanation: Pre-attentive attributes are visual cues that the brain rapidly processes, often before conscious
attention is directed to them. These attributes include color, size, shape, and orientation.
Impact: Overusing or misusing pre-attentive attributes can introduce visual clutter, as they may conflict and disrupt
the viewer's ability to quickly grasp the intended message.
Example: A data visualization with a multitude of differently colored data points that lack a clear and consistent
scheme, making it hard to discern patterns.
59
(6) Complex Color Schemes: Too many colors or conflicting color choices can disrupt the visual hierarchy and order.
(7) Visual Noise: Extraneous details, decorations, or irrelevant content can add noise and hinder the perception of
order.
(8) Lack of Focal Points: Without clear focal points, viewers may struggle to identify where their attention should be
directed within the cluttered composition.
(9) Ineffective Use of Visual Cues: Misuse or overuse of visual cues like arrows, lines, or icons can lead to confusion
rather than clarity.
Align clutter:
Alignment of clutter typically refers to arranging or organizing cluttered objects or items in a more orderly or visually
pleasing manner. It's a way to bring some order to a chaotic space. Depending on the context, you can align clutter by:
- Grouping similar items together.
- Using containers or storage solutions to keep things organized.
- Sorting items by size, color, or function.
- Creating designated spaces for specific items.
- Regularly de-cluttering and getting rid of items you no longer need.
The specific approach to aligning clutter may vary based on the nature of the clutter and the space you're working with.
What do you mean by white space and non-white space in data visualization?
White Space: A white space is the empty space among the elements of a graphical composition. A good use of white
spaces will increase readability and focus the readers’ attention. For example, within a text, white spaces split big chunks
of text into small paragraphs which makes them easy to understand. In addition, white spaces enhance and highlight
some elements of a visualization, and thus emphases the main contents.
Non-white pace: In data visualization, "non-white space" typically refers to the parts of a chart or graph that are filled
with data or meaningful content, as opposed to the empty or blank areas. White space, in this context, is the empty
space or margins around the data elements .Non-white space contains the visual representations of your data, such as
bars in a bar chart, data points in a scatter plot, or segments in a pie chart. It's the area where the data is presented and
where viewers focus their attention to interpret the information being conveyed.
There are two types of white spaces :
In data visualization, there are generally two main types of white space:
(1) Macro White Space: This refers to the large gaps or empty areas in a visualization that help separate different
sections or group related elements. Macro white space can be used to improve readability and guide the viewer's
attention.
(2) Micro White Space: This is the smaller, finer spacing between individual elements within a chart or graph. Micro
white space is essential for clarity and to prevent visual clutter. It helps distinguish data points, labels, and other
elements from each other.
These two types of white space play a crucial role in creating effective and aesthetically pleasing data visualizations.
Properly managing white space can enhance the overall understanding of the data being presented.
Contrast:
Contrast refers to the differences between two or more things, often used to highlight distinctions or make
comparisons. In various contexts, such as design, photography, literature, and more, contrast can have advantages and
disadvantages:
Advantages of Contrast:
(1) Clarity: Contrast can enhance clarity and make it easier to distinguish between elements or objects.
(2) Emphasis: It can draw attention to specific details or focal points, making them stand out.
(3) Visual Interest: Contrast can make a design or composition visually engaging and dynamic.
(4) Depth: In art and photography, contrast can create a sense of depth and dimension.
(5) Highlighting Differences: It's useful for highlighting differences, such as in data visualization.
Disadvantages of Contrast:
(1) Overwhelm: Excessive contrast can be overwhelming and create visual fatigue.
(2) Distraction: Too much contrast may distract from the main message or content.
(3) Inconsistency: In some cases, too much contrast can lead to an inconsistent or chaotic look.
(4) Accessibility: High contrast can be challenging for individuals with visual impairments.
(5) Subjectivity: The perception of contrast can vary among individuals, leading to different interpretations.
60
Strategic use of contrast:
Contrast is a powerful tool in data visualization. Here are some strategic ways to use it:
(1) Highlight Key Data: Use contrast to make important data elements stand out. For example, you can use bold colors
or larger font sizes for key data points to draw attention to them.
(2) Color Contrast: Choose color schemes that provide good contrast. High contrast between data elements and
backgrounds makes it easier for viewers to differentiate and interpret the information. Be mindful of colorblindness
considerations.
(3) Background vs. Data: Ensure that the background of your visualization is neutral and doesn't distract from the data.
Use a light background with dark data elements or vice versa.
(4) Text vs. Data: Contrast between text and data is crucial for readability. Use a readable font and make sure text labels
are clearly legible against the data points they describe.
(5) Grouping and Categorization: Use contrast to visually group and categorize data. For example, you can use different
colors or patterns to distinguish between categories or data series.
(6) Emphasizing Trends: To highlight trends or comparisons, you can use contrast to make certain lines or bars in a chart
more prominent, making it easier for viewers to identify patterns and etc.
Remember that while contrast can be a powerful tool, it should be used thoughtfully to enhance understanding
rather than confuse or overwhelm the viewer. Test your visualizations with potential users to ensure that the use
of contrast aligns with their comprehension and needs.
Pre-attentive attributes
Pre-attentive attributes in data visualization refer to visual properties that the human brain can quickly and effortlessly
process without conscious thought. These attributes include things like color, size, position, length, and orientation. They
are used to draw the viewer's attention to specific data points or patterns in a visualization.
Advantages of using pre-attentive attributes in data visualization:
1. Rapid Perception: Pre-attentive attributes are processed very quickly by the brain, allowing viewers to grasp
information at a glance.
2. Effective Highlighting: They can be used to emphasize important data points or trends, making it easier for viewers
to focus on what matters.
3. Reduced Cognitive Load: By leveraging these attributes, visualizations can reduce the cognitive load on viewers,
making it easier for them to understand complex data.
4. Enhanced Communication: Pre-attentive attributes can improve the clarity and effectiveness of data
communication, helping viewers make better decisions.
Disadvantages of using pre-attentive attributes:
1. Misinterpretation: Overuse or misuse of pre-attentive attributes can lead to misinterpretation of data or visual
clutter if not carefully implemented.
2. Subjectivity: The effectiveness of these attributes can vary depending on cultural, contextual, and individual factors,
making it challenging to create universally effective visualizations.
3. Limited Attributes: There are only a limited number of pre-attentive attributes, so it's essential to choose the right
ones for a specific dataset and visualization.
4. Potential Bias: Certain attributes, like color, can introduce bias if not used thoughtfully, potentially leading to
misleading interpretations.
Types of Pre-attentive Attributes:
(1) Color: Changes in color can be used to highlight or differentiate data points. For example, using different colors for
categories in a bar chart.
(2) Size: Varying the size of visual elements, such as points or bars, can represent quantitative values. Larger elements
typically indicate larger values.
(3) Shape: Different shapes can be used to represent different categories or data points. For instance, circles and
squares could represent different product types.
(4) Position: The spatial arrangement of data points on a chart can convey relationships or groupings. For example,
scatter plot points placed higher on the y-axis might indicate higher values.
(5) Length: The length of bars or lines can be used to represent quantities or values. Longer bars typically represent
larger values.
(6) Orientation: The orientation of lines or bars can convey information, such as the direction of change or
trends.
61