Unit 2
Unit 2
Data visualization is a powerful tool for understanding and communicating data insights. The
foundations of data visualization encompass a range of principles and techniques that ensure that visual
representations are effective, accurate, and meaningful. Here are the key foundations of data
visualization
The foundations of data visualization involve understanding the types of data, adhering to principles of
effective visualization, choosing appropriate visualization types, considering design elements, using
appropriate tools, and maintaining ethical standards. By following these principles, you can create
visualizations that effectively communicate data insights and support informed decision-making.
Quantitative Data: Numeric data that represents quantities (e.g., sales figures,
temperatures). Suitable visualizations include line charts, histograms, and scatter plots.
Categorical Data: Data that represents categories or labels (e.g., product names,
regions). Visualizations like bar charts, pie charts, and grouped bar charts work well.
Temporal Data: Data that changes over time (e.g., stock prices, temperature changes).
Time series plots or Gantt charts are often used.
Geospatial Data: Data with geographic information (e.g., locations, maps). Heatmaps,
choropleth maps, and geographical plots are used for this kind of data.
Clarity: The visualization should be easy to interpret. Avoid clutter and unnecessary
visual elements.
Accuracy: Represent data accurately without misleading through inappropriate scaling,
truncation, or exaggeration.
Consistency: Use consistent colors, shapes, and scales to make comparisons easier.
Focus: Highlight the most important information or trends.
Programming Libraries: Python’s Matplotlib, Seaborn, Plotly, R’s ggplot2, D3.js for
web-based visualizations.
Business Intelligence Tools: Tableau, Microsoft Power BI, QlikView.
6. Ethics in Visualization
User Testing: Gather feedback from target users to understand whether they can interpret
the data correctly.
Iterative Design: Continuously improve visualizations based on feedback and
performance metrics like readability and clarity.
Data visualization is not just about showing data but telling a story:
Identify the key message or insight you want to convey.
Use narrative elements like annotations, highlights, and progression to guide the audience
through the data.
Visualization stages
Data visualization is a multi-stage process that transforms raw data into visual representations
that help in understanding, analyzing, and communicating insights. Each stage plays a crucial
role in ensuring that the final visualization is effective and accurate. Here’s a detailed breakdown
of the stages involved in data visualization:
1. Problem Definition
Objective: Clearly define the goals and questions you want the visualization to address.
Understand what you want to achieve with the visualization. Are you exploring data,
comparing values, showing trends, or highlighting relationships?
Identify who will use the visualization and what their needs and expectations are relevant
to your objectives.
Example: You want to visualize monthly sales trends to understand seasonal patterns and
compare performance across different regions.
2. Data Collection
Source Identification: Determine where your data will come from (databases, APIs,
spreadsheets, etc.).
3. Data Preparation
Data Cleaning: Handle missing values, remove duplicates, and correct errors.
Data Transformation: Convert data types, normalize or scale numerical values, and
encode categorical variables.
Data Aggregation: Summarize data as needed, such as grouping by time periods or
categories.
Example: Clean sales data by filling missing values, converting dates to a standard format, and
aggregating sales figures by month and region.
Objective: Design and develop the visualizations based on the insights from EDA and the
problem definition.
Choose Visualization Type: Select the appropriate chart or graph based on the data and
objectives (e.g., line chart for trends, bar chart for comparisons).
Design Layout: Create a layout that clearly presents the information, including axes,
labels, legends, and titles.
Example: Design a line chart to show monthly sales trends, including labels for axes, a legend to
differentiate regions, and interactive filters to view different time periods or regions.
Example: Validate that the line chart correctly reflects sales data and check with users to ensure
they can easily understand and interact with the visualization.
Objective: Publish and share the visualization with the intended audience.
Choose Platform: Decide where and how to deploy the visualization (e.g., web
dashboard, report, presentation).
Format and Export: Export the visualization in the appropriate format (e.g., interactive
web application, static image, PDF).
Example: Publish the line chart on a company dashboard and share it with sales teams via email
or a shared reporting platform.
Objective: Collect feedback and make improvements based on user input and evolving needs.
Example: Collect feedback from sales managers on the usability of the sales trends dashboard
and make updates to address any issues or incorporate additional features.
In the field of data visualization, visual variables are fundamental elements used to encode
information in graphical representations. Jacques Bertin, in his seminal work "Sémiologie
Graphique," identified several visual variables that can be used to represent data effectively.
These visual variables help in distinguishing different data elements and conveying various types
of information.
This taxonomy categorizes visualizations based on the type of data they represent, such
as quantitative, qualitative data, temporal data and geospatial data
This taxonomy categorizes visualizations based on the type of visual encoding used to represent
data, such as position, color, shape.
1. Position - the location of an element on a visual display, such as a graph or a map. For
example, a scatter plot shows the position of data points relative to two axes.
2. Shape - the form of an element on a visual display. For example, different shapes can be
used to represent different types of data in a scatter plot.
3.Color - the hue of an element on a visual display. For example, a heat map uses color to
represent the magnitude of a variable.
This taxonomy categorizes visualizations based on the type of task they are designed to support,
such as exploration, comparison, or explanation
Domain Taxonomy:
This taxonomy categorizes visualizations based on the domain or field of application, such as
scientific visualization or research and business intelligence
Programming Libraries: Python’s Matplotlib, Seaborn, Plotly, R’s ggplot2, D3.js for
web-based visualizations.
Business Intelligence Tools: Tableau, Microsoft Power BI, QlikView.
Technology Taxonomy:
This taxonomy categorizes visualizations based on the technology used to create them, such as ,
interactive visualizations , animated visualization, geospatial or network
Experimental Semiotics based on Perception Gibson‘s Affordance theory
• Experimental semiotics is a field that explores how humans use and interpret signs and
symbols in communication. One influential theory in this field is the Affordance Theory,
developed by psychologist James J. Gibson.
• Semiotics is the study of signs and symbols, focusing on how meaning is generated and
communicated.
Gibson's Affordance Theory suggests that perception is an active process that involves the
interpretation of environmental cues( cues around a person that inform them what is
happening and how to respond.) in relation to the individual's goals and intentions.
According to this theory, objects and environments have inherent affordances, or potential
actions that they enable or constrain. For example, a chair affords sitting, a door affords opening,
and a staircase affords climbing
• Experimental semiotics based on Affordance Theory seeks to investigate how people use
and interpret signs and symbols in relation to their perceived affordances.
• For example, a stop sign affords stopping, and a green traffic light affords moving
forward. By manipulating the signs and symbols presented to participants, researchers
can explore how they interpret and respond to different affordances.
• One example of an experimental semiotics study based on Affordance Theory is a study
that investigated how people interpret and respond to road signs. Participants were
presented with a series of road signs with different colors, shapes, and symbols, and were
asked to indicate what action they would take in response to each sign. The results
showed that participants' responses were strongly influenced by the perceived affordances
of the signs, highlighting the importance of affordance-based interpretation in
communication.
1. Sensory Input: The person detects the visual stimulus of the streetlight.
5. Memory: The person stores the information about the streetlight in their memory.
6. Response: The person continues walking down the street without reacting to the
streetlight.
• Illustrations, photographs, video, concept maps, graphs and charts, and many other visual
stimuli can be used to great effect in the classroom. Photos and art depicting historical
events, for example, can help students connect with the past. Graphs and charts are
excellent ways to illustrate comparisons and changes.