0% found this document useful (0 votes)
1 views7 pages

Data Visualization Notes

The document provides an overview of data visualization in Python, highlighting its importance in simplifying complex data and aiding decision-making. It introduces Matplotlib, a key library for creating various visualizations, and explains the role of NumPy in handling numerical data for plotting. Additionally, it covers different types of visualizations, basic rules for effective plotting, and essential nomenclature for understanding plots.

Uploaded by

purveshnagre4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views7 pages

Data Visualization Notes

The document provides an overview of data visualization in Python, highlighting its importance in simplifying complex data and aiding decision-making. It introduces Matplotlib, a key library for creating various visualizations, and explains the role of NumPy in handling numerical data for plotting. Additionally, it covers different types of visualizations, basic rules for effective plotting, and essential nomenclature for understanding plots.

Uploaded by

purveshnagre4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Data Visualization in Python

1. Introduction to Data Visualization


Data Visualization is the process of converting raw data or information into visual
formats such as charts, graphs, maps, histograms, and plots. These visuals make it
easier for humans to understand complex data, identify trends, compare values, and
make decisions quickly.
When you look at numbers in a table, it might take time to find patterns. But if the same
data is shown in a bar chart or line graph, it becomes easier to spot increases, decreases,
outliers, and overall trends.

Key Benefits of Data Visualization:


 Makes complex data simple and understandable
 Reveals trends, patterns, and correlations
 Helps in quick and informed decision-making
 Allows easy comparison of values over time or between groups
 Makes reports and presentations more attractive and engaging

Real-Life Uses of Data Visualization:


Area Example
Education Teachers use bar charts to show students' performance in subjects.
Business Managers use sales graphs to track revenue, profit, and losses.
COVID-19 dashboards showed daily cases, recoveries, and deaths using line and bar
Health
charts.
Platforms like YouTube and Instagram show analytics using pie charts and line
Social Media
graphs (likes, views, growth, engagement rate).
Population data, literacy rates, budget expenditure are visualized through graphs to
Government
inform the public.
Weather
Temperature trends, rainfall, or wind speed are shown with line or bar graphs.
Reports
Finance Stock market apps show prices through line charts

Simple Example:
Suppose a teacher wants to show how many students passed in each subject.
Subject Students Passed
Maths 40
Science 35
English 45
Hindi 38
Instead of just reading this table, a bar chart would instantly show which subject had the
highest or lowest pass count.

2. Introduction to Matplotlib
Matplotlib is a powerful Python library used for creating static, interactive, and animated
visualizations. It works well with NumPy and Pandas.It is use to create a wide variety of
data visualizations such as:

 Line charts
 Bar charts
 Pie charts
 Histograms
 Scatter plots
 And more...

It is especially useful in data analysis and machine learning to visually understand the
data.

Key Features of Matplotlib:

 Versatile: Can create simple to highly customized plots.


 Interactive: Allows zooming, panning, and saving the graph using tools on the
plot window.
 Static Plots: You can export your plots as image files (PNG, JPG, PDF, etc.).
 Works with NumPy & Pandas: Easily accepts arrays (from NumPy) and tables
(from Pandas) as input.
 Customizable: You can change colors, styles, labels, titles, legends, and more.
3. NumPy (used in visualization)

NumPy (Numerical Python) is a Python library used to work with numerical data.
In data visualization, we often work with large collections of numbers like values,
measurements, or statistics. Matplotlib uses NumPy arrays to plot data efficiently.

Why NumPy is Useful in Visualization with Matplotlib:

1. Efficient Data Storage


➤ NumPy stores data in arrays (similar to lists but faster and more powerful),
which Matplotlib can easily plot.
2. Generates Data for Plotting
➤ You can generate large sets of numbers (e.g., 1000 random numbers or even a
math function like sine waves) using NumPy.
3. Mathematical Operations
➤ NumPy makes it easy to apply math functions (like square, square root, sin,
cos) to arrays, which is often required before plotting.
4. Faster Calculations
➤ NumPy is highly optimized, so when Matplotlib uses NumPy arrays, it can
process and plot data faster.

Example: NumPy with Matplotlib

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100) # 100 points between 0 and 10
y = np.sin(x) # apply sine function
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("x")
plt.ylabel("sin(x)")
plt.grid(True)
plt.show()

🟢 Explanation:

 np.linspace() creates an array of 100 evenly spaced numbers.


 np.sin(x) calculates sine values for each x.
 plot() uses these arrays to draw a smooth sine wave.
4. Installing Matplotlib
Use the following command to install Matplotlib using pip:

pip install matplotlib

5. Types of Data Visualizations with Python Examples

1. Line Chart
Line charts are used to show trends or changes over time.

import matplotlib.pyplot as plt


x = [1, 2, 3, 4, 5]
y = [10, 20, 15, 25, 30]
plt.plot(x, y)
plt.title("Line Chart")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()

2. Bar Chart
Bar charts are used to compare quantities across different categories.

import matplotlib.pyplot as plt


subjects = ['Maths', 'Science', 'English', 'History']
marks = [80, 70, 90, 65]
plt.bar(subjects, marks, color='skyblue')
plt.title("Bar Chart - Marks in Subjects")
plt.xlabel("Subjects")
plt.ylabel("Marks")
plt.show()

3. Histogram
Histograms show the distribution of numerical data.

import matplotlib.pyplot as plt


import numpy as np
data = np.random.randn(1000)
plt.hist(data, bins=20, color='purple')
plt.title("Histogram")
plt.xlabel("Values")
plt.ylabel("Frequency")
plt.show()

4. Pie Chart
Pie charts show proportions of different categories in a circular format.

import matplotlib.pyplot as plt


labels = ['Python', 'Java', 'C++', 'Ruby']
values = [40, 30, 20, 10]
plt.pie(values, labels=labels, autopct='%1.1f%%')
plt.title("Pie Chart - Programming Language Popularity")
plt.show()

5. Scatter Plot
Scatter plots show the relationship between two variables.

import matplotlib.pyplot as plt


x = [5, 7, 8, 7, 2, 17, 2, 9]
y = [99, 86, 87, 88, 100, 86, 103, 87]
plt.scatter(x, y, color='red')
plt.title("Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()

6. Basic Visualization Rules


- Use proper labels and title
- Choose suitable chart type
- Use readable fonts and colors
- Avoid clutter
Program to show use of marker
import pandas as pd
import matplotlib.pyplot as plt
# Creating a Pandas Series
data = pd.Series([10, 15, 13, 18, 20])
# Plotting the line chart with customization
data.plot(kind='line', marker='o', linewidth=2, color='green', linestyle='--')
# Adding title and labels
plt.title("Customized Line Chart")
plt.xlabel("Index")
plt.ylabel("Values")
plt.grid(True)
plt.show()

Explanation:
marker='o': Adds circular markers at each data point.
linewidth=2: Increases the thickness of the line.
color='green': Sets the line color to green.
linestyle='--': Makes the line dashed.

7. Basic Nomenclature of a Plot


Important parts of a plot:
1. Title
2. X-axis Label
3. Y-axis Label
4. Legend
5. Grid
6. Axis Ticks

Important Parts of a Plot (Plot Nomenclature)

1. Title
o The name of the graph, written at the top.
o It tells what the graph is about.
Example: "Sales Report (2024)"
2. X-axis Label
o Describes what the horizontal axis (left to right) represents.
o Usually represents categories or time.
Example: "Months"
3. Y-axis Label
o Describes what the vertical axis (bottom to top) represents.
o Usually shows measured values.
Example: "Number of Products Sold"
4. Legend
o Explains what different colors or lines in the graph mean (used in multi-line or multi-bar
plots).
o Helps in identifying which line/bar represents what data.
Example: Blue line = "Class A", Green line = "Class B"
5. Grid
o Horizontal and vertical lines across the plot that make it easier to read and compare
values.
Use: Helps align points and values visually.
6. Axis Ticks
o The small marks on both axes (x and y) that show scale or values.
o Ticks help to read exact values from the plot.
Example: 10, 20, 30 on y-axis; Jan, Feb, Mar on x-axis.

You might also like