0% found this document useful (0 votes)
30 views

Unit 4 python

Bca 3rd sem Bangalore University

Uploaded by

bhuvaneshnair21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Unit 4 python

Bca 3rd sem Bangalore University

Uploaded by

bhuvaneshnair21
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Unit IV

Data visualization in Python is the process of creating charts, graphs, and plots to
represent data visually, making it easier to understand patterns, trends, and insights.

General Steps for Data Visualization:

1. Collect and Prepare Data: Gather, clean, and preprocess data to ensure
accuracy and relevance.
2. Choose the Visualization Type: Select a chart or plot type (e.g., bar chart,
scatter plot, histogram) that best represents the data.
3. Plot the Data: Use a visualization library or tool (e.g., matplotlib,plotly,
seaborn) to create the chart.
4. Customize and Enhance: Add titles, labels, legends, and gridlines to improve
readability and interpretability.
5. Present or Save: Share the visualization in reports, presentations, or files for
further use.

Advantages of Data Visualization in Python

1. Easy to Understand: Simplifies complex data into visual formats like charts
and graphs.
2. Quick Insights: Helps identify patterns and trends faster.
3. Customizable: Offers flexibility to design visuals as needed using various
libraries.
4. Supports Big Data: Handles large datasets effectively.
5. Interactive Visuals: Creates interactive and dynamic plots for better analysis.

Different ways of generating the data

 Generating Data with CSV

CSV (Comma-Separated Values) is a format to store and organize data in rows and
columns, making it easy to export and share. Python's csv module allows writing and
reading such files.

 Generating Data with randint

randint from Python's random module generates random whole numbers between a
specified range. It's useful for creating random data.

num = random.randint(1, 10)

 Generating Data with arange

arange from NumPy creates a sequence of evenly spaced numbers within a specified
range.

data = np.arange(0, 11, 2)


Python provides various libraries that come with different features for visualizing data
Ex:Matplotlib,Plotly

Matplotlib

Matplotlib is an easy-to-use, low-level data visualization library that is built on


NumPy arrays. It consists of various plots like scatter plot, line plot, histogram, etc.
Matplotlib provides a lot of flexibility.

Features of matplotlib:

1. Versatility: Matplotlib can generate a wide range of plots, including line plots,
scatter plots, bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control every
aspect of the plot, such as line styles, colors, markers, labels, and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with NumPy,
making it easy to plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable for
publication with fine-grained control over the plot aesthetics.

Installing matplotlib:
To install this type the below command in the terminal.
Pip install matplotlib

Types of visualization in matplotlib


Matplotlib supports a wide range of visualizations. Here are the main types:

Line Plots

1. Used to show trends over time or continuous data.


2. Example: plt.plot()

Scatter Plots

1. Displays individual data points.


2. Useful for showing relationships between two variables.
3. Example: plt.scatter()

Bar Charts

1. Represents categorical data with rectangular bars.


2. Example: plt.bar() for vertical bars, plt.barh() for horizontal bars.

Histograms
1. Shows the distribution of data by dividing it into intervals (bins).
2. Example: plt.hist()

Pie Charts

1. Displays proportions or percentages as slices of a pie.


2. Example: plt.pie()

Ploting simple line graph using matplotlib:

Step by step procedure to plot a simple line graph

 Importing matplotlib.pyplot: The matplotlib.pyplot module is imported as plt to


simplify function calls.
 Initializing the data: Two lists, x and y, are defined to represent the data points
for the X and Y axes.
 Plotting the data: The plt.plot() function is used to plot the x and y data as a line
chart.
 Customizing the plot:

 A title is added with plt.title().


 Labels for the X-axis and Y-axis are set with plt.xlabel() and plt.ylabel().

 Displaying the plot: The plt.show() function renders the chart in a window.

Line Chart
Line Chart is used to represent a relationship between two data X and Y on a different
axis. It is plotted using the plot() function. Let’s see the below example.

import matplotlib.pyplot as plt


# initializing the data
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]

# plotting the data


plt.plot(x, y)

# Adding label on the y-axis


plt.ylabel('Y-Axis')

# Adding label on the x-axis


plt.xlabel('X-Axis')

plt.show()
Important Methods of Matplotlib
Method Description Example
xlabel() Sets the label for the x-axis. plt.xlabel("X-Axis Label")
ylabel() Sets the label for the y-axis. plt.ylabel("Y-Axis Label")
title() Sets the title for the plot. plt.title("Plot Title")
legend() Displays a legend for labeled plot plt.legend(["Line 1", "Line 2"])
elements.
grid() Adds a grid to the plot for better plt.grid(True)
readability.
xticks() Customizes the ticks on the x-axis. plt.xticks([0, 1, 2], ["Low",
"Medium", "High"])
yticks() Customizes the ticks on the y-axis. plt.yticks([10, 20, 30], ["Poor",
"Average", "Excellent"])
show() Displays the plot. plt.show()

Customizing Line Graphs with plot() Method in Matplotlib


Parameter Description Example
color Sets the color of the line. plt.plot(x, y, color="blue")
Can be a string (e.g., 'red',
'blue') or a hex code.
linestyle Specifies the style of the plt.plot(x, y, linestyle="--")
line (solid, dashed, etc.).
linewidth Controls the width of the plt.plot(x, y, linewidth=2)
line.
marker Specifies markers for each plt.plot(x, y, marker="o")
data point (e.g., 'o' for
circle, 's' for square).
markersize Defines the size of the plt.plot(x, y, marker="o",
markers. markersize=6)
markerfacecolor Sets the color of the plt.plot(x, y, marker="o",
marker. markerfacecolor="red")
markeredgecolor Sets the color of the plt.plot(x, y, marker="o",
marker edge. markeredgecolor="black")

Multiline Graph

A multiline graph (also known as a multiple line graph) is a type of plot that
displays multiple lines on the same graph, each representing a different dataset or
variable.

Example of a Multiline Graph:

A plot showing the sales of multiple products (Product A, Product B, and Product C)
over several months.

import matplotlib.pyplot as plt


# Data for three products
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
product_a_sales = [200, 220, 240, 260, 280]
product_b_sales = [150, 170, 190, 210, 230]
product_c_sales = [100, 120, 140, 160, 180]

# Plotting multiple lines


plt.plot(months, product_a_sales, label='Product A', color='blue')
plt.plot(months, product_b_sales, label='Product B', color='green')
plt.plot(months, product_c_sales, label='Product C', color='red')

# Adding title and labels


plt.xlabel('Months')
plt.ylabel('Sales')

# Displaying the plot


plt.show()

Random Walks
A random walk is a process where an object or value takes steps in random directions,
creating an unpredictable path or sequence.

Applications of random walks:

 Stock Market Modeling: Random walks are used to model the unpredictable
movement of stock prices over time, based on the assumption that each price
change is random.
 Disease Prediction and Spread: Random walk models predict how diseases
spread by simulating the random movement of people in a population.

1D Random Walk:
In a 1-dimensional random walk, the object moves back and forth along a single axis
(e.g., left or right).
Example:
import matplotlib.pyplot as plt
import numpy as np
steps = np.random.choice([-1, 1], size=1000) # Generate 1000 random steps of -1 or 1
position = np.cumsum(steps)
plt.plot(position)
plt.show()

2D Random Walk: In a 2-dimensional random walk, the object moves in a plane,


with choices to move in any of the four cardinal directions (up, down, left, or right).

import matplotlib.pyplot as plt


import numpy as np
# Generate 1000 random steps for x and y directions
x_steps = np.random.choice([-1, 1], size=1000)
y_steps = np.random.choice([-1, 1], size=1000)
# Compute the cumulative sum to get positions in x and y directions
x_position = np.cumsum(x_steps)
y_position = np.cumsum(y_steps)
# Plotting the 2D random walk
plt.plot(x_position, y_position)
plt.xlabel("X Position")
plt.ylabel("Y Position")
plt.show()

3D Random Walk: In a 3-dimensional random walk, the object moves in a 3D space,


with choices to move along the x, y, or z axes.

Difference between plotly and matplotlib

Plotly

Plotly is an open-source library used to create interactive graphs and visualizations in


Python. It supports various chart types like line charts, scatter plots, and 3D plots,
making data visualization more dynamic.

Creating Line plot using plotly


import plotly.express as px

# Example data
x = [1, 2, 3, 4, 5]
y = [10, 11, 12, 13, 14]

# Create a line graph using Plotly Express


fig = px.line(x=x, y=y, title='Simple Line Plot')

# Show the plot


fig.show()

To install it type the below command in the terminal


pip install plotly
Rolling dice with plotly
Rolling dice with Plotly means simulating dice rolls and using Plotly to create visual
charts, like bar graphs, to show the results.

import random
import plotly.graph_objects as go
# Roll the die 100 times and store the results
rolls = [random.randint(1, 6) for _ in range(100)]

# Count the occurrences of each die face (1 to 6)


counts = [rolls.count(i) for i in range(1, 7)]

# Create a bar chart using Plotly


fig = go.Figure(data=[go.Bar(x=list(range(1, 7)), y=counts)])
fig.show()

Process to visualize a CSV file in Python, follow these steps:

Install Required Libraries: Make sure you have the necessary libraries installed,
such as pandas for data handling and matplotlib or plotly for visualization.

Import Libraries: Import the required libraries in your Python script.

Read the CSV File: Use pandas to load the CSV file into a DataFrame.
Select the Data to Visualize: Choose the columns or data that you want to visualize.
Plot the Data: Use matplotlib or plotly to create different types of plots (e.g., bar
chart, line chart, histogram).
Customize and Save the Plot (Optional): Customize the plot (e.g., labels, titles,
colors) and save the image if necessary.

Create a line graph using CSV using matplotlib:

import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV file into a DataFrame


df = pd.read_csv('your_file.csv')

# Plot a line graph using two columns (e.g., 'Year' and 'Sales')
plt.plot(df['Year'], df['Sales'])

# Customize the plot


plt.xlabel('Year')
plt.ylabel('Sales')
plt.title('Sales Over Time')

# Show the plot


plt.show()

Create a mutliline graph using CSV using matplotlib:

import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV file into a DataFrame


df = pd.read_csv('your_file.csv')

# Plot multiple lines (e.g., 'Sales' and 'Profit')


plt.plot(df['Year'], df['Sales'])
plt.plot(df['Year'], df['Profit'])

# Customize the plot


plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Sales and Profit Over Time')

# Show the plot


plt.show()

Creating a bar chart from CSV files using matplotlib

import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV file into a DataFrame


df = pd.read_csv('your_file.csv')

# Plot a bar chart using two columns (e.g., 'Year' and 'Sales')
plt.bar(df['Year'], df['Sales'])

# Customize the plot


plt.xlabel('Year')
plt.ylabel('Sales')
plt.title('Sales Over Time')

# Show the plot


plt.show()

Creating pie chart from CSV using matplotlib

import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV file into a DataFrame


df = pd.read_csv('your_file.csv')

# Plot a pie chart using the 'Sales' column, with 'Year' as labels
plt.pie(df['Sales'], labels=df['Year'])

# Customize the plot


plt.title('Sales Distribution by Year')

# Show the plot


plt.show()

JSON
JSON (JavaScript Object Notation) is a lightweight data format used for storing
and exchanging data. It is easy to read and write, structured as key-value pairs, and
widely used for APIs and data transfer between systems.

Features of JSON

1. Lightweight Format: JSON is simple and efficient, making it easy to transfer


data over the web.
2. Human-Readable: The syntax is easy to understand and write, with a clear
structure using key-value pairs.
3. Language-Independent: JSON is supported by most programming languages
and can be easily parsed and generated.
4. Supports Complex Structures: It can represent nested data using arrays and
objects, making it versatile for various data types.

Create json file & visualizing in python


Open notepad and type the below code and save it as sample.json
[
{"name":"Sam","age":21},
{"name":"Ram","age":22}
]

import json
import plotly.express as px

# Load the JSON data from 'per.json'


with open('per.json') as f:
data = json.load(f)

# Create a line graph using Plotly Express


fig = px.line(data, x='name', y='age')

# Show the plot


fig.show()

Mapping Global Datasets in JSON Format means organizing data in a clear and
structured way. Each data entry is stored as an object with related information
grouped together. This makes the data easy to read and use by computers.
[
{"country": "USA", "population": 331000000, "GDP": 21.43},
{"country": "India", "population": 1393409038, "GDP": 2.87},
{"country": "China", "population": 1444216107, "GDP": 14.34}
]

Steps to Visualize Data from a JSON File

1. Load the JSON file: Use the json or pandas library to read the data.
2. Parse the data: Convert JSON into a Python object or DataFrame.
3. Inspect the data: View the structure and contents of the JSON file.
4. Extract specific fields: Filter and retrieve required values (e.g., years, sales).
5. Choose a visualization library: Select Plotly for interactive or Matplotlib for
static graphs.
6. Create the chart: Use the extracted data to plot graphs like bar charts or pie
charts.
7. Display the visualization: Show the graph in a window or save it to a file.

What is API and Web API?

API (Application Programming Interface):

1. A set of rules and protocols that allow different software applications


to communicate with each other.
2. Acts as an intermediary between two applications, enabling data
exchange and functionality sharing.

Web API:

1. A type of API that uses web protocols like HTTP to send and receive
data over the internet.
2. Web APIs are commonly used to access data or services from web
servers.

Advantages of Using Web API

1. Real-Time Data Access: Provides up-to-date information from remote servers


without manual intervention.
2. Automation: Simplifies repetitive tasks by programmatically fetching and
processing data.
3. Cross-Platform Compatibility: Allows integration of data and services
across different systems and languages.
4. Dynamic Data Visualization: Enables creating visualizations with live,
constantly updated data.

API Key

An API key is a unique code provided by an API service to authenticate and identify
users or applications accessing the API. It acts like a password, ensuring that only
authorized users can make requests to the API. API keys are used to track usage and
limit access to prevent abuse.

Requests Library:

The Requests library in Python is a simple and user-friendly tool that allows you to
send HTTP requests, which are commonly used to interact with web services and
APIs.

Features of requests library


Simplified HTTP Requests: Allows easy sending of HTTP requests (GET, POST,
PUT, DELETE) with just a few lines of code.
Handles HTTP Headers: Automatically manages headers for requests, including
content type, authentication, and custom headers.
Supports Authentication: Provides built-in methods for handling different types of
authentication, such as basic authentication, OAuth, and custom authentication
schemes.
JSON and Form Data Support: Easily sends and receives data in formats like JSON
and form-encoded data.
Error Handling: Simplifies error handling by providing built-in exceptions for
issues like timeouts, connection problems, or invalid responses.

Example:

import requests
import matplotlib.pyplot as plt
# Fetch top Python projects
response=requests.get("https://api.github.com/search/repositories?q=language:python
&sort=stars")
data = response.json()
# Extract project names and stars
print(data["total_count"])
x=data["items"][0]
print(x["name"])
print(x["stargazers_count"])

Visualizing GitHub repositories using plotly


import requests
import matplotlib.pyplot as plt
# Fetch top Python projects
response=requests.get("https://api.github.com/search/repositories?q=language:python
&sort=stars")
data = response.json()
projects = [r['name'] for r in data['items']]
stars = [r['stargazers_count'] for r in data['items']]
# Plot the bar chart
plt.bar(projects, stars, color='skyblue', edgecolor='black')
plt.title('Top Python Projects on GitHub')
plt.xlabel('Projects')
plt.ylabel('Stars')
plt.show()

You might also like