Unit 4 python
Unit 4 python
Data visualization in Python is the process of creating charts, graphs, and plots to
represent data visually, making it easier to understand patterns, trends, and insights.
1. Collect and Prepare Data: Gather, clean, and preprocess data to ensure
accuracy and relevance.
2. Choose the Visualization Type: Select a chart or plot type (e.g., bar chart,
scatter plot, histogram) that best represents the data.
3. Plot the Data: Use a visualization library or tool (e.g., matplotlib,plotly,
seaborn) to create the chart.
4. Customize and Enhance: Add titles, labels, legends, and gridlines to improve
readability and interpretability.
5. Present or Save: Share the visualization in reports, presentations, or files for
further use.
1. Easy to Understand: Simplifies complex data into visual formats like charts
and graphs.
2. Quick Insights: Helps identify patterns and trends faster.
3. Customizable: Offers flexibility to design visuals as needed using various
libraries.
4. Supports Big Data: Handles large datasets effectively.
5. Interactive Visuals: Creates interactive and dynamic plots for better analysis.
CSV (Comma-Separated Values) is a format to store and organize data in rows and
columns, making it easy to export and share. Python's csv module allows writing and
reading such files.
randint from Python's random module generates random whole numbers between a
specified range. It's useful for creating random data.
arange from NumPy creates a sequence of evenly spaced numbers within a specified
range.
Matplotlib
Features of matplotlib:
1. Versatility: Matplotlib can generate a wide range of plots, including line plots,
scatter plots, bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control every
aspect of the plot, such as line styles, colors, markers, labels, and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with NumPy,
making it easy to plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable for
publication with fine-grained control over the plot aesthetics.
Installing matplotlib:
To install this type the below command in the terminal.
Pip install matplotlib
Line Plots
Scatter Plots
Bar Charts
Histograms
1. Shows the distribution of data by dividing it into intervals (bins).
2. Example: plt.hist()
Pie Charts
Displaying the plot: The plt.show() function renders the chart in a window.
Line Chart
Line Chart is used to represent a relationship between two data X and Y on a different
axis. It is plotted using the plot() function. Let’s see the below example.
plt.show()
Important Methods of Matplotlib
Method Description Example
xlabel() Sets the label for the x-axis. plt.xlabel("X-Axis Label")
ylabel() Sets the label for the y-axis. plt.ylabel("Y-Axis Label")
title() Sets the title for the plot. plt.title("Plot Title")
legend() Displays a legend for labeled plot plt.legend(["Line 1", "Line 2"])
elements.
grid() Adds a grid to the plot for better plt.grid(True)
readability.
xticks() Customizes the ticks on the x-axis. plt.xticks([0, 1, 2], ["Low",
"Medium", "High"])
yticks() Customizes the ticks on the y-axis. plt.yticks([10, 20, 30], ["Poor",
"Average", "Excellent"])
show() Displays the plot. plt.show()
Multiline Graph
A multiline graph (also known as a multiple line graph) is a type of plot that
displays multiple lines on the same graph, each representing a different dataset or
variable.
A plot showing the sales of multiple products (Product A, Product B, and Product C)
over several months.
Random Walks
A random walk is a process where an object or value takes steps in random directions,
creating an unpredictable path or sequence.
Stock Market Modeling: Random walks are used to model the unpredictable
movement of stock prices over time, based on the assumption that each price
change is random.
Disease Prediction and Spread: Random walk models predict how diseases
spread by simulating the random movement of people in a population.
1D Random Walk:
In a 1-dimensional random walk, the object moves back and forth along a single axis
(e.g., left or right).
Example:
import matplotlib.pyplot as plt
import numpy as np
steps = np.random.choice([-1, 1], size=1000) # Generate 1000 random steps of -1 or 1
position = np.cumsum(steps)
plt.plot(position)
plt.show()
Plotly
# Example data
x = [1, 2, 3, 4, 5]
y = [10, 11, 12, 13, 14]
import random
import plotly.graph_objects as go
# Roll the die 100 times and store the results
rolls = [random.randint(1, 6) for _ in range(100)]
Install Required Libraries: Make sure you have the necessary libraries installed,
such as pandas for data handling and matplotlib or plotly for visualization.
Read the CSV File: Use pandas to load the CSV file into a DataFrame.
Select the Data to Visualize: Choose the columns or data that you want to visualize.
Plot the Data: Use matplotlib or plotly to create different types of plots (e.g., bar
chart, line chart, histogram).
Customize and Save the Plot (Optional): Customize the plot (e.g., labels, titles,
colors) and save the image if necessary.
import pandas as pd
import matplotlib.pyplot as plt
# Plot a line graph using two columns (e.g., 'Year' and 'Sales')
plt.plot(df['Year'], df['Sales'])
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt
# Plot a bar chart using two columns (e.g., 'Year' and 'Sales')
plt.bar(df['Year'], df['Sales'])
import pandas as pd
import matplotlib.pyplot as plt
# Plot a pie chart using the 'Sales' column, with 'Year' as labels
plt.pie(df['Sales'], labels=df['Year'])
JSON
JSON (JavaScript Object Notation) is a lightweight data format used for storing
and exchanging data. It is easy to read and write, structured as key-value pairs, and
widely used for APIs and data transfer between systems.
Features of JSON
import json
import plotly.express as px
Mapping Global Datasets in JSON Format means organizing data in a clear and
structured way. Each data entry is stored as an object with related information
grouped together. This makes the data easy to read and use by computers.
[
{"country": "USA", "population": 331000000, "GDP": 21.43},
{"country": "India", "population": 1393409038, "GDP": 2.87},
{"country": "China", "population": 1444216107, "GDP": 14.34}
]
1. Load the JSON file: Use the json or pandas library to read the data.
2. Parse the data: Convert JSON into a Python object or DataFrame.
3. Inspect the data: View the structure and contents of the JSON file.
4. Extract specific fields: Filter and retrieve required values (e.g., years, sales).
5. Choose a visualization library: Select Plotly for interactive or Matplotlib for
static graphs.
6. Create the chart: Use the extracted data to plot graphs like bar charts or pie
charts.
7. Display the visualization: Show the graph in a window or save it to a file.
Web API:
1. A type of API that uses web protocols like HTTP to send and receive
data over the internet.
2. Web APIs are commonly used to access data or services from web
servers.
API Key
An API key is a unique code provided by an API service to authenticate and identify
users or applications accessing the API. It acts like a password, ensuring that only
authorized users can make requests to the API. API keys are used to track usage and
limit access to prevent abuse.
Requests Library:
The Requests library in Python is a simple and user-friendly tool that allows you to
send HTTP requests, which are commonly used to interact with web services and
APIs.
Example:
import requests
import matplotlib.pyplot as plt
# Fetch top Python projects
response=requests.get("https://api.github.com/search/repositories?q=language:python
&sort=stars")
data = response.json()
# Extract project names and stars
print(data["total_count"])
x=data["items"][0]
print(x["name"])
print(x["stargazers_count"])