Unit 5 PythonPackages(Matplotlib)
Unit 5 PythonPackages(Matplotlib)
Matplotlib is a Python library that helps in visualizing and analyzing the data
and helps in better understanding of the data with the help of graphical,
pictorial visualizations that can be simulated using the matplotlib library.
Matplotlib is a comprehensive library for static, animated and interactive
visualizations.
import numpy as np
Output
np.array():
[1 3 5]
np.zeros():
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
np.ones():
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]
Here,
import numpy as np
# create a 1D array
array1 = np.array([1, 3, 5, 7, 9, 11])
Original array:
[ 1 3 5 7 9 11]
Reshaped array:
[[ 1 3 5]
[ 7 9 11]]
Transposed array:
[[ 1 7]
[ 3 9]
[ 5 11]]
In this example,
import numpy as np
Sum of arrays:
[ 5 11 19 29 41]
Difference of arrays:
[ -3 -7 -13 -21 -31]
These statistical functions are useful to find basic statistical concepts like
mean, median, variance, etc. It is also used to find the maximum or the
minimum element in an array.
import numpy as np
Output
Mean: 77.2
Median: 78.0
Minimum marks: 66
Maximum marks: 85
Matplotlib:
Matplotlib is a popular data visualization library in Python that provides a
variety of plotting functions. Here are some major functions in Matplotlib:
1. plt.plot()
● Description: Creates a line plot.
● Example:
2 plt.scatter()
● Description: Creates a scatter plot for visualizing individual data points.
● Example:
●
4.plt.hist()
● Description: Creates a histogram for displaying the distribution of a dataset.
● Example:
import matplotlib.pyplot as plt
data = [1, 2, 2, 3, 3, 3, 4, 4, 5]
plt.hist(data, bins=5, label='Histogram', color='blue',
alpha=0.7)
plt.legend()
plt.show()
5. plt.pie()
● Description: Generates a pie chart for illustrating the composition of a whole.
● Example:
1. Introduction:
Pandas is an open-source data manipulation and analysis library for
Python.
It provides data structures like DataFrame and Series, designed for
efficient data cleaning, exploration, and analysis.
Install pandas
pip install pandas
import pandas
mydataset = {
'cars': [ "Volvo", "Ford"],
'speed': [ 70, 120]
}
myvar = pandas.DataFrame(mydataset)
print(myvar)
???????
To check pandas version:
import pandas as pd
print(pd.__version__)
Series:
A one-dimensional array-like object representing a column or row of
data.
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
import pandas as pd
a = [3, 7, 2]
myvar = pd.Series(a)
print(myvar[1])
import pandas as pd
speed = {"car1": 120, "car2": 80, "car3": 90}
myvar = pd.Series(speed)
print(myvar)
import pandas as pd
data = {
"car": [1, 2, 3],
"speed": [50, 140, 45]
}
myvar = pd.DataFrame(data)
print(myvar)
import pandas as pd
data = {
"car": [1, 2, 3],
"speed": [50, 140, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df.loc[0])
import pandas as pd
data = {
"calories": [400, 380, 390],
"duration": [40, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df.loc[[0, 1]])
With the index argument, you can name your own
indexes.
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2",
"day3"])
print(df)
To Load a comma separated file (CSV file) into a
DataFrame:
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
Pandas supports reading data from various file formats (CSV, Excel,
SQL, etc.).
python
df = pd.read_csv('data.csv')
5. Basic Operations:
Viewing Data:
7. Data Exploration:
Descriptive Statistics:
df.describe() # Generate descriptive statistics
GroupBy:
df.groupby('Column').mean() # Group data and calculate mean
8. Data Manipulation:
df.to_csv('output.csv', index=False)