0% found this document useful (0 votes)
42 views4 pages

PYTHON UNIT-5 Part-C

Pandas is an open-source Python library that provides high-performance data structures and analysis tools, primarily using Series and DataFrame objects. Series is a one-dimensional labeled array, while DataFrame is a two-dimensional table-like structure that can be created from various data types. Additionally, Matplotlib is introduced as a library for creating 2D graphics, with functions for plotting data and customizing visualizations.

Uploaded by

diagnosant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views4 pages

PYTHON UNIT-5 Part-C

Pandas is an open-source Python library that provides high-performance data structures and analysis tools, primarily using Series and DataFrame objects. Series is a one-dimensional labeled array, while DataFrame is a two-dimensional table-like structure that can be created from various data types. Additionally, Matplotlib is introduced as a library for creating 2D graphics, with functions for plotting data and customizing visualizations.

Uploaded by

diagnosant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Pandas: Pandas is an open source, library providing high performance, easy-to-use

use data
structures and data analysis tools for the Python programming language language. Panda deals with 2
data structures 1. Series 2. DataFrame
1. Panda series: Series is a one-dimensional
dimensional labeled array capable of holding any data type
(integers, strings, floating point numbers, Python objects, etc.). The axis labels are
collectively referred to as the index. The basic method to create a Series is to call:
import pandas as <identifier name>
<Series_name> = <identifier name>.Series(data, index=index)
Steps to create panda series-
i. Consider a list
games = ['Cricket', 'Volleyball', 'Judo', 'Hockey']
ii. Now we create a pandas Series with above list
iii. import pandas as ps
iv. s= ps.Series(games)
v. print(s)
In the above output 0,1,2,3 are the indexes of list values. We can also create our own
index for each value as-
import pandas as ps
games = ['Cricket', 'Volleyball', 'Judo', 'Hockey']
s= ps.Series(games,, index =['G1','G2','G3','G4'])
print(s)
In the similar manner we can create pandas Series with
different
erent data (tuple, dictionary, Object)
Creating a series from Scalar value--
import pandas as ps
s= ps.Series(50, index =['G1','G2','G3','G4'])
print(s)
Mathematical Operations in Series--
import pandas as ps
s1= ps.Series([1,2,3,4,5])
s2=ps.Series([2,3,4,5,6])
s3=s1*2 # assign new series by multiply s1 by 2
print(s3)
print(s1**2) # square each value of s1
print(s1[s1>2]) # print nt all s1 which is greater than 2
print(s1+s2) # addition of series s1, s2 it fill NaN if non matching index
print(s1.add(s2,fill_value=0)) # addition of series s1, s2 it fill 0 if non matching index
head (n) and tail(n) functions: It is used to access the first 5 rows of a series and last 5 rows
of the series.
import pandas as ps
s1= ps.Series([1,2,3,4,5,6,7,8,9,10])
print(s1.head()) # print first 5 rows of s1
print(s1.tail()) # print last 5 rows of s1
print(s1.head(2)) # print first 2 rows of s1
print(s1.tail(2)) # print last 2 rows of s1
Selection in Series: Series provides index label loc and iloc and [] to access rows and columns.
e.g. series_name.loc[StartRange: StopRange] #label indexing
series_name.iloc[StartRange: StopRange] # integer indexing
series_name[StartRange: StopRange] or series_name[ index]
series_name.index # to print all index in list form
Slicing in Series: SERIES_NAME [start:end: step]
2. DataFrame- It is a two-dimensional
dimensional object that is useful in representing data in the form
of rows and columns. It is similar to a spreadsheet or an SQL table. This is the most
commonly used pandas object. Once we st store
ore the data into the Dataframe, we can
perform various operations that are useful in analyzing and understanding the data.

A data frame can be created using any of the following


following-
1. Series 2. Lists 3. Dictionary 4. A numpy 2D array
Creating empty dataframe:
import pandas as ps
d= ps.DataFrame()
print(d)
Creating dataframe from series:
import pandas as ps
s=ps.Series(['a','b','c','d'])
d= ps.DataFrame(s)
print(d)
DataFrame from Dictionary of Series:
import pandas as ps
name=ps.Series(['hardik','virat'])
team=ps.Series(['MI','RCB'])
dict={'Name':name,'Team':team}
d= ps.DataFrame(dict)
print(d)
MatPlotlIb: it is python library which provide many interfaces and functions to present data in
2D graphics. It provide a sub module pyplot for methods that construct 2D graph easily.
For plotting using Matplotlib, we need to import its pyplot module using the following
command: import matplotlib.pyplot as plt

To plot x versus y, we can write plt.plot(x,y). The show() function is used to display the figure
created using the plot() function. plot() function by default plots a line chart e.g.
import matplotlib.pyplot as plt #list storing date in string format
date=["25/12","26/12","27/12"] #list storing date in string format
temp=[8.5,10.5,6.8] #list storing temperature values
plt.plot(date, temp) #create a figure plotting temp versus date
plt.show() #show the figure
We can click on the save button on the output window and save the plot as an image. A figure
can also be saved by using savefig() function. The name of the f igure is passed to the function
as parameter. e.g. plt.savefig('x.png').
Ploting a line:
import matplotlib.pyplot as plt # importing the required module
x = [1,2,3,4,5] # x axis values
y = [1,4,9,16,25] # y axis values
plt.plot(x, y) # plotting the points
plt.xlabel('x - axis') # naming the x axis
plt.ylabel('y - axis') # naming the y axis
plt.title('My first graph!') # giving a title to my graph
plt.show() # function to show the plot
Ploting Multiple line:
import matplotlib.pyplot as plt
food = ['meat','banana','avocardo','sweet
patato','spinach']
calories = [250,130,140,120,20]
plt.plot(food, calories, label = "calories")
potesium = [40,55,20,30,40]
plt.plot(food, potesium, label = "potesium")
fat = [8,5,3,6,1]
plt.plot(food, fat, label = "fat")
plt.xlabel('Fruits')
plt.ylabel('Value')
plt.title('Fruits calories chart!')
plt.legend()
plt.show()
List of Pyplot functions to plot different charts-
 plot(x, y, color='green', linestyle='dashed', linewidth = 3, marker='o',
markerfacecolor='blue', markersize=12) - Plot x versus y as lines and/or markers
 bar(x, y, height, width, label, color) - Make a bar plot
 hist(x,y, bins, range, density, weights, ...) - Plot a histogram.
 pie(x, explode, labels, radius, colors, autopct, ...]) - Plot a pie chart
 scatter(x, y, size, color, marker) - A scatter plot of x versus y.

You might also like