100% found this document useful (3 votes)

208 views66 pages

1 IP 12 NOTES PythonPandas 2022 PDF

The document provides information about data handling using the Pandas library in Python. It discusses the Pandas and Matplotlib libraries and their uses in data manipulation and visualization. It then describes the key data types and data structures used in Pandas - Series and DataFrames. The remainder of the document focuses on describing Series in more detail - how they can be created from various data types like lists, dictionaries, scalars, and NumPy arrays. It discusses accessing, selecting, and slicing data from Series using labels or positions. Series support homogeneous data and allow mutable values.

Uploaded by

Krrish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

208 views66 pages

1 IP 12 NOTES PythonPandas 2022 PDF

Uploaded by

Krrish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

UNIT- I

Syllabus:

i) Data Handling using Pandas -I

• Introduction to Python libraries - Pandas, Matplotlib.
• Data structures in Pandas - Series and Data Frames.
• Series: Creation of Series from – ndarray, dictionary, scalar value; mathematical operations; Head
and Tail functions, Selection, Indexing and Slicing.
• Data Frames: creation - from dictionary of Series, list of dictionaries, Text/CSV files; display;
iteration;
• Operations on rows and columns: add, select, delete, rename; Head and Tail functions; Indexing
using Labels,
• Boolean Indexing; Joining, Merging and Concatenation.
• Importing/Exporting Data between CSV files and Data Frames.

ii) Data handling using Pandas – II

• Descriptive Statistics: max, min, count, sum, mean, median, mode, quartile, Standard
deviation, variance.
• DataFrame operations: Aggregation, group by, Sorting, Deleting and Renaming Index, Pivoting.
• Handling missing values – dropping and filling.
• Importing/Exporting Data between MySQL database and Pandas.
iii) Data Visualization
• Purpose of plotting; drawing and saving following types of plots using Matplotlib – line plot, bar
graph.
• Histogram, pie chart, frequency polygon, box plot and scatter plot.
• Customizing plots: color, style (dashed, dotted), width; adding label, title, and legend in plots.

NB. RED marks have been excluded

===============================================================================
Pandas is an open-source Python Library providing high-performance Data Science for data manipulation and
analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data – an
Econometrics from Multidimensional data.

The MatPotLib Python library, developed by John Hunter and many other contributors, is used to create
high-quality graphs, charts, and figures. The library is extensive and capable of changing very minute details
of a figure. Some basic concepts and functions provided in matplotlib are:
• Figure and axes: The entire illustration is called a figure and each plot on it is an axes (do not confuse
Axes with Axis).
• Plotting: The very first thing required to plot a graph is data. A dictionary of key-value pairs can be
declared, with keys and values as the x and y values. After that, scatter(), bar(), and pie(), along with
tons of other functions, can be used to create plot.
• Axis: The figure and axes obtained using subplots() can be used for modification. Properties of the x-
axis and y-axis (labels, minimum and maximum values, etc.) can be changed using Axes.set().
Data Type of Pandas
• integer
• string
• float
• object

Data Structure of Pandas:

• Series: 1-D structure to store homogeneous (same data type) and mutable (can be modified/added)
data, but size of the series is immutable.
• DataFrame: 2-D structure to store heterogeneous (multiple data type) and mutable data.
• Panel: It is 3-D way of storing data.
Series
Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python
objects, etc.). The axis labels are collectively called index. For example, the following series is a collection of
integers 10, 23, 56,…
10 23 56 17 52 61 73 90 26 72
Key Points:
• Homogeneous data
• Size Immutable
• Values of Data Mutable

A series can be created using various inputs like:

• List
• Dictionary
• NdArray
• Scalar value or constant

Create an Empty Series:

Example:
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print (s)

Output:
Series([], dtype: float64)

Create a Series from List:

If data is a list, then index passed must be of the same length. If no index is passed, then by default index will
be range(n) where n is array length, i.e., 0,1,2,3…. range(len(list))-1].
Example:
#import the pandas library and aliasing as pd
x=[10, 20, 30, 40, 50] # List
import pandas as pd
s = pd.Series(x) # Series
print (s)
Output:
0 10
1 20
2 30
3 40
4 50
dtype: object

Create a Series from ndarray:

If data is a ndarray, then index passed must be of the same length. If no index is passed, then by default
index will be range(n) where n is array length, i.e., *0,1,2,3…. range(len(array))-1].
Example: With integer values in a list
import pandas
x=[10,20,30,40]
print("The original list: ")
print(x)
s1=pandas.Series(x)
print("The Series: ")
print(s1)
print("With user given index: ")
s2=pandas.Series(x, index=[100,200,300,400])
print(s2)

Output:
The original list:
[10, 20, 30, 40]

The Series:
0 10
1 20
2 30
3 40
dtype: int64
With user given index:
100 10
200 20
300 30
400 40
dtype: int64

# With Text value:

import pandas as pd
x=['India', 'UK', 'USA', 'China','Russia']
print("The original list: ")
print(x)
s1=pd.Series(x)
print("The Series: ")
print(s1)
print("With user given index: ")
s2=pd.Series(x, index=[100,200,300,400,500])
print(s2)

Output:

The original list:

['India', 'UK', 'USA', 'China', 'Russia']
The Series:
0 India
1 UK
2 USA
3 China
4 Russia
dtype: object
With user given index:
100 India
200 UK
300 USA
400 China
500 Russia
dtype: object

x=['a','b','c','d'] # List
import pandas as pd import
numpy as np
data = np.array(x) # Array created by NumPy
s = pd.Series(data) # Series
print (s)

Output:
0 a
1 b
2 c
3 d
dtype: object
Array with defined indexes:
Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data, index=[100,101,102,103])
print (s)
Output:
100 a
101 b
102 c
103 d
dtype: object

Creating series on a dictionary:

A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted order
to construct index. If index is passed, the values in data corresponding to the labels in the index will be pulled
out.

import pandas as pd
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print (s)
Output:
a 0.0
b 1.0
c 2.0
dtype: float64

Observe − Dictionary keys are used to construct the index.

Example
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data, index=['b','c','d','a'])
print (s)

Output:
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64

Observe: Index order is persisted and the missing element is filled with NaN (Not a Number).
Example:
x={'a': “Africa”, 'b': ”Britain“, 'c': “Canada” , 'd': “Denmark”}
import pandas as pd
s = pd.Series(data) # Series
print (s)
Output:
a Africa
b Britain
c Canada
d Denmar
dtype: string

Example:

import pandas as pd
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print (s)
s1=pd.Series(data, index=['a', 'c', 'd'])
print(s1)
s2=pd.Series(data, index=['b','c','a','x'])
print(s2)
print(s.index)
print(s.dtype)
print(s.shape)

Output:
a 0.0
b 1.0
c 2.0
dtype: float64

a 0.0
c 2.0
d NaN
dtype: float64

b 1.0
c 2.0
a 0.0
x NaN
dtype: float64

Index(['a', 'b', 'c'], dtype='object')

float64

(3,)

Create a Series from Scalar:

If data is a scalar value, an index must be provided then the value will be repeated to match the length of index
import pandas as pd
s = pd.Series(15, index=[0, 1, 2, 3, 4])
print (s)

Output:
0 15
1 15
2 15
3 15
4 15
dtype: int64

Accessing Data from Series with Position:

Data in the series can be accessed similar to that in an ndarray.

Example
Retrieve/access the first element. As we already know, the counting starts from zero for the array, which
means the first element is stored at zeroth position and so on.
import pandas as pd
s = pd.Series([1,2,3,4,5], index = ['a','b','c','d','e'])
#retrieve the first element
print (s[0])

Output:
1

Example
Retrieve/access/index the first three elements in the Series. If a: is inserted in front of it, all items from that
index onwards will be extracted. If two parameters (with: between them) is used, items between the two
indexes (not including the stop index)

import pandas as pd
s = pd.Series([1,2,3,4,5], index = ['a','b','c','d','e']) #retrieves the first three element
print (s[:3])

Output:
a 1
b 2
c 3

dtype: int64

Example
Retrieve (slicing) the last three elements.

import pandas as pd
s = pd.Series([1,2,3,4,5], index = ['a','b','c','d','e']) #retrieve the last three element
print (s[-3:])
Output:
c 3
d 4
e 5
dtype: int64

Retrieve Data Using Label (Index)

A Series is like a fixed-size dict in that you can get and set values by index label.

Example
Retrieve a single element using index label value.

import pandas as pd
s = pd.Series([1,2,3,4,5], index = ['a','b','c','d','e']) #retrieve a single element
print (s['a'])

Output:
1
Example
Retrieve multiple elements using a list of index label values.

import pandas as pd
s = pd.Series([1,2,3,4,5] ,index = ['a','b','c','d','e']) #retrieve multiple elements
print (s[ ['a','c','d'] ])

Output:
a 1
c 3
d 4
dtype: int64

Example 3
If a label is not contained, an exception is raised.

import pandas as pd
s = pd.Series([1,2,3,4,5], index = ['a','b','c','d','e'])
print (s['f']) # No output, as ‘f’ is not present

Output:
KeyError: 'f'

Creating Series using Mathematical Operations:

Example:
import numpy as np
import pandas as pd
s1=np.arange(10, 15) # Spelling is arange() [10,11,12,13,14]
print(s1)
s2=pd.Series(data=s1*2, index=s1)
print(s2)
Output:
[10, 11, 12, 13, 14]
dtype: int64

10 20
11 22
12 24
13 26
14 28
dtype: int64
Example:
import pandas as pd
s = pd.Series(range(2,15,2))
print(s)
s1=pd.Series(range(2,15,2), index=[10,11,12,13,14,16,17])
print(s1)

Output:
0 2
1 4
2 6
3 8
4 10
5 12
6 14
dtype: int64

10 2
11 4
12 6
13 8
14 10
16 12
17 14
dtype: int64

# Example: Slicing using iloc and loc

import pandas as pd
s = pd.Series(range(2,10,2))
print("#Prints the values of Series: ")
print(s)
s1=pd.Series(range(2,12,2), index=[10,11,12,13,14])
print("#Prints the of Series with user index: ")
print(s1)
print("#Prints the values of Series with default index: ")
print(s[1:4])
print("#Prints the same values of Series with default index using iloc: ")
print(s.iloc[1:4])
print("#Can't print with iloc as 11:14 are not default indexes: ")
print(s1.iloc[11:14])
print("#Prints the values of Series with user index using loc: ")
print(s1.loc[11:14]) # It prints all the values of the range (:) of labels (user index)

Output:
3 8
14 10
dtype: int64

#Prints the values of Series with default index:

1 4
2 6
3 8
dtype: int64

#Prints the same values of Series with default index using iloc:
1 4
2 6
3 8
dtype: int64

#Can't print with iloc as 11:14 are not default indexes:

Series([], dtype: int64)

#Prints the values of Series with user index using loc:

11 4
12 6
13 8
14 10
dtype: int64

# Example: Indexing and accessing using iloc and loc:

import pandas as pd
s=pd.Series([1, 2, 3, 4, 5] , index=['a', 'b','c','d','e'])
print(s)
print()
print(s.iloc[1 : 4]) # for indexing or selecting based on position
print()
print(s.loc['b' : 'e']) # Rule of range of values doesn't work here

Output:
a 1
b 2
c 3
d 4
e 5
dtype: int64

b 2
c 3
d 4
dtype: int64

b 2
c 3
d 4
e 5
dtype: int64

Difference between loc , iloc & slicing, range

import pandas as pd
s = pd.Series([11,22,33,44,55,66,77,88,99,100], index=[49,48,47,46,45, 1, 2, 3, 4, 5])
print(s.loc[:3]) # Prints the values till user's index
print()
print(s.loc[1:3]) # Prints the values till user's index
print()
print(s[:3]) # Prints the values till default index-1
print()
print(s[1:3]) # Prints the values till default index-1
print()
print(s.iloc[:3]) # Prints the values same as default index rules
print()
print(s.iloc[1:3])

Output:
-------
49 11
48 22
47 33
46 44
45 55
1 66
2 77
3 88
dtype: int64

1 66
2 77
3 88
dtype: int64

49 11
48 22
47 33
dtype: int64
49 11
48 22
47 33
dtype: int64

Head and Tail functions

To view a small sample of a Series or the DataFrame object, use the head() and the tail() methods.

head() returns the first n rows(observe the index values). The default number of elements to display is five,
but you may pass a custom number.

Example:
import pandas as pd
s = pd.Series([10,20,30,40,50], index=[1,2,3,4,5])
print(s)
print(“Head=>”)
print(s.head()) # top 5 rows by default
print(s.head(3)) # to 3 rows

Output:
1 10
2 20
3 30
4 40
5 50
dtype: int64

Head=>
1 10
2 20
3 30
4 40
5 50

dtype: int64

1 10
2 20
3 30
dtype: int64

Example:
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print ("The original series is:")
print (s)
print ("The first two rows of the data series:")
print s.head(2)
Output:
The original series is:
0 0.720876
1 -0.765898
2 0.479221
3 -0.139547
dtype: float64

The first two rows of the data series:

0 0.720876
1 -0.765898
dtype: float64

tail() returns the last n rows(observe the index values). The default number of elements to display is five, but
you may pass a custom number.

Example:

import pandas as pd
s = pd.Series([10,20,30,40,50], index=[1,2,3,4,5])
print(s)
print(“Tail=> “)
print(s.tail()) # By default print 5 lowermost rows
print(s.tail(3)) # Prints 3 rows from bottom of the series

Output:
1 10
2 20
3 30
4 40
5 50
dtype: int64

Tail=>
1 10
2 20
3 30
4 40
5 50
dtype: int64

3 30
4 40
5 50
dtype: int64
Example:

import pandas as pd import

numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print ("The original series is:")
print (s)
print ("The last two rows of the data series:")
print s.tail(2)

Output:
The original series is:
0 -0.655091
1 -0.881407
2 -0.608592
3 -2.341413
dtype: float64

The last two rows of the data series:

2 -0.608592
3 -2.341413
dtype: float64

Access/Replace values with Condition (Series.where() ):

Pandas Series.where() function replace values where the input condition is False for the given Series object. It takes
another object as an input which will be used to replace the value from the original object.

Syntax: Series.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False,

raise_on_error=None)

Example: Print the series more than 50

import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s > 50)
print(a)

Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 60.0
6 70.0
7 80.0
8 90.0
9 100.0
dtype: float64
Example#2: Print the series has a value 50
import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s == 50)
print(a)

Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 50.0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
dtype: float64

Example#3: Print the series less than 50

import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s <= 50)
print(a)

Output:
0 10.0
1 20.0
2 30.0
3 40.0
4 50.0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
dtype: float64

Example: Show/Filter the values with condition in 3 different ways

import pandas as pd
s = pd.Series(range(2,10,2))
print(s)
print(s>3) # Shows only Ture those meet the condition & rest False
print(s[s>3]) # Shows the values meet the condition
print(s.where(s>3)) # Shows the values those meet the condition & rest shows NaN
Output:

0 2
1 4
2 6
3 8
dtype: int64

0 False
1 True
2 True
3 True
dtype: bool

1 4
2 6
3 8
dtype: int64

0 NaN
1 4.0
2 6.0
3 8.0
dtype: float64

Example #4: Use Series.where() function to replace values in the given Series object with some other value when
the passed condition is not satisfied.

import pandas as pd
# Creating the First Series
sr1 = pd.Series(['New York', 'Chicago', 'Toronto', 'Lisbon', 'Rio'])
sr1.index = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']
print(sr1)

# Creating the Second Series

sr2 = pd.Series(['New York', 'Bangkok', 'London', 'Lisbon', 'Brisbane'])
sr2.index = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']
print(sr2)

# Replace the values using Series.where()

a=sr1.where(sr1 == 'Rio', sr2)
print(a)

Output:
# Serites of sr1
City 1 New York
City 2 Chicago
City 3 Toronto
City 4 Lisbon
City 5 Rio
dtype: object
# Serites of sr2
City 1 New York
City 2 Bangkok
City 3 London
City 4 Lisbon
City 5 Brisbane
dtype: object

# ‘Brisbane’ of sr2 of index city5 has been replaced by ‘Rio’ of sr1 of same index
City 1 New York
City 2 Bangkok
City 3 London
City 4 Lisbon
City 5 Rio
dtype: object

Example #5 : Use Series.where() function to replace values in the given Series object with some other value when
the passed condition is not satisfied.

import pandas as pd
sr1 = pd.Series([22, 18, 19, 20, 21], index = ['Student 1', 'Student 2', 'Student 3', 'Student 4', 'Student 5'])
print(sr1)
print()

# Creating the second Series

sr2 = pd.Series([19, 16, 22, 20, 18] , index = ['Student 1', 'Student 2', 'Student 3', 'Student 4', 'Student 5'])
print(sr2)
print()
print("#Replace the values with Series.where()")
b=sr1.where(sr1>20, sr2)
print(b)

# Replace the value(s) of sr2 by sr1 with Series.where()

b=sr1.where(sr1 >20, sr2)
print(b)

Output:
Student 1 22
Student 2 18
Student 3 19
Student 4 20
Student 5 21
dtype: int64

Student 1 19
Student 2 16
Student 3 22
Student 4 20
Student 5 18
dtype: int64
#Replacing the values of sr2 by sr1 where the corresponding value of sr1 is greater than 20
Student 1 22
Student 2 16
Student 3 22
Student 4 20
Student 5 21
dtype: int64

Mathematical Operations:
+ add()
- sub(), subtract()
* mul(), multiply()
/ div(), divide()
// floordiv()
% mod()
** pow()

Example:

import pandas as pd
s1 = pd.Series([1,2,3,4])
s2 = pd.Series([10,20,30,40])
print (s1)
print (s2)
print”ADD:”, (s2+s1) # print(s2.add(s1))
print(”SUB:”, s2-s1) # print(s2.sub(s1))
print(”MUL:”, s2*s1) # print(s2.multiply(s1))
print(”DIV:”, s2/s1) # print(s2.div(s1))
print(”F.DIV:”, s2//s1) # print(s2.floordiv(s1))
print(”MOD:”, s2%s1) # print(s2.mod(s1))
print(”POW:”, s2**s1) # print(s2.pow(s1))
Output:
0 1
1 2
2 3
3 4
dtype: int64

0 10
1 20
2 30
3 40
dtype: int64

ADD:
0 11
1 22
2 33
3 44
dtype: int64
SUB:
0 9
1 18
2 27
3 36

dtype: int64

MUL:
0 11
1 22
2 33
3 44
dtype: int64

DIV:
0 10.0
1 10.0
2 10.0
3 10.0
dtype: float

F.DIV:
0 10
1 10
2 10
3 10
dtype: int64

Using range() and for loop:

import pandas as pd
s = pd.Series(range(1 , 15 , 3), index=(x for x in ‘abcde’)
print (s)

Output:
a 1
b 4
c 7
d 10
e 13
dtype: float64

###
DataFrame
DataFrame is a two-dimensional array with heterogeneous data, like a table with rows and columns.

A pandas DataFrame can be created using various inputs like

• Lists
• dict
• Series
• Numpy ndarrays
• Another DataFrame

Key Points of DataFrame:

• Heterogeneous data
• Size Mutable
• Data Mutable
• Can Perform Arithmetic operations on rows and columns

For example in the following table (DataStructure):

Name Age Gender Rating

Raman 32 Male 3.45
Jayati 28 Female 4.6
Saurav 45 Male 3.9
Kritika 38 Female 2.78

The table represents the data of a team of an organization with their overall performance rating. The data is
represented in rows and columns. Each column represents an attribute and each row represents a person.

Data Type of Columns:

The data types of the four columns are as follows −

Column Type
Name String
Age Integer
Gender String
Rating Float

1. Create an Empty DataFrame

Example:

#import the pandas library and aliasing as pd

import pandas as pd
df= pd.DataFrame()
print (df)
Output:

Empty DataFrame
Columns: []
Index: []
2. Create a DataFrame from Lists:
Example:

import pandas as pd
df= pd.D ataF rame([10, 20, 30, 40, 50])
print (df)

Output:
0
0 10
1 20
2 30
3 40
4 50

3. Create a DataFrame from Nested Lists:

Example:

import pandas as pd
df = pd.DataFrame([ [1, 2, 3, 4, 5] , [10, 20, 30, 40, 50] ])
print (df)

Output:
0 1 2 3 4

0 1 2 3 4 5 # Comes from 1st List

1 10 20 30 40 50 # Comes from 2nd List

Example: Create a DataFrame with Column names

import pandas as pd
x = [[‘XII’,101] , [‘XI’,210] , [‘X’,301]]
df = pd.DataFrame(x , columns=['Class' , '90% Score'])
print (df)

Output:
Class 90% Score
0 XII 101
1 XI 201
2 X 301

Example:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data , columns=['Name' , 'Age'] , dtype=float)
print (df)

Output:

Name Age
0 Alex 10.0
1 Bob 12.0
2 Clarke 13.0

4. Create a DataFrame from Dictionary of ndarrays

All the ndarrays must be of same length. If index is passed, then the length of the index should equal to the
length of the arrays. If no index is passed, then by default, index will be range(n), where n is the array length.

Example:

import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(d)
print (df)

Output:
col1 col2
0 1 3
1 2 4

Example

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'] , 'Age':[28,34,29,42]}
x = pd.DataFrame(data)
print(x)

Output:
Name Age
0 Tom 28
1 Jack 34
2 Steve 29
3 Ricky 42

Example:

import pandas as pd
nme = ["aparna", "pankaj", "sudhir", "Geeku"]
deg = ["MBA", "BCA", "M.Tech", "MBA"]
scr = [90, 40, 80, 98]
dict = {'Name': nme, 'Degree': deg, 'Score': scr}
df = pd.DataFrame(dict)
print(df)
Output:

Name Degree Score

0 Aparna MBA 90
1 Pankaj BCA 40
2 Sudhir M.Tech 80
3 Geeku MBA 98

Example:

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1' , 'rank2' , 'rank3' , 'rank4'])
print (df)

Output:
Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky

4.1 Example (Changing Row indexes):

import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'] ,
'Height': [5.1 , 6.2 , 5.1 , 5.2] ,
'Qualification': ['Msc' , 'MA' , 'Msc' , 'Msc']
}
df = pd.DataFrame(data)
print(df)
print("") # Print/insert a blank line
df1 = pd.DataFrame(data, index=['one', 'two', 'three', 'four']) # Assigning index
print(df1)

Output:
Height Name Qualification
0 5.1 Jai Msc
1 6.2 Princ MA
2 5.1 Gaurav Msc
3 5.2 Anuj Msc

Height Name Qualification

one 5.1 Jai Msc
two 6.2 Princ MA
three 5.1 Gaurav Msc
four 5.2 Anuj Msc
5. Sorting of data in DataFrame:

5.1 Example:

import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000],
'Year': [2015,2013,2018,2018]
}
df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year'])
print (df)

Output:
Brand Price Year
0 Honda Civic 22000 2015
1 Toyota Corolla 25000 2013
2 Ford Focus 27000 2018
3 Audi A4 35000 2018

5.2 Example:
import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000],
'Year': [2015,2013,2018,2018]
}

_
df = pd.DataFrame(cars, columns= ['Brand', 'Price','Year']) sd=df.sort values(by=['Brand'],
ascending=False) # Descending order
print (sd)

Output:

Brand Price Year

1 Toyota Corolla 25000 2013
0 Honda Civic 22000 2015
2 Ford Focus 27000 2018
3 Audi A4 35000 2018

5.3 Example:
import pandas as pd
cars = {'Brand': ['Honda Civic' , 'Toyota Corolla', 'Ford Focus' , 'Audi A4'],
'Price': [22000, 25000, 27000, 35000],
'Year': [2015, 2013, 2018, 2018]
}
df = pd.DataFrame(cars, columns= ['Brand', 'Price' , 'Year'])
sd=df.sort_values(by=['Price'], ascending=False) # Descending order
print (sd)
Output:

Brand Price Year

3 Audi A4 35000 2018
2 Ford Focus 27000 2018
1 Toyota Corolla 25000 2013
0 Honda Civic 22000 2015

5.4 Example:

Output:
Brand Price Year
1 Toyota Corolla 25000 2013
0 Honda Civic 22000 2015
2 Ford Focus 27000 2018
3 Audi A4 35000 2018
6. Renaming a column name in DataFrame

6.1 Example (Renaming one field):

import pandas as pd
L1=[10,30,50,70,90]
print(L1) # Prints List
df=pd.DataFrame(L1)
print(df) #Prints DataFrame
df.columns=['Code'] # Renaming column
print(df)

Output:
[10 , 30 , 50 , 70 , 90]
0
0 10
1 30
2 50
3 70
4 90
Code 10
0
1 30
2 50
3 70
4 90
6.2 Renaming column using function rename()
import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print(df)
df.rename(columns={"A": "a", "B": "c"}, inplace=True)
print(df)

Output:
a c
0 1 4
1 2 5
2 3 6

Note: When inplace = True , the data is modified in place, which means it will return nothing and the
dataframe is now updated. When inplace = False , which is the default, then the operation is performed
and it returns a copy of the object. You then need to save it to something.
6.3 Example (Get new column names):

import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'] ,
'Height': [5.1 , 6.2 , 5.1 , 5.2] ,
'Qualification': ['Msc' , 'MA' , 'Msc' , 'Msc']}
df = pd.DataFrame(data, index=['one', 'two', 'three', 'four'])
print(df)
df.columns=['N. Height' , 'N. Name', 'N. Qualification'] # New column names
print(df)

Output:
Height Name Qualification
one 5.1 Jai Msc
two 6.2 Princ MA
three 5.1 Gaurav Msc
four 5.2 Anuj Msc

N. Hight N. Name N. Qualification

one 5.1 Jai Msc
two 6.2 Princ MA
three 5.1 Gaurav Msc
four 5.2 Anuj Msc

6.3 Example (Replace a specific column name):

Output:

N. Hight N. Name Degree

one 5.1 Jai Msc
two 6.2 Princ MA
three 5.1 Gaurav Msc
four 5.2 Anuj Msc

7 Adding column to a DataFrame

7.1 Example:
import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'] ,
'Height': [5.1 , 6.2 , 5.1 , 5.2] ,
'Qualification': ['Msc' , 'MA' , 'Msc' , 'Msc']
}
df = pd.DataFrame(data)
print(df) # Prints the DataFrame
print(“”)
addr = ['Delhi', 'Bangalore', 'Chennai', 'Patna'] # Declare a list
df['New Address'] = addr # Prints the DataFrame with a new column
print(df)

Output:

Name Height Qualification

0 Jai 5.1 Msc
1 Princ 6.2 MA
2 Gaurav 5.1 Msc
3 Anju 3.2 Msc

Name Height Qualification New Address

0 Jai 5.1 Msc Delhi
1 Princ 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna

7.2 Example: Addition of Series to DataFrame row & column wise

import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
print(x)
print()
y = pd.Series([1, 2, 3])
print(y)
print()
new_x = x.add(y, axis=0) # Adding series to DF row-wise for axis=0 on 0th col using add() function
print(new_x)
print()
new_y = x.add(y, axis=1) # Adding series to DF col-wise for axis=1 on 0th row using add() function
print(new_y)

output;

0 1 2
0 1 4 7
1 2 5 8
2 3 6 9

0 1
1 2
2 3
dtype: int64

0 1 2
0 2 5 8
1 4 7 10
2 6 9 12

0 1 2
0 2 6 10
1 3 7 11
2 4 8 12

7.3 Example: Binary operation of DataFrame with DtaFrame row /column wise: addition
import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
print(x)
print()
print(y)
print()
x1 = x.add(y, axis=0) # Adding series to DF row-wise as axis=0 on 0th col
print(x1)
print()
y1 = x.add(y, axis=1) # Adding series to DF col-wise as axis=1 on 0th row
print(y1)

Output:
0 1 2
0 1 4 7
1 2 5 8
2 3 6 9
dtype: int64
0 1 2
0 1 4 7
1 2 5 8
2 3 6 9
dtype: int64

0 1 2
0 2 8 14
1 4 10 16
2 6 12 18
dtype: int64

Example: Binary mathematical operations of within DataFrames row wise

import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
print(x)
print()
print(y)
print("\n Addition: \n")
x1 = x.add(y, axis=0) # Row wise addition
print(x1)
print("\n Subtraction: \n")
x2 = x.sub(y, axis=0) # Row wise subtraction
print(x2)
print("\n Multiplication: \n")
x3 = x.mul(y, axis=0) # Row wise multiplication
print(x3)
print("\n Division: \n")
x4 = x.div(y, axis=0) # Row wise division
print(x4)

Output:
0 1 2
0 1 4 7
1 2 5 8
2 3 6 9

0 1 2
0 1 4 7
1 2 5 8
2 3 6 9
Addition:
0 1 2
0 2 8 14
1 4 10 16
2 6 12 18

Subtraction:
0 1 2
0 0 0 0
1 0 0 0
2 0 0 0

Multiplication:
0 1 2
0 1 16 49
1 4 25 64
2 9 36 81

Division:
0 1 2
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0

7.5 Example: Binary mathematical operations of within DataFrames column wise

import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
print(x)
print()
print(y)
print("\n Addition: \n")
x1 = x.add(y, axis=1) # Col wise addition
print(x1)
print("\n Subtraction: \n")
x2 = x.sub(y, axis=1) # Col wise subtraction
print(x2)
print("\n Multiplication: \n")
x3 = x.mul(y, axis=1) # Colwise multiplication
print(x3)
print("\n Division: \n")
x4 = x.div(y, axis=1) # Col wise division
print(x4)

Output:
0 1 2
0 1 4 7
1 2 5 8
2 3 6 9
Addition:

0 1 2
0 2 8 14
1 4 10 16
2 6 12 18

Subtraction:

0 1 2
0 0 0 0
1 0 0 0
2 0 0 0

Multiplication:

0 1 2
0 1 16 49
1 4 25 64
2 9 36 81

Division:

0 1 2
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0

8 Selecting columns from DataFrame:

8.1 Example:
import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']
}
df = pd.DataFrame(data)
x1=df[df.columns[1 : 4]]
print(x1) # From 2nd col to 4th col
print(“”)
x2=df[df.columns[1:3] # From 2nd col to 3rd col
print(x2)
x3=df[df.columns[:]]
print(x3) # All columns
print(“”)
x4=df[df.columns[:2]]
print(x4) # First 1st col to 2nd col
Output:
Age Address Qualification
0 27 Delhi Msc
1 24 Kanpur MA
2 22 Allahabad MCA
3 32 Kannauj Phd

Age Address
0 27 Delhi
1 24 Kanpur
2 22 Allahabad
3 32 Kannauj

Name Age Address Qualification

0 Jai 27 Delhi Msc
1 Princ 24 Kanpur MA
2 Gaurav 22 Allahabad MCA
3 Anuj 32 Kannauj Phd

Name Age
0 Jai 27
1 Princ 24
2 Gaurav 22
3 Anuj 32

8.1 Selecting data from DataFrame with iteration:

import pandas as pd
# import numpy as np1
raw_data1 = { 'name': ['freya', 'mohak'],
'age': [10, 1],
'favorite_color': ['pink', 'blue'],
'grade': [88, 92]}
df1 = pd.DataFrame(raw_data1, columns = ['name', 'age', 'favorite_color', 'grade'])
for index, row in df1.iterrows(): # Reads data with index row wise & returns to vriable row
print (row["name"], row["age"])

Output:
freya 10
mohak 1

9 Adding a row to a DataFrame (1st way)

import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'] ,
'Height': [5.1 , 6.2 , 5.1 , 5.2] ,
'Qualification': ['Msc' , 'MA' , 'Msc' , 'Msc']
}
df = pd.DataFrame(data)
print(df) # Prints the DataFrame
x = {'Name': 'Amit', 'Height': 5.5, 'Qualification': ‘PhD’}
df = df.append(x, ignore_index = True)
# ignore_index : bool, default False
# If True, do not use the index labels, the resulting axis will be labeled 0, 1,..., n-1.
print(df)

Output:

Name Height Qualification

0 Jai 5.1 Msc
1 Princ 6.2 MA
2 Gaurav 5.1 Msc
3 Anju 3.2 Msc

Name Height Qualification

0 Jai 5.1 Msc
1 Princ 6.2 MA
2 Gaurav 5.1 Msc
3 Anju 3.2 Msc
4 Amit 5.5 PhD

9.1 Adding a row to a DataFrame (2nd way)

import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'] ,
'Height': [5.1 , 6.2 , 5.1 , 5.2] ,
'Qualification': ['Msc' , 'MA' , 'Msc' , 'Msc']
}
df = pd.DataFrame(data)
print(df) # Prints the DataFrame
print("")
df.loc[len(df.index)] = ['Amit', 5.5, 'PhD']
# df.loc locates the index to place values of the new row
# len(df.index) returens number of indexes = no. of rows+1 i.e. last index+1 i.e. new index
print(df)

Name Height Qualification

0 Jai 5.1 Msc
1 Princ 6.2 MA
2 Gaurav 5.1 Msc
3 Anuj 5.2 Msc

Name Height Qualification

0 Jai 5.1 Msc
1 Princ 6.2 MA
2 Gaurav 5.1 Msc
3 Anuj 5.2 Msc
4 Amit 5.5 PhD
10 Select Row / Column using iloc (Index Location):

Syntax: df.iloc[ range of rows(x:y) , range of columns(a:b)]

df.iloc[x:y , a:b]

11. Example:

import pandas as pd
data = { 'Name': ['Jai', 'Princ', 'Gaurav', 'Anuj'], 'Age': [27,
24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']
}
df = pd.DataFrame(data)
,
x1=df.iloc[ : 1 : 4] # First part is range of rows , Second part is range of columns
print(x1)
x2=df.iloc[1:2 , 1:4]
print(x2)
x3=df.iloc[1: , 1:]
print(x3)
4=df.iloc[: , :]
print(x4)

Output:

Age Address Qualification

0 27 Delhi Msc
1 24 Kanpur MA
2 22 Allahabad MCA
3 32 Kannauj Phd

Age 24 Address Qualification

1 Kanpur
MA
Age Address Qualification
1 24 Kanpur MA
2 22 Allahabad MCA
3 32 Kannauj Phd

Name Age Address Qualification

0 Jai 27 Delhi Msc
1 Princ 24 Kanpur MA
2 Gaurav 22 Allahabad MCA
3 Anuj 32 Kannauj Phd
Crating DataFrame with Series and adding columns to DataFrame

import pandas as pd
d={ 'one' : pd .Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd .Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df=pd.DataFrame(d)
print(df)
print()
df['three']=pd.Series([10,20,30],index=['a','b','c']) # Adding a new column
print(df)
print()
df ['four']=df ['one']+df ['three'] # Adding a new column made by sum of other columns
print(df )
print()
print (df['one']) # Displaying a column data

Output:
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4

one two three

a 1.0 1 10.0
b 2.0 2 20.0
c 3.0 3 30.0
d NaN 4 NaN

one two three four

a 1.0 1 10.0 11.0
b 2.0 2 20.0 22.0
c 3.0 3 30.0 33.0
d NaN 4 NaN NaN

a 1.0
b 2.0
c 3.0
d NaN
Name: one, dtype: float64

4 Deleting Row / Column:

10.1 Example:

Output:
Name Age Address Qualification
0 Jai 27 Delhi Msc
1 Princ 24 Kanpur MA
2 Gaurav 22 Allahabad MCA
3 Anuj 32 Kannauj Phd

Deleting a column using del:

Output:
Age Address Qualification
0 27 Delhi Msc
1 24 Kanpur MA
2 22 Allahabad MCA
3 32 Kannauj Phd

Deleting a column using pop():

Output:

Name Address Qualification

0 Jai Delhi Msc
1 Princ Kanpur MA
2 Gaurav Allahabad MCA
3 Anuj Kannauj Phd
Deleting a columns using drop():

Output:
Name Age Qualification
0 Jai 27 Msc
1 Princ 24 MA
2 Gaurav 22 MCA
3 Anuj 32 Phd

Deleting a rows using drop():

Output:
Name Age Address Qualification
0 Jai 27 Delhi Msc
3 Anuj 32 Kannauj Phd

Joining DataFrames with append() function:

import pandas as pd
df1 = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df1 = df1.append(df2)
print (df1)

Output;
a b
0 1 2
1 3 4
0 5 6
1 7 8

Joining DataFrames concat() function:

11.1 Example (Default index):

import pandas as pd
data1 = { 'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}

data2 = { 'Name':['Abhi', 'Ayushi', 'Dhiraj', 'Hitesh'],

'Age':[17, 14, 12, 52],
'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'],
'Qualification':['Btech', 'B.A', 'Bcom', 'B.hons']}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

res1 = pd.concat( [df1,df2], axis=0) #Works on Rows for each Column

print(res1)
print(“”)
res2 = pd.concat([df1,df2], axis=1) # Works on Columns for each Row
print(res2)

Output:
Name Age Address Qualification
0 Jai 27 Nagpur Msc
1 Princi 24 Kanpur MA
2 Gaurav 22 Allahabad MCA
3 Anuj 32 Kannuaj Phd
0 Abhi 17 Nagpur Btech
1 Ayushi 14 Kanpur B.A
2 Dhiraj 12 Allahabad Bcom
3 Hitesh 52 Kannuaj B.hons

[8 rows x 4 columns]
Name Age Address Qualification Name Age Address Qualification
0 Jai 27 Nagpur Msc Abhi 17 Nagpur Btech
1 Princi 24 Kanpur MA Aydhi 14 Kanpur B.A
2 Gaurav 22 Allahabad MCA Dhiraj 12 Allahabad Bcom
3 Anuj 32 Kannuaj Phd Hitesh 52 Kannuaj B.hons
[4 rows x 8 columns]

NOTE:
• axis=0 acts on all the ROWS in each COLUMN
• axis=1 acts on all the COLUMNS in each ROW
• by default axis=0
Merging data within DataFrames using merge():

import pandas as pd
x = pd.DataFrame({'id':[1,2],'Name': ['anil', 'vishal'], 'subject_id':['sub1','sub2']})
y = pd.DataFrame({'id':[1,2],'Name': ['sumer', 'salil'], 'subject_id':['sub2','sub4']})
print(pd.merge(x , y , on='id'))

Output:
id Name_x subject_id_x Name_y subject_id_y
0 1 anil sub1 sumer sub2
1 2 vishal sub2 salil sub4

11. Boolean Indexing:

Boolean indexing helps us to select the data from the DataFrames using a boolean vector. We need a
DataFrame with a boolean index to use the boolean indexing. Let's see how to achieve the Boolean. indexing.

• Create a dictionary of data.

• Convert it into a DataFrame object with a boolean index
• Now, access the data using boolean indexing.

import pandas as pd
data = {
'Name': ['Hafza', 'Srikanth', 'Rakesh'],
'Age': [19, 20, 19]
}
df = pd.DataFrame(data, index = [True, False, True]) # Creating a DataFrame with boolean index
print(df)
print(“”) #Prints a blank line
print(df.loc[True])
print(“”)
print(df.iloc[1])

Output:

Name Age
True Hafza 19
False Srikanth 20
True Rakesh 19

Name Age
True Hafza 19
True Rakesh 19

Name Srikanth
Age 20
Selecting data from DataFrame with boolean indexing:

import pandas as pd
dict = {'name':['Mohak', "Freya", "Roshni"],
'degree': ["MBA", "BCA", "M.Tech"],
'score':[90, 40, 80]}
# creating a dataframe with boolean index
df = pd.DataFrame(dict, index = [True, False, True]) # accessing a dataframe using .loc[] function
print(df.loc[True]) #it will return rows of Mohak and Roshni only(matching true only)

Output:
name degree score
True Mohak MBA 90
True Roshni M.Tech 80

Python | Pandas Series.where()

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric
python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable
type. The object supports both integer- and label-based indexing and provides a host of methods for performing
operations involving the index.

Syntax: Series.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False,

raise_on_error=None)

Parameters :
cond : boolean NDFrame, array-like, or callable
other : scalar, NDFrame, or callable
inplace : boolean, default False
axis : int, default None
level : int, default None
errors : str, {‘raise’, ‘ignore’}, default raise
try_cast : boolean, default False

Example : Print the series more than 50

import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s > 50)
print(a)

Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 60.0
6 70.0
7 80.0
8 90.0
9 100.0
dtype: float64

Example : Print the series has a value 50

import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s == 50)
print(a)

Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 50.0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
dtype: float64

Example#3: Print the series less than 50

import pandas as pd
s = pd.Series([10,20,30,40,50,60,70,80,90,100])
a=s.where(s <= 50)
print(a)

Output:
0 10.0
1 20.0
2 30.0
3 40.0
4 50.0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
dtype: float64
Example #1: Use Series.where() function to replace values in the given Series object with some other value when the
passed condition is not satisfied.

# importing pandas as pd
import pandas as pd

# Creating the First Series

sr1 = pd.Series(['New York', 'Chicago', 'Toronto', 'Lisbon', 'Rio'])

# Creating the row axis labels

sr1.index = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']

# Print the series

print(sr1)

# Creating the second Series

sr2 = pd.Series(['New York', 'Bangkok', 'London', 'Lisbon', 'Brisbane'])

# Creating the row axis labels

sr2.index = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']

# Print the series

print(sr2)

# replace the values

sr1.where(sr1 == 'Rio', sr2)

Example #2 : Use Series.where() function to replace values in the given Series object with some other value when
the passed condition is not satisfied.

# importing pandas as pd
import pandas as pd

# Creating the First Series

sr1 = pd.Series([22, 18, 19, 20, 21])
# Creating the row axis labels
sr1.index = ['Student 1', 'Student 2', 'Student 3', 'Student 4', 'Student 5']

# Print the series

print(sr1)

# Creating the second Series

sr2 = pd.Series([19, 16, 22, 20, 18])

# Creating the row axis labels

sr2.index = ['Student 1', 'Student 2', 'Student 3', 'Student 4', 'Student 5']

# Print the series

print(sr2)

# replace the values

sr1.where(sr1 >20, sr2)

12.1 Accessing/importing/reading csv File in Pandas

A CSV is a comma-separated values file, which allows data to be saved in a tabular format. CSVs look like a
garden-variety spreadsheet but with a .csv extension. CSV files can be used with most any spreadsheet
program, such as Microsoft Excel or Google Spreadsheets.
emp1.xlsx
Emp ID Emp Name Emp Role
1 Pankaj Kumar Admin
2 David Lee Editor
3 Lisa Ray Author

emp1.csv
Emp ID, Emp Name, Emp Role
1, Pankaj Kumar, Admin
2, David Lee, Editor 3,
3, Lisa Ray, Author
import pandas as pd
df = pd.read _csv(c:/mydata/class12/'emp1.csv')
print(df)

Output:
Emp ID Emp Name Emp Role
0 1 Pankaj Kumar Admin
1 2 David Lee Editor
2 3 Lisa Ray Author

import pandas as pd
# importing Data here from remote source of .csv file
df = pd.read_csv('emp1.csv')
x=[50000 , 55000, 60000]
df['Salary'] = x
print(df)

Output:
Emp ID Emp Name Emp Role Salary
0 1 Pankaj Kumar Admin 50000
1 2 David Lee Editor 55000
2 3 Lisa Ray Author 60000

12.2 Exporting DataFrame to csv File

import pandas as pd

df.to_csv('emp2.csv', index=True)
df.to_csv('emp2.csv', index=False) # Without indexing

Output: (index=True)
Emp ID Emp Name Emp Role Salary
0 0 1 Pankaj Kumar Admin 50000
1 1 2 David Lee Editor 55000
2 2 3 Lisa Ray Author 60000

Output: (index=False)
Emp ID Emp Name Emp Role Salary
0 1 Pankaj Kumar Admin 50000
1 2 David Lee Editor 55000
2 3 Lisa Ray Author 60000

13. DataFrame Functions:

Example of a DataFrame:
import pandas as pd
x = { 'Name': ['Jai', 'Prince', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd'] }
df = pd.DataFrame(x)
print(df)

Output:
Name Age Address Qualification
0 Jai 27 Delhi Msc
1 Prince 24 Kanpur MA
2 Gaurav 22 Allahabad MCA
3 Anuj 32 Kannauj Phd

13.1 Example: max()

import pandas as pd
x = { 'Name': ['Jai', 'Prince', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd'] }
df = pd.DataFrame(x)
print(df['Age'].max())

Output:
32

13.2 Example: min()

Output:
22

13.3 Example: count() # It counts the number of values present in the column
import pandas as pd
x = { 'Name': ['Jai', 'Prince', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd'] }
df = pd.DataFrame(x)
print(df['Age'].count())

Output:
4

13.4 Example: sum() # Finds the total/addition of the values of the column
import pandas as pd
x = { 'Name': ['Jai', 'Prince', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd'] }
df = pd.DataFrame(x)
print(df['Age'].sum())

Output:
105

15.2 Altering / Renaming Series Label

import pandas as pd
import numpy as np
data = np.array([54,76,88,99,34])
s1 = pd.Series(data,index=['a','b','c','d','e'])
print (s1)
s2=s1.rename(index={'a':0, 'b':1})
print(“After reindexing: \n“, s2)

OUTPUT
a 54
b 76
c 88
d 99
e 34

dtype: int32

0 54
1 76
c 88
d 99
e 34
dtype: int32

15.3 Re-indexing Rows in Pandas Dataframe:

import pandas as pd
import numpy as np
table = {"name": ['vishal', 'anil', 'mayur', 'viraj','mahesh'],
'age':[15, 16, 15, 17,16],
'weight': [51, 48, 49, 51,48],
'height': [5.1, 5.2, 5.1, 5.3,5.1],
}
d = pd.DataFrame(table)
print("DATA OF DATAFRAME")
print(d)
print("DATA OF DATAFRAME AFTER REINDEX")
df=d.reindex([2,1, 0,4,3])
print(df)

Output:
DATA OF DATAFRAME
name age weight height
0 vishal 15 51 5.1
1 anil 16 48 5.2
2 mayur 15 49 5.1
3 viraj 17 51 5.3
4 mahesh 16 48 5.1

DATA OF DATAFRAME AFTER REINDEX

name age weight height
2 mayur 15 49 5.1
1 anil 16 48 5.2
0 vishal 15 51 5.1
4 mahesh 16 48 5.1
3 viraj 17 51 5.3

15.4 Re-Indexing Columns in Pandas

DataFrame: import pandas as pd
import numpy as np
table = {"name": ['vishal', 'anil', 'mayur', 'viraj','mahesh'],
'age':[15, 16, 15, 17,16],
'weight': [51, 48, 49, 51,48],
'height': [5.1, 5.2, 5.1, 5.3,5.1],
}
d = pd.DataFrame(table)
print("DATA OF DATAFRAME=>")
print(d)
print()
print("DATA OF DATAFRAME AFTER REINDEX=>")
df=d.reindex(columns=['name','weight','age'])
print(df)

Output:

DATA OF DATAFRAME=>
name age weight height
0 vishal 15 51 5.1
1 anil 16 48 5.2
2 mayur 15 49 5.1
3 viraj 17 51 5.3
4 mahesh 16 48 5.1

DATA OF DATAFRAME AFTER REINDEX=>

name weight age
0 vishal 51 15
1 ani 48 16
2 mayur 49 15
3 viraj 51 17
4 mahesh 48 16
17. Data Visualisation:
There are various types of Data Visualisation techniques, are as follows:
• Histogram
• Line plot
• Bar chart

17.1 Histogram:

A histogram is an accurate graphical representation of the distribution of numerical data. It is an

estimate of the probability distribution of a continuous variable (quantitative variable) and was first
introduced by Karl Pearson. It is a kind of bar graph. To construct a histogram, the first step is to “bin”
the range of values — that is, divide the entire range of values into a series of intervals — and then
count how many values fall into each interval. The bins are usually specified as consecutive, non-
overlapping intervals of a variable. The bins (intervals) must be adjacent, and are often (but are not
required to be) of equal size. A histogram shows the number of occurrences of different values in a
dataset.

Histogram in Python:

Drawing a histogram in Python is very easy. All we have to do is code for 3-4 lines of code. But
complexity is involved when we are trying to deal with live data for visualization.
To draw histogram in python following concepts must be clear.
Title: To display heading of the histogram.
Color : To show the colour of the bar.
Axis: y-axis and x-axis.
Data: The data can be represented as an array.
Height and width of bars. This is determined based on the analysis.
The width of the bar is called bin or intervals. Default size = 10
Border colour: To display border colour of the bar.

Matplotlib is the whole python package/ library used to create 2D graphs and plots by using python
scripts. pyplot is a module in matplotlib, which supports a very wide variety of graphs and plots namely
- histogram, bar charts, power spectra, error charts etc. It is used along with NumPy to provide an
environment for MatLab.

Pyplot provides the state-machine interface to the plotting library in matplotlib.It means that figures
and axes are implicitly and automatically created to achieve the desired plot. For example, calling plot
from pyplot will automatically create the necessary figure and axes to achieve the desired plot. Setting
a title will then automatically set that title to the current axes object.The pyplot interface is generally
preferred for non-interactive plotting (i.e., scripting).

Difference between a histogram and a bar chart / graph:

A bar chart majorly represents categorical data (data that has some labels associated with it), they
are usually represented using rectangular bars with lengths proportional to the values that they
represent. While histograms on the other hand, is used to describe distributions. Given a set of data,
are their distributions.
17.1.1 Program of Histogram in python:

import pandas as pd

import matplotlib.pyplot as plt # MAT-PLOT-LIB <DOT>PY-PLOT

x={"AvMark": [90,95,95, 93, 94,78,69,85,74,86,75,79,98] }
df=pd.DataFrame(x)
print(df)
df.hist() # hist() Converts the df in Histogram (Chart/Bar)

plt.show() # Prints the histogram

AvMark
0 90
1 95
2 95
3 93
4 94
5 78
6 69
7 85
8 74
9 86
10 75
11 79
12 98

17.2.1a Example: Plot a Histogram to show the various frequencies (bin with distribution) of given marks

import matplotlib.pyplot as plt

marks=[90, 95, 80, 80, 80, 91, 75, 75, 77, 50, 65,65,55, 69, 74, 75, 85, 85 ]
range=[0,10,20,30,40,50,60,70,80,90,100] # Works as Bin value
plt.xticks(range) # Fix the values in x-axis
plt.yticks([0,1,2,3,4,5,6,7,8,9,10, 11,12,13,14,15]) # Fix the values in y-axis
plt.xlabel("The Rage of marks")
plt.ylabel("Student's Marks")
plt.title("Histogram of Result")
plt.hist(marks, range, width=5) # Width of each bar has been reduced
plt.show()

17.2.1b Example:

import pandas as pd
import matplotlib.pyplot as plt
x = {'Age': [27, 24, 22, 32, 33, 32], 'Points': [3,5,7, 9, 7, 9] }
df = pd.DataFrame(x)
hist = df.hist()
plt.show()
17.1.2 Example:
import pandas as pd
import matplotlib.pyplot as plt
data={ 'length': [15, 5, 12, 12, 12, 5], 'width': [7, 2, 15, 2, 5, 7] }
df = pd.DataFrame(data)
hist = df.hist(bins=3) # Bin is 3
plt.show()

17.1.3 Example:
import pandas as pd
import matplotlib.pyplot as plt
data={'length': [15, 5, 12, 12, 12, 5], 'width': [7, 2, 15, 2, 5, 7] }
df = pd.DataFrame(data)
hist = df.hist(bins=5) # Bin is 5
plt.show()

17.1.4 Example:
import pandas as pd
import matplotlib.pyplot as plt
data={'length': [15, 5, 12, 12, 12, 5], 'width': [7, 2, 15, 2, 5, 7] }
df = pd.DataFrame(data)
hist = df.hist(bins=10) # Bin is 10
plt.show()

17.1.5 Example of a Histogram

Jeff is the branch manager at a local bank.
Recently, Jeff‟s been receiving customer
feedback saying that the wait times for a client
to be served by a customer service
representative are too long. Jeff decides to
observe and write down the time spent by each
customer on waiting. Here are his findings from
observing and writing down the wait times
spent by 20 customers:

17.1.6 Histogram Data - Example

The corresponding histogram with 5-second bins (5-second intervals) would look as follows:
We can see that:

There are 3 customers waiting between 1 and 35 seconds

There are 5 customers waiting between 1 and 40 seconds
There are 5 customers waiting between 1 and 45 seconds
There are 5 customers waiting between 1 and 50 seconds
There are 2 customers waiting between 1 and 55 seconds
43.1 , 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6, 45.2, 54.1,
45.6, 36.5, 43.1

Syntax:
import matplotlib.pyplot as plt
x = [value1, value2, value3,. .. ]
plt.hist(x, bins = number of bins)
plt.show()

17.1.7 Example:
import matplotlib.pyplot as plt
wt=[43.1 , 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6,
45.2, 54.1, 45.6, 36.5, 43.1]
plt.hist(wt, 10)
plt.show()

17.1.8 Example: (No bins i.e. default bins=10)

import matplotlib.pyplot as plt

wt=[43.1, 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6, 45.2,
54.1, 45.6, 36.5, 43.1]
plt.hist(wt, edgecolor='red')
plt.show()

17.1.9 Example:

import matplotlib.pyplot as plt

wt=[43.1 , 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6, 45.2,
54.1, 45.6, 36.5, 43.1]
plt.hist(wt, bins=5, edgecolor='red')
plt.show()
17.1.9 Example:
import matplotlib.pyplot as plt
wt=[43.1 , 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4,
35.6, 45.2, 54.1, 45.6, 36.5, 43.1]
bn=[30,35,40,45,50,55]
plt.hist(wt, bn, edgecolor='red')
plt.show()
17.1.10 Example:

import matplotlib.pyplot as plt

wt=[43.1, 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2,
47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6, 45.2, 54.1, 45.6, 36.5, 43.1]
plt.hist(wt, bins=10, edgecolor='red')
plt.xlabel('Value=>',fontsize=15)
plt.ylabel('Frequency=>',fontsize=15)
plt.title('Normal Distribution Histogram', fontsize=15)
plt.show()

Example: With customization (Label, Titile, Font size, Edge colour, Face colour. )

import matplotlib.pyplot as plt

wt=[43.1, 36.6, 37.6, 36.5, 45.3, 43.5, 40.3, 50.2, 47.3, 31.2, 42.2, 45.5, 30.3, 31.4, 35.6, 45.2, 54.1, 45.6,
36.5, 43.1]
plt.hist(wt, bins=10, edgecolor='y', facecolor='r')
plt.xteicks()
plt.xlabel('Value=>', fontsize=15)
plt.ylabel('Frequency=>', fontsize=15)
plt.title('Normal Distribution Histogram', fontsize=15)
plt.savefig('Wait_time.jpeg') # savefig() func. Saves the graph in picture format (jpeg, png…)
plt.show()
17.1.11 Example:

import numpy as np
import matplotlib.pyplot as plt
plt.hist([5,15,25,35,45,55], bins=[0,10,20,30,40,50, 60], weights=[20,10,45,33,6,8], edgecolor="red")
plt.show()

# First argument of hist() method is position (x , y Coordinate) of weight, where weight is to be

displayed. No of coordinates must match No of weight otherwise error will generate
# Second argument is interval / Bins
# Third argument is weight for bars
# Fourth argument is the Border-colour of each bar

Creating Histogram on array of data

For better understanding we develop the same program with minor change.

import matplotlib.pyplot as plt

plt.hist([5,15,25,35,15, 55], bins=[0,10,20,30,40,50, 60], weights=[20,10,45,33,6,8], edgecolor="red")
plt.show()

# At interval (bin) 40 to 50 no bar because we have not mentioned position from 40 to 50 in first
argument(list) of hist method. Where as in interval 10 to 20 width is being Displayed as 16 (10+6
both weights are added) because 15 is twice In first argument.
Example (Histogram Type „step‟):

import numpy as np
import matplotlib.pyplot as plt
data = [1,11,21,31,41]
plt.hist([5,15,25,35,15, 55], bins=[0,10,20,30,40,50, 60], weights=[20,10,45,33,6,8],
edgecolor="red", histtype='step') #plt.hist(data, bins=20, histtype='step')
plt.xlabel('Value')
plt.ylabel('Probability')
plt.title('Histogram')
plt.show()

Note: Bin: In histogram total range of data set

(minimum to maximum) is divided into 8 to 15
equal parts. These equal parts are known as
Bins or class-intervals.

17.2 PyPlot Application to display a Line Chart:

17.2.1 Example:

import matplotlib.pyplot as plt

plt.plot([1,2,3,4,5]) # Generates or co-ordinates of the values in reference to Y-Axis
plt.show()
Example:
import matplotlib.pyplot as plt
plt.grid() # Reformat the layout with grid
plt.plot([1,2,3,4,5]) # Generates or co-ordinates of the values in reference to Y-Axis
plt.show()

17.2.2 Example:

import matplotlib.pyplot as plt

plt.grid() # Reformat the layout with grid
plt.plot([1,2,3,4,5] , [2,4,6,8,5]) # Generates or co-ordinates of the values in reference to Y-Axis
plt.show()
17.2.3 PyPlot application to display multiple lines

Example:

import matplotlib.pyplot as plt

plt.plot([2,4,6,8,10,12],[1, 2, 3, 4, 5, 4], 'g') # Creates a line in ref to X-Axis & Y-Axis respectability of Green colour(‘g’)
plt.plot([1,2,3,4,5,6], [1, 2, 3, 4, 5, 4], 'b') # Creates a line in ref to X-Axis & Y-Axis respectability of Blue colour(‘b’)
plt.grid() # Reformat the layout with grid
plt.title('My Pyplot') # Displays title of the layout at the top
plt.ylabel('some numbers') # Displays label of Y-Axis
plt.xlabel('values') # Displays label of X-Axis
plt.show()

17.2.4 Application of arrange() function:

Example:

import numpy as np
import matplotlib.pyplot as plt
x=np.arange(1,5,1)
plt.plot(x, 'r') # ‘r’ makes red colour of the line generated according to value of x
plt.plot(x+1, 'y') # ‘y’ makes yellow colour of the line according to value of x+1
plt.plot(x+2, 'b') # ‘b’ makes blue colour of the line according to value of x+2
plt.show()
X=[1, 2, 3, 4]
X+1 = [2 , 3, 4, 5]
X+2= [3 , 4, 5, 6]

Note:
arange(a,b,c): arange() function generates values from starting value(a) up to before stop value (b)
incremented by third value(c) which is optional. In the above example, [1, 2, 3, 4] values will be
generated for x variable, where initial value is 1, final is 5 and increment value is 1.

17.3 BAR CHART:

A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with
heights or lengths proportional to the values that they represent. The bars can be plotted vertically or
horizontally.

A bar graph shows comparisons among discrete categories. One axis of the chart shows the specific
categories being compared, and the other axis represents a measured value.

Matplotlib API provides the bar() function that can be used in the MATLAB style use as well as object
oriented API.

17.3.1 Example:

import matplotlib.pyplot as plt

langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
plt.bar(langs, students, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
plt.show()
17.3.2 Example

17.3.3 Example:

import matplotlib.pyplot as plt

langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
plt.bar(langs , students, width=.5, color='g')
plt.xlabel('X-Axis==>', fontsize=20)
plt.ylabel('Y-Axis==>', fontsize=20)
plt.xticks(langs, fontsize=10, rotation=90)
plt.yticks(students, fontsize=15, rotation=60)
plt.title('Bar Graph of Result=> ', fontsize=20)
plt.show()

17.3.4 Example:

import matplotlib.pyplot as plt

x=['MATHS', 'SCIENCE', 'SST', 'ENG', 'IT'] y=[98,
89,90,91,99]
langs = ['C', 'C++', 'Java', 'Python', 'PHP'] students =
[23,17,35,29,12]
plt.bar(x , y, width=0.5 , color='r')
plt.bar(langs,students, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
plt.show()
17.3.5 Example:

import matplotlib.pyplot as plt

x=['MATHS', 'SCIENCE', 'SST', 'ENG', 'IT']
y=[98, 89,90,91,99]
x=['C', 'C++', 'Java', 'Python', 'PHP']
y=[8, 5,9,6,7]
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [3,7,5,9,2]
plt.bar(x , y, width=0.5 , color='r')
plt.bar(langs,students, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
plt.show()

17.3.6 Example:

import matplotlib.pyplot as plt

x=['Sc', 'Math', 'Sst', 'Eng', 'IT']
y=[1,3,5,7,9]
langs = ['C', 'C++', 'Java', 'Python', 'PHP'] students =
[2,4,6,8,10]
plt.bar(x , y, width=0.5 , color='r')
plt.bar(langs,students, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
plt.show()

17.3.7 Example: Using legend()

import matplotlib.pyplot as plt

x=[1,3,5,7,9]
y=['Sc', 'Math', 'Sst', 'Eng', 'IT']
x1 = [2,4,6,8,10]
y1 = ['C', 'C++', 'Java', 'Python', 'PHP']
a=plt.bar(x , y, width=0.5 , color='r')
b=plt.bar(x1,y1, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
plt.legend([‘a’,’b’]) # Legend
plt.show()
17.3.8 Example:

import matplotlib.pyplot as plt

y=[10,30,20,50,40]
x=[1,3,5,7,9]
y1 = [11,22,55, 33, 44]
x1 = [2,4,6,8,10]
plt.bar(x , y, width=0.5 , color='r')
plt.bar(x1,y1, width=.5, color='g')
plt.xlabel('X-Axis==>')
plt.ylabel('Y-Axis==>')
plt.title('Bar Graph of Result=> ')
#plt.legend()
plt.show()

17.3.9 Example: Using Panda‟s DataFrame:

import pandas as pd
import matplotlib.pyplot as plot
data = {"Production":[10000, 12000, 14000],
"Sales":[9000, 10500, 12000]
}
index= ["2017", "2018", "2019"]
df = pd.DataFrame(data=data, index=index) # data variable has two sets of values as X-axis
df.plot.bar(rot=15, title="Annual Production Vs Annual Sales")
plot.show()

Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Pandas
No ratings yet
Pandas
86 pages
Python Matplotlib: Gaurav Verma
100% (1)
Python Matplotlib: Gaurav Verma
21 pages
Pandas Class XII (2021-22)
No ratings yet
Pandas Class XII (2021-22)
246 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
XII-IP - Data Visualisation
No ratings yet
XII-IP - Data Visualisation
65 pages
Pandas
No ratings yet
Pandas
167 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
A CR CCP 702 PF 001 Red Star IG
No ratings yet
A CR CCP 702 PF 001 Red Star IG
730 pages
03-Python Libraries - Numpy - Matplotlib
No ratings yet
03-Python Libraries - Numpy - Matplotlib
56 pages
기존 시설물 (기초및지반) 내진성능 평가요령 (안)
No ratings yet
기존 시설물 (기초및지반) 내진성능 평가요령 (안)
216 pages
Pandas
No ratings yet
Pandas
41 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
Motivational Cognitive Behavioural Therapy Distinctive Features 1st Edition Optimized EPUB Download
100% (19)
Motivational Cognitive Behavioural Therapy Distinctive Features 1st Edition Optimized EPUB Download
16 pages
Python Programming Notes
No ratings yet
Python Programming Notes
143 pages
Data Structures Cheat Sheet
100% (1)
Data Structures Cheat Sheet
1 page
Designing Machine Learning Workflows in Python Chapter2
No ratings yet
Designing Machine Learning Workflows in Python Chapter2
39 pages
Interfacing of LED 8051
No ratings yet
Interfacing of LED 8051
16 pages
Sustainability - Feature Story
No ratings yet
Sustainability - Feature Story
2 pages
? Gallery Walk Scoring Rubric
No ratings yet
? Gallery Walk Scoring Rubric
2 pages
Top 50 Pandas Interview Questions and Answers (2024)
No ratings yet
Top 50 Pandas Interview Questions and Answers (2024)
34 pages
Python Interview Questions
No ratings yet
Python Interview Questions
61 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
100% (1)
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
12 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Brosur Grolen HP19R
No ratings yet
Brosur Grolen HP19R
2 pages
DSL Pandas
No ratings yet
DSL Pandas
87 pages
Iot Systems - Logical Design Using Python: Bahga & Madisetti, © 2015
No ratings yet
Iot Systems - Logical Design Using Python: Bahga & Madisetti, © 2015
31 pages
Chapter 1 Review of Python Basicseng PDF
No ratings yet
Chapter 1 Review of Python Basicseng PDF
51 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Python Cheat Sheet For Excel Users
No ratings yet
Python Cheat Sheet For Excel Users
5 pages
Data Visualization - Matplotlib PDF
100% (1)
Data Visualization - Matplotlib PDF
15 pages
Panda Cheatsheet
No ratings yet
Panda Cheatsheet
17 pages
Informatics Practices Practical List22-2323
100% (1)
Informatics Practices Practical List22-2323
7 pages
BS 1449-1 - 1983
100% (7)
BS 1449-1 - 1983
39 pages
Sas#4 - Ite 303-Sia
No ratings yet
Sas#4 - Ite 303-Sia
10 pages
CodeAcademyNotes Python
No ratings yet
CodeAcademyNotes Python
42 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Numpy Complete Material
No ratings yet
Numpy Complete Material
19 pages
Sheet Metal Shop Exp 1.3
No ratings yet
Sheet Metal Shop Exp 1.3
30 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
Tuple in Python PDF
No ratings yet
Tuple in Python PDF
20 pages
1 Pandas Basics
No ratings yet
1 Pandas Basics
13 pages
Python Database Connectivity
No ratings yet
Python Database Connectivity
12 pages
LMRS Ip 2020 21
No ratings yet
LMRS Ip 2020 21
21 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
Pandas
No ratings yet
Pandas
41 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Python Program
No ratings yet
Python Program
7 pages
Getting Started With NumPy in Data Analytics
No ratings yet
Getting Started With NumPy in Data Analytics
45 pages
Introduction To Data Visualization in Python
No ratings yet
Introduction To Data Visualization in Python
16 pages
Mosi Debat
No ratings yet
Mosi Debat
8 pages
Data Analytics With Python-1
No ratings yet
Data Analytics With Python-1
12 pages
A Conversation With William Rathje-Anthropology Today
No ratings yet
A Conversation With William Rathje-Anthropology Today
7 pages
3.IP.12.NOTES - Societal Impacts2022 PDF
No ratings yet
3.IP.12.NOTES - Societal Impacts2022 PDF
14 pages
Project Name: Wilmont's Pharmacy Drone Case: Qualitative Risk Analysis
100% (1)
Project Name: Wilmont's Pharmacy Drone Case: Qualitative Risk Analysis
3 pages
Pandas in Python 16sept2022
No ratings yet
Pandas in Python 16sept2022
8 pages
Pandas Notes
No ratings yet
Pandas Notes
4 pages
Red Zone Equipment Checklist
No ratings yet
Red Zone Equipment Checklist
4 pages
Delineating The Epistemological Trajectory of Learning Theories: Implications For Mathematics Teaching and Learning
No ratings yet
Delineating The Epistemological Trajectory of Learning Theories: Implications For Mathematics Teaching and Learning
18 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Collaborative Learning
No ratings yet
Collaborative Learning
7 pages
Unit 4 - Week 2: Introduction To Python: Assignment 2
No ratings yet
Unit 4 - Week 2: Introduction To Python: Assignment 2
4 pages
XS Series E Appen 7 Installation PDF
No ratings yet
XS Series E Appen 7 Installation PDF
101 pages
Python Generators: How To Create A Generator in Python?
No ratings yet
Python Generators: How To Create A Generator in Python?
8 pages
De Chuyen Anh Vinh Phuc 2018-2019
No ratings yet
De Chuyen Anh Vinh Phuc 2018-2019
6 pages
Artificial Intelligence Lab Manual: Python
No ratings yet
Artificial Intelligence Lab Manual: Python
15 pages
Reflow Soldering
No ratings yet
Reflow Soldering
6 pages
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
100% (1)
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
27 pages
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
No ratings yet
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
9 pages
Working2022 PDF
No ratings yet
Working2022 PDF
9 pages
Teacher Notes and Answers 8 Fluid Mechanics
No ratings yet
Teacher Notes and Answers 8 Fluid Mechanics
3 pages
Kinetic Theory & Thermal Properties Notes IGCSE AVG
100% (3)
Kinetic Theory & Thermal Properties Notes IGCSE AVG
12 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Strings PDF
No ratings yet
Strings PDF
14 pages
MSA Case Studies
No ratings yet
MSA Case Studies
10 pages
IP 12 BoardPracPracticeQuestions PDF
No ratings yet
IP 12 BoardPracPracticeQuestions PDF
2 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
2 Ip 12 Notes RDBMS 2022 PDF
No ratings yet
2 Ip 12 Notes RDBMS 2022 PDF
16 pages
14 AAU - Level 6 - Test - Challenge - Unit 4
100% (13)
14 AAU - Level 6 - Test - Challenge - Unit 4
5 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
Evidence and Practice
No ratings yet
Evidence and Practice
18 pages
Solutions To Pandas Basic Questions
No ratings yet
Solutions To Pandas Basic Questions
1 page
Python Date Time
No ratings yet
Python Date Time
6 pages
Molo District Health Center: AP (Pre-Natal)
No ratings yet
Molo District Health Center: AP (Pre-Natal)
2 pages
Numpy Cheat Sheet & Quick Reference
100% (1)
Numpy Cheat Sheet & Quick Reference
6 pages
Department of Education: Daily Lesson Log (DLL)
0% (1)
Department of Education: Daily Lesson Log (DLL)
2 pages
vb8 Datasheet
No ratings yet
vb8 Datasheet
9 pages
Pandas Guide
No ratings yet
Pandas Guide
64 pages
11 2 Multi-Step Subtraction Problems
No ratings yet
11 2 Multi-Step Subtraction Problems
2 pages
1.safety Inspection Check List
No ratings yet
1.safety Inspection Check List
2 pages