0% found this document useful (0 votes)

220 views48 pages

Pandas Dataframe

A DataFrame is a two-dimensional data structure where data is aligned in a tabular format with rows and columns. It can be created from many different input types like lists, dictionaries, and NumPy arrays. DataFrames allow labeling of rows and columns and can perform arithmetic operations on rows and columns. Common operations on DataFrames include selecting, adding, deleting, and renaming columns as well as selecting, adding, deleting, and sorting rows.

Uploaded by

James Prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

220 views48 pages

Pandas Dataframe

Uploaded by

James Prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

 A Data frame is a two-dimensional data structure,

i.e., data is aligned in a tabular fashion in rows and

columns.

 Features of DataFrame
 Potentially columns are of different types
 Size – Mutable
 Labeled axes (rows and columns)
 Can Perform Arithmetic operations on rows and
columns
 A pandas DataFrame can be created using the
following constructor −
 pandas.DataFrame( data, index, columns, dtype,
copy)
 A pandas DataFrame can be created using various
inputs like −
 Lists
 dict
 Series
 Numpy ndarrays
 Another DataFrame

In the subsequent slides of this lecture, we will see

how to create a DataFrame using these inputs.
 Create an Empty DataFrame
 A basic DataFrame, which can be created is an
Empty Dataframe.

 #import the pandas library and aliasing as pd

 import pandas as pd
 df = pd.DataFrame()
 print df
 Create a DataFrame from Lists
 The DataFrame can be created using a single list or
a list of lists.
 Example 1
 import pandas as pd
 data = [1,2,3,4,5]
 df = pd.DataFrame(data)
 print (df)
 Example 3
 import pandas as pd
 data = [['Alex',10],['Bob',12],['Clarke',13]]
 df=pd.DataFrame(data,columns=['Name','Age'],dty
pe=float)
 print(df)
 Note − Observe, the dtype parameter changes the
type of Age column to floating poi
 Create DataFrame from Dictionary using default
Constructor
 DataFrame constructor accepts a data object that
can be ndarray, dictionary etc.
 But if we are passing a dictionary in data, then it
should contain a list like objects in value field like
Series, arrays or lists etc i.e.
 # Dictionary with list object in values
 studentData = {
 'name' : ['jack', 'Riti', 'Aadi'],
 'age' : [34, 30, 16],
 'city' : ['Sydney', 'Delhi', 'New york']
 }
 On Initialising a DataFrame object with this kind of
dictionary, each item (Key / Value pair) in
dictionary will be converted to one column i.e. key
will become Column Name and list in the value
field will be the column data
 # Dictionary with list object in values
 Import pandas as pd
 studentData = {
 'name' : ['jack', 'Riti', 'Aadi'],
 'age' : [34, 30, 16],
 'city' : ['Sydney', 'Delhi', 'New york']
 }
 dfObj = pd.DataFrame(studentData)
 print(dfObj)
 All the ndarrays must be of same length. If index is
passed, then the length of the index should equal to the
length of the arrays.
 If no index is passed, then by default, index will be
range(n), where n is the array length.
 import pandas as pd
 data={'Name':['Tom','Jack','Steve','Ricky'],'Age':[28,34,29
,42]}
 df = pd.DataFrame(data)
 print(df)
 List of Dictionaries can be passed as input data to
create a DataFrame. The dictionary keys are by default
taken as column names.

 The following example shows how to create a

DataFrame by passing a list of dictionaries.

 import pandas as pd
 data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
 df = pd.DataFrame(data)
 print df
 Example 2
 The following example shows how to create a
DataFrame by passing a list of dictionaries and the row
indices.

 import pandas as pd
 data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
 df = pd.DataFrame(data, index=['first', 'second'])
 print (df)
 The following example shows how to create a DataFrame
with a list of dictionaries, row indices, and column indices.

 import pandas as pd
 data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
 #With two column indices, values same as dictionary keys
 df1 = pd.DataFrame(data, index=['first', 'second'],
columns=['a', 'b'])
 #With two column indices with one index with other name
 df2 = pd.DataFrame(data, index=['first', 'second'],
columns=['a', 'b1'])
 print (df1)
 print (df2)
 Dictionary of Series can be passed to form a DataFrame.
The resultant index is the union of all the series indexes
passed

 import pandas as pd
 d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
 df = pd.DataFrame(d)
 print (df)
 # In order to deal with columns, we perform basic
operations on columns like selecting, deleting, adding
and renaming.
 Column Selection:
 import pandas as pd
 # Define a dictionary containing employee data
 data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
 'Age':[27, 24, 22, 32],
 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
 'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
 # Convert the dictionary into DataFrame
 df = pd.DataFrame(data)
 # select two columns
 print(df[['Name', 'Qualification']])
 Column Addition:
 In Order to add a column in Pandas DataFrame, we can declare a new list as a
column and add to a existing Dataframe.
 # Define a dictionary containing Students data

import pandas as pd
dic = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'], 'Height': [5.1, 6.2, 5.1,
5.2], 'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}

 # Convert the dictionary into DataFrame

df = pd.DataFrame(data=dic)

 # Declare a list that is to be converted into a column

address = ['Delhi', 'Bangalore', 'Chennai', 'Patna']
# Using 'Address' as the column name # and equating it to the list
df['Address'] = address
# Observe the result
print(df)
 Column Addition:
import pandas as pd
dic = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(dic)
 # Adding a new column to an existing DataFrame
object with column label by passing new series
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print(df)
df['four']=df['one']+df['three']
print(df)
 Pandas key data structure is called?

 A. Keyframe
B. DataFrame
C. Statistics
D. Econometrics
 Which of the following input can be accepted by
DataFrame?
a) Structured ndarray
b) Series
c) DataFrame
d) All of the mentioned
 Identify the correct statement:
 A. The standard marker for missing data in Pandas
is NaN
 B. Series act in a way similar to that of an array
 C. Both of the above
 D. None of the above
 If data is an ndarray, index must be the same
length as data.
a) True
b) False
 Column Deletion
 Columns can be deleted, popped or dropped.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print(df)
 # using del function
del df['one']
print (df)
 # using pop function
df.pop('two')
print df
 import pandas as pd
 # Define a dictionary containing employee data
 data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
 'Age':[27, 24, 22, 32],
 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
 'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
 # Convert the dictionary into DataFrame
 df = pd.DataFrame(data)
 # select all rows # and second to fourth column
 df[df.columns[1:4]]
 Selection by Label
 Rows can be selected by passing row label to a loc function.
 [loc is label-based, which means that you have to specify rows and columns
based on their row and column label.]

 import pandas as pd
 d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
 df = pd.DataFrame(d)
 print df.loc['b']
 Selection by integer location
 Rows can be selected by passing integer location to an iloc function.
 iloc is integer index based, so you have to specify
rows and columns by their integer index
 import pandas as pd
 d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c',
'd'])}
 df = pd.DataFrame(d)
 print df.iloc[2]
 Select Multiple Rows
 Multiple rows can be selected using ‘ : ’ operator.
 import pandas as pd
 d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c',
'd'])}
 df = pd.DataFrame(d)
 print(df[2:4])
 Addition of Rows
 Add new rows to a DataFrame using the append function. This function will
append the rows at the end.

 import pandas as pd
 df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
 df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
 df = df.append(df2)
 print (df)
 Deletion of Rows
 Use index label to delete or drop rows from a DataFrame. If label is
duplicated, then multiple rows will be dropped.

 If you observe, in the above example, the labels are duplicate.

 import pandas as pd
 df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
 df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
 df = df.append(df2)
 # Drop rows with label 0
 df = df.drop(0)
 print (df)
 Deletion of Rows
 Use index label to delete or drop rows from a DataFrame. If label is
duplicated, then multiple rows will be dropped.

 If you observe, in the above example, the labels are duplicate.

 import pandas as pd
 df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
 df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
 df = df.append(df2)
 # Drop rows with label 0
 df = df.drop(0)
 print (df)
 DataFrame is a two-dimensional matrix and
will give the shape as rows and columns by
 df.shape
 This is a tuple and thus if we need to store
the rows and columns into some variables
 Pandas head() method is used to return top n
(5 by default) rows of a data frame or series.
 We can get the detail of all the data in the
DataFrame like it’s max, min, mean etc. by
just one command df.describe()
 Function to see first few observations in data
frame is
 A. dataframe_object.head()
 B.dataframe_object.start()
 C.head()
 D.All
 What is the syntax to remove column from
dataframe
 A. del dataframe_object(Column_name)
 B. del Column_name
 C. del dataframe_object()
 D.None of the above
 What is the syntax to remove column from
dataframe
 A. del dataframe_object(Column_name)
 B. del Column_name
 C. del dataframe_object()
 D.None of the above
 The syntax to check uniqueness of lables
 A.df.index.is_unique
 B. df.is_unique
 C. index.is_unique
 D. None of the above
 What is the method for generating multiple
statistics
 A. df.explain()
 B. df.stat()
 C. df.describe()
 D. All
 What is the method for generating multiple
statistics
 A. df.explain()
 B. df.stat()
 C. df.describe()
 D. All
 What is the syntax for reading a csv file into
dataframe in pandas
 A. df = pd.read_csv(file_name.csv)
 B. df = pd.read_csv()
 C. df = read_csv(file_name.csv)
 D. All
 What function is used to fill missing data
 A. df.fillna(value)
 B. fillna(value)
 C. df.fillna()
 D. fillna()
 The operator used for concatenation of
strings is
 A. :
 B. +
 C. *
 D. All
 The index of last character in the string is
 A. 0
 B. 1
 C. N
 D. N -1

Assignment 61
100% (2)
Assignment 61
4 pages
Low Latency Streaming Cmaf Whitepaper
No ratings yet
Low Latency Streaming Cmaf Whitepaper
11 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
64 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Python Functions
No ratings yet
Python Functions
29 pages
Python Lab
No ratings yet
Python Lab
8 pages
Python Pandas-Series-neww
100% (1)
Python Pandas-Series-neww
80 pages
Data Visualization
No ratings yet
Data Visualization
9 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Python Pandas For Class XI Tutorial 1
No ratings yet
Python Pandas For Class XI Tutorial 1
8 pages
03 Strings in Python
No ratings yet
03 Strings in Python
29 pages
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
No ratings yet
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
9 pages
Worksheet - Pandas
100% (1)
Worksheet - Pandas
16 pages
Pandas
No ratings yet
Pandas
41 pages
Python Pandas2 PDF
No ratings yet
Python Pandas2 PDF
38 pages
LMRS Ip 2020 21
No ratings yet
LMRS Ip 2020 21
21 pages
Python Question Bank Complete 100 Question
No ratings yet
Python Question Bank Complete 100 Question
23 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
12 Ip
No ratings yet
12 Ip
5 pages
Strings PDF
No ratings yet
Strings PDF
14 pages
Pandas Commands
No ratings yet
Pandas Commands
3 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
Programming and Data Analytics Using Python
100% (1)
Programming and Data Analytics Using Python
16 pages
Class 12 IP Ch-1, 2 3
No ratings yet
Class 12 IP Ch-1, 2 3
28 pages
Python - Module 3
No ratings yet
Python - Module 3
86 pages
Python Assignment
33% (3)
Python Assignment
53 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Python (Advanced)
No ratings yet
Python (Advanced)
84 pages
Python Cheat Sheet For Excel Users
No ratings yet
Python Cheat Sheet For Excel Users
5 pages
File Handling in Python
No ratings yet
File Handling in Python
17 pages
Pandas Class XII (2021-22)
No ratings yet
Pandas Class XII (2021-22)
246 pages
Record 2022-23
No ratings yet
Record 2022-23
92 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
Python For Data Science Cheat Sheet: Subset Slice
50% (2)
Python For Data Science Cheat Sheet: Subset Slice
1 page
Pandas in Python 16sept2022
No ratings yet
Pandas in Python 16sept2022
8 pages
Data Visualization PDF
No ratings yet
Data Visualization PDF
3 pages
Class 12 Ip Practical Programs 2024-25
No ratings yet
Class 12 Ip Practical Programs 2024-25
37 pages
Informatics Practices Practical List22-2323
100% (1)
Informatics Practices Practical List22-2323
7 pages
Python Date Time
No ratings yet
Python Date Time
6 pages
Python Interview Questions
No ratings yet
Python Interview Questions
8 pages
CLASS 12 Cs
No ratings yet
CLASS 12 Cs
43 pages
SQL Python Connectivity
No ratings yet
SQL Python Connectivity
61 pages
Tuple in Python PDF
No ratings yet
Tuple in Python PDF
20 pages
Unit-I: Introction To Python
No ratings yet
Unit-I: Introction To Python
18 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
MCQ Questions
No ratings yet
MCQ Questions
8 pages
Chapter 2 - NumPy and Pandas
No ratings yet
Chapter 2 - NumPy and Pandas
26 pages
Python Pandas Interview Questions
100% (1)
Python Pandas Interview Questions
17 pages
Python Generators: How To Create A Generator in Python?
No ratings yet
Python Generators: How To Create A Generator in Python?
8 pages
Pandas Notes
No ratings yet
Pandas Notes
4 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Class 6 Pandas
No ratings yet
Class 6 Pandas
13 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Database Management System
No ratings yet
Database Management System
35 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Lecture 9 Pandas
No ratings yet
Lecture 9 Pandas
176 pages
Lecture 2829multiple IntegralsDouble Integrals
No ratings yet
Lecture 2829multiple IntegralsDouble Integrals
18 pages
Lecture 37change of Variables in Triple Integrals Cont.
No ratings yet
Lecture 37change of Variables in Triple Integrals Cont.
15 pages
Lecture 1review of Matrices
No ratings yet
Lecture 1review of Matrices
27 pages
Lecture 42fourier Series of Discontinuous Functions
No ratings yet
Lecture 42fourier Series of Discontinuous Functions
16 pages
Lecture 23partial Derivatives - Total Differential and Derivatives of Composite Functions
No ratings yet
Lecture 23partial Derivatives - Total Differential and Derivatives of Composite Functions
18 pages
Lecture 23total Derivative and Chain Rule
No ratings yet
Lecture 23total Derivative and Chain Rule
18 pages
Lecture 4solution of Linear System of Equations
No ratings yet
Lecture 4solution of Linear System of Equations
25 pages
Lecture 22 Continuity and Partial Derivatives MTH165
No ratings yet
Lecture 22 Continuity and Partial Derivatives MTH165
18 pages
Lecture 16
No ratings yet
Lecture 16
4 pages
L0 MEC205 Zero Lecture Updated
No ratings yet
L0 MEC205 Zero Lecture Updated
42 pages
Linux - Install VirtualBox Guest Additions On Ubuntu 22.04
No ratings yet
Linux - Install VirtualBox Guest Additions On Ubuntu 22.04
4 pages
PlasmaSensOut User Manual
No ratings yet
PlasmaSensOut User Manual
10 pages
Appendix 2
No ratings yet
Appendix 2
28 pages
Waves API 550: User Manual
No ratings yet
Waves API 550: User Manual
14 pages
10 Benchmarking With Solutions
No ratings yet
10 Benchmarking With Solutions
57 pages
BAIO Guide - Polycom Edge E550 Commercial
No ratings yet
BAIO Guide - Polycom Edge E550 Commercial
2 pages
HW11數學規劃
No ratings yet
HW11數學規劃
14 pages
FA17-EEE-037 Control Lab Report 11
No ratings yet
FA17-EEE-037 Control Lab Report 11
6 pages
2021 MitchLeeuwe Paintbackgrounds Patreon
No ratings yet
2021 MitchLeeuwe Paintbackgrounds Patreon
29 pages
Installation (V1.1) bc5380
No ratings yet
Installation (V1.1) bc5380
63 pages
Resume-Enrique Ramses Martinez Jara
No ratings yet
Resume-Enrique Ramses Martinez Jara
1 page
Bca 2 Unit 3
No ratings yet
Bca 2 Unit 3
14 pages
Ai Manifesto The 4th Industrial Revolution in Context of Bangladesh
No ratings yet
Ai Manifesto The 4th Industrial Revolution in Context of Bangladesh
46 pages
Davy
No ratings yet
Davy
3 pages
KGLDT - Editorial Brief Room Descriptions
No ratings yet
KGLDT - Editorial Brief Room Descriptions
10 pages
Unit 7 Coa
No ratings yet
Unit 7 Coa
20 pages
Ultra Fight Da Kyanta 2 _ Netplay Tutorial
No ratings yet
Ultra Fight Da Kyanta 2 _ Netplay Tutorial
5 pages
Semiconductors and The Calculation of The Balance of Power
No ratings yet
Semiconductors and The Calculation of The Balance of Power
95 pages
EDSU-MCM 122 - Technologies in Communication
No ratings yet
EDSU-MCM 122 - Technologies in Communication
35 pages
A Project Report ON: Online Transaction
No ratings yet
A Project Report ON: Online Transaction
43 pages
The Guy She Was Interested in Wasn't A Guy at All 4
No ratings yet
The Guy She Was Interested in Wasn't A Guy at All 4
1 page
DSDM in Agile
No ratings yet
DSDM in Agile
30 pages
Beamer Presentation Template Feather Theme
No ratings yet
Beamer Presentation Template Feather Theme
22 pages
Perfect Plywood & Hardware Shalimar Bagh Delhi (Catalogue)
No ratings yet
Perfect Plywood & Hardware Shalimar Bagh Delhi (Catalogue)
28 pages
Seo Checklist: SEO Search Engine Optimization
No ratings yet
Seo Checklist: SEO Search Engine Optimization
5 pages
Alsa Excell 200 - User Manual
No ratings yet
Alsa Excell 200 - User Manual
17 pages
Optics
No ratings yet
Optics
3 pages
Fbmmm-742 Applied Optimal Control FINAL (Delivery Date: 05 June 2020)
No ratings yet
Fbmmm-742 Applied Optimal Control FINAL (Delivery Date: 05 June 2020)
4 pages
FW23.0.207 Release Notes PDF
No ratings yet
FW23.0.207 Release Notes PDF
7 pages

Pandas Dataframe

Uploaded by

Pandas Dataframe

Uploaded by

 A Data frame is a two-dimensional data structure,

i.e., data is aligned in a tabular fashion in rows and

In the subsequent slides of this lecture, we will see

 #import the pandas library and aliasing as pd

 The following example shows how to create a

 # Convert the dictionary into DataFrame

 # Declare a list that is to be converted into a column

 If you observe, in the above example, the labels are duplicate.

 If you observe, in the above example, the labels are duplicate.

You might also like