0% found this document useful (0 votes)

105 views32 pages

Pandas DataFrame Basics Guide

Pandas DataFrame is a two-dimensional data structure that allows labeling of rows and columns for analysis of tabular data. It consists of series objects containing data, rows, and columns. Basic operations on a DataFrame include creating, selecting, adding, and deleting rows and columns. Missing data is represented by NaN values and can be checked, filled, or dropped from the DataFrame.

Uploaded by

Ben Ten

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

105 views32 pages

Pandas DataFrame Basics Guide

Uploaded by

Ben Ten

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Python Data Frame

PREPARED BY
R.AKILA.AP(SG)/CSE
BSACIST
Pandas

At the very basic level, Pandas objects can be thought

of as enhanced versions of NumPy structured arrays.
The rows and columns are identified with labels
rather than simple integer indices.
Pandas provides a host of useful tools, methods, and
functionality on top of these data structures.
Three fundamental Pandas data structures:
Series, DataFrame, and Index.
Pandas Series
A pandas Series is a one-dimensional array of
indexed data. It can be created from a list or array
The series has both a sequence of values and a
sequence of indices, which we can access with
the values and index attributes. The values are
simply a familiar NumPy array:
The essential difference is the presence of the index:
while the Numpy Array has an implicitly
defined integer index used to access the values, the
Pandas Series has an explicitly defined index
associated with the values.
This explicit index definition gives the Series object
additional capabilities. For example, the index need
not be an integer, but can consist of values of any
desired type. For example, if we wish, we can use
strings as an index:
Series as Specialized Dictionary

A dictionary is a structure which maps arbitrary keys

to a set of arbitrary values, and a series is a structure
which which maps typed keys to a set
of typed values.
This typing is important: just as the type-specific
compiled code behind a NumPy array makes it more
efficient than a Python list for certain operations, the
type information of a Pandas Series makes it much
more efficient than Python dictionaries for certain
operations.
Pandas DataFrame

Pandas DataFrame is two-dimensional size-

mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns).
A Data frame is a two-dimensional data structure,
i.e., data is aligned in a tabular fashion in rows and
columns.
Pandas DataFrame consists of three principal
components, the data, rows, and columns.

Basic operation on Pandas DataFrame

Creating a DataFrame
Dealing with Rows and Columns
Indexing and Selecting Data
Working with Missing Data
Iterating over rows and columns
Contd..

In the real world, a Pandas DataFrame will be

created by loading the datasets from existing storage,
storage can be SQL Database, CSV file, and Excel file.
 Pandas DataFrame can be created from the lists,
dictionary, and from a list of dictionary etc.
Creating a dataframe using List

Creating DataFrame from dict of ndarray/lists

To create DataFrame from dict of narray/list, all the

narray must be of same length.
If index is passed then the length index should be
equal to the length of arrays.
 If no index is passed, then by default, index will be
range(n) where n is the array length.
Dealing with Rows and Columns

A Data frame is a two-dimensional data structure,

i.e., data is aligned in a tabular fashion in rows and
columns.
We can perform basic operations on rows/columns
like selecting, deleting, adding, and renaming.
Column Selection: In Order to select a column in
Pandas DataFrame, we can either access the columns
by calling them by their columns name.
import pandas as pd

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],

'Height': [5.1, 6.2, 5.1, 5.2],
'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}

# Convert the dictionary into DataFrame

df = pd.DataFrame(data)

# Declare a list that is to be converted into a column

address = ['Delhi', 'Bangalore', 'Chennai', 'Patna']

# Using 'Address' as the column name

# and equating it to the list
df['Address'] = address

# Observe the result

print(df)
After adding new column
Dataframe
Row Selection: Pandas provide a unique method
to retrieve rows from a Data frame.
DataFrame.loc[] method is used to retrieve rows
from Pandas DataFrame.
Rows can also be selected by passing integer location
to an iloc[] function.
Dealing with rows

# importing pandas package

import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")

# retrieving row by loc method

first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]

print(first, "\n\n\n", second)

Selecting a single row

Indexing a DataFrame using .iloc[ ] :

This function allows us to retrieve rows and columns
by position.
In order to do that, we’ll need to specify the positions
of the rows that we want, and the positions of the
columns that we want as well.
The df.iloc indexer is very similar to df.loc but only
uses integer locations to make its selections.
Working with Missing Data

Missing Data can occur when no information is provided for

one or more items or for a whole unit.
 Missing Data is a very big problem in real life scenario.
Missing Data can also refer to as NA(Not Available) values in
pandas.
Checking for missing values
using isnull() and notnull() :
In order to check missing values in Pandas DataFrame, we use
a function isnull() and notnull().
Both function help in checking whether a value is NaN or not.
These function can also be used in Pandas Series in order to
find null values in a series.
Filling missing values

Filling missing values

using fillna(), replace() and interpolate() :
In order to fill null values in a datasets, we
use fillna(), replace() and interpolate() function these
function replace NaN values with some value of their own.
All these function help in filling a null values in datasets of
a DataFrame.
Interpolate() function is basically used to fill NA values in
the dataframe but it uses various interpolation technique to
fill the missing values rather than hard-coding the value.
Dropping missing values

Dropping missing values using dropna() :

In order to drop a null values from a dataframe, we
used dropna() function this fuction drop
Rows/Columns of datasets with Null values in
different ways.
Now we drop rows with at least one Nan
value (Null value)

Pandas
No ratings yet
Pandas
7 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas
No ratings yet
Pandas
13 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
31 pages
Subject IP
No ratings yet
Subject IP
9 pages
Lab 9
No ratings yet
Lab 9
9 pages
Class Xii Information Practices PPT On Data Handling Using Pandas-I
No ratings yet
Class Xii Information Practices PPT On Data Handling Using Pandas-I
64 pages
Python 3rd Unit Question and Answer
No ratings yet
Python 3rd Unit Question and Answer
25 pages
Practical - 3 (Ai)
No ratings yet
Practical - 3 (Ai)
12 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
Data Frames
No ratings yet
Data Frames
60 pages
Unit III - Notes
No ratings yet
Unit III - Notes
12 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
Introduction To Pandas and Matplotlib: Dr. D. Kothandaraman Associate Professor, SCOPE, VITAP-University
No ratings yet
Introduction To Pandas and Matplotlib: Dr. D. Kothandaraman Associate Professor, SCOPE, VITAP-University
30 pages
Phan1 Pandas Numpy Matplotlib
No ratings yet
Phan1 Pandas Numpy Matplotlib
158 pages
Unit III - Pandas - Data Manipulation Using Python
No ratings yet
Unit III - Pandas - Data Manipulation Using Python
15 pages
Pandas Series and DataFrames Guide
100% (2)
Pandas Series and DataFrames Guide
64 pages
1 Data Handling Using Pandas 1
No ratings yet
1 Data Handling Using Pandas 1
63 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
Unit 4.2
No ratings yet
Unit 4.2
24 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
14 pages
Unit 2
No ratings yet
Unit 2
81 pages
Unit-3 DH&V
No ratings yet
Unit-3 DH&V
135 pages
Pandas Series - Notes For PA3
No ratings yet
Pandas Series - Notes For PA3
9 pages
FDS Exp4
No ratings yet
FDS Exp4
5 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
41 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
9 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Pandas Worksheets ALL
100% (1)
Pandas Worksheets ALL
8 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Pandas
No ratings yet
Pandas
63 pages
Module 6
No ratings yet
Module 6
48 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Pandas
No ratings yet
Pandas
29 pages
Grade-XII-IP - Ch-1 - Series Notes
No ratings yet
Grade-XII-IP - Ch-1 - Series Notes
28 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
138 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
05getting Started With Pandas
No ratings yet
05getting Started With Pandas
44 pages
Pandas
No ratings yet
Pandas
25 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
16 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
Pandas DataFrame Basics
No ratings yet
Pandas DataFrame Basics
48 pages
Fundamental Duties
No ratings yet
Fundamental Duties
12 pages
Directive Principles of State Policy in India
No ratings yet
Directive Principles of State Policy in India
14 pages
Visualization Errors
No ratings yet
Visualization Errors
34 pages
Comparisons, Masks, and Boolean Logic
No ratings yet
Comparisons, Masks, and Boolean Logic
33 pages
Combining Datasets
No ratings yet
Combining Datasets
36 pages
Product List Under SWR Fittings Standard IS 14735: 1999: Minimum
No ratings yet
Product List Under SWR Fittings Standard IS 14735: 1999: Minimum
2 pages
Multiattribute Networks in CBF Systems
No ratings yet
Multiattribute Networks in CBF Systems
9 pages
IT Cost Data Integrity Briefing
No ratings yet
IT Cost Data Integrity Briefing
10 pages
Biohacking: Tech-Savvy Self-Experimenters
No ratings yet
Biohacking: Tech-Savvy Self-Experimenters
4 pages
Glasses Direct Competitor Analysis
No ratings yet
Glasses Direct Competitor Analysis
4 pages
Photons OAM in Optical Communications
No ratings yet
Photons OAM in Optical Communications
108 pages
SAP R/3 System Introduction
No ratings yet
SAP R/3 System Introduction
61 pages
Kisi2 Eng
No ratings yet
Kisi2 Eng
8 pages
CS601 Short Notes (VUAnswer - Com) Topic 124 To 204
100% (1)
CS601 Short Notes (VUAnswer - Com) Topic 124 To 204
98 pages
Hackdata'25
No ratings yet
Hackdata'25
14 pages
Creating Effective User Personas
No ratings yet
Creating Effective User Personas
14 pages
LG LCD TV Service Manual Guide
No ratings yet
LG LCD TV Service Manual Guide
14 pages
Solid-State-Drives (SSDS) Modeling Simulation Tools Strategies (Rino Micheloni (Eds.) ) (Z-Library)
No ratings yet
Solid-State-Drives (SSDS) Modeling Simulation Tools Strategies (Rino Micheloni (Eds.) ) (Z-Library)
177 pages
COMP301 Lab 1
No ratings yet
COMP301 Lab 1
2 pages
SAP SD Interview Questions
No ratings yet
SAP SD Interview Questions
9 pages
Programming 1 FINAL EXAM
100% (1)
Programming 1 FINAL EXAM
2 pages
LectroPol-5 Brochure English PDF
No ratings yet
LectroPol-5 Brochure English PDF
4 pages
Drawing - Boll & Kirch Filterbau GMBH
No ratings yet
Drawing - Boll & Kirch Filterbau GMBH
7 pages
Razer Gold Gift Card - Google Search
No ratings yet
Razer Gold Gift Card - Google Search
1 page
Computer Science Resume
100% (1)
Computer Science Resume
6 pages
Migrating An Oracle Database To AWS
No ratings yet
Migrating An Oracle Database To AWS
9 pages
Allied Telesis - at gs2002 SP - Data Sheet
No ratings yet
Allied Telesis - at gs2002 SP - Data Sheet
2 pages
Sensirion Humidity SHT20 Datasheet
No ratings yet
Sensirion Humidity SHT20 Datasheet
14 pages
31 Startup Ideas
No ratings yet
31 Startup Ideas
32 pages
Online Crime Reporting System
No ratings yet
Online Crime Reporting System
14 pages
Literature Review On DTH Services
100% (1)
Literature Review On DTH Services
4 pages
Letter To RDSO Regarding Strengthening of ICF Bogie Frame Dated 09.06.2023
No ratings yet
Letter To RDSO Regarding Strengthening of ICF Bogie Frame Dated 09.06.2023
29 pages
Globalization Empowers Civilization0330
No ratings yet
Globalization Empowers Civilization0330
45 pages
An Introduction To Python For Scientific Computing: © 2019 M. Scott Shell Last Modified 9/24/2019
No ratings yet
An Introduction To Python For Scientific Computing: © 2019 M. Scott Shell Last Modified 9/24/2019
62 pages
5E Lesson Plan Template
No ratings yet
5E Lesson Plan Template
6 pages

Pandas DataFrame Basics Guide

Uploaded by

Pandas DataFrame Basics Guide

Uploaded by

Python Data Frame

At the very basic level, Pandas objects can be thought

A dictionary is a structure which maps arbitrary keys

Pandas DataFrame is two-dimensional size-

In the real world, a Pandas DataFrame will be

To create DataFrame from dict of narray/list, all the

A Data frame is a two-dimensional data structure,

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],

# Convert the dictionary into DataFrame

# Declare a list that is to be converted into a column

# Using 'Address' as the column name

# Observe the result

# importing pandas package

# making data frame from csv file

# retrieving row by loc method

print(first, "\n\n\n", second)

Indexing a DataFrame using .iloc[ ] :

Missing Data can occur when no information is provided for

Filling missing values

Dropping missing values using dropna() :

You might also like