0% found this document useful (0 votes)

4 views7 pages

Notes On Pandas.

Pandas is a Python library designed for data manipulation and analysis, providing functions for cleaning, exploring, and analyzing datasets. It includes data structures like Series (one-dimensional) and DataFrames (two-dimensional) for organizing data, and allows for operations such as data cleaning, correlation analysis, and loading data from CSV files. Created by Wes McKinney in 2008, the name 'Pandas' references both 'Panel Data' and 'Python Data Analysis'.

Uploaded by

tajeshwarsingh400

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views7 pages

Notes On Pandas.

Uploaded by

tajeshwarsingh400

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

What are Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was
created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

● Is there a correlation between two or more columns?

● What is the average value?
● Max value?
● Min value?

Pandas are also able to delete rows that are not relevant, or contain wrong values, like empty or
NULL values. This is called cleaning the data.

What is a Series?
A Pandas Series is like a column in a table.

It is a one-dimensional array holding data of any type.

Create a simple Pandas Series from a list:

import pandas as pd
a = [1, 7, 2]

data = pd.Series(a)

print(data)

Labels
If nothing else is specified, the values are labeled with their index number. First value has index
0, second value has index 1 etc.

This label can be used to access a specified value.

Create Labels
With the index argument, you can name your own labels.
Create your own labels:

import pandas as pd

a = [1, 7, 2]

data = pd.Series(a, index = ["x", "y", "z"])

print(data)

Output:

x 1

y 7

z 2

Key/Value Objects as Series

You can also use a key/value object, like a dictionary, when creating a Series.
Create a simple Pandas Series from a dictionary:

import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}

data = pd.Series(calories)

print(data)

Output:
day1 420

day2 380

day3 390

To select only some of the items in the dictionary, use the index argument and specify only the
items you want to include in the Series.

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

data = pd.Series(calories, index = ["day1", "day2"])

print(data)

Output:
day1 420

day2 380

DataFrames
Datasets in Pandas are usually multi-dimensional tables, called DataFrames.
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table
with rows and columns.

Series is like a column, a DataFrame is the whole table. 2 dimensional.

Create a DataFrame from two Series:

import pandas as pd

data = {

"calories": [420, 380, 390],

"duration": [50, 40, 45]

data = pd.DataFrame(data)

print(data)

Output:
calories duration

0 420 50

1 380 40

2 390 45

Locate Row
As you can see from the result above, the DataFrame is like a table with rows and columns.

Pandas use the loc attribute to return one or more specified row(s)

Example 1

#refer to the row index:

print(df.loc[0])

Output:
calories 420

duration 50

Example 2

#use a list of indexes:

print(df.loc[[0, 1]])

Output:
calories duration

0 420 50
1 380 40

Note: When using [], the result is a Pandas DataFrame.

Named Indexes
With the index argument, you can name your own indexes.
Example

Add a list of names to give each row a name:

import pandas as pd

data = {

"calories": [420, 380, 390],

"duration": [50, 40, 45]

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

Output:
calories duration

day1 420 50

day2 380 40

day3 390 45

Locate Named Indexes

Use the named index in the loc attribute to return the specified row(s).

Example
Return "day2":

#refer to the named index:

print(df.loc["day2"])

Output:
calories 380

duration 40

Read CSV Files

A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contain plain text and is a well known format that can be read by everyone including
Pandas.

In our examples we will be using a CSV file called 'data.csv'.

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Example

Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df)

Practical Guide To Pandas For Data Science
100% (1)
Practical Guide To Pandas For Data Science
26 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Flacs CFD Manual
No ratings yet
Flacs CFD Manual
658 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
Project File Airline Managment System
100% (1)
Project File Airline Managment System
21 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Python Pandas
100% (1)
Python Pandas
35 pages
For Assignment-3 (Final - Pandas - Lab)
No ratings yet
For Assignment-3 (Final - Pandas - Lab)
40 pages
COMPUTER STUDIES Form 3 Term 2 Joint Exam 2022 Questions
No ratings yet
COMPUTER STUDIES Form 3 Term 2 Joint Exam 2022 Questions
11 pages
Pandas
No ratings yet
Pandas
41 pages
Telindus 1421 SHDSL Router
No ratings yet
Telindus 1421 SHDSL Router
806 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
40 MTCNA Questions
No ratings yet
40 MTCNA Questions
10 pages
Oppo Reno10Pro - 5G - ColorOS 13.1 User Manual - en-US - Final
No ratings yet
Oppo Reno10Pro - 5G - ColorOS 13.1 User Manual - en-US - Final
59 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Unit - V Introduction To Pandas in Python
No ratings yet
Unit - V Introduction To Pandas in Python
21 pages
VTU Provisional Results Sheet
No ratings yet
VTU Provisional Results Sheet
1 page
Pandas
No ratings yet
Pandas
41 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Auto Update Log
No ratings yet
Auto Update Log
118 pages
COA Final Merged
No ratings yet
COA Final Merged
130 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
Python Libraries
No ratings yet
Python Libraries
53 pages
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
No ratings yet
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
5 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
Pnpki Vinset 2.0
No ratings yet
Pnpki Vinset 2.0
34 pages
Unit 4
No ratings yet
Unit 4
36 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Python Pandas
No ratings yet
Python Pandas
34 pages
Pandas Basics
No ratings yet
Pandas Basics
21 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
No ratings yet
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
25 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Unit 1 Three Level Architecture
No ratings yet
Unit 1 Three Level Architecture
19 pages
Pandas
No ratings yet
Pandas
21 pages
Data Science - Sec3
No ratings yet
Data Science - Sec3
27 pages
MOD-3 Dap
No ratings yet
MOD-3 Dap
41 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Assembly Language Fundamentals: CMPS293&290 Class Notes (Chap 03) Kuo-Pao Yang Page 1 / 22
No ratings yet
Assembly Language Fundamentals: CMPS293&290 Class Notes (Chap 03) Kuo-Pao Yang Page 1 / 22
22 pages
Pandas
No ratings yet
Pandas
16 pages
IT Audit Checklist Sachin Hissaria 1691968387
No ratings yet
IT Audit Checklist Sachin Hissaria 1691968387
9 pages
Systems Analysis and Design in A Changing World, Fifth Edition
No ratings yet
Systems Analysis and Design in A Changing World, Fifth Edition
26 pages
Unit 2 Mca275 PPT Part 2
No ratings yet
Unit 2 Mca275 PPT Part 2
33 pages
Pandas
No ratings yet
Pandas
21 pages
Pandas
No ratings yet
Pandas
11 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
14 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Lecture 7 Understanding Dataframes in Python and R
No ratings yet
Lecture 7 Understanding Dataframes in Python and R
17 pages
Pandas
No ratings yet
Pandas
163 pages
ch07 PPT
No ratings yet
ch07 PPT
34 pages
Pandas
No ratings yet
Pandas
13 pages
FDS Notes Unit-4
No ratings yet
FDS Notes Unit-4
30 pages
Examples Joins
No ratings yet
Examples Joins
9 pages
Pandas
No ratings yet
Pandas
42 pages
Exercise 3
No ratings yet
Exercise 3
12 pages
Cloud Storage
No ratings yet
Cloud Storage
14 pages
Pandas AI
No ratings yet
Pandas AI
14 pages
Importing Files Through Pandas
No ratings yet
Importing Files Through Pandas
16 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Hmi Series
No ratings yet
Hmi Series
19 pages
DBMS Journal Guidelines
No ratings yet
DBMS Journal Guidelines
7 pages
Pandas Notes
No ratings yet
Pandas Notes
10 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
19 pages
Blockchain Quiz A
No ratings yet
Blockchain Quiz A
16 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Experiment No. 2 Load Profile Analysis Using MATLAB Objectives
No ratings yet
Experiment No. 2 Load Profile Analysis Using MATLAB Objectives
4 pages
Data Science Notes Unit-1 Part - 2
No ratings yet
Data Science Notes Unit-1 Part - 2
22 pages
Pandas (Assignment 3)
No ratings yet
Pandas (Assignment 3)
24 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Syllabus For IOT
No ratings yet
Syllabus For IOT
11 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Unit 3
No ratings yet
Unit 3
10 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
6 pages
Python Pandas
No ratings yet
Python Pandas
34 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
User Satisfaction of Information Display On Mobile Devices and Desktop Computer: A Comparative Study
No ratings yet
User Satisfaction of Information Display On Mobile Devices and Desktop Computer: A Comparative Study
5 pages
MAHARAJ.M FlowCV Resume 20241105
No ratings yet
MAHARAJ.M FlowCV Resume 20241105
4 pages
Brochure Springer10 PDF
No ratings yet
Brochure Springer10 PDF
2 pages
Data Science For Beginners - Tableau and Python Workshop
No ratings yet
Data Science For Beginners - Tableau and Python Workshop
5 pages
Software Development Project - I
No ratings yet
Software Development Project - I
4 pages
Pandas
No ratings yet
Pandas
3 pages
Log
No ratings yet
Log
2 pages
Multi - Pilecap Design Spreadsheet
No ratings yet
Multi - Pilecap Design Spreadsheet
1 page
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet

Notes On Pandas.

Uploaded by

Notes On Pandas.

Uploaded by

What are Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

Why Use Pandas?

Relevant data is very important in data science.

What Can Pandas Do?

● Is there a correlation between two or more columns?

It is a one-dimensional array holding data of any type.

Create a simple Pandas Series from a list:

This label can be used to access a specified value.

data = pd.Series(a, index = ["x", "y", "z"])

Key/Value Objects as Series

calories = {"day1": 420, "day2": 380, "day3": 390}

data = pd.Series(calories, index = ["day1", "day2"])

Series is like a column, a DataFrame is the whole table. 2 dimensional.

Create a DataFrame from two Series:

"calories": [420, 380, 390],

#refer to the row index:

#use a list of indexes:

Note: When using [], the result is a Pandas DataFrame.

Add a list of names to give each row a name:

"calories": [420, 380, 390],

"duration": [50, 40, 45]

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

Locate Named Indexes

#refer to the named index:

Read CSV Files

In our examples we will be using a CSV file called 'data.csv'.

Load Files Into a DataFrame

Load a comma separated file (CSV file) into a DataFrame:

You might also like