0% found this document useful (0 votes)

15 views16 pages

CSL 410 L13

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views16 pages

CSL 410 L13

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Unit No. 2
Introduction of Pandas Library

Lecture No. 13

Dr. Sanjay Jain

Associate Professor, CSA/SOET
Outlines
• Pandas-Introduction
• Pandas-Key Features
• Pandas-Environment Setup
• Pandas-Data Structures
– Series
– Data Frames
– Panel
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it
effectively.
Pandas: Introduction
• Pandas is an open-source Python Library providing high-performance data
manipulation and analysis tool using its powerful data structures.
• The name Pandas is derived from the word Panel Data – an Econometrics
from Multidimensional data.
• In 2008, developer Wes McKinney started developing pandas when in need
of high performance, flexible tool for analysis of data.
• Prior to Pandas, Python was majorly used for data munging and
preparation. It had very less contribution towards data analysis.
• Pandas solved this problem. Using Pandas, we can accomplish five typical
steps in the processing and analysis of data, regardless of the origin of data
— load, prepare, manipulate, model, and analyze.
• Python with Pandas is used in a wide range of fields including academic
and commercial domains including finance, economics, Statistics,
analytics, etc.

<SELO: 1> <Reference No.: R1,R4>

Pandas: Key Features
• Fast and efficient DataFrame object with default and customized indexing.
• Tools for loading data into in-memory data objects from different file
formats.
• Data alignment and integrated handling of missing data.
• Reshaping and pivoting of date sets.
• Label-based slicing, indexing and subsetting of large data sets.
• Columns from a data structure can be deleted or inserted.
• Group by data for aggregation and transformations.
• High performance merging and joining of data.
• Time Series functionality.

<SELO: 1> <Reference No.: R1,R4>

Pandas: Environment Setup
• Standard Python distribution doesn't come bundled with Pandas module. A
lightweight alternative is to install Pandas using popular Python package
installer, pip.
pip install pandas
• If you install Anaconda Python package, Pandas will be installed by
default.

<SELO: 1> <Reference No.: R1,R4>

Pandas: Environment Setup
• Windows
– Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy
stack. It is also available for Linux and Mac.
– Canopy (https://www.enthought.com/products/canopy/) is available as free as well
as commercial distribution with full SciPy stack for Windows, Linux and Mac.
– Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for
Windows OS. (Downloadable from http://python-xy.github.io/)
• Linux
– Package managers of respective Linux distributions are used to install one or more
packages in SciPy stack.
• For Ubuntu Users
– sudo apt-get install python-numpy python-scipy python-
matplotlibipythonipythonnotebook python-pandas python-sympy python-nose
• For Fedora Users
– sudo yum install numpyscipy python-matplotlibipython python-pandas sympy python-
nose atlas-devel
<SELO: 1> <Reference No.: R1,R4>
Pandas: Introduction to Data Structures
• Pandas deals with the following three data structures:
– Series
– DataFrame
– Panel
• These data structures are built on top of Numpy array, which means they
are fast.
• The best way to think of these data structures is that the higher dimensional
data structure is a container of its lower dimensional data structure. For
example, DataFrame is a container of Series, Panel is a container of
DataFrame.

<SELO: 1> <Reference No.: R1,R4>

Pandas: Introduction to Data Structures
• Building and handling two or more dimensional arrays is a tedious task,
burden is placed on the user to consider the orientation of the data set when
writing functions.
• But using Pandas data structures, the mental effort of the user is reduced.
• For example, with tabular data (DataFrame) it is more semantically helpful
to think of the index (the rows) and the columns rather than axis 0 and
axis 1.
• Mutability
All Pandas data structures are value mutable (can be changed) and except
Series all are size mutable. Series is size immutable.
• DataFrame is widely used and one of the most important data
structures. Panel is very less used.

<SELO: 1> <Reference No.: R1,R4>

Pandas: Series
• Series is a one-dimensional array like structure with homogeneous data.
For example, the following series is a collection of integers 10, 23, 56, …

• Key Points
– Homogeneous data
– Size Immutable
– Values of Data Mutable

<SELO: 1> <Reference No.: R1,R4>

Pandas: DataFrame
• DataFrame is a two-dimensional array with heterogeneous data. For
example,

• The table represents the data of a sales team of an organization with their
overall performance rating. The data is represented in rows and columns.
Each column represents an attribute and each row represents a person.

<SELO: 1> <Reference No.: R1,R4>

Pandas: DataFrame
• The data types of the four columns are as follows:

• Key Points
– Heterogeneous data
– Size Mutable
– Data Mutable

<SELO: 1> <Reference No.: R1,R4>

Pandas: Panel
• Panel is a three-dimensional data structure with heterogeneous data. It is
hard to represent the panel in graphical representation. But a panel can be
illustrated as a container of DataFrame.
• Key Points
– Heterogeneous data
– Size Mutable
– Data Mutable

<SELO: 1> <Reference No.: R1,R4>

Learning Outcomes

The students have learn and understand the followings:

•Pandas-Introduction
•Pandas-Key Features
•Pandas-Environment Setup
•Pandas-Data Structures
•Series
•DataFrames
•Panel
References

1. Anaconda for python softwares(Jupiter notebook and spider IDE)

https://www.anaconda.com/products/individual
2. Python software for windows
https://www.python.org/downloads/
3. Online Google python notebook
https://colab.research.google.com/notebooks
Thank you

Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
138 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
Practical Guide To Pandas For Data Science
100% (1)
Practical Guide To Pandas For Data Science
26 pages
Pandas 21
No ratings yet
Pandas 21
33 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
UNIT II Material
No ratings yet
UNIT II Material
34 pages
Pandas
No ratings yet
Pandas
36 pages
Pandas Intro
No ratings yet
Pandas Intro
14 pages
Ii Unit Pandas
No ratings yet
Ii Unit Pandas
30 pages
4a Introduction To Pandas - PPTX - Lyst5943
No ratings yet
4a Introduction To Pandas - PPTX - Lyst5943
11 pages
Unit - V Introduction To Pandas in Python
No ratings yet
Unit - V Introduction To Pandas in Python
21 pages
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
No ratings yet
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
15 pages
Module 6
No ratings yet
Module 6
48 pages
Chapter 10 Python Pandas
No ratings yet
Chapter 10 Python Pandas
40 pages
Python Unit - 6 Pandas
No ratings yet
Python Unit - 6 Pandas
106 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Python Exp12.
No ratings yet
Python Exp12.
2 pages
Pandas Definitions Summary
No ratings yet
Pandas Definitions Summary
2 pages
Pandas
No ratings yet
Pandas
82 pages
Pandas
No ratings yet
Pandas
13 pages
12 SM Ip
No ratings yet
12 SM Ip
180 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Python Pandas
No ratings yet
Python Pandas
96 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas
No ratings yet
Pandas
11 pages
Week 4.1
No ratings yet
Week 4.1
16 pages
Unit 2
No ratings yet
Unit 2
81 pages
L1 Pandaseries
No ratings yet
L1 Pandaseries
21 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Pandas Assignment
No ratings yet
Pandas Assignment
12 pages
Pandas
No ratings yet
Pandas
163 pages
Pandas
No ratings yet
Pandas
3 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
Practical 7
No ratings yet
Practical 7
8 pages
Lab Manual ET Lab III
No ratings yet
Lab Manual ET Lab III
38 pages
Pandas
No ratings yet
Pandas
8 pages
Python Pandas - I
No ratings yet
Python Pandas - I
32 pages
Pandas - Panel Data System
No ratings yet
Pandas - Panel Data System
4 pages
Pandas
No ratings yet
Pandas
13 pages
Python Pandas
No ratings yet
Python Pandas
2 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
6 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Unit 4
No ratings yet
Unit 4
36 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Cheat Sheet PDF
100% (1)
Cheat Sheet PDF
45 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data Handling Using Pandas - Series & Dataframe
14 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
2 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Notes On Pandasmanpreet
No ratings yet
Notes On Pandasmanpreet
4 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Class XI Sample Exam Paper CS
100% (1)
Class XI Sample Exam Paper CS
7 pages
GPT Index Readthedocs Io en Latest
No ratings yet
GPT Index Readthedocs Io en Latest
292 pages
CBIT B E AIDS Major Project Part 1 Report Template 2
No ratings yet
CBIT B E AIDS Major Project Part 1 Report Template 2
62 pages
Kuldeep Kumar Rawani: Medley Medical Solutions PVT LTD
No ratings yet
Kuldeep Kumar Rawani: Medley Medical Solutions PVT LTD
2 pages
Virtual Dressing
No ratings yet
Virtual Dressing
37 pages
Prospectus
No ratings yet
Prospectus
30 pages
Snyk Integration + Training Resources
No ratings yet
Snyk Integration + Training Resources
27 pages
Python Microproject
No ratings yet
Python Microproject
13 pages
Python and Crypto A Beginners Guide
No ratings yet
Python and Crypto A Beginners Guide
9 pages
Blackbook Group
No ratings yet
Blackbook Group
82 pages
Python PDF
No ratings yet
Python PDF
208 pages
A Beginners Guide To Collectd
No ratings yet
A Beginners Guide To Collectd
19 pages
Project Report Dhruv Shivam Asmit
No ratings yet
Project Report Dhruv Shivam Asmit
67 pages
Python Notes Unit1
No ratings yet
Python Notes Unit1
62 pages
Metashape Python Reference - Agisoft LLC
No ratings yet
Metashape Python Reference - Agisoft LLC
175 pages
Resume Templates
No ratings yet
Resume Templates
1 page
Laboratory Reports: Data Structures and Algorithms
No ratings yet
Laboratory Reports: Data Structures and Algorithms
109 pages
Movie Recommendartion
No ratings yet
Movie Recommendartion
26 pages
Python Basic and Advanced-Day 2
No ratings yet
Python Basic and Advanced-Day 2
17 pages
Pythontest
No ratings yet
Pythontest
5 pages
Tecnología Taller1
No ratings yet
Tecnología Taller1
4 pages
Tools For Data Science
No ratings yet
Tools For Data Science
4 pages
Python Worksheet 3
No ratings yet
Python Worksheet 3
5 pages
COE 301 - 2023-2024 Answers
No ratings yet
COE 301 - 2023-2024 Answers
8 pages
Resume Pankaj Bhambhani
No ratings yet
Resume Pankaj Bhambhani
2 pages
Installing Numpy, Scipy, Opencv, Theano For Python in VS: Windows X86-64 Msi Installer"
No ratings yet
Installing Numpy, Scipy, Opencv, Theano For Python in VS: Windows X86-64 Msi Installer"
7 pages
Experience: Software Engineer
No ratings yet
Experience: Software Engineer
2 pages
Analysis Report
No ratings yet
Analysis Report
2 pages
5f75d77a5fe7a - 1601558389.siddharth Singh
No ratings yet
5f75d77a5fe7a - 1601558389.siddharth Singh
1 page

CSL 410 L13

Uploaded by

CSL 410 L13

Uploaded by

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Dr. Sanjay Jain

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

The students have learn and understand the followings:

1. Anaconda for python softwares(Jupiter notebook and spider IDE)

You might also like