Unit 3 Data Analysis using pandas - Copy
Unit 3 Data Analysis using pandas - Copy
Unit 3
Data Analysis using
Pandas
Data Analysis using Pandas
Disclaimer
The content is curated from online/offline resources and used for educational purpose only
Data Analysis using Pandas
Pandas solved
your Problem
Missing Values Give your Data to Pandas with
the correct code
Data Analysis using Pandas
Learning Objectives
• Introduction to Pandas
• Why Pandas?
• Applications of Pandas
• Installation of Pandas
• Pandas Objects
• Pandas Sort
• Working with Text Data
• Statistical Function
• Indexing and Selecting Data
Data Analysis using Pandas
Introduction to Pandas
• Pandas is an open-source Python library that uses powerful data structures to provide high-
performance data manipulation and analysis.
• It provides a variety of data structures and operations for manipulating numerical data and time series.
• This library is based on the NumPy library.
Data Analysis using Pandas
Why Pandas?
Click here
Click here
Installation of Pandas
• The first step in using pandas is to check whether it is installed in the Python folder.
• If not, we must install it on our system using the pip command.
pip install pandas
• After installing pandas on your system, you'll need to import the library.
• This module is typically imported as follows:
import pandas as pd
Data Analysis using Pandas
Pandas Object
Data Analysis using Pandas
What is a Series?
• Pandas Series is a labelled one-dimensional array that
can hold any type of data (integer, string, float, Python
objects, and so on).
• Pandas Series is simply a column in an Excel
spreadsheet.
• Using the Series() method, we can easily convert a list,
tuple, or dictionary into a Series.
series
Data Analysis using Pandas
What is a Series?
Pandas Index
• Pandas Index is an efficient tool for extracting
particular rows and columns of data from a
DataFrame.
• Its job is to organize data and make it easily
accessible.
• We can also define an index, similar to an address,
through which we can access any data in the Series
or DataFrame.
Pandas Index
Data Analysis using Pandas
Pandas Index
Creating Index
First, we have to take a csv file that consist some data used for indexing.
Data Analysis using Pandas
Pandas DataFrame
• Panda has A two-dimensional data structure with
corresponding labels is known as a dataframe.
• Spreadsheets used in Excel or Calc or SQL tables
are similar to DataFrames.
• Pandas DataFrame consists of three main
components: the data, the index, and the columns.
DataFrame
Data Analysis using Pandas
Pandas DataFrame
Creating a Pandas DataFrame
• Creating a dataframe using List: DataFrame can
be created using a single list or a list of lists.
Data Analysis using Pandas
Pandas DataFrame
Reindexing
• Reindexing modifies the row and column labels of a DataFrame.
• It denotes verifying that the data corresponds to a specific set of labels along an established
axis.Indexing enables us to carry out a variety of operations, including:-
• Insert missing value (NaN) markers in label locations where there was previously no data for the
label.
• To reorder existing data to correspond to a new set of labels.
Data Analysis using Pandas
Reindexing
• To reindex the dataframe, use the reindex() function.
• Values in the new index that do not have matching records in the dataframe are by default given the value
NaN.
Now, we can use the dataframe.reindex() function
to reindex the dataframe.
Data Analysis using Pandas
Reindexing
• Notice that the new indexes are populated with NaN values.
• We can fill in the missing values using the fill_value parameter.
Data Analysis using Pandas
Pandas Sort
There are two kinds of sorting available in Pandas. They are –
• By label
• By Actual Value
Pandas Sort
Order of Sorting
The order of sorting can be controlled by passing a Boolean value to the ascending parameter. To
better understand this, consider the following example.
Data Analysis using Pandas
Pandas Sort
Sort the Columns
Sorting on the column labels is possible by passing the axis argument a value of 0 or 1. Sort by row by
default, axis=0. To better understand this, consider the following example.
Data Analysis using Pandas
Pandas Sort
By Value
Sort_values(), like index sorting, is a method for sorting by values. It accepts a 'by' argument, which will
use the column name of the DataFrame to sort the values.
Data Analysis using Pandas
lower()
upper()
Data Analysis using Pandas
Statistical Functions
• Using pandas, it is simple to simplify numerous complex statistical operations in Python to a single line of
code.
• Some of the most popular and practical statistical operations will be covered.
Statistical Functions
Statistical Functions
Statistical Functions
print(first)
Data Analysis using Pandas
import pandas as pd
print(row2)
Data Analysis using Pandas
Summary
• We have completed this section and now we have understood about:
• What is Pandas
• Application of Pandas
• Structure of Pandas –Series, Index and DataFrame
• How to import Pandas Library
• How to import files using Pandas
• Indexing in Pandas
• Sort method in Pandas
• We have performed different types of Data Analysis
• This Knowledge we will use in Machine Learning, Data Analysis, Visualization and Mathematical
Operation.
Data Analysis using Pandas
Quiz
1. Pandas Stands For_________
Quiz
2. _________is in important library used for analyzing data.
a) Math
b) Random
c) Pandas
d) None of the above
Answer: c) Pandas
Data Analysis using Pandas
Quiz
3. _________is used when data in Tabular Format
a) NumPy
b) Pandas
c) Matplotlib
d) All of the above
Answer: b) Pandas
Data Analysis using Pandas
Quiz
4. Which of the following command is used to install Pandas?
Quiz
5. A _________is a One-dimensional array.
a) Data Frame
b) Series
c) Both of the above
d) None of the above
Answer: a) Series
Data Analysis using Pandas
Reference
• https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
• https://docs.python.org/3/library/
• https://pandas.pydata.org/docs/user_guide/10min.html
• https://www.geeksforgeeks.org/python-pandas-series/
• https://towardsdatascience.com/pandas-index-explained-b131beaf6f7b
• https://medium.com/analytics-vidhya/introduction-to-pandas-90b75a5c2278
• https://mode.com/python-tutorial/libraries/pandas/
• https://www.freepik.com/
Data Analysis using Pandas
Thank you...!