0% found this document useful (0 votes)

4 views40 pages

Module 1: Data Visualization and Data Exploration: Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Module 1 covers Data Visualization and Exploration, emphasizing the importance of visual representation of data through various tools and libraries. It discusses statistical concepts, operations using Numpy and Pandas, and the advantages and disadvantages of data visualization. Additionally, it highlights applications in business intelligence, financial analysis, and healthcare, along with data wrangling techniques to prepare data for visualization.

Uploaded by

Ganesh Prasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views40 pages

Module 1: Data Visualization and Data Exploration: Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Uploaded by

Ganesh Prasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Module 1: Data

Visualization and Data

Exploration

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Module 1: Data Visualization and
Data Exploration
• Introduction: Data Visualization, Importance of Data
Visualization, Data Wrangling, Tools and Libraries for
Visualization
• Overview of Statistics: Measures of Central Tendency,
Measures of Dispersion, Correlation, Types od Data, Summary
Statistics
• Numpy: Numpy Operations - Indexing, Slicing, Splitting,
Iterating, Filtering, Sorting, Combining, and Reshaping
• Pandas: Advantages of pandas over numpy, Disadvantages
of pandas, Pandas operation - Indexing, Slicing, Iterating,
Filtering, Sorting and Reshaping using Pandas
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Data Visualization
• Data visualization is the graphical representation of
information and data. By using visual elements like charts,
graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and
patterns in data.
• Data visualization translates complex data sets into visual
formats that are easier for the human brain to comprehend.
This can include a variety of visual tools such as:
• Charts: Bar charts, line charts, pie charts, etc.
• Graphs: Scatter plots, histograms, etc.
• Maps: Geographic maps, heat maps, etc.
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Tools for Visualization of Data
• The following are the 10 best Data Visualization Tools
1.Tableau
2.Looker
3.Zoho Analytics
4.Sisense
5.IBM Cognos Analytics
6.Qlik Sense
7.Domo
8.Microsoft Power BI
9.Klipfolio
10.SAP Analytics Prepared
Cloud by Dr. Ganesha Prasad, Dept. of AI & ML
Advantages of Data
Visualization:
• Enhanced Comparison: Visualizing performances of two elements or scenarios
streamlines analysis, saving time compared to traditional data examination.
• Improved Methodology: Representing data graphically offers a superior
understanding of situations, exemplified by tools like Google Trends illustrating
industry trends in graphical forms.
• Efficient Data Sharing: Visual data presentation facilitates effective
communication, making information more digestible and engaging compared to
sharing raw data.
• Sales Analysis: Data visualization aids sales professionals in comprehending
product sales trends, identifying influencing factors through tools like heat maps, and
understanding customer types, geography impacts, and repeat customer behaviors.
• Identifying Event Relations: Discovering correlations between events helps
businesses understand external factors affecting their performance, such as online
sales surges during festive seasons.
• Exploring Opportunities and Trends: Data visualization empowers business
leaders to uncover patterns and opportunities within vast datasets, enabling a deeper
understanding of customer behaviors and insights into emerging business trends.
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Disadvantages of Data Visualization:
• Can be time-consuming: Creating visualizations can be a time-consuming
process, especially when dealing with large and complex datasets.
• Can be misleading: While data visualization can help identify patterns
and relationships in data, it can also be misleading if not done correctly.
Visualizations can create the impression of patterns or trends that may
not exist, leading to incorrect conclusions and poor decision-making.
• Can be difficult to interpret: Some types of visualizations, such as those
that involve 3D or interactive elements, can be difficult to interpret and
understand.
• May not be suitable for all types of data: Certain types of data, such as
text or audio data, may not lend themselves well to visualization. In
these cases, alternative methods of analysis may be more appropriate.
• May not be accessible to all users: Some users may have visual
impairments or other disabilities that make it difficult or impossible for
them to interpret visualizations. In these cases, alternative methods of
presenting data may be necessary to ensure accessibility.
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Applications of Data Visualization

• Business Intelligence and Reporting

• Financial Analysis
• Healthcare
• Marketing and Sales
• Human Resources

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Data Wrangling
• Data wrangling is the process of transforming raw data into a suitable
representation for various tasks. It is the discipline of augmenting,
cleaning, filtering, standardizing, and enriching data in a way that allows
it to be used in a downstream task, which in our case is data
visualization.
• Data Wrangling is also known as Data Munging.

• Example:Books selling Website want to show top-selling books of

different domains, according to user preference. For example, if a new
user searches for motivational books, then they want to show those
motivational books which sell the most or have a high rating, etc.
• But on their website, there are plenty of raw data from different users.
Here the concept of Data Munging or Data Wrangling is used. As we
know Data wrangling is not by the System itself. This process is done by
Data Scientists. So, the data Scientist will wrangle data in such a way
that they will sort the motivational books that are sold more or have
high ratings or user buy this book with these package of Books, etc. On
the basis of that, the new user will make a choice. This will explain the
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
importance of Data wrangling.
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Tools and Libraries for Visualization
Commonly used tools are:
• Non-coding tool – Tableau, power BI
• Coding tool – Python, MATLAB and R

Note: libraries we will going to discuss in detail in the coming

chapters.
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Overview of Statistics
• Statistics is a combination of the analysis, collection,
interpretation, and representation of numerical data.
• Probability is a measure of the likelihood that an event will
occur and is quantified as a number between 0 and 1.
• A probability distribution is a function that provides the
probability for every possible event. A probability distribution is
frequently used for statistical analysis. The higher the
probability, the more likely the event. There are two types of
probability distributions, namely:
Discrete
Continuous.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Discrete probability distribution
A discrete probability distribution
shows all the values that a random
variable can take, together with their
probability. The following diagram
illustrates an example of a discrete
probability distribution. If we have a six-
sided die, we can roll each number
between 1 and 6. We have six events
that can occur based on the number
that's rolled. There is an equal
probability of rolling any of the
numbers, and the individual probability
of any of the six events occurring is 1/6:
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Continuous probability
distribution
• A continuous probability
distribution defines the probabilities
of each possible value of a continuous
random variable. The following
diagram provides an example of a
continuous probability distribution.
This example illustrates the
distribution of the time needed to
drive home. In most cases, around 60
minutes is needed, but sometimes,
less time is needed because there is
no traffic, and sometimes, much more
time is needed if there are traffic jams:
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Measures of Central Tendency
• Mean: The arithmetic average is computed by summing up all
measurements and dividing the sum by the number of observations.
The mean is calculated as follows:

• Median: This is the middle value of the ordered dataset. If there is

an even number of observations, the median will be the average of
the two middle values. The median is less prone to outliers compared
to the mean, where outliers are distinct values in data.

• Mode: Our last measure of central tendency, the mode is defined as

the most frequent value. There may be more than one mode in cases
where multiple values Prepared
are equally frequent.
by Dr. Ganesha Prasad, Dept. of AI & ML
Example
• For example, a die was rolled 10 times, and we got the following
numbers: 4, 5, 4, 3, 4, 2, 1, 1, 2, and 1.

The mean is calculated by summing all the events and dividing

them by the number of observations:
(4+5+4+3+4+2+1+1+2+1)/10=2.7.

To calculate the median, the die rolls have to be ordered according

to their values. The ordered values are as follows: 1, 1, 1, 2, 2, 3, 4,
4, 4, 5. Since we have an even number of die rolls, we need to take
the average of the two middle values. The average of the two middle
values is (2+3)/2=2.5.

The modes are 1 and 4Prepared

since by Dr.they areDept.
Ganesha Prasad, the
of AItwo
& ML most frequent events.
Measures of Dispersion
• Dispersion, also called variability, is the extent to which a
probability distribution is stretched or squeezed. The different
measures of dispersion are as follows:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Correlation
Correlation describes the statistical relationship
between two variables:
• In a positive correlation, both variables move in the
same direction.
• In a negative correlation, the variables move in
opposite directions.
• In zero correlation, the variables are not related.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Types of Data

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

NumPy
• It provides support for large n-dimensional arrays and has built-
in support for many high-level mathematical and statistical
operations.
• Mean

• Median

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Var, std

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Basic NumPy Operations
Indexing
• Indexing elements in a NumPy array, at a high level, works the
same as with built-in Python lists. Therefore, we can index
elements in multi-dimensional matrices:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Slicing: Being able to easily slice parts of lists into new
ndarrays is very helpful when handling large amounts of
data

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Splitting
• Splitting data can be helpful in many situations, from plotting
only half of your timeseries data to separating test and training
data for machine learning algorithms. There are two ways of
splitting your data, horizontally and vertically. Horizontal
splitting can be done with the hsplit method. Vertical splitting
can be done with the vsplit method:

Horizontal Split : Splits along

columns (axis 1).
Vertical Split :
Splits along rows (axis 0).

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Iterating
• Iterating the NumPy data structures, ndarrays, is also possible.
It steps over the whole list of data one after another, visiting
every single element in the ndarray once. Considering that they
can have several dimensions, indexing gets very complex. The
nditer is a multi-dimensional iterator object that iterates over a
given number of arrays:

The ndenumerate will give us exactly

this index

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Filtering

• Filtering is a very powerful tool that can be used to

clean up your data if you want to avoid outlier values.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Sorting
• Sorting each row of a dataset can be really useful. Using
NumPy, we are also able to sort on other dimensions, such as
columns.
• In addition, argsort gives us the possibility to get a list of
indices, which would result in a sorted list:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Combining
• Stacking rows and columns onto an existing dataset can be
helpful when you have two datasets of the same dimension
saved to different files.
• Given two datasets, we use vstack to stack dataset_1 on top
of dataset_2, which will give us a combined dataset with all
the rows from dataset_1, followed by all the rows from
dataset_2.
• If we use hstack, we stack our datasets "next to each other,"
meaning that the elements from the first row of dataset_1
will be followed by the elements of the first row of dataset_2.
This will be applied to each row:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Reshaping
• Reshaping can be crucial for some algorithms. Depending
on the nature of your data, it might help you to reduce
dimensionality to make visualization easier:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Pandas
• The pandas Python library provides data structures and methods
for manipulating different types of data, such as numerical and
temporal data. These operations are easy to use and highly
optimized for performance. Data formats, such as CSV and
JSON, and databases can be used to create DataFrames.
• DataFrames are the internal representations of data and are
very similar to tables but are more powerful since they allow you
to efficiently apply operations such as multiplications,
aggregations, and even joins. Importing and reading both
files and in-memory data is abstracted into a user-friendly
interface. When it comes to handling missing data, pandas
provide built-in solutions to clean up and augment your data,
meaning it fills in missing values with reasonable values.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Integrated indexing and label-based slicing in combination
with fancy indexing (what we already saw with NumPy) make
handling data simple. More complex techniques, such as
reshaping, pivoting, and melting data, together with the
possibility of easily joining and merging data, provide
powerful tooling so that you can handle your data correctly.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Basic Operations of pandas
Indexing:
• Indexing with pandas is a bit more complex than with
NumPy. We can only access columns with a single bracket. To
use the indices of the rows to access them, we need the iloc
method. If we want to access them with index_col (which
was set in the read_csv call), we need to use the loc method:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML
Series

• A pandas Series is a one-dimensional labeled array that is

capable of holding any type of data. We can create a Series by
loading datasets from a .csv file, Excel spreadsheet, or SQL
database.

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Advanced pandas Operations
• Filtering
Filtering in pandas has a higher-level interface than NumPy. You
can still use the simple brackets-based conditional filtering.
However, you're also able to use more complex queries, for
example, filter rows based on labels using likeness, which allows
us to search for a substring using the like argument and even
full regular expressions using regex:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Sorting
• Sorting each row or column based on a given row or
column will help you analyze your data better and find
the ranking of a given dataset. With pandas, we are
able to do this pretty easily. Sorting in ascending and
descending order can be done using the parameter
known as ascending. The default sorting order is
ascending. Of course, you can do more complex sorting
by providing more than one value in the by = [ ] list.
Those will then be used to sort values for which the first
value is the same:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Reshaping
• Reshaping can be crucial for easier visualization and
algorithms. However, depending on your data, this can get
really complex:

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Base Foundation of Generator: SECTION I - I SECTION II - II
80% (5)
Base Foundation of Generator: SECTION I - I SECTION II - II
1 page
IT - R23 - Skills Development-DATA VISUALIZATION Lab
No ratings yet
IT - R23 - Skills Development-DATA VISUALIZATION Lab
31 pages
Amit Khilare Used Device Data PM Project
No ratings yet
Amit Khilare Used Device Data PM Project
25 pages
21AD71 Module 1 Textbook
No ratings yet
21AD71 Module 1 Textbook
75 pages
DAV - Technical Book
No ratings yet
DAV - Technical Book
137 pages
Ch01 ICS422 04
No ratings yet
Ch01 ICS422 04
84 pages
CIS 467 - Topic 2 - Data Exploration and Preprocessing
No ratings yet
CIS 467 - Topic 2 - Data Exploration and Preprocessing
81 pages
Lecture 4 Unit 1
No ratings yet
Lecture 4 Unit 1
23 pages
DM 2 Final
No ratings yet
DM 2 Final
30 pages
CH 6
No ratings yet
CH 6
43 pages
Vendor List: Ser No Name & Address of Firm Contact Details (Email and Tele Nos) Core Competencies
100% (1)
Vendor List: Ser No Name & Address of Firm Contact Details (Email and Tele Nos) Core Competencies
13 pages
CCW331 Business Analytics Lecture Notes 2
No ratings yet
CCW331 Business Analytics Lecture Notes 2
185 pages
Mod 4
No ratings yet
Mod 4
115 pages
Data Mining: Prepared By: Eesha Tur Razia Babar
No ratings yet
Data Mining: Prepared By: Eesha Tur Razia Babar
49 pages
Module 4
No ratings yet
Module 4
195 pages
1 L2 Intro DAM
No ratings yet
1 L2 Intro DAM
27 pages
Unit IV
No ratings yet
Unit IV
63 pages
Data Analytics and Interactive Dashboards Using Python
No ratings yet
Data Analytics and Interactive Dashboards Using Python
96 pages
DSV Module-4
No ratings yet
DSV Module-4
36 pages
DataUnderstandingAndPreparation DOM304
No ratings yet
DataUnderstandingAndPreparation DOM304
19 pages
FIT1043 - Lecture 3 - 2024
No ratings yet
FIT1043 - Lecture 3 - 2024
69 pages
Data Visualization
No ratings yet
Data Visualization
100 pages
Business Analytics Anna University
No ratings yet
Business Analytics Anna University
40 pages
Module 1 Introduction To Data Visualization
No ratings yet
Module 1 Introduction To Data Visualization
5 pages
Week - 1 Day - 1 Descriptive Statistics
No ratings yet
Week - 1 Day - 1 Descriptive Statistics
40 pages
Module 1 Importance of Data Visualization and Data Exploration 1
No ratings yet
Module 1 Importance of Data Visualization and Data Exploration 1
20 pages
Data Mining 2
No ratings yet
Data Mining 2
64 pages
What Is Data Visualization UNIT-V
No ratings yet
What Is Data Visualization UNIT-V
24 pages
Data Visualization
No ratings yet
Data Visualization
16 pages
DM Unit-1-1
No ratings yet
DM Unit-1-1
56 pages
Module 2 Lesson 3 Measures of Dispersion
No ratings yet
Module 2 Lesson 3 Measures of Dispersion
6 pages
VIPDMTheory Chapter 2
No ratings yet
VIPDMTheory Chapter 2
56 pages
Notes - Business Analytics
No ratings yet
Notes - Business Analytics
138 pages
Data Visualization Module1
No ratings yet
Data Visualization Module1
44 pages
DWDM LS2 Fall 24 25
No ratings yet
DWDM LS2 Fall 24 25
42 pages
Unit 5
No ratings yet
Unit 5
81 pages
Data Basics For ML
No ratings yet
Data Basics For ML
23 pages
Business Anaytics Unit 1
No ratings yet
Business Anaytics Unit 1
37 pages
Data Visualization New
No ratings yet
Data Visualization New
103 pages
Unit-5 New
No ratings yet
Unit-5 New
31 pages
4 - Data Visualization For Decison Making
100% (1)
4 - Data Visualization For Decison Making
64 pages
It B.tech II Year II Sem DV (R18a0555)
No ratings yet
It B.tech II Year II Sem DV (R18a0555)
73 pages
Data Visualization 21st June
No ratings yet
Data Visualization 21st June
110 pages
02 Data
No ratings yet
02 Data
64 pages
Chapter 2
No ratings yet
Chapter 2
53 pages
DVP Unit1
No ratings yet
DVP Unit1
44 pages
Data Science
No ratings yet
Data Science
59 pages
UCS551 Chapter 4 - Descriptive Analytics - Visualization
No ratings yet
UCS551 Chapter 4 - Descriptive Analytics - Visualization
39 pages
Data Visualization
No ratings yet
Data Visualization
25 pages
Module4 DSV
No ratings yet
Module4 DSV
89 pages
Chapter 2 - Understand Data
No ratings yet
Chapter 2 - Understand Data
63 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
39 pages
BA Unit 1
No ratings yet
BA Unit 1
38 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
78 pages
Introduction To Data Visualisation
100% (1)
Introduction To Data Visualisation
47 pages
Unit .......
No ratings yet
Unit .......
45 pages
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
No ratings yet
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
52 pages
Exploratory Data Analysis: Datascience Using Python Topic: 3
No ratings yet
Exploratory Data Analysis: Datascience Using Python Topic: 3
32 pages
Eds Unit 3
No ratings yet
Eds Unit 3
22 pages
Data-Visualization Intro
No ratings yet
Data-Visualization Intro
7 pages
225 - EE8015, EE6801 Electric Energy Generation, Utilization and Conservation - Notes PDF
No ratings yet
225 - EE8015, EE6801 Electric Energy Generation, Utilization and Conservation - Notes PDF
135 pages
BH35 2
100% (1)
BH35 2
4 pages
11 - Chapter 2
No ratings yet
11 - Chapter 2
46 pages
Saint Bernard: Assembly Instructions: Assemble The Head
No ratings yet
Saint Bernard: Assembly Instructions: Assemble The Head
0 pages
Public Garden Automation Document
63% (8)
Public Garden Automation Document
85 pages
2024 Note 2
No ratings yet
2024 Note 2
5 pages
PVC
No ratings yet
PVC
6 pages
1-Implementing A Java Program
No ratings yet
1-Implementing A Java Program
13 pages
Controlling: 1. General
No ratings yet
Controlling: 1. General
10 pages
Beacon Solidworks Product Matrix
No ratings yet
Beacon Solidworks Product Matrix
6 pages
ASTRO 3: Introductory Astronomy 3rd Edition Michael A. Seeds - The Ebook in PDF Format Is Available For Download
No ratings yet
ASTRO 3: Introductory Astronomy 3rd Edition Michael A. Seeds - The Ebook in PDF Format Is Available For Download
52 pages
M190M195 LoRaWAN Module AT Command Manual-V1.0.2
No ratings yet
M190M195 LoRaWAN Module AT Command Manual-V1.0.2
29 pages
EVALKIT
No ratings yet
EVALKIT
24 pages
Learning Exemplar U2LC1
No ratings yet
Learning Exemplar U2LC1
2 pages
Chapter-3-Wind Energy
No ratings yet
Chapter-3-Wind Energy
7 pages
4th Year Parasitology Manual
No ratings yet
4th Year Parasitology Manual
28 pages
Research Paper 1
No ratings yet
Research Paper 1
24 pages
Table of Test Specification
No ratings yet
Table of Test Specification
2 pages
BPS-600 English
No ratings yet
BPS-600 English
39 pages
BSC-Mobile-Application and-Web-Portal
No ratings yet
BSC-Mobile-Application and-Web-Portal
19 pages
5.2 5.3 Exam Questions
No ratings yet
5.2 5.3 Exam Questions
5 pages
Geotropism
No ratings yet
Geotropism
8 pages
Contribution of Phosphocreatine and Aerobic Metabolism To Energy Supply During Repeated Sprint Exercise
No ratings yet
Contribution of Phosphocreatine and Aerobic Metabolism To Energy Supply During Repeated Sprint Exercise
10 pages
Year 10 Structure and Bonding Lesson 1
No ratings yet
Year 10 Structure and Bonding Lesson 1
6 pages
Bicosome BicowhiteComplex
No ratings yet
Bicosome BicowhiteComplex
2 pages
Experimental CW QRP Transceiver: ANTENTOP-01 - 2006, # 008
No ratings yet
Experimental CW QRP Transceiver: ANTENTOP-01 - 2006, # 008
4 pages
Earliest Start Time (Es) : CPM Analysis Page of
No ratings yet
Earliest Start Time (Es) : CPM Analysis Page of
4 pages
Thermal Energy Q (EDITED)
No ratings yet
Thermal Energy Q (EDITED)
8 pages
Statistics and Data Analysis Essentials
From Everand
Statistics and Data Analysis Essentials
Jayant Ramaswamy
No ratings yet
Data Analytics
From Everand
Data Analytics
Jeffery Short
1/5 (1)

Module 1: Data Visualization and Data Exploration: Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Uploaded by

Module 1: Data Visualization and Data Exploration: Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Uploaded by

Module 1: Data

Visualization and Data

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Business Intelligence and Reporting

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Example:Books selling Website want to show top-selling books of

Note: libraries we will going to discuss in detail in the coming

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Median: This is the middle value of the ordered dataset. If there is

• Mode: Our last measure of central tendency, the mode is defined as

The mean is calculated by summing all the events and dividing

To calculate the median, the die rolls have to be ordered according

The modes are 1 and 4Prepared

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Horizontal Split : Splits along

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

The ndenumerate will give us exactly

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• Filtering is a very powerful tool that can be used to

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

• A pandas Series is a one-dimensional labeled array that is

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

Prepared by Dr. Ganesha Prasad, Dept. of AI & ML

You might also like