0% found this document useful (0 votes)

5 views38 pages

Module - 3 New

The document discusses data aggregation and grouping operations in pandas, highlighting the importance of the GroupBy method for summarizing and aggregating data efficiently. It covers various functions such as describe(), unique(), and methods for aggregating data like min, max, sum, and standard deviation. Additionally, it explains how to create pivot tables for summarizing data in a structured format.

Uploaded by

Muhammed adhil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views38 pages

Module - 3 New

Uploaded by

Muhammed adhil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Module - 3

Data Aggregation, Group operations

Data Aggregation
What is data aggregation in pandas?
Aggregating Data with Pandas
Data aggregation is the process of gathering data and expressing it in a
summary form. This typically corresponds to summary statistics for
numerical and categorical variables in a data set.
Pandas Groupby: Summarising, Aggregating,
and Grouping data in Python
GroupBy is a simple concept. We can create a grouping of categories and apply
a function to the categories. It’s a simple concept, but it’s an extremely
valuable technique that’s widely used in data science. In real data science
projects, you’ll be dealing with large amounts of data and trying things over
and over, so for efficiency, we use Groupby concept. Groupby concept is really
important because of its ability to summarize, aggregate, and group data
efficiently.
Summarize
Summarization includes counting, describing all the data present in data frame.
We can summarize the data present in the data frame using describe()
method. This method is used to get min, max, sum, count values from the data
frame along with data types of that particular column.
describe(): This method elaborates the type of data and its attributes.
Syntax:
dataframe_name.describe()
unique(): This method is used to get all unique values from the given column.
Syntax:
dataframe[‘column_name].unique()
nunique(): This method is similar to unique but it will return the count of unique
values.
Syntax:
dataframe_name[‘column_name].nunique()
info(): This command is used to get the data types and columns information
Syntax:
dataframe.info()
columns: This command is used to display all the column names present in data frame
Syntax:
dataframe.columns
Example:
We are going to analyze the student marks data in this example.
# importing pandas as pd for using data frame
import pandas as pd
# creating dataframe with student details
dataframe = pd.DataFrame({'id': [7058, 4511, 7014, 7033],
'name': ['sravan', 'manoj', 'aditya', 'bhanu'],
'Maths_marks': [99, 97, 88, 90],
'Chemistry_marks': [89, 99, 99, 90],
'telugu_marks': [99, 97, 88, 80],
'hindi_marks': [99, 97, 56, 67],
'social_marks': [79, 97, 78, 90], })
print(dataframe)
# describing the data frame
print(dataframe.describe())
print("-----------------------------")
# finding unique values
print(dataframe['Maths_marks'].unique())
print("-----------------------------")
# counting unique values
print(dataframe['Maths_marks'].nunique())
print("-----------------------------")
# display the columns in the data frame
print(dataframe.columns)
print("-----------------------------")
# information about dataframe
print(dataframe.info())
In the below program we will aggregate data.
# getting all minimum values from all columns in a dataframe
print(dataframe.min())
print("-----------------------------------------")
# minimum value from a particular column in a data frame
print(dataframe['Maths_marks'].min())
print("-----------------------------------------")
# computing maximum values
print(dataframe.max())
print("-----------------------------------------")
# computing sum
print(dataframe.sum())
print("-----------------------------------------")
# finding count
print(dataframe.count())
print("-----------------------------------------")
# computing standard deviation
print(dataframe.std())
print("-----------------------------------------")
# computing variance
print(dataframe.var())
Grouping
It is used to group one or more columns in a dataframe by using the
groupby() method. Groupby mainly refers to a process involving one or
more of the following steps they are:
Splitting: It is a process in which we split data into group by applying
some conditions on datasets.
Applying: It is a process in which we apply a function to each group
independently
Combining: It is a process in which we combine different datasets
after applying groupby and results in a data structure
In many situations, we split the data into sets and we apply some
functionality on each subset. In the apply functionality, we can perform
the following operations −

Aggregation − computing a summary statistic

Transformation − perform some group-specific operation

Filtration − discarding the data with some condition

#import the pandas library
import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)

print df
Pandas object can be split into any of their objects. There are multiple
ways to split an object like −
obj.groupby('key')
obj.groupby(['key1','key2'])
obj.groupby(key,axis=1)
Let us now see how the grouping objects can be applied to the
DataFrame object
import pandas as pd

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

Print(df.groupby('Team'))
View Groups
# import the pandas library
import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year':
[2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
print(df.groupby('Team').groups)
Group by with multiple columns −
import pandas as pd

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

print df.groupby(['Team','Year']).groups
Iterating through Groups
# import the pandas library
import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
grouped = df.groupby('Year')
for name,group in grouped:
print name
print group
By default, the groupby object has the same label name as the group name.
Select a Group
Using the get_group() method, we can select a single group.
# import the pandas library
import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year':
[2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
grouped = df.groupby('Year')
print grouped.get_group(2014)
Aggregations
An aggregated function returns a single aggregated value for each group. Once
the group by object is created, several aggregation operations can be
performed on the grouped data.
An obvious one is aggregation via the aggregate or equivalent agg method −
import pandas as pd
import numpy as np
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
grouped = df.groupby('Year')
print grouped['Points'].agg(np.mean)
Aggregating functions are the ones that reduce the dimension of the returned objects.
Some common aggregating functions are tabulated below:
Function Description
mean() Compute mean of groups
sum() Compute sum of group values
size() Compute group sizes
count() Compute count of group
std() Standard deviation of groups
var() Compute variance of groups
sem() Standard error of the mean of groups
describe() Generates descriptive statistics
first() Compute first of group values
last() Compute last of group values
nth() Take nth value, or a subset if n is a list
min() Compute min of group values
max() Compute max of group values
import pandas as pd
import numpy as np

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year':
[2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
Attribute Access in Python Pandas
grouped = df.groupby('Team')
print grouped.agg(np.size)
Applying Multiple Aggregation Functions at Once
With grouped Series, you can also pass a list or dict of functions to do
aggregation with, and generate DataFrame as output −
import pandas as pd
import numpy as np
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)

grouped = df.groupby('Team')
print grouped['Points'].agg([np.sum, np.mean, np.std])
Transformation
It is an operation on a group or column that performs some group-
specific computation and returns an object that is indexed with the
same size as of the group size.
# import the pandas library
import pandas as pd
import numpy as np
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
grouped = df.groupby('Team')
score = lambda x: (x - x.mean()) / x.std()*10
print grouped.transform(score)
A lambda function is a small anonymous function.
A lambda function can take any number of arguments, but can only
have one expression.
Syntax
lambda arguments : expression
The expression is executed and the result is returned:
Add 10 to argument a, and return the result:
x = lambda a : a + 10
print(x(5))

x = lambda a, b : a * b
print(x(5, 6))
Pivot Table
Pivot table lets you calculate, summarize and aggregate your data. MS
Excel has this feature built-in and provides an elegant way to create the
pivot table from data. its a powerful tool that allows you to aggregate
the data with calculations such as Sum, Count, Average, Max, and Min.
and also configure the rows and columns for the pivot table and apply
any filters and sort orders to the data once pivot table has been
created.Coming to Python, Pandas has a feature to build Pivot table and
Crosstab using the Dataframe or list of Data.
Lets create a dataframe of different ecommerce site and their monthly sales in
different Category

import pandas as pd
import numpy as np
df = pd.DataFrame({'site' : ['walmart', 'amazon', 'alibaba',
'flipkart','alibaba','flipkart','walmart', 'amazon', 'alibaba', 'flipkart'],
'Product_Category' : ['Kitchen', 'Home-Decor', 'Gardening', 'Health',
'Beauty', 'Garments',
'Gardening', 'Health', 'Beauty', 'Garments'] ,
'Product' : ['Oven','Sofa-set','digging spade','fitness
band','sunscreen','pyjamas','digging spade',
'fitness band','sunscreen','pyjamas'],
'Sales' : [2000,3000,4000,5000,6000,9000,3000,2500,1020,950]})
Print(df)
There are 4 sites and 6 different product category. We will now use this
data to create the Pivot table. Before using the pandas pivot table
feature we have to ensure the dataframe is created.
Create Pivot Table
df.pivot_table( index=['Product_Category', 'Product'], values=['Sales'],
columns=['site'])
attribute index is the list of rows in data and columns is the columns for
the rows for which you want to see the Sales data i.e. values. So here
we want to see the Product Category and Product and their sales data
for each of the sites as column.
By default the aggreggate function is mean.
Pandas Pivot Table Aggfunc
Lets us see another attribute aggfunc where you can add one or list of
functions so we have seen if you dont mention this param explicitly
then default func is mean. Now lets check another aggfunc i.e.
sum,min,max,count etc.
df.pivot_table( index=['Product_Category', 'Product'], values=['Sales'],
columns=['site'], aggfunc=min)
List of Aggfunc
Let us add two aggfunc in a list i.e. min and sum
df.pivot_table( index=['Product_Category', 'Product'], values=['Sales'],
columns=['site'], aggfunc=[min,sum])
Pandas Crosstabs

Its a tabular structure showing relationship between different variables.

The Pandas crosstab and pivot has not much difference it works almost
the same way. The only difference is Crosstab works with Series or list of
Variables whereas Pivot works with dataframe and internally crosstab
calls pivot table function. So when you have list of data or a Series then
you should use crosstab and if there is data available in a dataframe then
you should go for pivot table.
Lets take the same above dataframe and apply those same use cases
using crosstab. Here the default aggrfunc is count which means it finds
the frequency of each of the row and respective column
pd.crosstab([df.Product_Category,df.Product],df.site)
Crosstab Rownames and Column Names
Lets change the row and column names using these two attibutes
rownames and colnames. Let the Product_Category as PC, Product as P
and Sales as S
pd.crosstab([df.Product_Category,df.Product],df.site,rownames=['PC','P'
],colnames=['S'])
Crosstab Aggfunc
pd.crosstab([df.Product_Category,df.Product],df.site,values=df.Sales,aggfunc=sum,rownames=['
PC','P'],colnames=['S'])
List of Aggfunc
Lets take list of aggfunc i.e. sum, min, All these functions are stored in list and passed in aggfunc
pd.crosstab([df.Product_Category,df.Product],df.site,values=df.Sales,aggfunc=[sum,min],rowna
mes=['PC','P'],colnames=['S'])
Python - Time Series

Time series is a series of data points in which each data point is

associated with a timestamp. A simple example is the price of a stock in
the stock market at different points of time on a given day. Another
example is the amount of rainfall in a region at different months of the
year.

In the example we take the value of stock prices every day for a quarter
for a particular stock symbol. We capture these values as a csv file and
then organize them to a dataframe using pandas library. We then set the
date field as index of the dataframe by recreating the additional
Valuedate column as index and deleting the old valuedate column.
Sample Data
Below is the sample data for the price of the stock on different days of a given quarter. The data is
saved in a file named as stock.csv

ValueDate Price
01-01-2018, 1042.05
02-01-2018, 1033.55
03-01-2018, 1029.7
04-01-2018, 1021.3
05-01-2018, 1015.4
...
...
...
...
23-03-2018, 1161.3
26-03-2018, 1167.6
27-03-2018, 1155.25
28-03-2018, 1154
Creating Time Series
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('path_to_file/stock.csv')
df = pd.DataFrame(data, columns = ['ValueDate', 'Price'])
# Set the Date as Index
df['ValueDate'] = pd.to_datetime(df['ValueDate'])
df.index = df['ValueDate']
del df['ValueDate']
df.plot(figsize=(15, 6))
plt.show()
Output
Python Basic date and time types

To manipulate dates and times in the python there is a module called datetime. There are two types
of date and time objects. The types are naïve and the aware.
In the naïve object, there is no enough information to unambiguously locate this object from other
date-time objects. In this approach it uses Coordinate Universal Time (UTC).
In the aware type objects there are different information regarding algorithmic and political time
adjustments. This type of objects is used to represent some specific time moments.
To use this module, we should import it using −

import datetime
There are different classes, constants and methods in this module.
The constants are −
datetime.MINYEAR
It is the smallest Year number, which can be applied as date or datetime
objects. The value is 0
datetime.MAXYEAR
It is the largest Year number, which can be applied as date or datetime
objects. The value is 9999
The Available datatypes are −
date
It is date type object. It uses Gregorian calendar. It has year, month, day attributes.
time
It is a time object class. It is independent of any particular day. It has hour, minute, second,
microsecond and tzinfo attributes.
datetime
It is a combined set of dates and times.
timedelta
It is used to express the difference between two date, time or datetime values in milliseconds.
tzinfo
It is an Abstract Base Class. It holds the time zone information. It is used by the datetime and time
classes.
timezone
In this class, it implements tzinfo. There is a fixed offset from the UTC
Date Type Object
The date objects represent a date. In the date there are Day, month and the Year part. It uses the Gregorian
Calendar. According to this calendar the day of January 1 of Year 1 is called as the day number 1, and so on.
Some date related methods are −
Method date.date(year, month, day)
This is the constructor to create a date type object. To create a date, all arguments are required as integer type
data. The year must be in range MINYEAR & MAXYEAR. If the given date is not valid, it will raise ValueError.
Method date.today()
This method is used to return the current local date.
Method date.fromtimestamp(timestamp)
This method is used to get the date from POSIX timestamp. If the timestamp value is out of range, it will raise
OverflowError.
Method date.fromordinal(ordinal)
This method is used to get the date from proleptic Gregorian Calendar ordinal. It is used to get the date from
the date count from January 1 of Year 1.
Method date.toordinal()
This method is used to return a date to proleptic Gregorian Calendar ordinal.
Method date.weekday()
This method is used to return the date of a week as an integer from the date. The Monday is 0, Tuesday is 1
and so on.
Method date.isoformat()
import datetime as dt
new_date = dt.date(1998, 9, 5) #Store date 5th septemberm, 1998
print("The Date is: " + str(new_date))
print("Ordinal value of given date: " + str(new_date.toordinal()))
print("The weekday of the given date: " + str(new_date.weekday()))
#Monday is 0
my_date = dt.date.fromordinal(732698) #Create a date from the
Ordinal value.
print("The Date from ordinal is: " + str(my_date))
td = my_date - new_date
#Create a timedelta object
print('td Type: ' + str(type(td)) + '\nDifference: ' + str(td))

Vedic Astrology - Is Marriage Promised PDF
100% (2)
Vedic Astrology - Is Marriage Promised PDF
93 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
EDA Lab Manual
100% (2)
EDA Lab Manual
93 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas
No ratings yet
Pandas
13 pages
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
100% (1)
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
37 pages
NumPy, SciPy, Pandas, Quandl Cheat Sheet
100% (3)
NumPy, SciPy, Pandas, Quandl Cheat Sheet
4 pages
Chapter-2 Python Pandas
100% (2)
Chapter-2 Python Pandas
33 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
017) Pandas - Batch 2 - Day 017
No ratings yet
017) Pandas - Batch 2 - Day 017
47 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
Flight Performance and Planning (PPL)
No ratings yet
Flight Performance and Planning (PPL)
3 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Data Visualization
No ratings yet
Data Visualization
41 pages
Unit Iv
No ratings yet
Unit Iv
63 pages
Python Cheat Sheet For Excel Users
100% (2)
Python Cheat Sheet For Excel Users
5 pages
Pandas Tutorial1 - Informatics
No ratings yet
Pandas Tutorial1 - Informatics
43 pages
DataFrame Statistics
No ratings yet
DataFrame Statistics
41 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
Comprehensive Guide To Grouping and Aggregating With Pandas - Practical Business Python
No ratings yet
Comprehensive Guide To Grouping and Aggregating With Pandas - Practical Business Python
23 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Groupby RST
No ratings yet
Groupby RST
32 pages
EDA Lab Manual
No ratings yet
EDA Lab Manual
93 pages
Lecture 14
No ratings yet
Lecture 14
33 pages
Lab Record IP
No ratings yet
Lab Record IP
13 pages
Unit 4,5
No ratings yet
Unit 4,5
24 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Pandas 2 Complete Notes Class XII
No ratings yet
Pandas 2 Complete Notes Class XII
18 pages
SQL To Pandas - Group Aggregations
No ratings yet
SQL To Pandas - Group Aggregations
6 pages
Certificate of Recognition
No ratings yet
Certificate of Recognition
5 pages
Unit 5
No ratings yet
Unit 5
19 pages
Python Pandas - 2 2020-21
No ratings yet
Python Pandas - 2 2020-21
21 pages
EDA Module 3-1
No ratings yet
EDA Module 3-1
40 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Python CSBS Bhavya Lab Manual
No ratings yet
Python CSBS Bhavya Lab Manual
14 pages
Dataframe Extended-Ii
No ratings yet
Dataframe Extended-Ii
19 pages
Understanding Pandas Groupby For Data Aggregation
No ratings yet
Understanding Pandas Groupby For Data Aggregation
49 pages
Bwms Samsung Purimar Final Drawing - 224
No ratings yet
Bwms Samsung Purimar Final Drawing - 224
224 pages
12 Pandas
No ratings yet
12 Pandas
14 pages
Data Science With Python
No ratings yet
Data Science With Python
12 pages
Python For Statistics
No ratings yet
Python For Statistics
40 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Part A Assignment - No - 1
No ratings yet
Part A Assignment - No - 1
7 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
? Pandas Study Guide
No ratings yet
? Pandas Study Guide
6 pages
93-Grouped Aggregations Formatted
No ratings yet
93-Grouped Aggregations Formatted
3 pages
Using Groupby and Pivot
No ratings yet
Using Groupby and Pivot
7 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Lab Manual Python Programming Language
No ratings yet
Lab Manual Python Programming Language
21 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
State-Space Modelling of LLC Resonant Half-Bridge
No ratings yet
State-Space Modelling of LLC Resonant Half-Bridge
10 pages
0022 Ammonia Production
No ratings yet
0022 Ammonia Production
32 pages
A Level History Interpretations Coursework
100% (2)
A Level History Interpretations Coursework
7 pages
Pandas
No ratings yet
Pandas
9 pages
How To Auto Install All Kali Linux Tools Using Katoolin On DebianUbuntu
No ratings yet
How To Auto Install All Kali Linux Tools Using Katoolin On DebianUbuntu
4 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Quantum Users Guide-3
100% (2)
Quantum Users Guide-3
209 pages
Bill Payment Receipt - Feb 2019
No ratings yet
Bill Payment Receipt - Feb 2019
4 pages
BRM Chapter 1
No ratings yet
BRM Chapter 1
44 pages
CSR Initiatives of Federal Bank in India
No ratings yet
CSR Initiatives of Federal Bank in India
5 pages
Ethernet Twist Per Inch
No ratings yet
Ethernet Twist Per Inch
8 pages
Alternators: LSA 42.2 - 2 Pole
No ratings yet
Alternators: LSA 42.2 - 2 Pole
7 pages
Lesson Plan: Diffusion and Osmosis: Membrane Transport
No ratings yet
Lesson Plan: Diffusion and Osmosis: Membrane Transport
2 pages
App Note 1 - Motion Driver 6.12 Getting Started
No ratings yet
App Note 1 - Motion Driver 6.12 Getting Started
15 pages
Denso Petrolatum Tapes IFU 2017
No ratings yet
Denso Petrolatum Tapes IFU 2017
2 pages
CJC H2 Maths Promos 2009: Annex B
No ratings yet
CJC H2 Maths Promos 2009: Annex B
2 pages
Soft Skills
No ratings yet
Soft Skills
12 pages
Audit in S CIS Environment 3
No ratings yet
Audit in S CIS Environment 3
4 pages
Revised First Year Counselors For 2023-24
No ratings yet
Revised First Year Counselors For 2023-24
15 pages
Essential Skills Module 1-4
No ratings yet
Essential Skills Module 1-4
19 pages
Mini Explorer Club
No ratings yet
Mini Explorer Club
4 pages
TENTEC V-Series Data Sheet R8 A4
No ratings yet
TENTEC V-Series Data Sheet R8 A4
4 pages
Assignment 2 - Design, Build and Test A Pressure Transducer
No ratings yet
Assignment 2 - Design, Build and Test A Pressure Transducer
2 pages
Kcse Revision
No ratings yet
Kcse Revision
8 pages
COCOMO and UNIT 4
No ratings yet
COCOMO and UNIT 4
10 pages
Association of Aphids With Plants Belonging To Order Nymphaeales-Austrobaileyales-Laurales-Magnoliales and Piperales in India
No ratings yet
Association of Aphids With Plants Belonging To Order Nymphaeales-Austrobaileyales-Laurales-Magnoliales and Piperales in India
7 pages
EC 246: Decisions & Games
No ratings yet
EC 246: Decisions & Games
1 page
Medical Conference Style Presentation by Slidesgo
No ratings yet
Medical Conference Style Presentation by Slidesgo
41 pages
Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Module - 3 New

Uploaded by

Module - 3 New

Uploaded by

Module - 3

Data Aggregation, Group operations

Aggregation − computing a summary statistic

Transformation − perform some group-specific operation

Filtration − discarding the data with some condition

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',

Its a tabular structure showing relationship between different variables.

Time series is a series of data points in which each data point is

You might also like