Utils Py

This module provides helper functions for exploratory data analysis (EDA) and modeling exercises, including functions to read and clean demand and promotion data. It also includes functionality to merge demand with promotions and extend promotions over multiple days. Additionally, there is a function to aggregate data to a weekly format.

Uploaded by

salimshaik045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views2 pages

Utils Py

Uploaded by

salimshaik045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

"""

This module contains helper functions for the EDA and modelling
exercises. Feel free to use them to get you started more quickly.

"""
import pandas as pd
import numpy as np
from datetime import datetime

def parse_time(s):
return datetime.strptime(s, "%Y-%m-%d").date()

def read_demand(path):
df = pd.read_csv(path)
df = df.assign(date=lambda df: df.date.apply(parse_time))
df = df.set_index("date")
df.index = pd.DatetimeIndex(df.index)
return df

def read_promotions(path):
df = pd.read_csv(path, index_col=0)
df = df.assign(promotion_date=lambda df:
df.promotion_date.apply(parse_time))
df = df.set_index("promotion_date")
df.index = pd.DatetimeIndex(df.index)
return df

def clean(ts: pd.Series) -> pd.Series:

# Replaces missing values
return ts.bfill().fillna(ts.mean())

def clean_demand_per_group(demand: pd.DataFrame) -> pd.DataFrame:

"""TODO add docstring"""
sus = demand.supermarket.unique()
skus = demand.sku.unique()
for su in sus:
for sku in skus:
demand.loc[(demand.sku == sku) & (demand.supermarket == su),
"demand"] = clean(demand.loc[(demand.sku == sku) & (demand.supermarket ==
su), "demand"])
return demand

def merge(demand: pd.DataFrame, promotions: pd.DataFrame) ->

pd.DataFrame:
promotions = promotions.rename_axis("date").assign(promotion=True)
demand = demand.merge(
promotions,
on=["supermarket", "sku", "date"],
how="outer",
)
demand = demand.assign(promotion=lambda df:
df.promotion.fillna(False))
return demand

def extend_promotions_days(promotions, n_days):

""" Extends the promotions to have multiple rows for a specific
number of days.
The input promotions is assumed be specified with a single row with a
starting date.
The output extends the input promotions with multiple days, one row
for each day of the promotion.
"""
n_promotions = len(promotions)
initial_promotions = promotions.copy()
promotion_id = np.arange(n_promotions)
extended_promotions =
promotions.copy().assign(promotion_id=promotion_id)
for days_to_add in range(1, n_days):
additional_promotion_days =
initial_promotions.copy().assign(promotion_id=promotion_id)
additional_promotion_days.index += pd.Timedelta(days_to_add, "d")
extended_promotions =
extended_promotions.append(additional_promotion_days)
return extended_promotions

def aggregate_to_weekly(df):
grouped = df.groupby(["sku", "supermarket"])
# Performs a simplistic aggregation of promotion. If a promotion
occured during the week this variable will be true.
weekly = grouped.apply(lambda df: df.resample("W").agg({"demand":
"sum", "promotion": "max"}))
weekly = weekly.reset_index().set_index("date")
return weekly

Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
ACI Strut and Tie Model Examples-1
86% (7)
ACI Strut and Tie Model Examples-1
64 pages
How To Do Nothing
0% (1)
How To Do Nothing
24 pages
Retail Analysis With Walmart Data
100% (10)
Retail Analysis With Walmart Data
2 pages
Prospectus 2019 PDF
No ratings yet
Prospectus 2019 PDF
160 pages
MeriSkill Sales Analysis
No ratings yet
MeriSkill Sales Analysis
17 pages
m04 v01 Store Sales Prediction
No ratings yet
m04 v01 Store Sales Prediction
31 pages
Manual Teleflex
100% (1)
Manual Teleflex
75 pages
StratAirComm Dominic
100% (1)
StratAirComm Dominic
157 pages
m06 v01 Store Sales Prediction
No ratings yet
m06 v01 Store Sales Prediction
34 pages
m08 v01 Store Sales Prediction
No ratings yet
m08 v01 Store Sales Prediction
39 pages
m05 v01 Store Sales Prediction
No ratings yet
m05 v01 Store Sales Prediction
32 pages
Task 2 - Experimentation and Uplift Testing - Jupyter Notebook
No ratings yet
Task 2 - Experimentation and Uplift Testing - Jupyter Notebook
41 pages
Theory vs. Principles
100% (1)
Theory vs. Principles
4 pages
m03 v01 Store Sales Prediction
No ratings yet
m03 v01 Store Sales Prediction
11 pages
Algebra 2 Practice
75% (4)
Algebra 2 Practice
101 pages
Lkpd-Biography Text
100% (3)
Lkpd-Biography Text
8 pages
Grade 6 Olympiad: Answer The Questions
No ratings yet
Grade 6 Olympiad: Answer The Questions
15 pages
SMDM Final - Jupyter Notebook
100% (1)
SMDM Final - Jupyter Notebook
17 pages
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
No ratings yet
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
15 pages
Analysis
No ratings yet
Analysis
12 pages
Inventory Simulation - For Test Data Copy 2
No ratings yet
Inventory Simulation - For Test Data Copy 2
10 pages
SCM - Slide - Chap 2.1.1
No ratings yet
SCM - Slide - Chap 2.1.1
36 pages
Python Pandas Tutorial For Beginners 2019 (On The Go)
No ratings yet
Python Pandas Tutorial For Beginners 2019 (On The Go)
14 pages
PRJ Sales Forecasting
No ratings yet
PRJ Sales Forecasting
22 pages
Big Sales Mart Final Script PDF
No ratings yet
Big Sales Mart Final Script PDF
36 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
Documentpython 2
No ratings yet
Documentpython 2
22 pages
Lab Manual 4
No ratings yet
Lab Manual 4
23 pages
Retail Sales Prediction Model
No ratings yet
Retail Sales Prediction Model
50 pages
Supermart Grocery Sales - Retail Analytics Dataset - (Data Analyst)
No ratings yet
Supermart Grocery Sales - Retail Analytics Dataset - (Data Analyst)
17 pages
Retail Analysis Walmart
No ratings yet
Retail Analysis Walmart
18 pages
INDEX
No ratings yet
INDEX
16 pages
Customer Marketing Analysis 1738244935
No ratings yet
Customer Marketing Analysis 1738244935
42 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
13 pages
Supply Chain Management - ML - FA - DA Project
No ratings yet
Supply Chain Management - ML - FA - DA Project
13 pages
Machine Learning Project
No ratings yet
Machine Learning Project
10 pages
Task 1 - Data Preparation and Customer Analytics - Jupyter Notebook
No ratings yet
Task 1 - Data Preparation and Customer Analytics - Jupyter Notebook
64 pages
Customer Segmentation 1683225943
No ratings yet
Customer Segmentation 1683225943
34 pages
Project ProductAnalyst
No ratings yet
Project ProductAnalyst
32 pages
DVT Exp - 7
No ratings yet
DVT Exp - 7
11 pages
DMDW Fielding Set
No ratings yet
DMDW Fielding Set
11 pages
1 Demand
No ratings yet
1 Demand
13 pages
Data Wrangling (Data Preprocessing)
No ratings yet
Data Wrangling (Data Preprocessing)
4 pages
Sales Analysis Project
No ratings yet
Sales Analysis Project
11 pages
Guides
No ratings yet
Guides
23 pages
Data Wrangling Notebook Summary
No ratings yet
Data Wrangling Notebook Summary
9 pages
Tranform Data (MD File)
No ratings yet
Tranform Data (MD File)
13 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
Phase 3
No ratings yet
Phase 3
19 pages
Code Feature
No ratings yet
Code Feature
7 pages
The Programmers' Stone: Chapter 1 - Thinking About Thinking
No ratings yet
The Programmers' Stone: Chapter 1 - Thinking About Thinking
134 pages
DF PD - Read - Excel ('Sample - Superstore - XLS') : Anjaliassignmnet - Ipy NB
No ratings yet
DF PD - Read - Excel ('Sample - Superstore - XLS') : Anjaliassignmnet - Ipy NB
23 pages
Dinya Antony MRA ML2
100% (1)
Dinya Antony MRA ML2
24 pages
Epox Ep-9npa3 Sli Ep-9npa7 Manual
No ratings yet
Epox Ep-9npa3 Sli Ep-9npa7 Manual
84 pages
Python For Business Decision Making Asm2
No ratings yet
Python For Business Decision Making Asm2
21 pages
Learning From Failure
100% (1)
Learning From Failure
9 pages
Accept and Value Each Person
No ratings yet
Accept and Value Each Person
22 pages
EXP 5 DE Lab
No ratings yet
EXP 5 DE Lab
5 pages
Extracted Notebook Content
No ratings yet
Extracted Notebook Content
17 pages
Theme - Iii Experiment 1 Study of Effect of Aspects (N-E-S-W) and Altitude On The Performance of Natural Springs
No ratings yet
Theme - Iii Experiment 1 Study of Effect of Aspects (N-E-S-W) and Altitude On The Performance of Natural Springs
11 pages
CDAC Assignment
No ratings yet
CDAC Assignment
3 pages
MRA Project Milestone 2
71% (17)
MRA Project Milestone 2
20 pages
Green SaND
No ratings yet
Green SaND
7 pages
EDA Report Week2
No ratings yet
EDA Report Week2
15 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
BigMart PDF
100% (1)
BigMart PDF
42 pages
Contoh Proposal
No ratings yet
Contoh Proposal
19 pages
Jurnal Hukum Internasional
No ratings yet
Jurnal Hukum Internasional
17 pages
Zigbee Based Wireless Electronic Notice Board With Multipoint Receiver
0% (1)
Zigbee Based Wireless Electronic Notice Board With Multipoint Receiver
5 pages
Supply Chain Management in The Era of The Internet of Things
100% (1)
Supply Chain Management in The Era of The Internet of Things
3 pages
Supermarket Sales Data Analysis
No ratings yet
Supermarket Sales Data Analysis
6 pages
BigMart Sales Data Analysis
No ratings yet
BigMart Sales Data Analysis
16 pages
PYF Project LearnerNotebook LowCode
No ratings yet
PYF Project LearnerNotebook LowCode
6 pages
Project Template Notebook Ipynb 1
No ratings yet
Project Template Notebook Ipynb 1
23 pages
Session Two: Qualitative Research - Focus Group Interview
No ratings yet
Session Two: Qualitative Research - Focus Group Interview
14 pages
Practice Questions2
No ratings yet
Practice Questions2
2 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Tony Proctor - Creative Problem Solving For Managers - 51
No ratings yet
Tony Proctor - Creative Problem Solving For Managers - 51
15 pages
Sop 0025
No ratings yet
Sop 0025
7 pages
Nptel R and C Lecture Notes Lec35
No ratings yet
Nptel R and C Lecture Notes Lec35
43 pages
Management Accounting: Information That Creates Value
No ratings yet
Management Accounting: Information That Creates Value
27 pages
Mass Transfer Operation-1 CL 205 (2-1-0-6) : Date: 16/02/2016
No ratings yet
Mass Transfer Operation-1 CL 205 (2-1-0-6) : Date: 16/02/2016
11 pages
Compliance Certificate
No ratings yet
Compliance Certificate
18 pages
S Curve Camera
No ratings yet
S Curve Camera
3 pages
Socket Programming and Threading Using C#
No ratings yet
Socket Programming and Threading Using C#
5 pages
Philosophy
No ratings yet
Philosophy
7 pages
KU Special Phase Admission List
No ratings yet
KU Special Phase Admission List
3 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Utils Py

Uploaded by

Utils Py

Uploaded by

"""

def clean(ts: pd.Series) -> pd.Series:

def clean_demand_per_group(demand: pd.DataFrame) -> pd.DataFrame:

def merge(demand: pd.DataFrame, promotions: pd.DataFrame) ->

def extend_promotions_days(promotions, n_days):

You might also like