H AHA2

Uploaded by

PRANJAY ROHILLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views4 pages

H AHA2

Uploaded by

PRANJAY ROHILLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

DATA preprocessing TECHNIQUE

Q1. Calculate the total sales generated by each

product.
ts=df1.groupby("Category") ["Sales Amount"].sum()
ts=ts.reset_index() ts.columns=["Product", "Total
sales"] print("The total sales generated by each
product:\n",ts)

Q2. Calculate the average quantity sold in each

region.
avg=df1.groupby ("Region") ["Quantity Sold").mean()
avg=avg.reset_index()
avg.columns=("Region", "Average quantity sold"]
print("The average quantity sold in each region:\n",avg)

Q3. Count the total number of transactions

happened across every month.
df1 "Transaction Date"!=pd.to_datetime(df["Transaction
Date"])
t=df.groupby(df1 "Transaction Date"].dt.month)
["Category"].count()
t=t.reset_index() t.columns=["Month", "Transaction
Count"]
print("The total number of transactions happened
across every month:\n",t)
Q4. Display the regions generating maximum and
minimum sales.
rs=df1.groupby ("Region") "Sales Amount"].sum()
print("The region generating maximum sales is",
rs.idxmax())
print("The region generating minimum sales is",
rs.idxmin())

Q5. Display the total sales of every quarter.

df1 ["Transaction
Date"]=pd.to_datetime(df("Transaction Date"])
sq=df.groupby(df1 ["Transaction Date").dt.quarter)
["Sales Amount").sum()
sq=sq.reset_index()
sq.columns=["Quarter", "Total sales"! print("The total
sales of every quarter:\n",sq)

Q6. Normalize the Sales column.

df=df1.copy()
df ("Normalized Sales")= (df ["Sales Anount"]-df["Sales
Amount"].min())/(df ["Sales Amount").max()-df["Sales
Amount"].min())
print(df[[ "Sales Amount", "Normalized Sales"]])
Q7. Apply log transformation on the Sales column
to minimize its variance.
import numpy as np
df=df1.copy()
df ["Log Transformed Sales"]=np.log(df ["Sales
Amount"])
print(df[["Sales Amount","Log Transformed Sales"]])

Q8. Perform binning on the Sales column in order

to categorize the sales into low, medium and high.
df=df1.copy()
bins=[0,150,350,5001]
l=["Low", "Medium", "High"]
df ["Sales Category"]=pd.cut(df1 ["Sales Amount"],
bins, labels=1)
print(df [["Sales Amount", "Sales Category"]])
Q9. Find out highly correlated columns of the given
data set.
df=df1.copy()
df["Category"]=df
["Category"].astype("category").cat.codes
df["Transaction Date"]=df["Transaction
Date"].astype("category").cat.codes
df["Region"]=df1["Region"]
.astype("category").cat.codes
c=df.corr()
cp=c.unstack().reset_index()
cp.columns = ["Column1", "Column2", "Correlation"]
cp=cp [(cp["Correlation"]>threshold) & (cp["Column1"]!
=cp ["Column2"])]
cp=cp.drop_duplicates(subset=["Correlation"])
print("Highly correlated columns of this dataset are:\
n",cp)

(Latest Edited) Full Note Sta404 - 01042022
No ratings yet
(Latest Edited) Full Note Sta404 - 01042022
108 pages
Stat 153 Slides
100% (3)
Stat 153 Slides
137 pages
Solution10 PDF
No ratings yet
Solution10 PDF
6 pages
Numerical Descriptive Measures: Prem Mann, Introductory Statistics, 7/E
No ratings yet
Numerical Descriptive Measures: Prem Mann, Introductory Statistics, 7/E
138 pages
Customer Segmentation Using RFM Analysis: Overview
No ratings yet
Customer Segmentation Using RFM Analysis: Overview
11 pages
Chapter 3 Numerical Descriptive Measures Jaggia4e - PPT
No ratings yet
Chapter 3 Numerical Descriptive Measures Jaggia4e - PPT
69 pages
Statistics For Economics
No ratings yet
Statistics For Economics
58 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Statistics & Probability Q3 - Week 3-4
No ratings yet
Statistics & Probability Q3 - Week 3-4
15 pages
BAB210 Assignment3
No ratings yet
BAB210 Assignment3
5 pages
Statistical Treatment
No ratings yet
Statistical Treatment
22 pages
Jawaban Soal Latihan Bab 4-6 Kelas A 2018
No ratings yet
Jawaban Soal Latihan Bab 4-6 Kelas A 2018
94 pages
Statistics: Self-Learning Module 15
No ratings yet
Statistics: Self-Learning Module 15
16 pages
Risk and Return
100% (1)
Risk and Return
28 pages
Grade11 Statistics and Probabilty - Module 2
100% (1)
Grade11 Statistics and Probabilty - Module 2
6 pages
Chapter 5 Utilization of Assessment Tools
100% (3)
Chapter 5 Utilization of Assessment Tools
3 pages
Ip Project
No ratings yet
Ip Project
27 pages
Mean Median Mode Range 1
No ratings yet
Mean Median Mode Range 1
1 page
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
23 pages
Activity in English III Court
No ratings yet
Activity in English III Court
3 pages
Worksheet On Data ManagementPart 1 For SPOC
No ratings yet
Worksheet On Data ManagementPart 1 For SPOC
3 pages
Laporan Data Murid
No ratings yet
Laporan Data Murid
4 pages
Data Pre Test Post Test
No ratings yet
Data Pre Test Post Test
4 pages
Retail Analysis Walmart
No ratings yet
Retail Analysis Walmart
18 pages
8MAT 152 Lesson 12
No ratings yet
8MAT 152 Lesson 12
22 pages
Chapter 6 Practice Questions On Normal Probability Distribution With Answer Key
No ratings yet
Chapter 6 Practice Questions On Normal Probability Distribution With Answer Key
4 pages
Unit 5 - Time Series Analysis and Predictive Modeling
No ratings yet
Unit 5 - Time Series Analysis and Predictive Modeling
21 pages
How To Calculate Variance and Standard Deviation?: Definition
No ratings yet
How To Calculate Variance and Standard Deviation?: Definition
1 page
Fas Merch'24
No ratings yet
Fas Merch'24
9 pages
DMC Lab Ex - 1 To 15 (31.03.2024)
No ratings yet
DMC Lab Ex - 1 To 15 (31.03.2024)
52 pages
Chapter 1 Python Pandas - I Type C Long Answer
No ratings yet
Chapter 1 Python Pandas - I Type C Long Answer
5 pages
Data Visualization For Python - Sales Retail - r1
No ratings yet
Data Visualization For Python - Sales Retail - r1
19 pages
Practicals
No ratings yet
Practicals
42 pages
Cap 793
No ratings yet
Cap 793
17 pages
Python For Business Decision Making Asm2
No ratings yet
Python For Business Decision Making Asm2
21 pages
2.12.+correlation Exercise Solution
No ratings yet
2.12.+correlation Exercise Solution
3 pages
Window Functions
No ratings yet
Window Functions
14 pages
Project ML Code
No ratings yet
Project ML Code
132 pages
examples - Econometrics thầy Thế
No ratings yet
examples - Econometrics thầy Thế
9 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Data Preprocessing Visualization
No ratings yet
Data Preprocessing Visualization
25 pages
10 - Jayesh - Prakash - Rane
No ratings yet
10 - Jayesh - Prakash - Rane
26 pages
MATODA Raport Store20
No ratings yet
MATODA Raport Store20
13 pages
Descriptive Statistics - Frequency Distribution
No ratings yet
Descriptive Statistics - Frequency Distribution
30 pages
Excel To Pandas Advanced Data Techniques For BI Devs 1729266352
No ratings yet
Excel To Pandas Advanced Data Techniques For BI Devs 1729266352
9 pages
Data Aggregation Using Python
No ratings yet
Data Aggregation Using Python
33 pages
Bazm-e-Adab Merch 2024-25
No ratings yet
Bazm-e-Adab Merch 2024-25
6 pages
Basic Statistical Concepts-2
No ratings yet
Basic Statistical Concepts-2
20 pages
Generic Elective: Time Table (August-2024) For I Year
No ratings yet
Generic Elective: Time Table (August-2024) For I Year
1 page
DMV - 5 - Jupyter Notebook
No ratings yet
DMV - 5 - Jupyter Notebook
5 pages
Practical No. 01
No ratings yet
Practical No. 01
114 pages
DMV Lab 12
No ratings yet
DMV Lab 12
8 pages
Python Project
No ratings yet
Python Project
20 pages
Python ASSIGNMENT Thursday
No ratings yet
Python ASSIGNMENT Thursday
4 pages
Matplotlib
No ratings yet
Matplotlib
4 pages
Python - Assignment Pandas
No ratings yet
Python - Assignment Pandas
3 pages
Half Yearly Answers
No ratings yet
Half Yearly Answers
10 pages
Trading Results Analysis
No ratings yet
Trading Results Analysis
27 pages
Rithika
No ratings yet
Rithika
16 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Lab Manual 4
No ratings yet
Lab Manual 4
23 pages
Notes 20241025083428
No ratings yet
Notes 20241025083428
4 pages
Finals MMW Reviewer
No ratings yet
Finals MMW Reviewer
3 pages
Box Plot - Excel 2007
No ratings yet
Box Plot - Excel 2007
11 pages
Solution
No ratings yet
Solution
4 pages
Book 111
No ratings yet
Book 111
3 pages
Python - Pandas - Numpy Interview Q&A
No ratings yet
Python - Pandas - Numpy Interview Q&A
12 pages
Supermarket Sales Data Analysis
No ratings yet
Supermarket Sales Data Analysis
6 pages
Wa0002.
No ratings yet
Wa0002.
4 pages
BIDA Practical Print
No ratings yet
BIDA Practical Print
56 pages
Gaurav BA Workbook
No ratings yet
Gaurav BA Workbook
53 pages
Data Analysis
No ratings yet
Data Analysis
4 pages
Blinkit & Zepto Interview Questions
No ratings yet
Blinkit & Zepto Interview Questions
21 pages
Daily Transactions Problem Statement
No ratings yet
Daily Transactions Problem Statement
27 pages
Python Vs SQL
No ratings yet
Python Vs SQL
25 pages
Pandas Data Manipulation Extended CheatSheet 1731972219
No ratings yet
Pandas Data Manipulation Extended CheatSheet 1731972219
9 pages
OEL01
No ratings yet
OEL01
8 pages
Problem Statement Major Project
No ratings yet
Problem Statement Major Project
8 pages
DVT Exp - 7
No ratings yet
DVT Exp - 7
11 pages
Pandas Notes
No ratings yet
Pandas Notes
8 pages
Project 3
No ratings yet
Project 3
8 pages
Guides
No ratings yet
Guides
23 pages
Numpy Intermediate Boolean Slicing Aggregate
No ratings yet
Numpy Intermediate Boolean Slicing Aggregate
1 page
Documentation Part by Pranay Kashyap
No ratings yet
Documentation Part by Pranay Kashyap
7 pages
Practice Questions2
No ratings yet
Practice Questions2
2 pages
UNIT 5 Scenario
No ratings yet
UNIT 5 Scenario
5 pages
Pandas Fuction Notes
No ratings yet
Pandas Fuction Notes
3 pages
Sales Analysis Assessment
No ratings yet
Sales Analysis Assessment
2 pages
Data Processing
No ratings yet
Data Processing
20 pages
Project Merged
No ratings yet
Project Merged
7 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Marx
No ratings yet
Marx
32 pages
Mill
No ratings yet
Mill
20 pages
KARL MARX
No ratings yet
KARL MARX
18 pages
Rousseau
No ratings yet
Rousseau
16 pages
Contents
No ratings yet
Contents
8 pages
Expt4..wavelengthoflaserusinhdoubleslit
No ratings yet
Expt4..wavelengthoflaserusinhdoubleslit
3 pages
Smart Utilities Pvt
No ratings yet
Smart Utilities Pvt
1 page
Mathematical Formulas for Economics and Business: A Simple Introduction
From Everand
Mathematical Formulas for Economics and Business: A Simple Introduction
K.H. Erickson
4/5 (4)