Data Wrangling

Data wrangling, also known as data munging, is a crucial step in data science that involves processing data in various formats for analysis. It includes functionalities such as data exploration, dealing with missing values, reshaping, and filtering data, all of which prepare datasets for further use in machine learning and visualization. Python provides built-in features to facilitate these wrangling methods, such as merging datasets to create meaningful insights.

Uploaded by

Anshu Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

Data Wrangling

Uploaded by

Anshu Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

DATA WRANGLING:-

Data wrangling involves processing the data in various formats like - merging, grouping, concatenating etc.
for the purpose of analysing or getting them ready to be used with another set of data. Python has built-in
features to apply these wrangling methods to various data sets to achieve the analytical goal. Data
Wrangling is also known as Data Munging.

Importance Of Data Wrangling

Data Wrangling is a very important step in a Data science project. The below example will explain its
importance:

Books selling Website want to show top-selling books of different domains, according to user preference.
For example, if a new user searches for motivational books, then they want to show those motivational
books which sell the most or have a high rating, etc.

But on their website, there are plenty of raw data from different users. Here the concept of Data Munging
or Data Wrangling is used. As we know Data wrangling is not by the System itself. This process is done by
Data Scientists. So, the data Scientist will wrangle data in such a way that they will sort the motivational
books that are sold more or have high ratings or user buy this book with these package of Books, etc. On
the basis of that, the new user will make a choice.

Data wrangling in Python deals with the below functionalities:

1. Data exploration: In this process, the data is studied, analyzed, and understood by visualizing
representations of data.

2. Dealing with missing values: Most of the datasets having a vast amount of data contain missing
values of NaN, they are needed to be taken care of by replacing them with mean, mode, the most
frequent value of the column, or simply by dropping the row having a NaN value.

3. Reshaping data: In this process, data is manipulated according to the requirements, where new
data can be added or pre-existing data can be modified.

4. Filtering data: Some times datasets are comprised of unwanted rows or columns which are required
to be removed or filtered

5. Other: After dealing with the raw dataset with the above functionalities we get an efficient dataset
as per our requirements and then it can be used for a required purpose like data analyzing, machine
learning, data visualization, model training etc.
Data exploration in Python:-Here in Data exploration, we load the data into a dataframe, and then we
visualize the data in a tabular format .

Dealing with missing values in Python

As we can see from the previous output, there are NaN values present in the MARKS column which is
a missing value in the dataframe that is going to be taken care of in data wrangling by replacing them with
the column mean.

Data Replacing in Data Wrangling

in the GENDER column, we can replace the Gender column data by categorizing them into different
numbers.

Data Wrangling Using Merge Operation

Merge operation is used to merge two raw data into the desired format.

Syntax: pd.merge( data_frame1,data_frame2, on="field ")

EXAMPLE: Suppose that a Teacher has two types of Data, the first type of Data consists of Details of
Students and the Second type of Data Consist of Pending Fees Status which is taken from the Account Office.
So The Teacher will use the merge operation here in order to merge the data and provide it meaning. So
that teacher will analyze it easily and it also reduces the time and effort of the Teacher from Manual
Merging.

Creating Second Dataframe to Perform Merge operation using Data Wrangling:

Data Wrangling Using Merge Operation:

Data Analysis With Python
No ratings yet
Data Analysis With Python
29 pages
Electronics: Quarter III - Module 3: Lesson 1
No ratings yet
Electronics: Quarter III - Module 3: Lesson 1
16 pages
Male Basic Round Neck Tee
100% (1)
Male Basic Round Neck Tee
21 pages
Data Wrangling
No ratings yet
Data Wrangling
13 pages
Lesson 5 Data Wrangling in Data Science.
100% (1)
Lesson 5 Data Wrangling in Data Science.
11 pages
DSBDAL
No ratings yet
DSBDAL
87 pages
Exp 1
No ratings yet
Exp 1
3 pages
Ds With Py
No ratings yet
Ds With Py
39 pages
Data Wrangling
No ratings yet
Data Wrangling
30 pages
Lab Assignment 1 Title: Data Wrangling I: Problem Statement
No ratings yet
Lab Assignment 1 Title: Data Wrangling I: Problem Statement
12 pages
Data Wrangling
No ratings yet
Data Wrangling
15 pages
Unit 4
No ratings yet
Unit 4
60 pages
Unit 4
No ratings yet
Unit 4
60 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
110 pages
Unit V
No ratings yet
Unit V
47 pages
Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
112 pages
Data Wrangling
No ratings yet
Data Wrangling
5 pages
Advanced Python Lab
No ratings yet
Advanced Python Lab
17 pages
Dsbda Ass1
No ratings yet
Dsbda Ass1
61 pages
Module - 1 (Introduction To Data Wrangling)
No ratings yet
Module - 1 (Introduction To Data Wrangling)
29 pages
DWDV Unit 1
No ratings yet
DWDV Unit 1
21 pages
2-Data Wrangling
No ratings yet
2-Data Wrangling
13 pages
Data Wrangling
0% (1)
Data Wrangling
5 pages
1.3 Data Analysis With Python - Data Wrangling 1
No ratings yet
1.3 Data Analysis With Python - Data Wrangling 1
14 pages
1.3 Data Analysis With Python - Data Wrangling 1
No ratings yet
1.3 Data Analysis With Python - Data Wrangling 1
14 pages
DR Kruti Dangarwala CSE & IT Department Svmit: Python For Data Science Unit 5: Data Wrangling
No ratings yet
DR Kruti Dangarwala CSE & IT Department Svmit: Python For Data Science Unit 5: Data Wrangling
91 pages
IJCRT2405424
No ratings yet
IJCRT2405424
8 pages
Ipl Data Analysis PBL II-II
No ratings yet
Ipl Data Analysis PBL II-II
11 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
Unit 4 - Working With Graphs - Python
No ratings yet
Unit 4 - Working With Graphs - Python
49 pages
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
No ratings yet
Sarkar, DR Tirthajyoti - Roychowdhury, Shubhadeep - Data Wrangling With Python - Creating Actionable Data From Raw Sources-Packt Publishing (2019)
538 pages
Data Analysis Using Python Day - 1 To Day - 4
No ratings yet
Data Analysis Using Python Day - 1 To Day - 4
30 pages
Data Wrangling
No ratings yet
Data Wrangling
18 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
DWDV Notes
No ratings yet
DWDV Notes
111 pages
DS Final
No ratings yet
DS Final
46 pages
Unit IV
No ratings yet
Unit IV
27 pages
DATA WRANGLING AND DATA VISUALIZATION - Unit-01
No ratings yet
DATA WRANGLING AND DATA VISUALIZATION - Unit-01
19 pages
Unit-1 DM
No ratings yet
Unit-1 DM
10 pages
Ipl Data Analysis PBL
No ratings yet
Ipl Data Analysis PBL
11 pages
Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
111 pages
Data Wrangling
No ratings yet
Data Wrangling
6 pages
Unit II Notes
No ratings yet
Unit II Notes
39 pages
DSBDA Lab Manual24-25
No ratings yet
DSBDA Lab Manual24-25
58 pages
Lecture Week 6-Data Scraping and Data Wrangling
No ratings yet
Lecture Week 6-Data Scraping and Data Wrangling
16 pages
Data Wrangling
No ratings yet
Data Wrangling
9 pages
Data Sceince - UNIT - 4
No ratings yet
Data Sceince - UNIT - 4
70 pages
Data Wrangling and Munging
No ratings yet
Data Wrangling and Munging
21 pages
Lesson 2 - Data Preprocessing
100% (1)
Lesson 2 - Data Preprocessing
72 pages
MODULE 5 Merged
No ratings yet
MODULE 5 Merged
22 pages
FDS Chapter 3
No ratings yet
FDS Chapter 3
103 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
Python For DS Unit4
No ratings yet
Python For DS Unit4
11 pages
Chapter2 - Data Wrangling
No ratings yet
Chapter2 - Data Wrangling
48 pages
Data Wrangling PDF
No ratings yet
Data Wrangling PDF
14 pages
Data Wrangling and Analysis
100% (1)
Data Wrangling and Analysis
36 pages
Optimisation and Dddddimension Reduction Tech-Unlocked
No ratings yet
Optimisation and Dddddimension Reduction Tech-Unlocked
29 pages
Lecture 5
No ratings yet
Lecture 5
13 pages
Math211101020
No ratings yet
Math211101020
12 pages
Python Data Structures Explained: A Practical Guide with Examples
From Everand
Python Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
DOS Commands Linux Commands
No ratings yet
DOS Commands Linux Commands
29 pages
1.3 Python As A Calculator
100% (1)
1.3 Python As A Calculator
2 pages
Master Report Grzemba
No ratings yet
Master Report Grzemba
55 pages
E-Commerce & ERP
No ratings yet
E-Commerce & ERP
5 pages
Python Lesson 5 - Selection
No ratings yet
Python Lesson 5 - Selection
19 pages
Learning Activity Sheet Empowerment Technologies-Senior High School
No ratings yet
Learning Activity Sheet Empowerment Technologies-Senior High School
6 pages
Jawab Pertanyaan Studi Kasus (Case Study Questions) Sesuai Pembagian Di Bawah Ini
No ratings yet
Jawab Pertanyaan Studi Kasus (Case Study Questions) Sesuai Pembagian Di Bawah Ini
2 pages
DT2485 - DT-BUS Data Logger
No ratings yet
DT2485 - DT-BUS Data Logger
2 pages
Citra Log
No ratings yet
Citra Log
7 pages
G-12 Practice Questions 2-1
No ratings yet
G-12 Practice Questions 2-1
12 pages
Assignment - Telecommunication Principles
No ratings yet
Assignment - Telecommunication Principles
3 pages
Operating Manual - AGM-13H
No ratings yet
Operating Manual - AGM-13H
181 pages
Advanced OSPF Lab1
No ratings yet
Advanced OSPF Lab1
22 pages
Log
No ratings yet
Log
215 pages
Coffee Shop Management Report
No ratings yet
Coffee Shop Management Report
16 pages
1 Introduction To Data Analytics
No ratings yet
1 Introduction To Data Analytics
14 pages
Intrusion Detection With Suricata
No ratings yet
Intrusion Detection With Suricata
32 pages
Speech Enhancement Using Kalman Filter
No ratings yet
Speech Enhancement Using Kalman Filter
14 pages
Auto-Sensing 3G/HD/SD Multiplexer With Up To 8 AES Inputs: Features
No ratings yet
Auto-Sensing 3G/HD/SD Multiplexer With Up To 8 AES Inputs: Features
5 pages
AFT Impulse 8 Data Sheet
No ratings yet
AFT Impulse 8 Data Sheet
2 pages
Entry Level Java Developer Resume Example
No ratings yet
Entry Level Java Developer Resume Example
1 page
NPTEL Online Course Details For ECE
No ratings yet
NPTEL Online Course Details For ECE
4 pages
UScan Operation Manual CT
No ratings yet
UScan Operation Manual CT
68 pages
Algebra Handout #5 Answers and Solutions
0% (1)
Algebra Handout #5 Answers and Solutions
5 pages
Detecting Suspicious File Migration or Replication in The Cloud
No ratings yet
Detecting Suspicious File Migration or Replication in The Cloud
14 pages
CN Lab Manual
No ratings yet
CN Lab Manual
36 pages
DOCA0091EN-05 Modbus NSX
No ratings yet
DOCA0091EN-05 Modbus NSX
228 pages
Junior Software Engineer: Nguyen Quang Minh
No ratings yet
Junior Software Engineer: Nguyen Quang Minh
2 pages

Data Wrangling

Uploaded by

Data Wrangling

Uploaded by

DATA WRANGLING:-

Importance Of Data Wrangling

Data wrangling in Python deals with the below functionalities:

Dealing with missing values in Python

Data Replacing in Data Wrangling

Data Wrangling Using Merge Operation

Syntax: pd.merge( data_frame1,data_frame2, on="field ")

Creating Second Dataframe to Perform Merge operation using Data Wrangling:

Data Wrangling Using Merge Operation:

You might also like