How To Prepare Messy Data for Analysis using SQL_
How To Prepare Messy Data for Analysis using SQL_
How To
Prepare
Messy Data
for Analysis
using SQL?
Problem :
SELECT *
FROM SalesTransactions
UPDATE SalesTransactions
SELECT PricePerUnit
FROM Products
SalesTransactions
the row:
(Products).
value 10.
WITH DuplicateCheck AS (
FROM SalesTransactions
WHERE TransactionID IN (
);
The duplicate row has been deleted, leaving only one record
UPDATE SalesTransactions
SELECT *
FROM SalesTransactions
than 1,000), and then removed this transaction from the data.
TransactionID C
ustomerID TransactionDate ProductID Q S
uantity old TotalAmount
1004 3
200 Deleted
Data Analytics 2024 Data Cleaning
SELECT *
FROM SalesTransactions
The row with the invalid ProductID = XYZ has been deleted, leaving
Summary
values
realistic data