Data Cleaning in SQL
Data Cleaning in SQL
SQL
By Muhammad Ikhwan Fadillah
This Project worked using SQL Queries in
Microsoft SQL Server Management Studio
Data Brief
Nashville Housing Dataset
This is the core dataset. You might find all other
information.
1. UniqueID 11. Acreage
2. ParcelID 12. TaxDistrict
3. LandUse 13. LandValue
4. PropertyAddress 14. BuildingValue
5. SaleDate 15. TotalValue
6. SalePrice 16. YearBuilt
7. LegalReference 17. Bedrooms
8. SoldAsVacant 18. FullBath
9. OwnerName 19. HalfBath
10. OwnerAddress
Total Rows : 56.477
Data Cleaning Process
1. Standardize Date Format ; Change Datetime format to Date Format in SaleDate column.
2. Populate Property Address Data ; Fill in the blank data in the property address column by
looking at the similarity of data in the parcelID column.
3. Breaking Out Address Column ; Split PropertyAddress column and OwnerAddress column into
Address column, city column, and state column.
4. Replace Sold AS Vacant Data ; Replace "Y" and "N" into "Yes" and "No" .
5. Delete Unused Column ; Delete unused column such as PropertyAddress, OwnerAddress and
SaleDate .
Standardize
Date Format
Solution
Result
Before After
Populate Property
Address Data
Solution
Result
Solution
Result
Breaking out Property
Address into Individual
Columns
(Addres and City)
Solution
Solution
Result
Breaking out Owner
Address into Individual
Columns
(Addres, City, and
State)
Solution
Solution
Result
Change Y and N to
Yes and No in "Sold
as Vacant" field
Solution
Result
Solution
Solution
Result
Delete Unused
Columns
Solution
THANK YOU