0% found this document useful (0 votes)
57 views

Cleaning Data in Excel

This document provides an overview of various data cleaning techniques in Excel, including removing duplicates, parsing text to columns, deleting formatting, spell checking, changing case, highlighting errors, using the TRIM function, and finding and replacing text. It explains how to use Excel's built-in tools and functions to clean datasets by eliminating unnecessary or incorrect information to ensure accuracy and quality.

Uploaded by

ngugi muchau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Cleaning Data in Excel

This document provides an overview of various data cleaning techniques in Excel, including removing duplicates, parsing text to columns, deleting formatting, spell checking, changing case, highlighting errors, using the TRIM function, and finding and replacing text. It explains how to use Excel's built-in tools and functions to clean datasets by eliminating unnecessary or incorrect information to ensure accuracy and quality.

Uploaded by

ngugi muchau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Table of Contents

Remove Duplicates
Data Parsing from Text to Column
Delete All Formatting
Spell Check
Change Case - Lower/Upper/Proper
Highlight Errors
TRIM Function
Find and Replace
Conclusion

Excel Data Cleaning is a significant skill that all Business and Data Analysts must possess. In the
current era of data analytics, everyone expects the accuracy and quality of data to be of the
highest standards. A major part of Excel Data Cleaning involves the elimination of blank spaces,
incorrect, and outdated information.

Some simple steps can easily do the procedure of Data Cleaning in Excel by using Excel Power
Query. This tutorial will help you learn about some of the fundamental and straightforward Excel
Data Cleaning procedures.

Remove Duplicates
There is a considerable probability that it might duplicate unintentionally the data without the
user's knowledge. In such scenarios, you can eliminate the duplicate values.

Here, you will consider a simple student dataset that has duplicate values. You will use Excel's
built-in function to remove duplicates, as shown below.

The original dataset has two rows as duplicates. To eliminate the duplicate data, you need to
select the data option in the toolbar, and in the Data Tools ribbon, select the "Remove
Duplicates" option. This will provide you with the new dialogue box, as shown below.
Here, you need to select the columns you want to compare for duplication. Another critical step
is to check in the headers' option as you included the column names in the data set. Excel will
automatically scan it by default.

Next, you must compare all columns, so go ahead and check all the columns as shown below.

Select Ok, and Excel performs the operations required and provides you with the data set after
filtering out the duplicate data, as shown below.
In the next part of Excel Data Cleaning, you will understand data parsing from text to column

Data Parsing from Text to Column


Sometimes, there is a possibility that one cell might have multiple data elements separated by a
data delimiter like a comma. For example, consider that there is one column that stores address
information.

The address column stores the street, district, state, and nation. Commas separate all the data
elements. You must now divide the street, district, state, and nation from the address columns
into separate columns.

Excel's inbuilt functionality called "text to column" can achieve this. Now, try an example for
the same.

Here, you have the car manufacturer and the car model name separated by space as the data
delimiter. The tabular data is shown below.
Select the data, click on the data option in the toolbar and then select "Text to Column", as
shown below.

A new window will pop up on the screen, as shown below. Select the delimiter option and click
on "next". In the next window, you will see another dialogue box.
In the new page dialogue box, you will see an option to select the type of delimiter your data has.
In this case, you need to select the "space" as a delimiter, as shown below.
In the last dialogue box, select the column data format as "General", and the next step should be
to click on the finish, as shown in the following image.
The final resultant data will be available, as shown below.

Followed by Data parsing, in this tutorial about Excel Data Cleaning, you will learn how to
delete all formatting.
Delete All Formatting
The formatting can be as simple as coloring your cells and aligning the text in the cells. It can be
a logical condition applied to your cells using Excel's conditional formatting option from the
home tab.

However, in situations where you wish to remove the formatting, you can do it in the following
ways. First, try to eliminate the regular formatting. In the previous example, you took the case of
car manufacturers and car models data tables with heading cells colored in blue, and the text was
centre aligned.

Now, use the clear option to remove the formats. Select the tabular data as shown below. Select
the "home" option and go to the "editing" group in the ribbon. The "clear" option is available in
the group, as shown below.

Select the "clear" option and click on the "clear formats" option. This will clear all the formats
applied on the table.
The final data table will appear as shown below.

Now, you must learn how to eliminate the conditional formatting in Excel. This time, consider a
different sheet. You must use the student's details sheet, which includes conditional formatting in
Excel.

To eliminate conditional formatting in Excel, select the column or table with conditional
formatting as shown below.
Then navigate to "Home", and select conditional formatting.

Then in the dialogue box, select the clear rules option. Here, you can either choose to eliminate
rules only in the selected cells or eliminate rules from the entire column.

After you eliminate all conditions, the resultant table would look as follows.
You can always use a shortcut method to eliminate the conditional formatting in Excel. It is by
pressing the sequential combination of the following keys as follows.

ATL + E + A + F

Next, in this Excel Data Cleaning tutorial, you will learn about Spell Check.

Spell Check
The feature of checking the spelling is available in MS Excel as well. To check the spellings of
the words used in the spreadsheet, you can use the following method. Select the data cell,
column, or sheet where you want to perform the spell check.

Now, go to the review option as shown below.


Microsoft Excel will automatically show the correct spelling in the dialogue box, as shown
below. You can replace the words as per the requirement as shown below.

The final reviewed data table will like the one below.
In the next segment of this Excel Data Cleaning tutorial, you will learn about changing the text
case.

FREE Business Analytics With Excel Course

Start your Business Analytics Learning for FREEStart Learning

Change Case - Lower/Upper/Proper


You can manipulate the data in the Excel worksheet in terms of character cases as per the
requirements. To apply case changes, you can follow the following steps.

Select the table or columns that need the case to be changed, as shown below.

Select the cell next to the column and apply the formula as per the requirement, as shown below.
=UPPER(cell address) - for Upper case conversion

=LOWER(cell address) - for Lower case conversion

=PROPER(cell address) - for Sentence case conversion

Now, you can drag the cell can to the last row, as shown below.

The final data table will appear as shown below.


Now that you learned spell check, in the upcoming section of Excel Data Cleaning, you will
learn how to Highlight Errors in an Excel spreadsheet.

Highlight Errors
Highlighting errors in an Excel spreadsheet is helpful to find or sort out the erroneous data with
ease. You can do error Highlighting with the help of conditional formatting in Excel. Here, you
must consider the student data set as an example.

Imagine that you are interviewing all the students. There are eligibility criteria. You can shortlist
the students if they have 60% aggregate marks. Now, apply conditional formatting and sort out
the students who are eligible and not eligible.

First, select the aggregate/percentage column as shown below.

Select "Home", and in the Styles group, select conditional formatting, as shown below.
In the conditional formatting option, select the highlight option, and in the next drop-down,
select the less than an option as shown below.

In the settings window, you will find a slot to provide the aggregate as "60" percent and press ok.
Excel will now select and highlight cells with an aggregate of less than 60 percent. In the next
part of Excel Data Cleaning, you will understand the trim function.

TRIM Function
The TRIM function is used to eliminate excess spaces and tab spaces in the Excel worksheet
cells. The excessive blank spaces and tab spaces make the data hard to understand. Using the
"TRIM" function can eliminate these excessive blank spaces.

Select the data cells with excessive blank spaces and tab spaces. Now, select a new cell adjacent
to the first cell.

Apply the TRIM() function and drag the cell as shown below.

It shows the final data after the elimination of the excess space as follows.

Next, in the Excel Data Cleaning tutorial, you will look at the Find and Replace function.

Find and Replace


Find and Replace will help you fetch and replace data in the entire worksheet. Consider the
employee data example.

Here, try to fetch an employee with the name Joe and try to rename or replace his name with
John, after changing his first name.

The "find and replace" option is present in the home ribbon in the editing group, as shown
below.

Click on the option, and a new window will open, where you can enter the data to be fetched and
enter the text you need to replace, as shown below.
Click on "replace all", and it will replace the text. The final dataset will be as shown below.

You might also like