0% found this document useful (0 votes)
261 views16 pages

Basic Data Cleaning With Microsoft Excel v1.1

Uploaded by

Fazisfvttz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
261 views16 pages

Basic Data Cleaning With Microsoft Excel v1.1

Uploaded by

Fazisfvttz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

#Issue 1: Incomplete Data

Filter Blank Cells

1. With the help of Filter feature, you can easily get rid of the blank cells from any
column.
2. The Filter option is available in the Home tab.
3. So, you need to navigate as go the Home tab→ Editing group → Sort & Filter →
Filter.

Go to Editing group and click on Sort & Filter.

Select Filter.

Check the (Blanks) option and


Select the column to then press OK.
remove blanks space.

Fill or Delete

4. Once you have filtered for blanks, you can choose to delete the rows containing
blanks or fill the blanks with appropriate values.

Let's say the missing values for the following cells are as follows:

Cell Missing values


A5 Jibril
A14 Mikail
A15 Firman
A21 Syahir
A22 Firash
A23 Syafiq

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 1
You may click on the desired cell and fill in the missing values.

5. If the missing values are unknown, you can right-click on the selected rows and
choose Delete Row to remove the entire rows.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 2
Go To Special

1. With the help of Go To feature, you also can easily select blank cells from any
column.
2. Select the table or column or row where you need to select the blank cells.
3. Press F5 or Ctrl + G to display the Go To dialog box.
4. Click on the Special button and then press OK.

Press F5 or Ctrl + G to open


Go To dialog box.
Select data.

Click on Special button.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 3
Select Blanks button

Press OK button

Now, you can


easily see that
all Blank cells
are selected.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 4
#Issue 2: Inaccurate Data
Spelling Checkers

1. Many software applications, including word processors and spreadsheet tools like

Microsoft Word or Excel, have built-in spelling checkers.


2. Use these tools to automatically identify and correct common misspellings.

Go to the Review tab, find the Proofing group,


and click on Spelling.

Click on the preferred


Suggestion and then
click on Change.
Select data.
Repeat this process
until all misspellings
are corrected.

3. While spell checkers are effective at catching common spelling errors, they have
limitations and may not identify all types of mistakes. This is especially true for

errors related to context, grammar, or specific industry terminology.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 5
Two more misspellings were not detected
by the spell checker.

4. It is generally a good practice to review or manually inspect the text after using a
spell checker.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 6
#Issue 3: Inconsistent Data
1. Ensure that data such as date and currency formats are consistent within each
column.
2. Look out for variations such as different date formats (MM/DD/YYYY,
DD/MM/YYYY) or currency symbols ($, €, £, RM).

Before formatting process

Challenge: Try to recall how to format dates and currency.

After formatting process

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 7
#Issue 4: Duplicate Data
Remove Duplicate

1. Microsoft Excel has a built-in feature to remove duplicate values from tables called

Remove Duplicate.
2. This tool is found under the Data tab in the menu bar.

3. To use this tool, firstly you need to select either row, column, or table from where
we need to remove the duplicate data.

4. Then, go the Data tab → Data Tools group → Remove Duplicates.

Go to the Data tab, find the Data Tools group, and


click on Remove Duplicates.

Select the data from which you need to remove


duplicate entries.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 8
In the Remove Duplicates dialog box, check the
box My data has headers if your data includes
headers.

Press OK to proceed.

After removing duplicates in Excel, you will get a message showing how
many duplicates were removed and how many unique values are left.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 9
#Issue 5: Unstructured Data
Text to Columns

1. With the help of Excel's Text to Columns feature, you can separate text into

different columns based on delimiters or a fixed width.


2. There are two options for using Text to Columns in Excel.

3. The first option, which this tutorial will focus on, is to separate the text using a
delimiter. A delimiter is a specific character that marks the boundaries

between different pieces of data. Common delimiters include tab, commas,


semicolon, spaces etc.

4. The second method for Text to Columns involves using a predefined fixed width to
separate text into adjacent columns.

5. To access the Text to Columns feature, go to the Data tab ➞ Data Tools group →
Text to Columns.

Select Data.
Go to the Data tab, find the Data Tools
group, and click on Text to Columns.

Select Delimited option.


Check on Comma option.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 10
Select Destination cell.

6. After the Text to Columns operation is complete, you may need to adjust the column
width to ensure that the data is displayed correctly.

Before modifying process

7. Examine the headers of the newly created columns to ensure they reflect the
content accurately. If needed, modify, or add headers to provide clarity.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 11
After modifying process

8. Carefully review the entire worksheet to ensure the split data is accurate and well-
organized.
9. Save your work.

Messy data with various delimiters

A messy dataset with semicolons

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 12
A messy dataset with spaces

A messy dataset with tabs

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 13
TRIM
1. With the help of the TRIM function in Excel, you can remove all unnecessary

spaces from text, including leading and trailing spaces, except for single spaces
between words.

2. The TRIM function cleans up text by removing unnecessary spaces, including spaces
at the beginning (leading) and end (trailing), but keeps single spaces between

words. This helps make the data more readable and easier to work with.
3. The syntax of the TRIM function is:

=TRIM(text)
or

=TRIM(cell)
The syntax parameters are as follows:
text: Text refers to any text or string from which you need to remove leading,

trailing, and double spaces.


cell: Cell refers to a specific location (cell reference or cell address) where you want

to remove unnecessary spaces at the beginning, end, and between words.

TRIM formula with cell reference.

TRIM formula with text reference.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 14
CLEAN
1. With the help of the CLEAN function in Excel, we can remove line breaks and all

non-printable characters from a text.


2. The CLEAN function returns the text value or string after removing nonprintable

characters and line breaks.


3. This function is used to clean up data with line breaks and all nonprintable

characters from a string, making it easier to read and work with the data.
4. The syntax of the TRIM function is:

=CLEAN(text)
or

=CLEAN(cell)
The syntax parameters are as follows:

text: Text refers to any text or string from which you need to remove line breaks
and all non-printable characters from a string.

cell: Cell refers to a specific location (cell reference or cell address) where you
need to remove line breaks and all non-printable characters from a string.

5. Non-printable characters, also known as control characters or unprintable


characters, are characters that do not have a visible representation when printed

or displayed. These characters typically have a specific function in controlling the


formatting or behaviour of a document or text, but they are not intended to be

shown or printed.
6. Common examples of no-nprintable characters include:

• Line Breaks: Characters that indicate the end of a line and the beginning of
a new one. They are necessary for formatting text but are not visible on the

printed page.
• Tab Characters: Used to create horizontal spacing or indentation in a

document.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 15
• Control Characters: Various characters with ASCII values below 32 (such as

control characters like ESC, BEL, etc.), which are used for controlling hardware
devices or communication protocols.

CLEAN formula with cell reference.

CLEAN formula with text reference.

Basic Data Cleaning with Microsoft Excel Department of ICT, CFS IIUM 16

You might also like