0% found this document useful (0 votes)
4 views2 pages

Lab5 - Prep.

The document emphasizes the importance of data quality and manipulation in data projects, highlighting that clean data is essential for accurate analysis. It provides guidance on using Excel functions for data cleaning, such as removing duplicates and formatting text, and outlines data validation rules to ensure correct data entry. Additionally, it instructs on locking formula cells to prevent unauthorized changes.

Uploaded by

nurmeenzahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views2 pages

Lab5 - Prep.

The document emphasizes the importance of data quality and manipulation in data projects, highlighting that clean data is essential for accurate analysis. It provides guidance on using Excel functions for data cleaning, such as removing duplicates and formatting text, and outlines data validation rules to ensure correct data entry. Additionally, it instructs on locking formula cells to prevent unauthorized changes.

Uploaded by

nurmeenzahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Quality

We have dealt a lot with different types of data files and analyzing it so far. But we haven’t dealt
with question regarding data quality and data manipulation yet. Data quality is perhaps one of the
most important aspects in a data project. Ensuring your data is in the right format is key to
performing all your analysis. If the data isn’t right, the analysis most certainly won’t be.

In our previous labs, we have always had clean, good data where we assumed we had no issues, and
the data was usually in a good format, ready to be analyzed and used. This may not always be the
case. In real world scenarios, you may not get a pretty and clean workbook to start your work.
Instead, you will have to filter out relevant data, clean it to make sure there are no issues and
combine it from multiple sources to get the right dataset before you can even start your work.

So let’s spend some time practicing these concepts in Excel within this Lab.

Skills & Functionality


In Excel, explore the tab “Formulas” and click on “Text” drop down button (yellow book with A
written on it). You will see a description of each function if you take your mouse over a function.
Functions that may come in handy are trim, upper, proper, right, left and ‘&’. You can also use
google and / or Microsoft’s website (support.microsoft.com) to figure out your problem.
Also checkout:
Data (tab)  Data Tools  Remove Duplicates, Text to Column, Data Validation,
Data (tab)  Sort & Filter  Advanced.

Lock Formulas in a Cell, so that another user does not change it. A reference video on how to
achieve this is given here: https://trumpexcel.com/lock-formulas-excel/

Text Functions
Check out ‘remove duplicates’ function from the data tab ribbon of excel.
‘text to column’ function can be found under the Data tab ribbon of excel.
Find a text related excel function that makes the first letters of the name capital. E.g. muhammad
ahmed -> Muhammad Ahmed.
Find a text function to make all the letters upper case.
Find a text related function to display only the last 4 letters of a word.
Find function to concatenate multiple words from different columns into one word.

Use random functions (see Excel help for different types) to generate random numbers.
Checkout Data Validation options in the Data Tab.

1. Data validation rules


Implement following data validation rules on columns to stop the data entry staff from entering
incorrect data.

 Order Date should only be a date for January 2024. Add an Error Message saying: “Please
enter dates for January 2024 only”
 Customer Name should be text only. Add appropriate Error Message.
How to view only unique values.

2. Locking cells

Try locking the formula cells.


A reference video on how to achieve this is given here:
https://trumpexcel.com/lock-formulas-excel/
The result should be such that if you try to type anything in the Product ID column, it should
give you an error.
Cell locking can be done for many other use cases. Feel free to experiment on your own.

You might also like