Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
142 views
4 pages
Lab - Interpret Visualizations With Respect To Outliers
Uploaded by
Francis Siamunyano
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save Lab - Interpret Visualizations with Respect to Out... For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
142 views
4 pages
Lab - Interpret Visualizations With Respect To Outliers
Uploaded by
Francis Siamunyano
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save Lab - Interpret Visualizations with Respect to Out... For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 4
Search
Fullscreen
Lab - Interpret Visualizations with Respect to Outliers Objectives In this lab, charts and functions will be used to detect data outliers. Part 1: Examine a Dataset for Outliers Background / Scenario An outlier is a value or data point that varies significantly from others in the same dataset. An outlier can result from variability in the measurements, experimental errors, or human error in entering the data. To make sure that any data analysis is correct, outliers need to be identified and then it needs to be determined how best to treat them. Required Resources + Mobile device or PC/laptop with a browser, Microsoft 365 Excel online, and internet access Note: The precise steps to format and manipulate data in Excel can vary between platforms and versions. The instructions in this lab are based on the free version of Excel available from Office.com and may have to be modified to match the platform or version used to achieve the results shown in this lab. Instructions Part 1: Examine a Dataset for Outliers Step 1: Open the data set. a. Download the file Bike Sales_Outlier_Lab.xIsx b. Upload the file to your OneDrive and open it in MS 365 Excel online Step 2: Use a Pivot Table to Select Data for Analysis a. Click any cell in the Bike Sales worksheet. b. Insert a pivot table by clicking Insert > PivotTable. Check that New Worksheet is selected in the Create PivotTable dialog box and click OK. This adds a new worksheet for the pivot table. c. In the PivotTable Fields Dialog box check the Date and Order_Quantity fields. The pivot table is created with two columns Date and Sum of Order_Quantity. Step 3: Sorting Data to Find Outliers One way to identify outliers is by just sorting the data. This method works with small data sets where the data is easily scanneda. Sort the Sum of Qrder_Quantity column from high to low 1. Select the data points in the Sum of Order_Quantity column. (Do not select the Grant Total or the column header) 2. Click Sort & Filter > Sort Descending, This sorts the Order_Quantity data points from highest to lowest. Which December date had the largest sales quantity? What was the sales quantity? Answer Area Type your answers here. Review the data in the Bike Sales worksheet for December 19". Which entry contributes most to the Sum of Order_Quantity in the pivot table? In other words, which order number is most responsible for the outlier? [Answer Area Type your answers here. Step 4: Use a Scatter Chart to Find Outliers Ascatter chart can help to identify outliers, especially in larger datasets. a. Return to the worksheet containing the pivot table (Sheet). b. Copy and paste the data from the pivot table into two blank columns (D and E). Copy the header row with the data, but do not copy the Grand Total row. Excel will not allow creation of a scatter plot from data in a pivot table. So, the data must be moved to other columns. c. Insert scatter plot. 1. Select the all cells in the copied data and use Sort & Filter to sort it ascending. 2. Highlight the Sum of Order_Quantity column in the copied data. 3. Click on Insert > Scatter and then select the top left scatter plot in the dropdown list Note that the visual of the scatter chart makes the sales for December 19" easily stand out as an outlier from the other order quantity datapoints as shown below.4. Delete the scatter plot. Step 5: Using the LARGE and SMALL Functions to Find Outliers. If there is a lot of data the LARGE and SMALL functions can be used to extract the largest and smallest values which can help to see if there are any outliers, For this example, the Date column is column D and the Sum of Order_Quantity column is column E. The columns in your worksheet may be different so adjust your function cells references accordingly. owe sumo onder aus a0 : anno di a. In an emply cell enter the function =LARGE(SE$4:$E27, 1). This function looks at the entries from cell E4 through E27 and returns the highest value. What value was returned? [Answer Area Type your answers here. b, To get the highest 5 values, modify the functions to =LARGE($E$4:$E27, ROW($1:5)) This returns the highest five values. To return more values change the "5" at the end of the function to number of values you would like returned. What function would return the lowest 6 values? [Answer Area Type your answers here. Once outliers are identified, the next challenge is what to do with them. Outliers may indicate errors in the data, or may be valid data that needs to be investigated as to why it appears to be an anomaly. There are a couple of ways in which a data analyst can deal with outliers. 1. Delete them. In a large dataset deleting a few outliers will likely not impact the overall analysis. However, it is important to create a copy of the data so you can research what was causing the outliers in the first place. In this example, row 72 in the Bike Sales dataset could be deleted,2. Normalize them (Adjust their value). The value of the outliers is changed to be slightly above the maximum value in the dataset. This is a good method if it will not skew the data. There are a number of statistical methods to normalize data. Research the various methods before randomly adjusting data values. In the example Bike Sales dataset, the December 19" Order_Quantity could be changed from 43 to 20 so it is just above the maximum value of 19. Reflec! n Questions List the factors that could determine whether data outliers should or should not be considered in the final analysis of a dataset [Answer Area Type your answers here. Challenge Activity Consider scenarios where data outliers could have significant impact on the final data analysis if excluded from consideration. A© 2017 - 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public
You might also like
Explorotary Data Analysis
PDF
100% (1)
Explorotary Data Analysis
30 pages
Feature Engineering
PDF
No ratings yet
Feature Engineering
63 pages
Data Screening (Sometimes Referred To As "Data Screaming") Is The Process of Ensuring Your Data Is
PDF
No ratings yet
Data Screening (Sometimes Referred To As "Data Screaming") Is The Process of Ensuring Your Data Is
4 pages
Explanatory Data Analysis
PDF
100% (1)
Explanatory Data Analysis
28 pages
Introduction To Outlier Analysis Complete
PDF
No ratings yet
Introduction To Outlier Analysis Complete
12 pages
Missing Values and Outliers in R-Software
PDF
No ratings yet
Missing Values and Outliers in R-Software
17 pages
Formulas and Functions
PDF
No ratings yet
Formulas and Functions
9 pages
The Hampel Identifier - Robust Outlier Detection in A Time Series
PDF
No ratings yet
The Hampel Identifier - Robust Outlier Detection in A Time Series
9 pages
Data Cleaning
PDF
No ratings yet
Data Cleaning
4 pages
Feature Engineering
PDF
No ratings yet
Feature Engineering
66 pages
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
PDF
No ratings yet
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
69 pages
BA Lab Manual
PDF
No ratings yet
BA Lab Manual
62 pages
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
PDF
No ratings yet
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
45 pages
Outlier Detection
PDF
No ratings yet
Outlier Detection
45 pages
Detecting Data Outliers
PDF
No ratings yet
Detecting Data Outliers
7 pages
4 - Outliers - +transformaations ML
PDF
No ratings yet
4 - Outliers - +transformaations ML
28 pages
Exploratory Data
PDF
No ratings yet
Exploratory Data
47 pages
Concepts of EDA, Outliers-Detection and Treatment
PDF
No ratings yet
Concepts of EDA, Outliers-Detection and Treatment
99 pages
Data Preprocessing
PDF
No ratings yet
Data Preprocessing
18 pages
Identifying and Handling Outliers in Pandas - A Step-By-Step Guide - by Arvid Eichner - Python in Plain English
PDF
No ratings yet
Identifying and Handling Outliers in Pandas - A Step-By-Step Guide - by Arvid Eichner - Python in Plain English
19 pages
IMPDAV
PDF
No ratings yet
IMPDAV
105 pages
5 Ways To Find Outliers in Your Data - Statistics by Jim
PDF
No ratings yet
5 Ways To Find Outliers in Your Data - Statistics by Jim
35 pages
Chapter 2. Pre-Processing Data
PDF
No ratings yet
Chapter 2. Pre-Processing Data
37 pages
Module 5 - Data Cleaning and Transformation
PDF
No ratings yet
Module 5 - Data Cleaning and Transformation
26 pages
Outlier Analysis
PDF
No ratings yet
Outlier Analysis
28 pages
Mastering Outliers in Excel and in R
PDF
No ratings yet
Mastering Outliers in Excel and in R
71 pages
Dsi237 Group 2
PDF
No ratings yet
Dsi237 Group 2
27 pages
Topic2 - 2024 - Descriptive Statistics - STD - Revised
PDF
No ratings yet
Topic2 - 2024 - Descriptive Statistics - STD - Revised
20 pages
Krishnendu PCB-IT602B
PDF
No ratings yet
Krishnendu PCB-IT602B
11 pages
Unit 1
PDF
No ratings yet
Unit 1
21 pages
Chapter3 DS
PDF
No ratings yet
Chapter3 DS
17 pages
Unit 5 - Lecture 1 - Outlier Detection
PDF
No ratings yet
Unit 5 - Lecture 1 - Outlier Detection
30 pages
Group A Assignment No2 Writeup
PDF
No ratings yet
Group A Assignment No2 Writeup
9 pages
Data Minning Unit 4-1
PDF
No ratings yet
Data Minning Unit 4-1
10 pages
CC&BD Unit 4
PDF
No ratings yet
CC&BD Unit 4
12 pages
Data Quality and Remediation
PDF
No ratings yet
Data Quality and Remediation
40 pages
Outliers ML
PDF
No ratings yet
Outliers ML
14 pages
Slide PTDL.1
PDF
No ratings yet
Slide PTDL.1
16 pages
Abacus Break The Modelling Taboo Break T
PDF
No ratings yet
Abacus Break The Modelling Taboo Break T
10 pages
Guide On Outlier Detection Methods
PDF
No ratings yet
Guide On Outlier Detection Methods
11 pages
Fundamentals Stats
PDF
No ratings yet
Fundamentals Stats
44 pages
Pivot Table Int Q
PDF
No ratings yet
Pivot Table Int Q
16 pages
Outliers
PDF
No ratings yet
Outliers
7 pages
Outlier or Anomaly Detection
PDF
No ratings yet
Outlier or Anomaly Detection
9 pages
Detecting Data Outliers
PDF
No ratings yet
Detecting Data Outliers
7 pages
Handling Outliers
PDF
No ratings yet
Handling Outliers
6 pages
Outliers CW
PDF
No ratings yet
Outliers CW
6 pages
Handling Ouliers
PDF
No ratings yet
Handling Ouliers
5 pages
Lecture 12 Outliers and Guidelines For Exercises
PDF
No ratings yet
Lecture 12 Outliers and Guidelines For Exercises
6 pages
DSBDA Lab Assignment No 2
PDF
No ratings yet
DSBDA Lab Assignment No 2
7 pages
Lab - Using VLOOKUP in Data Analysis
PDF
No ratings yet
Lab - Using VLOOKUP in Data Analysis
4 pages
Data Analytics 02: Drag Connect It Change Remove Cabin, Life Boat, Name, and Ticket Number
PDF
No ratings yet
Data Analytics 02: Drag Connect It Change Remove Cabin, Life Boat, Name, and Ticket Number
2 pages
DSBDL Asg 2 Write Up
PDF
No ratings yet
DSBDL Asg 2 Write Up
4 pages
Lecture 22
PDF
No ratings yet
Lecture 22
20 pages
6735367a5d6e24a5f185bf9c 99512104437
PDF
No ratings yet
6735367a5d6e24a5f185bf9c 99512104437
2 pages
Exploratory Data Analysis
PDF
No ratings yet
Exploratory Data Analysis
12 pages
Lab - Using Excel To Sort and Filter Data
PDF
No ratings yet
Lab - Using Excel To Sort and Filter Data
4 pages
What Is Outlier
PDF
No ratings yet
What Is Outlier
3 pages
Lab - Manipulate Data
PDF
No ratings yet
Lab - Manipulate Data
4 pages
Data Exercise 2
PDF
No ratings yet
Data Exercise 2
3 pages
Create Visualisation
PDF
No ratings yet
Create Visualisation
4 pages
How To Calculate Outliers
PDF
No ratings yet
How To Calculate Outliers
7 pages
Missing Values in A Dataset
PDF
No ratings yet
Missing Values in A Dataset
2 pages
Lab - Importing Data Into Excel
PDF
No ratings yet
Lab - Importing Data Into Excel
3 pages
Discussion
PDF
No ratings yet
Discussion
2 pages