0% found this document useful (0 votes)
18 views2 pages

Outliers in Machine Learning - Google Search

Outliers in machine learning are data points that significantly deviate from the general pattern, which can arise from errors or rare events. They are important as they can negatively impact model accuracy and lead to misleading insights. Handling outliers involves detection methods like visualization and statistical tests, and strategies such as removal, capping, imputation, or transformation.

Uploaded by

viji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views2 pages

Outliers in Machine Learning - Google Search

Outliers in machine learning are data points that significantly deviate from the general pattern, which can arise from errors or rare events. They are important as they can negatively impact model accuracy and lead to misleading insights. Handling outliers involves detection methods like visualization and statistical tests, and strategies such as removal, capping, imputation, or transformation.

Uploaded by

viji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

outliers in machine learning

All Images Videos Short videos Forums Shopping News Web Books Maps Flights

தமிழில் தேடுங்கள் வுடலியேர்ஸ் இந்த மச்…

Sponsored

LinkedIn
https://www.linkedin.com

Outlier Detection Techniques: Pros and Cons for Data Analysis


Get expert tips on leadership, communication, marketing, and more on LinkedIn.

AI Overview +4

Listen

In machine learning, outliers are data points that deviate


significantly from the general pattern or cluster of other
observations, potentially indicating errors or unusual occurrences .
Understanding and handling outliers is crucial for building accurate
and reliable models.

Here's a breakdown of key aspects:

1. What are Outliers?


Definition:
Outliers are data points that are significantly different from the majority of the data, either much
larger or much smaller than the other values.
Causes:
They can arise due to various reasons, including data entry errors, sensor malfunctions, or
genuine rare events.
Types:

Global Outliers: Deviate significantly from the entire dataset.


Contextual Outliers: Anomalous based on a specific context (e.g., high temperature in
winter).
Collective Outliers: A group of data points that together behave differently from the
rest.

2. Why are Outliers Important?


Model Accuracy:
Outliers can negatively impact the performance of machine learning models, especially those
sensitive to extreme values (e.g., linear regression, k-nearest neighbors).
Skewed Results:
Outliers can distort the patterns and relationships that a model learns, leading to inaccurate
predictions.
Misleading Insights:
Models trained on data with outliers might provide misleading insights or recommendations.

3. How to Handle Outliers?


Detection:

Visualization: Use plots (e.g., box plots, scatter plots) to identify potential outliers.
Statistical Tests: Employ methods like Z-score or Interquartile Range (IQR) to detect
outliers.

Handling:

Removal: Remove outliers if they are deemed to be errors or irrelevant to the analysis.
Capping: Replace extreme values with a certain threshold or limit.
Imputation: Replace outliers with a more appropriate value (e.g., mean, median, or a value
predicted by a model).
Transformation: Apply transformations to the data to reduce the impact of outliers (e.g.,
logarithmic transformation).

How to Find Outliers in Data The Impact of Outliers in How to Detect Outliers in
using Machine Learning -… Machine Learning | by… Machine Learning - Applie…
11 Sept 2020 — An outlier is a data 2 Aug 2023 — Outliers can pose a 14 Oct 2024 — What is an Outlier in
point that is noticeably different fro… significant threat to model… Machine Learning? In machine…
Express Analytics Medium Applied AI

Generative AI is experimental.

You might also like