0% found this document useful (0 votes)

36 views6 pages

Unsupervised Machine Learning - Dealing With Unknown Data

The document discusses unsupervised machine learning and how it can be used to analyze and categorize unlabeled data. Unsupervised learning uses clustering and other algorithms to group unlabeled data based on similarities and discover hidden patterns in the data. Dimension reduction algorithms are also used to reduce the number of attributes for unlabeled data to speed up modeling and improve performance. Semi-supervised learning combines both labeled and unlabeled data for training models.

Uploaded by

suryanarayana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views6 pages

Unsupervised Machine Learning - Dealing With Unknown Data

Uploaded by

suryanarayana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

16/04/2021 Unsupervised machine learning: Dealing with unknown data

9 SearchEnterpriseAI
g
SPAINTER_VFX - STOCK.ADOBE.COM

f
MANAGE

Unsupervised machine learning: Dealing with unknown data

Learn how machine learning works when dealing with unclassified, unlabeled data sets and how, using certain algorithms and other practices, the system can
learn on its own.

By Arcitura Education Published: 05 Mar 2021

GUEST CONTRIBUTOR

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 1/8
16/04/2021 Unsupervised machine learning: Dealing with unknown data

The following article is comprised of excerpts from the course "Fundamental Machine Learning" that is part of the Machine Learning Specialist certification
program from Arcitura Education. It is the third part of the 13-part series, "Using machine learning algorithms, practices and patterns."

With unsupervised learning, the algorithm and model are subjected to "unknown" data -- that is, data for
which no previously defined categories or labels exist. When data is unknown, the machine learning
system must teach itself to classify the data. It accomplishes this by processing the unlabeled data with
special algorithms to learn from its inherent structure (Figure 1).

Most of the time, data that is used in unsupervised learning is not historical data. For example,
unsupervised learning can be used in healthcare to create a model that can categorize and identify the
results of different tests to quickly identify abnormal situations or test results. The model can learn from
different features of X-ray images or blood test results to categorize future tests or scans.

In unsupervised machine learning, clustering is the most common process used to identify and group similar entities or items together. This task is
performed with the aim of finding similarities in data points and grouping similar data points together.

k Figure 1. Unknown data is categorized by the system; an analyst then reviews the

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 2/8
16/04/2021 Unsupervised machine learning: Dealing with unknown data

For example, the learning model identifies and groups high-risk customers by determining which spend more than a certain amount or more than a certain
number of times in casinos or on gambling websites; it then categorizes them accordingly in a group (Figure 2).

Grouping similar data points helps to create a more accurate profile and attributes for different groups. Clustering can also be used to reduce the
dimensionality of the data when there are significant amounts of data.

Illustration of results of a machine learning clustering process

Categorization can further identify the featured data that is needed, and another process can
w then extract the featured data. For example, clustering can be used to group and identify certain
k Figure 2. Clustering is a machine learning process
used to sort large groups into sets with shared
characteristics.
data points to represent different social interactions with the profile of a social media influencer,
such as: likes, dislikes, shared posts and comments.

The hypothetical toy company, introduced in Part 2, continues to look for ways to gain further insights into its customer base. It sends an online survey to all
of its customers, asking them to fill out a questionnaire about their preferences regarding the types of toys they enjoy buying for their families and how
much they prefer to spend on toys each year. The toy company gets a good response, primarily because it includes the promise that all customers who
complete the survey will be entered into a raffle for a series of high-end prizes.

The company uses a clustering algorithm to mine the database in which survey results are recorded. The algorithm looks for common responses and
compares those against common characteristics of the customer profiles. Doing so results in potentially useful groups or clusters of data.

After the clustering process is completed, the following new data clusters are discovered and characterized by the analyst:

Cluster A: Customers who have historically paid by credit card are more likely to spend more on toys each year than those who usually pay by cash.

Cluster B: Customers who have three or more children are more likely to purchase outdoor toys priced at over $100 than those who have fewer
children.

The toy company adds a new class label to each customer record (based on its cluster membership) as further input for future model building using
classification algorithms.

Dimension reduction algorithms

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 3/8
16/04/2021 Unsupervised machine learning: Dealing with unknown data

Dimension reduction algorithms are used to decrease the number of characteristics or attributes in data sets so that the data generated is more relevant to
the problem being solved, and less difficult to visualize and understand. Reducing dimensions further helps reduce the amount of space required for storing
data sets and can also improve performance, as data sets are trimmed down and optimized, thereby decreasing the time required to perform computations.
Dimension reduction algorithms exist for both supervised and unsupervised learning.

Our hypothetical toy company, when carrying out classification and regression algorithms, has been using a standard set of characteristics about
customers, including:

geographic location

age group

average transaction amount

transaction frequency

frequency of returns

types of toys purchased

In an attempt to reduce the number of factors (features) taken into consideration when each model is trained, the toy company attempts to reduce the
quantity of these characteristics (dimensions) to only those most relevant and valuable to its machine learning analysis goals.

They deploy a dimension reduction algorithm for this purpose. Upon running the algorithm, it is determined that the age group and frequency of returns
values add negligible value to the typical analysis results, so they are dropped from further classification and regression processing. The remaining features
are used in subsequent model development because they have higher predictive potential.

Semi-supervised learning
Semi-supervised learning is a hybrid approach that combines aspects of supervised and unsupervised learning. Commonly, semi-supervised learning is
carried with a smaller volume of labeled historical data that is combined with a quantity of unlabeled (unknown) data. These two types of data are combined
to form the training data used to train a model. Essentially, the labeled data establishes base labels and categories that are used as a starting point for the
algorithm to process related unlabeled data.

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 4/8
16/04/2021 Unsupervised machine learning: Dealing with unknown data

This approach is often necessary when it is considered too time-consuming and expensive to collect, pre-process and label large amounts of historical
training data.

Reinforcement learning
Reinforcement learning is a learning method that interacts with its environment by producing actions and discovering errors or rewards. Trial-and-error
searches and delayed rewards are the most relevant characteristics of reinforcement learning. This method allows machines and software agents to
automatically determine an ideal behavior within a specific context in order to maximize its performance.

In other words, reinforcement learning uses a trial-and-error model to teach the machine so that it can learn the required behaviors and decisions needed
to make the expected decisions. Reinforcement learning is used in robotics, gaming and self-driving cars.

What's next?
The remaining 10 parts of this series focus on proven machine learning techniques in a standard patterns format. (These patterns should not be confused
with computation and data-related patterns resulting from machine learning processing.) The next article focuses on two exploration patterns: central
tendency computation and variability computation.

View the full series

This lesson is one in a 13-part series on using machine learning algorithms, practices and patterns. Click the titles below to read the other available lessons.

Course overview

Lesson 1: Introduction to using machine learning

Lesson 2: The supervised approach to machine learning

Lesson 3

Lesson 4: Common ML patterns: central tendency and variability

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 5/8
16/04/2021 Unsupervised machine learning: Dealing with unknown data

Related Resources

Making Data-Driven Investment Decisions Vendor Landscape for Data Science and Machine Learning Platforms
–Dataiku –TIBCO

m Dig Deeper on Machine learning platforms

machine learning reinforcement learning

By: Ed Burns By: Joseph Carew

supervised learning Supervised vs. unsupervised learning: Use in

business

By: David Petersson By: George Lawton

https://searchenterpriseai.techtarget.com/post/Unsupervised-machine-learning-Dealing-with-unknown-data?offer=ML_series 6/8

CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
4 pages
Unsupervised Learning - Overview
No ratings yet
Unsupervised Learning - Overview
6 pages
Chapter 01 Introduction To ML
No ratings yet
Chapter 01 Introduction To ML
178 pages
AWS Machine Learning Fundamentals
No ratings yet
AWS Machine Learning Fundamentals
41 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
20 pages
Unit5 ML Introduction
No ratings yet
Unit5 ML Introduction
32 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Unsupervised Lec
No ratings yet
Unsupervised Lec
12 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
25 pages
ML Unit-2 - RTU
No ratings yet
ML Unit-2 - RTU
33 pages
Capture D'écran, Le 2025-03-18 À 04.47.36
No ratings yet
Capture D'écran, Le 2025-03-18 À 04.47.36
63 pages
4 U1 Aiml Part4
No ratings yet
4 U1 Aiml Part4
38 pages
Module IV - Machine Learning
No ratings yet
Module IV - Machine Learning
53 pages
Unit 2 Unsupervised Learning
No ratings yet
Unit 2 Unsupervised Learning
86 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
16 pages
Ai - W8L15
No ratings yet
Ai - W8L15
44 pages
New Doc 09-30-2024 20.37
No ratings yet
New Doc 09-30-2024 20.37
6 pages
Unit - 1-1
No ratings yet
Unit - 1-1
35 pages
UnSupervised ML
No ratings yet
UnSupervised ML
17 pages
Supervised Unsupervised Reinforcement
No ratings yet
Supervised Unsupervised Reinforcement
39 pages
23ECE205 FoDS 13 Introduction To ML
No ratings yet
23ECE205 FoDS 13 Introduction To ML
41 pages
Lecture 03
No ratings yet
Lecture 03
28 pages
Classification of Machine Learning
No ratings yet
Classification of Machine Learning
73 pages
Spring AOP
No ratings yet
Spring AOP
48 pages
Module 6.1
No ratings yet
Module 6.1
42 pages
Attachment 11.2 LF
No ratings yet
Attachment 11.2 LF
5,716 pages
FAM Unit5
No ratings yet
FAM Unit5
47 pages
Assignment 3
No ratings yet
Assignment 3
22 pages
Introduction 1175
No ratings yet
Introduction 1175
58 pages
ML Lecture 2 3 Types
No ratings yet
ML Lecture 2 3 Types
27 pages
Module 1
No ratings yet
Module 1
122 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
15 pages
ML Workflow Steps: Step 1: Defining Problem
No ratings yet
ML Workflow Steps: Step 1: Defining Problem
6 pages
2nd Unit NN Final Class Notes
No ratings yet
2nd Unit NN Final Class Notes
51 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
Unit III 1
No ratings yet
Unit III 1
22 pages
Session 3 Types of Machine Learning
No ratings yet
Session 3 Types of Machine Learning
22 pages
M Learning
No ratings yet
M Learning
11 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
9 pages
Unit 3 Supervised Learning
No ratings yet
Unit 3 Supervised Learning
89 pages
DS&ML 1
No ratings yet
DS&ML 1
9 pages
L05 Unsupervised Learning - Overview
No ratings yet
L05 Unsupervised Learning - Overview
16 pages
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
No ratings yet
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
19 pages
Supervised Learning
No ratings yet
Supervised Learning
19 pages
U5 Unsupervised Learning
No ratings yet
U5 Unsupervised Learning
15 pages
Data Science Unit 3
No ratings yet
Data Science Unit 3
10 pages
Unit 1
No ratings yet
Unit 1
52 pages
Unit 2 Machine Learning Aktu
No ratings yet
Unit 2 Machine Learning Aktu
18 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
6 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
21 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
4 pages
ML Type
No ratings yet
ML Type
13 pages
Group I Discrete Mathematics
No ratings yet
Group I Discrete Mathematics
4 pages
2 ML
No ratings yet
2 ML
9 pages
Machine Learning and Web Scraping Lesson02
No ratings yet
Machine Learning and Web Scraping Lesson02
29 pages
Unsupervised - Learning Final
No ratings yet
Unsupervised - Learning Final
20 pages
MB 310 Mod1
0% (1)
MB 310 Mod1
35 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Contact App Android: Project Description
100% (1)
Contact App Android: Project Description
16 pages
Apple Petition For Invalidation of Patent 9245314
No ratings yet
Apple Petition For Invalidation of Patent 9245314
80 pages
Chapter - 04 - Process Management
No ratings yet
Chapter - 04 - Process Management
97 pages
Symantec Web Email Protection Help (HTTP://SWWW - Symantec.commy Account)
No ratings yet
Symantec Web Email Protection Help (HTTP://SWWW - Symantec.commy Account)
5 pages
Sagiv Nissan CV
No ratings yet
Sagiv Nissan CV
2 pages
4-90001 LANDI APOS A7 Security Policy
No ratings yet
4-90001 LANDI APOS A7 Security Policy
19 pages
TP - CB - VII - Cs - Ch. 5 Using Css in HTML
No ratings yet
TP - CB - VII - Cs - Ch. 5 Using Css in HTML
14 pages
IVMS 8600 Platform Software Introduction
No ratings yet
IVMS 8600 Platform Software Introduction
37 pages
Heated Front Seats and Rear Seat: Current Flow Diagram
No ratings yet
Heated Front Seats and Rear Seat: Current Flow Diagram
5 pages
Brochure CS Sep 23
No ratings yet
Brochure CS Sep 23
2 pages
ISM Course Handout 2024
No ratings yet
ISM Course Handout 2024
4 pages
ELEC4142 Power System Protection and Switchgear: The University of Hong Kong
No ratings yet
ELEC4142 Power System Protection and Switchgear: The University of Hong Kong
7 pages
Intelligence of ThingsTechnologiesAndApplications
No ratings yet
Intelligence of ThingsTechnologiesAndApplications
435 pages
Chief Technology Officer CTO in Dallas TX Resume Scott Davis
No ratings yet
Chief Technology Officer CTO in Dallas TX Resume Scott Davis
2 pages
Pritam CV
No ratings yet
Pritam CV
3 pages
Ka3525a PDF
100% (1)
Ka3525a PDF
5 pages
Divya Anand
No ratings yet
Divya Anand
4 pages
What Is Research Data
No ratings yet
What Is Research Data
6 pages
Online Survey For College
No ratings yet
Online Survey For College
8 pages
Prod 1001
No ratings yet
Prod 1001
75 pages
Wa0295
No ratings yet
Wa0295
2 pages
Creating Custom Record Sharing Logic Using Salesforce Flow - Automation Champion
No ratings yet
Creating Custom Record Sharing Logic Using Salesforce Flow - Automation Champion
27 pages
004 C2M2 Reference Cheat Sheet
No ratings yet
004 C2M2 Reference Cheat Sheet
1 page
Motorola GM-300 Information Page
No ratings yet
Motorola GM-300 Information Page
17 pages
Test Cases Azimut Vision Website - SMG
No ratings yet
Test Cases Azimut Vision Website - SMG
6 pages
Tsm-Pc05 Tsm-Pa05: The Universal Solution
No ratings yet
Tsm-Pc05 Tsm-Pa05: The Universal Solution
2 pages
Iball Decibel Android Alexa UserManualSetup v1 PDF
No ratings yet
Iball Decibel Android Alexa UserManualSetup v1 PDF
7 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet

Unsupervised Machine Learning - Dealing With Unknown Data

Uploaded by

Unsupervised Machine Learning - Dealing With Unknown Data

Uploaded by

16/04/2021 Unsupervised machine learning: Dealing with unknown data

Unsupervised machine learning: Dealing with unknown data

By Arcitura Education Published: 05 Mar 2021

Illustration of results of a machine learning clustering process

Dimension reduction algorithms

average transaction amount

types of toys purchased

View the full series

Lesson 1: Introduction to using machine learning

Lesson 2: The supervised approach to machine learning

Lesson 4: Common ML patterns: central tendency and variability

m Dig Deeper on Machine learning platforms

By: Ed Burns By: Joseph Carew

supervised learning Supervised vs. unsupervised learning: Use in

By: David Petersson By: George Lawton

You might also like