0% found this document useful (0 votes)
31 views10 pages

Supervised and Unsupervised Learning in R Programming

Uploaded by

cleopatra.xer003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views10 pages

Supervised and Unsupervised Learning in R Programming

Uploaded by

cleopatra.xer003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Supervised and

Unsupervised Learning in R
Programming
• Arthur Samuel, a pioneer in the field of artificial intelligence and
computer gaming, coined the term “Machine Learning”. He defined
machine learning as – “Field of study that gives computers the
capability to learn without being explicitly programmed”. In a very
layman manner, Machine Learning(ML) can be explained as automating
and improving the learning process of computers based on their
experiences without being actually programmed i.e. without any human
assistance. The process starts with feeding good quality data and then
training our machines(computers) by building machine learning models
using the data and different algorithms. The choice of algorithms
depends on what type of data do we have and what kind of task we are
trying to automate.
Supervised Learning

• the name indicates the presence of a supervisor as a teacher. Basically


supervised learning is learning in which we teach or train the machine
using data that is well labeled which means some data is already
tagged with the correct answer. After that, the machine is provided
with a new set of examples(data) so that the supervised learning
algorithm analyses the training data(set of training examples) and
produces a correct outcome from labeled data.
Supervised learning is classified into
two categories of algorithms:
• Classification: A classification problem is when the output variable is
a category, such as “Red” or “blue” or “disease” and “no disease”.
• Regression: A regression problem is when the output variable is a real
value, such as “dollars” or “weight”.
• Types
• Regression
• Logistic Regression
• Classification
• Naïve Bayes Classifiers
• Decision Trees
• Support Vector Machine
Unsupervised Learning

• R – Unsupervised learning is the training of machines using


information that is neither classified nor labeled and allowing the
algorithm to act on that information without guidance. Here the task
of the machine is to group unsorted information according to
similarities, patterns differences without any prior training of data.
Unlike supervised learning, no teacher is provided that means no
training will be given to the machine. Therefore the machine is
restricted to finding the hidden structure in unlabeled data by our-
self.
• Clustering: A clustering problem is where you want to discover the
inherent groupings in the data, such as grouping customers by
purchasing behavior.
• Association: An association rule learning problem is where you want
to discover rules that describe large portions of your data, such as
people that buy X also tend to buy Y.
• Types
• Clustering:
• Exclusive (partitioning)
• Agglomerative
• Overlapping
• Probabilistic
• Clustering Types:
• Hierarchical clustering
• K-means clustering
• K-NN (k nearest neighbors)
• Principal Component Analysis
• Singular Value Decomposition
• Independent Component Analysis
Implement in k-means
clustering in R
• Let’s implement one of the very popular Unsupervised Learning i.e
K-means clustering in R programming. K means clustering in
R Programming is an Unsupervised Non-linear algorithm that clusters
data based on similarity or similar groups. It seeks to partition the
observations into a pre-specified number of clusters. Segmentation of
data takes place to assign each training example to a segment called a
cluster. In the unsupervised algorithm, high reliance on raw data is
given with large expenditure on manual review for review of
relevance is given. It is used in a variety of fields like Banking,
healthcare, retail, Media, etc.

You might also like