0% found this document useful (0 votes)
54 views30 pages

Chapter 1 ML

This document provides an introduction to machine learning, including definitions and examples. It discusses the difference between supervised, unsupervised, and semi-supervised learning. It also describes common machine learning algorithms like Naive Bayes, k-means clustering, support vector machines, Apriori, linear regression, and decision trees. The key applications and characteristics of each algorithm are outlined.

Uploaded by

Tomas Agonafer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views30 pages

Chapter 1 ML

This document provides an introduction to machine learning, including definitions and examples. It discusses the difference between supervised, unsupervised, and semi-supervised learning. It also describes common machine learning algorithms like Naive Bayes, k-means clustering, support vector machines, Apriori, linear regression, and decision trees. The key applications and characteristics of each algorithm are outlined.

Uploaded by

Tomas Agonafer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

CHAPTER ONE

Introduction to machine learning and it’s


applications
What is machine learning?
• Machine learning, sometimes referred to as statistical learning, is a
subfield of artificial intelligence (AI) whereby algorithms “learn”
patterns in data to perform specific tasks.
• Arthur Samuel, a scientist at IBM, first used the term machine
learning in 1959.
• He used it to describe a form of AI that involved training an algorithm
to learn to play the game of checkers.
• The word learning is what’s important here, as this is what
distinguishes machine learning approaches from traditional AI.
What is machine learning?...
What is machine learning?...
• The ML process is viewed as two phases, namely:
a. Learning phase and: The training data is fed to the learning phase for
predictions.
b. Prediction phase: In the prediction phase, the predicted results are
obtained for the new data regarding the concerned trained model.
The difference between a model and an
algorithm
• In practice, we call a set of rules that a machine learning algorithm learns
a model.
• Once the model has been learned, we can give it new observations, and it
will output its predictions for the new data.
• We refer to these as models because they represent real-world phenomena
in a simplistic enough way that we and the computer can interpret and
understand it.
• Just as a model of the Eiffel Tower may be a good representation of the real
thing but isn’t exactly the same, so statistical models are attempted
representations of real-world phenomena but won’t match them perfectly.
Classes of machine learning algorithms
• All machine learning algorithms can be categorized by their learning type and the
task they perform. There are three learning types:
Supervised
Unsupervised
Semi-supervised
Reinforcement learning.
• The type depends on how the algorithms learn. Do they require us to hold their
hand through the learning process? Or do they learn the answers for themselves?
• Supervised and unsupervised algorithms can be further split into two classes each:
Supervised
 Classification
 Regression
Unsupervised
 Dimension reduction
 Clustering
Supervised Learning
• Imagine you are trying to get a toddler to learn about shapes by using
blocks of wood.
• In front of them, they have a ball, a cube, and a star. You ask them to
show you the cube, and if they point to the correct shape, you tell
them they are correct; if they are incorrect, you also tell them.
• You repeat this procedure until the toddler can identify the correct
shape almost all of the time.
• This is called supervised learning, because you, the person who
already knows which shape is which, are supervising the learner by
telling them the answers.
Supervised Learning…
• A machine learning algorithm is said to be supervised if it uses a
ground truth or, in other words, labeled data.
• For example, if we wanted to classify a patient biopsy as healthy or
cancerous based on its gene expression, we would give an algorithm
the gene expression data, labeled with whether that tissue was
healthy or cancerous.
• The algorithm now knows which cases come from each of the two
types, and it tries to learn patterns in the data that discriminate them.
Supervised Learning…
• SML approach’s aim is to learn the mapping function ‘f’ from the input
variable (X) to the output variable (Y).
• The mapping function tries to estimate during the arrival of novel input
data (X) and forecast the corresponding output variable (Y).
• This algorithm learns from the training set and stops learning when the
acceptable performance level is reached.
• The subdivisions of SML are classification (output variable needs
categorization) and regression (output is real value).
• Some of the examples of SML algorithms are linear regression, random
forest and Support Vector Machine (SVM).
Unsupervised Learning
• Now imagine a toddler is given multiple balls, cubes, and stars but this
time is also given three bags.
• The toddler has to put all the balls in one bag, the cubes in another
bag, and the stars in another, but you won’t tell them if they’re
correct—they have to work it out for themselves from nothing but the
information they have in front of them.
• This is called unsupervised learning, because the learner has to
identify patterns themselves with no outside help.
Unsupervised Learning…
• A machine learning algorithm is said to be unsupervised if it does not
use a ground truth and instead looks on its own for patterns in the
data that hint at some underlying structure.
• For example, let’s say we take the gene expression data from lots of
cancerous biopsies and ask an algorithm to tell us if there are clusters
of biopsies.
• A cluster is a group of data points that are similar to each other but
different from data in other clusters.
• This type of analysis can tell us if we have subgroups of cancer types
that we may need to treat differently.
Unsupervised Learning…
• The objective of the USML is learning, understanding and exploring
the unlabeled input data (X) without historical data .
• The USML is subdivided into association (discover rules to describe
the data) and clustering (discover inherent group in data).
• Examples of USML include apriori and kmeans methods for
association and clustering problems, respectively.
Semi-supervised Learning
• Most machine learning algorithms will fall into one of these categories,
but there is an additional approach called semi-supervised learning.
• As its name suggests, semi-supervised machine learning is not quite
supervised and not quite unsupervised.
• Semi-supervised learning often describes a machine learning approach
that combines supervised and unsupervised algorithms together, rather
than strictly defining a class of algorithms in and of itself. The premise of
semi-supervised learning is that, often, labeling a dataset requires a
large amount of manual work by an expert observer.
Semi-supervised Learning…
• This process may be very time consuming, expensive, and error prone, and may be
impossible for an entire dataset.
• So instead, we expertly label as many of the cases as is feasibly possible, and then we
build a supervised model using only the labeled data.
• We pass the rest of our data (the unlabeled cases) into the model to get their
predicted labels, called pseudo-labels because we don’t know if all of them are
actually correct.
• Now we combine the data with the manual labels and pseudo-labels, and use the
result to train a new model.
• This approach allows us to train a model that learns from both labeled and unlabeled
data, and it can improve overall predictive performance because we are able to use
all of the data at our disposal.
Semi-supervised Learning…
• The SSML is considered as a mixture of SML and USML, and arises
when huge unlabeled input data (X) with few labelled data (Y) .
• That is, the mixture of very few labelled data and huge unlabeled
data.
• The SML and USML algorithms can be applied to get knowledge and
discover the data
Machine Learning Algorithms
• Naïve Bayes Classifier (NBC) Algorithm
NBC algorithm helps to classify a webpage, a document, a lengthy text note
or an email, it is not easy to do it manually.
Its popular application is spam filtering.
The classifier function assigns a population’s element value as one of the
accessible categories.
Sentiment Analysis is also its popular application. It performs well when we
have categorical input variables.
Machine Learning Algorithms…
• K-Means Clustering Algorithm
When a user searches for a keyword “jaguar” in the World Wide Web
(WWW), results will be the aggregation of Jaguar animal, Jaguar car and
Jaguar Mac Operating System.
The similar characteristic featured contents (may be text, images, videos, etc)
are grouped and displayed to the user.
The grouping (also referred as clustering) of similar features is done by K-
means clustering Algorithm.
Further, this algorithm is adopted by many search engines (namely, Google
and Yahoo) to group the web pages based on the similarity and identifies
‘relevance rate’ of concerned search results.
Machine Learning Algorithms…
• Support Vector Machine (SVM) Algorithm
SVM, a supervised algorithm, is used for classification problems.
SVM learns about classes of input dataset which are differentiated by a line in
the dataset to classify the new data.
It is categorized as Linear and Non-linear SVM.
Linear SVM can classify using just a hyper plane but it’s not possible for non-
linear SVM.
This technique does not make any substantial assumptions on the data.
It is used to forecast stock market values.
Machine Learning Algorithms…
• Apriori Algorithm
Apriori algorithm, an unsupervised algorithm, depicts association rules from
the input data set.
Association rule entails that on occurrence of item ‘X’, item ‘Y’ happens
definitely, similar to “if-then” rule.
For instance, IF a person purchases bike, THEN certainly they will purchase
helmet.
Critical observations on such purchases are examined to ensure the optimum
decisions.
Detecting unfavourable drug reactions and auto-complete applications are its
applications.
Machine Learning Algorithms…
• Linear Regression (LR) Algorithm
LR algorithm depicts the association between the two variables and how they
influence each other.
It shows the impact on the dependent variable (termed as factor of interest)
when the independent variable (termed as explanatory variables) is changed.
It is a wide spread ML technique with rapid speed.
Estimating sales and risk assessment is its applications
Machine Learning Algorithms…
• Decision Tree Algorithm
It uses the branching method to represent the possible consequences for a
decision with respect to specific scenario.
The decisions in a tree are inversely proportional to the expected precision.
Two sub-categories of decision tress are Classification and Regression trees.
The classification and regression trees are used during the classification of
nature and continuous/numerical variables, respectively.
The decision tree classification is mainly used in finance and banking sectors
to classify the loan applicants based on the probability of payments in a
specific periodic intervals.
Machine Learning Algorithms…
• Random Forest Algorithm
Random Forest algorithm works by creating a cluster of decision tree with a
subset of randomly chosen data.
The prediction performance is improved by repeated training of a model with
random samples.
The final prediction is made from the integration of decision trees output
followed by the polling procedure.
Machine Learning Algorithms…
• Logistic Regression Algorithm
Mainly this type of classification is applied to classify the discrete classes.
For instance, to detect the mail as spam or not, online transaction made as a
genuine or fraud, tumor classification as benign and malignant, etc.
This type of regression transforms the desired output using sigmoidal
function to determine the probability value.
If the resultant classification is with two outputs (for instance, benign or
malignant), then it is referred as Binary Logistic Regression.
On the other hand, if the resultant output is of categorical values (for
instance, dogs, cats, horses, etc), then it is referred as Multi-Linear Logistic
Regression method.
Machine Learning Algorithms…
• Spam Detection: Received emails are identified whether they are spam if
yes then they are not shown to the user with regular emails in their inboxes.
All these spam emails are maintained in a separate folder. This is done by
learning how to recognize a spam mail by identifying the characteristics
from newly received ones.
• Face Detection: Identifying faces from a given number of photos and tagging
them automatically if next time the same face is detected for a new photo.
Currently, Facebook is popularly known for using this technique where when
we upload a picture some suggestions of friends are automatically detected
and shown to the user whose friends are in the picture. For example, Google
photos separate all the pictures according to people in it.
Machine Learning Algorithms…
• Credit Card Fraud Detection: According to customers past transactions, if there
are any inappropriate purchases made then the customer is warned immediately
about the condition.
• Digit Recognition: Machine’s camera detects postal codes that are handwritten
and arranges all the letters according to the geographical locations they have to
reach. The machine is trained to learn handwritten numbers and transforms into
digital signs.
• Speech Understanding: Deals with listening to a speech by the user, understand
user intentions process what machine has understood. Machine is expected to
follow instructions from the users accurately. “Cortana” by Windows, “Siri” by
Apple and “Okay Google” by Google are popular and successfully implemented
applications of this technique.
Machine Learning Algorithms…
• Product Recommendation: With all the data of a customer’s past
purchases or interests online, the machine will recommend some
products which would attract customers to view them and maybe
even purchased. Flipkart, Amazon and many other e-commerce sites
have been implanting this technique where we receive only
recommended products as advertisements.
• Medical Diagnosis: For detecting diseases more accurately, hospitals
these days are using machines that could decide whether a person is
affected with any diseases for symptoms he/her has, with the help of
complete data about all diseases and symptoms.
Machine Learning Algorithms…
• Stock Trading: Given the current and past price fluctuations of a
stock, the machine decides to hold or sell the stock for better service
of the customer(s).
• Customer Segmentation: With the help of past behavior patterns, the
ML algorithms tries to predict the number of users who will choose
the paid versions from the trial versions. Amazon Prime has
implemented this technique.
Machine Learning Algorithms…
• Chat smarter with Allo: Allo is a messenger application by Google
which will learn from its users, how they are responding to what kind
of messages and recommend the user with a response that he /she
has thought of typing.
• Financial Service: Companies can identify the company insights of
financial sector data and can overcome the occurrence of financial
fraud. It is used to identify the opportunities for investment and
trade. We can also prevent the financial risks prone institutions by
using cyber surveillance and take necessary actions to prevent fraud.
Machine Learning Algorithms…
• Transportation: By the travel history and pattern of travelling in
various routes, machine learning helps in Transportation Company to
predict the problem in routers and advice their customers to choose a
different route. Transportation firmly uses machine language to carry
out data analysis and data modeling.
• Computational Intelligence: For many years computational
intelligence is being developed actively. Constant improvements and
improvements are being carried out on classical methods like
machine learning algorithms. These days computational intelligence
has been for many applications directly or indirectly.
Machine Learning Algorithms…
• Natural language processing: It is a field that which involves both
computer understanding and manipulation of human language and its
good in gathering new possibilities. It is mostly seen in a large pool of
legislation or other document sets, trying to discover new patterns or
to root out corruption. It is a better way to analyze, understand and
find the meaning of human language easily and smartly. By using NLP
developer can perform tasks such as speech recognition, entity
recognition, automatic translation, and summarization.

You might also like