Machine Learning
Machine Learning
Unit 1
Cleaning, Structuring,
Verifying, and Formatting, Training And testing
and Quality Control, etc. Data, etc.
• Machine Learning
• Supervised Learning
• Un-supervised Learning
• Deep Learning
• Ensemble Learning
• Reinforcement Learning
Supervised Learning
• Supervised learning is the types of machine learning in which machines are
trained using well "labelled" training data, and on basis of that data,
machines predict the output.
• The labelled data means some input data is already tagged with the correct
output.
• In supervised learning, the training data provided to the machines work as
the supervisor that teaches the machines to predict the output
correctly. It applies the same concept as a student learns in the supervision of the
teacher.
• Supervised learning is a process of providing input data as well as correct output data to the
machine learning model.
• The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).
variable(y)
Supervised Learning
Javapoint.com
Supervised Learning - Steps
• First Determine the type/domain of training dataset and
Collect/Gather the labelled training data.
• Split the training dataset into training dataset, training dataset, and
test/validation dataset.
• Determine the input features of the training dataset, which should have enough
knowledge so that the model can accurately predict the output.
• Determine the suitable algorithm for the model, such as support vector machine, decision
tree, etc.
• Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.
• Evaluate the accuracy of the model by providing the test set. If the
model predicts the correct output, which means our model is accurate.
Supervised Learning – Popular Techniques
• Regression
• Linear Regression
• Regression Trees
• Non-Linear Regression
• Bayesian Linear Regression
• Polynomial Regression
• Classification
• Random Forest
• Decision Trees
• Logistic Regression
• Support vector Machines
Supervised Learning – Advantages
• With the help of supervised learning, the model can
accurately predict the output on the basis of prior
experiences.
• In supervised learning, we can have an exact idea about
the classes of objects.
• Supervised learning model helps us to solve various real-
real
world problems such as fraud detection, spam filtering,
filtering
etc.
Supervised Learning – Disadvantages
• Supervised learning models are not suitable for handling the
complex tasks.
• Supervised learning cannot predict the correct output if the
test data is different from the training dataset.
• Training required lots of computation times.
• In supervised learning, we need enough knowledge about the
classes of object.
Unsupervised Learning
• Unsupervised learning is a machine learning technique in which models are
not supervised using training dataset.
• Instead, models itself find the hidden patterns and insights
from the given data.
• It can be compared to learning which takes place in the human brain
while learning new things. It can be defined as:
• K-means clustering
• KNN (k-nearest neighbors)
• Hierarchal clustering
• Anomaly detection
• Neural Networks
• Principle Component Analysis
• Independent Component Analysis
• Apriori algorithm
• Singular value decomposition
Unsupervised Learning : Advantages
• Supervised learning algorithms are trained • Unsupervised learning algorithms are trained
using labeled data. using unlabeled data.
• Supervised learning model takes direct • Unsupervised learning model does not take
feedback to check if it is predicting correct any feedback.
output or not.
• Supervised learning model predicts the • Unsupervised learning model finds the hidden
output. patterns in data.
• In supervised learning, input data is provided • In unsupervised learning, only input data is
to the model along with the output. provided to the model.
• The goal of supervised learning is to train the • The goal of unsupervised learning is to find the
model so that it can predict the output when hidden patterns and useful insights from the
it is given new data. unknown dataset.
• Supervised learning needs supervision to • Unsupervised learning does not need any
train the model. supervision to train the model.
• Supervised learning can be categorized • Unsupervised Learning can be classified
in Classification and Regression problems. in Clustering and Associations problems.
Supervised Learning (contd\..) Unsupervised Learning (contd\..)
• Supervised learning can be used for those • Unsupervised learning can be used for those
cases where we know the input as well as cases where we have only input data and no
corresponding outputs. corresponding output data.
• Supervised learning model produces an • Unsupervised learning model may give less
accurate result. accurate result as compared to supervised
learning.
• Supervised learning is not close to true • Unsupervised learning is more close to the true
Artificial intelligence as in this, we first Artificial Intelligence as it learns similarly as a
train the model for each data, and then only child learns daily routine things by b his
it can predict the correct output. experiences.
• It includes various algorithms such as Linear • It includes various algorithms such as Clustering,
Regression, Logistic Regression, Support KNN, and Apriori algorithm.
Vector Machine, Multi-class Classification,
Decision tree, Bayesian Logic, etc.
Reinforcement Learning
• That's like learning that cat gets from "what to do" from positive
experiences.
• At the same time, the cat
also learns what not do when faced with
negative experiences.
Reinforcement Learning
• Your cat is an agent that is exposed to the
environment, which is the house.
• An example of a state could be your cat sitting, and
you use a specific word in for cat to walk.
• Our agent reacts by performing an action transition
from one "state" to another "state."
• For example, your cat goes from sitting to walking.
• The reaction of an agent is an action, and the policy
is a method of selecting an action given a state in
expectation of better outcomes.
• After the transition, they may get a reward or penalty
in return.
Reinforcement Learning - Characteristics
• Gaming
• Robotics for industrial automation.
• Business strategy planning
• It helps you to create training systems that provide custom
instruction and materials according to the requirement
of students.
• Aircraft control and robot motion control
Reinforcement Learning – Why?
• AutoML
• https://cloud.google.com/automl/
Tools and Technologies for Machine Learning
• CoLab
• CoLab is a short form of Colaboratory.
• It is an environment that executes directly on cloud without any
additional set ups.
• It is also called Jupyter Notebook environment. One can directly call
(import) all the packages and libraries without installing them manually.
• The CoLab is a free tool by the Google to employ various machine learning
algorithms. According to the claim of the Google, it requires zero set up.
• https://research.google.com/colaboratory/
Tools and Technologies for Machine Learning
• Hadoop and Spark
• Apache Haddop is a framework based on the java programming language
for the distributed processing of big data. Typically the Hadoop framework has a
computation layer and a storage layer. The processing layer uses a parallel
programming approach called MapReduce. The storage layer offers distributed
file system based on the Google File System (GFS).
• Apache Spark is considered as an extension of the Hadoop to handle batch
as well as real time (streaming) data with efficient built in models for big data
handling. It is advertised as a ‘lightning fast unified analytical engine’.
• https://hadoop.apache.org/
• https://spark.apache.org/
Tools and Technologies for Machine Learning
• KNIME
• KNIME is another machine learning tool that helps in generating the data
science solutions/workflows.
• This tool is visual, user friendly and eliminates writing codes. Instead,
it uses drag and drop kind of visual interface.
• This tool allows the practitioner to access and transform the data, offers
modelling and visualization, and deployment of the model.
• https://www.knime.com/
Tools and Technologies for Machine Learning
• LibROSA
• https://librosa.org/
Tools and Technologies for Machine Learning
• Machine Learning Studio
• Machine Learning Studio by the Microsoft Azure is a visual drag and drop
type of tool to experiment machine learning models. The trail version for the
tool is free.
• MLFlow
• The tool MLFlow manages the complete life cycle of a machine learning
application till deployment of a model and its use. According to the claim of
its developers, the tool can be used in conjunction with any machine learning
library.
• https://azure.microsoft.com/
• https://mlflow.org/
Tools and Technologies for Machine Learning
• Neo4j
• Neo4j is a native graph platform that handles big data through the graphical
relationship between the data. That is, it manages not only data, but also
concentrates on their relationship graphically. The relationships between the
data entities are stored as connections and presented visually. At the time of
any query, these connections are used. In various graph related problem
such as social networking, scheduling, and path finding applications, this tool
is useful.
• https://neo4j.com/
Tools and Technologies for Machine Learning
• OpenAIGym
• The OpenAIGym is a consortium of techniques that helps in
implementation of reinforcement learning based applications via teaching
agents. As it supports the reinforcement learning, it can be effectively used
for games and robotic systems development. Besides these applications, it
can be also used in various domains such as business management, energy,
education, finance, and transportation, etc.
• https://gym.openai.com/
Tools and Technologies for Machine Learning
• RapidMiner
• RapidMiner helps in all the phases of machine learning practices such as
setting databases, pre-processing
processing data, applying machine learning algorithms,
and visualization of data. It is used by non-programmers
non and researchers
from various domains for data science and big data related machine learning
based projects. Rapidminer’s graphical interface makes it easy for non-
non
computer professionals and researchers to develop machine learning and
data science based applications.
• https://rapidminer.com/
Tools and Technologies for Machine Learning
• SimpleCV
• SimpleCV is a open source python based framework. This tool helps in
building models that deal computer vision. Here, input images or videos can
come from variety of devices and interfaces.
• StanfordNLP
• StanfordNLP is a package that helps in working with natural language.
StanfordNLP, as per the claim of the developers, support more than 70
languages.
• http://simplecv.org/
• https://stanfordnlp.github.io/stanfordnlp/
Tools and Technologies for Machine Learning
• TensorFlow and TensorFlowLite
• TensorFlow is an open source library that can be used as an efficient tool for
various machine learning applications through a browser. This tool is also
able to import readily available and trained models for use or retrain with
new data.
• TensorFlowLite is a collection of utilities that helps deployment and
execution of various TensorFlow models on mobile. Besides mobile
application, this tool is also used for embedded systems and IoT/IoE
devices.
• https://www.tensorflow.org/
• https://www.tensorflow.org/lite
Tools and Technologies for Machine Learning
• Uber Ludwig
• Uber Ludwig is a code free toolbox for deep learning applications. It is a toolbox built on
the TensorFlow to build, train, and test deep learning models. One major advantage of the
tool is that it avoids programming and offers code free development.
•
• UnityMLAgents
• The Unity Machine Learning Agent is also used for development of gaming applications
through reinforcement learning and interactive intelligent agents. This is also a python based
toolkit. It is to be noted that the toolkit helps in converting a utility scene into a learning
environment to train the model or a character. It takes learning algorithm support from the
python.
• https://eng.uber.com/introducing-ludwig/
• https://docs.unity3d.com/Packages/[email protected]/
https://docs.unity3d.com/Packages/com.unity.ml
Tools and Technologies for Machine Learning
• WEKA
• WEKA is an abbreviation of ‘Waikato Environment for Knowledge
Analysis’. This is an open source java based tool that supports various
machine learning algorithms to perform data preparation, classification,
clustering, mining, exploration and visualization via efficient graphical
interfaces for its various functionalities. Weka is also used for deep learning
applications. Weka is extensively used in the industry, academic and research
domains.
• https://www.cs.waikato.ac.nz/ml/weka/
Applications of Machine Learning
• Image Recognition • Virtual Personal Assistant
• Speech Recognition • Online Fraud Detection
• Traffic prediction
• Stock Market trading
• Product recommendations
• Medical Diagnosis
• Self-driving cars
• Email Spam and Malware • Automatic Language Translation
Filtering • Etc.
Application 1: Customized and Dynamic Material Presentation
Application 2: To Issue Credit Card or Not: A Case of Single Perceptron
Application 3: To Check/Approve Loan Status
Application 1:
https://www.datasciencecentral.com/profiles/blogs/credit-risk-prediction
prediction-using-artificial-neural-
network-algorithm
Application 4: Neuro-Fuzzy System for
Credit Card Fraud Detection
Application 5:
• • etc.