0% found this document useful (0 votes)
11 views

Introduction To Machine Learning

Uploaded by

vanshika128v
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Introduction To Machine Learning

Uploaded by

vanshika128v
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction to

Machine Learning
What is Machine Learning
• Arthur Samuel, a pioneer in the field of artificial intelligence and computer gaming,
coined the term “Machine Learning”. He defined machine learning as – “Field of
study that gives computers the capability to learn without being explicitly
programmed”.
• In a very layman manner, Machine Learning(ML) can be explained as automating and
improving the learning process of computers based on their experiences without being
actually programmed i.e. without any human assistance.
• The process starts with feeding good quality data and then training our
machines(computers) by building machine learning models using the data and
different algorithms. The choice of algorithms depends on what type of data do we
have and what kind of task we are trying to automate.
• Example: Training of students during exam. While preparing for the exams students don’t
actually cram the subject but try to learn it with complete understanding. Before the examination,
they feed their machine(brain) with a good amount of high-quality data (questions and answers from
different books or teachers notes or online video lectures). Actually, they are training their brain with
input as well as output i.e. what kind of approach or logic do they have to solve a different kind of
questions. Each time they solve practice test papers and find the performance (accuracy /score) by
comparing answers with answer key given, Gradually, the performance keeps on increasing, gaining
more confidence with the adopted approach. That’s how actually models are built, train machine with
data (both inputs and outputs are given to model) and when the time comes test on data (with input
only) and achieves our model scores by comparing its answer with the actual output which has not
been fed while training. Researchers are working with assiduous efforts to improve algorithms,
techniques so that these models perform even much better.

• ML systems can learn and improve with historical data, time and experience
• Although machine learning is a field within computer science, it differs from traditional
computational approaches. In traditional computing, algorithms are sets of explicitly
programmed instructions used by computers to calculate or problem solve. Machine learning
algorithms instead allow for computers to train on data inputs and use statistical analysis in
order to output values that fall within a specific range.
• Traditional Programming : We feed in DATA (Input) + PROGRAM (logic), run it on machine
and get output.
• Machine Learning : We feed in DATA(Input) + Output, run it on machine during training and
the machine creates its own program(logic), which can be evaluated while testing.
• When Do We Use Machine Learning? ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics).
• Learning isn’t always useful:
• There is no need to “learn” to calculate payroll
• Some more examples of tasks that are best solved by using a learning
algorithm
• Recognizing patterns: – Facial identities or facial expressions – Handwritten or
spoken words – Medical images
• Generating patterns: – Generating images or motion sequences
• Recognizing anomalies: – Unusual credit card transactions – Unusual patterns
of sensor readings in a nuclear power plant
• Prediction: – Future stock prices or currency exchange rates
What does exactly learning
means for a computer?
• A computer is said to be learning from Experiences with respect to
some class of Tasks, if its performance in a given Task improves with
the Experience.
• A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance
at tasks in T, as measured by P, improves with experience E
• Example:
• playing checkers. E = the experience of playing many games of checkers T =
the task of playing checkers. P = the probability that the program will win the
next game
How ML works?
• Gathering past data in any form suitable for processing. The better the quality of data, the
more suitable it will be for modeling
• Data Processing – Sometimes, the data collected is in the raw form and it needs to be pre-
processed.
Example: Some tuples may have missing values for certain attributes, an, in this case, it has
to be filled with suitable values in order to perform machine learning or any form of data
mining.
• Missing values for numerical attributes such as the price of the house may be replaced with
the mean value of the attribute whereas missing values for categorical attributes may be
replaced with the attribute with the highest mode. This invariably depends on the types of
filters we use. If data is in the form of text or images then converting it to numerical form will
be required, be it a list or array or matrix. Simply, Data is to be made relevant and consistent.
It is to be converted into a format understandable by the machine
• Divide the input data into training, cross-validation and test sets. The ratio between the
respective sets must be 6:2:2
• Building models with suitable algorithms and techniques on the training set.
• Testing our conceptualized model with data which was not fed to the model at the time of
The Machine Learning
Framework
y = f(x)
output prediction Image
function feature

• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the
prediction function f by minimizing the prediction error on the training set
• Testing: apply f to a never before seen test example x and output the predicted
value y = f(x)

Slide credit: L. Lazebnik


Types of Machine Learning
• As with any method, there are different ways to train machine learning algorithms,
each with their own advantages and disadvantages. To understand the pros and cons
of each type of machine learning, we must first look at what kind of data they ingest.
In ML, there are two kinds of data — labeled data and unlabeled data.
• Labeled data has both the input and output parameters in a completely machine-
readable pattern, but requires a lot of human labor to label the data, to begin with.
• Unlabeled data only has one or none of the parameters in a machine-readable form.
This negates the need for human labor but requires more complex solutions.
• There are also some types of machine learning algorithms that are used in very
specific use-cases, but four main methods are used today.
1. Supervised Machine Learning
2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning
Supervised Learning
• As its name suggests, Supervised machine learning is based on supervision. It means in the supervised
learning technique, we train the machines using the "labelled" dataset, and based on the training, the
machine predicts the output. Here, the labelled data specifies that some of the inputs are already mapped
to the output. More preciously, we can say; first, we train the machine with the input and corresponding
output, and then we ask the machine to predict the output using the test dataset.

• Let's understand supervised learning with an example. Suppose we have an input dataset of cats and dog
images. So, first, we will provide the training to the machine to understand the images, such as the shape
& size of the tail of cat and dog, Shape of eyes, colour, height (dogs are taller, cats are
smaller), etc. After completion of training, we input the picture of a cat and ask the machine to identify
the object and predict the output. Now, the machine is well trained, so it will check all the features of the
object, such as height, shape, colour, eyes, ears, tail, etc., and find that it's a cat. So, it will put it in the Cat
category. This is the process of how the machine identifies the objects in Supervised Learning.

• The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment, Fraud
Detection, Spam filtering, etc.

• Categories of Supervised Machine Learning

• Supervised machine learning can be classified into two types of problems, which are given below:
• Classification
• Regression
• a) Classification
• Classification algorithms are used to solve the classification problems in which the
output variable is categorical, such as "Yes" or No, Male or Female, Red or Blue,
etc. The classification algorithms predict the categories present in the dataset. Some
real-world examples of classification algorithms are Spam Detection, Email
filtering, etc.
• Some popular classification algorithms are given below:
• Random Forest Algorithm
• Decision Tree Algorithm
• Logistic Regression Algorithm
• Support Vector Machine Algorithm

• b) Regression
• Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous
output variables, such as market trends, weather prediction, etc.
• Some popular Regression algorithms are given below:
• Simple Linear Regression Algorithm
• Multivariate Regression Algorithm
• Decision Tree Algorithm
• Lasso Regression
Advantages and Disadvantages of Supervised Learning

• Advantages:
• Since supervised learning work with the labelled dataset so we can
have an exact idea about the classes of objects.
• These algorithms are helpful in predicting the output on the basis of
prior experience.

• Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different from the
training data.
• It requires lots of computational time to train the algorithm.
Applications of Supervised Learning
• Some common applications of Supervised Learning are given below:
• Image Segmentation:
• Supervised Learning algorithms are used in image segmentation. In this process, image classification is
performed on different image data with pre-defined labels.
• Medical Diagnosis:
• Supervised algorithms are also used in the medical field for diagnosis purposes. It is done by using
medical images and past labelled data with labels for disease conditions. With such a process, the
machine can identify a disease for the new patients.
• Fraud Detection
• Supervised Learning classification algorithms are used for identifying fraud transactions, fraud
customers, etc. It is done by using historic data to identify the patterns that can lead to possible fraud.
• Spam detection
• In spam detection & filtering, classification algorithms are used. These algorithms classify an email as
spam or not spam. The spam emails are sent to the spam folder.
• Speech Recognition
• Supervised learning algorithms are also used in speech recognition. The algorithm is trained with voice
data, and various identifications can be done using the same, such as voice-activated passwords, voice
commands, etc.
Unsupervised Machine Learning
• Unsupervised learning is different from the Supervised learning technique; as its name
suggests, there is no need for supervision. It means, in unsupervised machine learning, the
machine is trained using the unlabeled dataset, and the machine predicts the output
without any supervision.
• In unsupervised learning, the models are trained with the data that is neither classified nor
labelled, and the model acts on that data without any supervision.
• The main aim of the unsupervised learning algorithm is to group or categories
the unsorted dataset according to the similarities, patterns, and
differences. Machines are instructed to find the hidden patterns from the input dataset.
• Let's take an example to understand it more preciously; suppose there is a basket of fruit
images, and we input it into the machine learning model. The images are totally unknown to
the model, and the task of the machine is to find the patterns and categories of the objects.
• So, now the machine will discover its patterns and differences, such as colour difference,
shape difference, and predict the output when it is tested with the test dataset.
• Categories of Unsupervised Machine Learning

• Unsupervised Learning can be further classified into two types, which are given below:
• Clustering
• Association

• 1) Clustering
• The clustering technique is used when we want to find the inherent groups from the data. It is a way to group
the objects into a cluster such that the objects with the most similarities remain in one group and have fewer
or no similarities with the objects of other groups. An example of the clustering algorithm is grouping the
customers by their purchasing behaviour.
• Some of the popular clustering algorithms are given below:
• K-Means Clustering algorithm
• Mean-shift algorithm
• DBSCAN Algorithm
• Principal Component Analysis
• Independent Component Analysis

• 2) Association
• Association rule learning is an unsupervised learning technique, which finds interesting relations among
variables within a large dataset. The main aim of this learning algorithm is to find the dependency of one
data item on another data item and map those variables accordingly so that it can generate maximum profit.
This algorithm is mainly applied in Market Basket analysis, Web usage mining, continuous
production, etc.
• Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth algorithm.
Advantages and Disadvantages of Unsupervised
Learning Algorithm

• Advantages:
• These algorithms can be used for complicated tasks compared to the
supervised ones because these algorithms work on the unlabeled dataset.
• Unsupervised algorithms are preferable for various tasks as getting the
unlabeled dataset is easier as compared to the labelled dataset.
• Disadvantages:
• The output of an unsupervised algorithm can be less accurate as the
dataset is not labelled, and algorithms are not trained with the exact
output in prior.
• Working with Unsupervised learning is more difficult as it works with the
unlabelled dataset that does not map with the output.
Applications of Unsupervised
Learning
• Network Analysis: Unsupervised learning is used for identifying plagiarism and
copyright in document network analysis of text data for scholarly articles.
• Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation applications for
different web applications and e-commerce websites.
• Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points within the dataset.
It is used to discover fraudulent transactions.
• Singular Value Decomposition: Singular Value Decomposition or SVD is used
to extract particular information from the database. For example, extracting
information of each user located at a particular location.
Semi-Supervised Learning
• Semi-Supervised learning is a type of Machine Learning algorithm that lies between
Supervised and Unsupervised machine learning. It represents the intermediate ground between
Supervised (With Labelled training data) and Unsupervised learning (with no labelled training data)
algorithms and uses the combination of labelled and unlabeled datasets during the training period.
• Although Semi-supervised learning is the middle ground between supervised and unsupervised learning
and operates on the data that consists of a few labels, it mostly consists of unlabeled data. As labels are
costly, but for corporate purposes, they may have few labels. It is completely different from supervised
and unsupervised learning as they are based on the presence & absence of labels.
• To overcome the drawbacks of supervised learning and unsupervised learning algorithms, the
concept of Semi-supervised learning is introduced. The main aim of semi-supervised learning is to
effectively use all the available data, rather than only labelled data like in supervised learning. Initially,
similar data is clustered along with an unsupervised learning algorithm, and further, it helps to label the
unlabeled data into labelled data. It is because labelled data is a comparatively more expensive
acquisition than unlabeled data.
• We can imagine these algorithms with an example. Supervised learning is where a student is under the
supervision of an instructor at home and college. Further, if that student is self-analysing the same
concept without any help from the instructor, it comes under unsupervised learning. Under semi-
supervised learning, the student has to revise himself after analyzing the same concept under the
guidance of an instructor at college.
Advantages and disadvantages of Semi-
supervised Learning

• Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• It is used to solve drawbacks of Supervised and Unsupervised
Learning algorithms.
• Disadvantages:
• Iterations results may not be stable.
• We cannot apply these algorithms to network-level data.
• Accuracy is low.
Reinforcement Learning
• Reinforcement learning works on a feedback-based process, in which an AI agent (A
software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance. Agent gets rewarded
for each good action and get punished for each bad action; hence the goal of reinforcement
learning agent is to maximize the rewards.
• In reinforcement learning, there is no labelled data like supervised learning, and agents learn
from their experiences only.
• The reinforcement learning process is similar to a human being; for example, a child learns
various things by experiences in his day-to-day life. An example of reinforcement learning is to
play a game, where the Game is the environment, moves of an agent at each step define states,
and the goal of the agent is to get a high score. Agent receives feedback in terms of punishment
and rewards.
• Due to its way of working, reinforcement learning is employed in different fields such as Game
theory, Operation Research, Information theory, multi-agent systems.
• A reinforcement learning problem can be formalized using Markov Decision Process(MDP). In
MDP, the agent constantly interacts with the environment and performs actions; at each action,
the environment responds and generates a new state.
• Categories of Reinforcement Learning
• Reinforcement learning is categorized mainly into two types of methods/algorithms:
• Positive Reinforcement Learning: Positive reinforcement learning specifies increasing the
tendency that the required behaviour would occur again by adding something. It enhances the
strength of the behaviour of the agent and positively impacts it.
• Negative Reinforcement Learning: Negative reinforcement learning works exactly opposite to
the positive RL. It increases the tendency that the specific behaviour would occur again by avoiding
the negative condition.
• Real-world Use cases of Reinforcement Learning
• Video Games: RL algorithms are much popular in gaming applications. It is used to gain super-
human performance. Some popular games that use RL algorithms are AlphaGO and AlphaGO Zero.
• Resource Management: The "Resource Management with Deep Reinforcement Learning" paper
showed that how to use RL in computer to automatically learn and schedule resources to wait for
different jobs in order to minimize average job slowdown.
• Robotics: RL is widely being used in Robotics applications. Robots are used in the industrial and
manufacturing area, and these robots are made more powerful with reinforcement learning. There
are different industries that have their vision of building intelligent robots using AI and Machine
learning technology.
• Text Mining: Text-mining, one of the great applications of NLP, is now being implemented with the
help of Reinforcement Learning by Salesforce company.
Advantages and Disadvantages of
Reinforcement Learning
• Advantages
• It helps in solving complex real-world problems which are difficult to be
solved by general techniques.
• The learning model of RL is similar to the learning of human beings; hence
most accurate results can be found.
• Helps in achieving long term results.
• Disadvantage
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states which
can weaken the results.

You might also like