0% found this document useful (0 votes)

345 views21 pages

Unit 1 - Machine Learning

The document discusses different types of machine learning including supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. It provides examples and common algorithms for each type. The document also discusses applications of machine learning such as maps, social media, virtual assistants, translation, and fraud detection.

Uploaded by

jinkhatima

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

345 views21 pages

Unit 1 - Machine Learning

Uploaded by

jinkhatima

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

UNIT-I

Machine learning
Machine learning is a branch of science that deals with programming the systems in such a way
that they automatically learn and improve with experience. Here, learning means recognizing
and understanding the input data and making wise decisions based on the supplied data.

It is very difficult to provide to all the decisions based on all possible inputs. To tackle this
problem, ML algorithms are developed. This are those that can learn from data and improve
from experience, without human intervention. These algorithms build knowledge from specific
data and experience with the principles of statistics, probability theory, logic, combinatorial
optimization, search, reinforcement learning, and control theory. The developed algorithms
form the basis of various applications such as:

• Vision processing
• Language processing
• Forecasting (e.g., stock market trends)
• Pattern recognition
• Games
• Data mining
• Expert systems
• Robotics

Learning

Learning tasks may include learning the function that maps the input to the output, learning the
hidden structure in unlabelled data or ‘instance-based learning’, where a class label is produced
for a new instance by comparing the new instance (row) to instances from the training data,
which were stored in memory. ‘Instance-based learning’ does not create an abstraction from
specific instances.

Types of ML

Supervised learning

Supervised learning deals with learning a function from available training data. A supervised
learning algorithm analyses the training data and produces an inferred function, which can be
used for mapping new examples.

• supervised learning with the concept of function approximation, where basically we

train an algorithm and in the end of the process, we pick the function that best
describes the input -
data, the one that for a given X makes the best estimation of y (X -> y). Most of the
time we are not able to figure out the true function that always make the correct
predictions and other reason is that the algorithm rely upon an assumption made by
humans about how the computer should learn and this
assumptions introduce a bias, Bias is topic I’ll explain in another post.
• Here the human experts act as the teacher where we feed the computer with training
data containing the input/predictors and we show it the correct answers (output) and

•
from the data the computer should be able to learn the patterns.
Supervised learning algorithms try to model relationships and dependencies
between the target prediction output and the input features such that we can
predict the output values for new data based on those relationships which it learned
from the previous data sets.

Common examples of supervised learning include:

• classifying e-mails as spam,

• labelling webpages based on their content, and
• voice recognition.

There are many supervised learning algorithms such as

• Nearest Neighbour
• Naive Bayes
• Decision Trees
• Linear Regression
• Support Vector Machines (SVM)
• Neural Networks

Unsupervised learning
Unsupervised learning makes sense of unlabelled data without having any predefined dataset
for its training. Unsupervised learning is an extremely powerful tool for analysing available
data and look for patterns and trends. It is most commonly used for clustering similar input
into logical groups.

• The computer is trained with unlabelled data.

• Here there’s no teacher at all, the computer might be able to teach new things after it
learns patterns in data, these algorithms a particularly useful in cases where the human
expert doesn’t know what to look for in the data.
• machine learning algorithms which are mainly used in pattern detection and descriptive
modelling. However, there are no output categories or labels here based on which the
algorithm can try to model relationships. These algorithms try to use techniques on the
input data to mine for rules, detect patterns, and summarize and group the data points
which help in deriving meaningful insights and describe the data better to the users.

List of Common Algorithms

• k-means clustering, Association Rules

Common approaches to unsupervised learning include:

• k-means
• self-organizing maps, and
• hierarchical clustering

Semi-supervised Learning

In the previous two types, either there are no labels for all the observation in the dataset or
labels are present for all the observations. Semi-supervised learning falls in between these two.
In many practical situations, the cost to label is quite high, since it requires skilled human
experts to do that. So, in the absence of labels in most of the observations but present in few,
semi-supervised algorithms are the best candidates for the model building. These methods
exploit the idea that even though the group memberships of the unlabelled data are unknown,
this data carries important information about the group parameters.

Reinforcement Learning

method aims at using observations gathered from the interaction with the environment to take
actions that would maximize the reward or minimize the risk. Reinforcement learning
algorithm (called the agent) continuously learns from the environment in an iterative fashion.
In the process, the agent learns from its experiences of the environment until it explores the
full range of possible states.

Reinforcement Learning is a type of Machine Learning, and thereby also a branch of

Artificial Intelligence. It allows machines and software agents to automatically determine the
ideal behaviour within a specific context, in order to maximize its performance. Simple
reward feedback is required for the agent to learn its behaviour; this is known as the
reinforcement signal.

There are many different algorithms that tackle this issue. As a matter of fact, Reinforcement
Learning is defined by a specific type of problem, and all its solutions are classed as
Reinforcement Learning algorithms. In the problem, an agent is supposed decide the best
action to select based on his current state. When this step is repeated, the problem is known
as a Markov Decision Process.
In order to produce intelligent programs (also called agents), reinforcement learning goes
through the following steps:

1. Input state is observed by the agent.

2. Decision making function is used to make the agent perform an action.
3. After the action is performed, the agent receives reward or reinforcement from the
environment.
4. The state-action pair information about the reward is stored.

List of Common Algorithms

• Q-Learning
• Temporal Difference (TD)
• Deep Adversarial Networks

Applications of ML:

Traffic Alerts (Maps)

Google Maps is probably the app we use whenever we go out and require assistance in
directions and traffic. The other day I was traveling to another city and took the expressway
and Maps suggested: “Despite the Heavy Traffic, you are on the fastest route “. It’s a
combination of People currently using the service, Historic Data of that route collected over
time and few tricks acquired from other companies. Everyone using maps is providing their
location, average speed, the route in which they are traveling which in turn helps Google collect
massive Data about the traffic, which makes them predict the upcoming traffic and adjust your
route according to it.

Social Media (Facebook)

Facebook uses face detection and Image recognition to automatically find the face of the
person which matches its Database and hence suggests us to tag that person based on Deep-
Face. Facebook’s Deep Learning project Deep-Face is responsible for the recognition of faces
and identifying which person is in the picture. It also provides Alt Tags (Alternative Tags) to
images already uploaded on Facebook. For e.g., if we inspect the following image on
Facebook, the alt-tag has a description.

Virtual Personal Assistants

Virtual Personal Assistants assist in finding useful information, when asked via text or
voice. Few of the major Applications of Machine Learning here are:

• Speech Recognition
• Speech to Text Conversion
• Natural Language Processing
• Text to Speech Conversion

All you need to do is ask a simple question like “What is my schedule for tomorrow?” or
“Show my upcoming Flights “. For answering, your personal assistant searches for
information or recalls your related queries to collect info.

Google Translate

The time when we travelled to a new place and you find it difficult to communicate with the
locals or finding local spots where everything is written in a different language.
Google’s GNMT(Google Neural Machine Translation) is a Neural Machine Learning that
works on thousands of languages and dictionaries, uses Natural Language Processing to
provide the most accurate translation of any sentence or words. Since the tone of the words
also matters, it uses other techniques like POS Tagging, NER (Named Entity Recognition) and
Chunking. It is one of the best and most used Applications of Machine Learning.

Fraud Detection

Fraud Detection is one of the most necessary Applications of Machine Learning. The number
of transactions has increased due to a plethora of payment channels – credit/debit cards,
smartphones, numerous wallets, UPI and much more. At the same time, the number of
criminals has become adept at finding loopholes. Whenever a customer carries out a transaction
– the Machine Learning model thoroughly x-rays their profile searching for suspicious
patterns. In Machine Learning, problems like fraud detection are usually framed as
classification problems.
Introduction to Neural Networks.

Neural networks are parallel computing devices, which is basically an attempt to make a
computer model of the brain. The main objective is to develop a system to perform various
computational tasks faster than the traditional systems. These tasks include pattern recognition
and classification, approximation, optimization, and data clustering.

Artificial Neural Network (ANN) is an efficient computing system whose central theme is
borrowed from the analogy of biological neural networks. ANNs are also named as “artificial
neural systems,” or “parallel distributed processing systems,” or “connectionist systems.” ANN
acquires a large collection of units that are interconnected in some pattern to allow
communication between the units. These units, also referred to as nodes or neurons, are simple
processors which operate in parallel.

Every neuron is connected with other neuron through a connection link. Each connection link
is associated with a weight that has information about the input signal. This is the most useful
information for neurons to solve a particular problem because the weight usually excites or
inhibits the signal that is being communicated. Each neuron has an internal state, which is
called an activation signal. Output signals, which are produced after combining the input
signals and activation rule, may be sent to other units.

Model of Artificial Neural Network

The following diagram represents the general model of ANN followed by its processing.

For the above general model of artificial neural network, the net input can be calculated as
follows −

yin=x1.w1+x2.w2+x3.w3…xm.wm

i.e., Net input yin=∑mixi.wi

The output can be calculated by applying the activation function over the net input.

Y=F(yin)
Output = function (net input calculated)

Introduction to linear regression

Regression is a method of modelling a target value based on independent predictors. This

method is mostly used for forecasting and finding out cause and effect relationship between
variables. Regression techniques mostly differ based on the number of independent variables
and the type of relationship between the independent and dependent variables.

SSE

In linear regression there is a neat way to measure the accuracy of the relationship (called
correlation in statistics), it is has many names, SSE, SSR, RSS. I am going to refer to it as
SSE, which stands for Sum of Squared Errors.

The regression line is the line made using the function we defined above. You can think of it
as drawing a pixel for every possible meal price value, thus creating a line. Here is what it
looks like with our data set.

An error refers to how far a data point, or in this case tip is from the regression line. To get
the SSE we calculate the distance for each of the data points from the regression line then
square the it, then we add to the sum. Here is what it would look like in code.

Why do we square the errors?

You might think squaring the error is somewhat pointless, but there is an important reason for
doing this. If the tip is 2 dollars away from the regression line, the square of 2 is only 4. But if
the tip is 5 dollars away from the regression line the square of 5 is 25, which is a lot more
even though there is only a 3 dollar difference from 2 to 5. The longer the tip moves away
from the regression line, the more damming it is, which is good.

If a tip moves far away from the regression line, it is a clear indicator that the correlation is
low. Squaring the error serves as a useful measure to insure that the correlation is high,
simply summing up each error without squaring it would not effectively show how low the
correlation actually is.

Gradient Descent

Gradient descent is an optimization algorithm used to minimize some function by iteratively

moving in the direction of steepest descent as defined by the negative of the gradient. In
machine learning, we use gradient descent to update the parameters of our model. Parameters
refer to coefficients in Linear Regression and weights in neural networks.

Gradient Descent is one of the most popular and widely used algorithms for training machine
learning models.

Machine learning models typically have parameters (weights and biases) and a cost function
to evaluate how good a particular set of parameters are. Many machine learning problems
reduce to finding a set of weights for the model which minimizes the cost function.

For example, if the prediction is p, the target is t, and our error metric is squared error, then
the cost function J(W) = (p - t)².

Note that the predicted value p depends on the input X as well as the machine learning model
and (current) values of the parameters W. During training, our aim is to find a set of values
for W such that (p - t)² is small. This means our prediction p will be close to the target t.
closed form

Normal Equation

Normal Equation is an analytical approach to Linear Regression with a Least

Square Cost Function. We can directly find out the value of θ without using Gradient
Descent. Following this approach is an effective and a time-saving option when are
working with a dataset with small features.
Normal Equation is a follows :

In the above equation,

θ : hypothesis parameters that define it the best.
X : Input feature value of each instance.
Y : Output value of each instance.

Maths Behind the equation –

Given the hypothesis function
where,
n : the no. of features in the data set.
x0 : 1 (for vector multiplication)
Notice that this is dot product between θ and x values. So for the convenience to
solve we can write it as :

The motive in Linear Regression is to minimize the cost function :

where,
xi : the input value of iih training example.
m : no. of training instances
n : no. of data-set features
yi : the expected result of ith instance
Let us representing cost function in a vector form.

we have ignored 1/2m here as it will not make any difference in the working. It was
used for the mathematical convenience while calculation gradient descent. But it is
no more needed here.
xij : value of jih feature in iih training example.
This can further be reduced to
But each residual value is squared. We cannot simply square the above expression.
As the square of a vector/matrix is not equal to the square of each of its values. So
to get the squared value, multiply the vector/matrix with its transpose. So, the final
equation derived is

Therefore, the cost function is

So, now getting the value of θ using derivative

So, this is the finally derived Normal Equation with θ giving the minimum cost
value.

Features of Linear regression

Linearity
The linear regression model forces the prediction to be a linear combination of
features, which is both its greatest strength and its greatest limitation. Linearity
leads to interpretable models. Linear effects are easy to quantify and describe.
They are additive, so it is easy to separate the effects. If you suspect feature
interactions or a nonlinear association of a feature with the target value, you can
add interaction terms or use regression splines.

Normality
It is assumed that the target outcome given the features follows a normal
distribution. If this assumption is violated, the estimated confidence intervals of the
feature weights are invalid.

Independence
It is assumed that each instance is independent of any other instance. If you
perform repeated measurements, such as multiple blood tests per patient, the data
points are not independent. For dependent data you need special linear regression
models, such as mixed effect models or GEEs. If you use the “normal” linear
regression model, you might draw wrong conclusions from the model.

Fixed features
The input features are considered “fixed”. Fixed means that they are treated as
“given constants” and not as statistical variables. This implies that they are free of
measurement errors. This is a rather unrealistic assumption. Without that
assumption, however, you would have to fit very complex measurement error
models that account for the measurement errors of your input features. And
usually you do not want to do that.

Absence of multicollinearity
You do not want strongly correlated features, because this messes up the
estimation of the weights. In a situation where two features are strongly correlated,
it becomes problematic to estimate the weights because the feature effects are
additive and it becomes indeterminable to which of the correlated features to
attribute the effects.
Overfitting in Machine Learning
Overfitting happens when a model learns the detail and noise in the training data to the
extent that it negatively impacts the performance of the model on new data. This means
that the noise or random fluctuations in the training data is picked up and learned as
concepts by the model. The problem is that these concepts do not apply to new data and
negatively impact the models ability to generalize.

Overfitting is more likely with nonparametric and nonlinear models that have more
flexibility when learning a target function. As such, many nonparametric machine
learning algorithms also include parameters or techniques to limit and constrain how
much detail the model learns.

For example, decision trees are a nonparametric machine learning algorithm that is very
flexible and is subject to overfitting training data. This problem can be addressed by
pruning a tree after it has learned in order to remove some of the detail it has picked up.

Underfitting in Machine Learning

Underfitting refers to a model that can neither model the training data nor generalize to
new data.

An underfit machine learning model is not a suitable model and will be obvious as it will
have poor performance on the training data.

Underfitting is often not discussed as it is easy to detect given a good performance

metric. The remedy is to move on and try alternate machine learning algorithms.
Nevertheless, it does provide a good contrast to the problem of overfitting.

Training Set and Test Set

In machine learning, an unknown universal dataset is assumed to exist, which

contains all the possible data pairs as well as their probability distribution of
appearance in the real world. While in real applications, what we observed is
only a subset of the universal dataset due to the lack of memory or some other
unavoidable reasons. This acquired dataset is called the training set (training
data) and used to learn the properties and knowledge of the universal dataset. In
general, vectors in the training set are assumed independently and identically
sampled from the universal dataset.
In machine learning, what we desire is that these learned properties can not only
explain the training set, but also be used to predict unseen samples or future
events. In order to examine the performance of learning, another dataset may be
reserved for testing, called the test set or test data.

Validation
In machine learning, we couldn’t fit the model on the training data and can’t say that
the model will work accurately for the real data. For this, we must assure that our
model got the correct patterns from the data, and it is not getting up too much noise.
For this purpose, we use the cross-validation technique.
Cross-Validation
Cross-validation is a technique in which we train our model using the subset of the
data-set and then evaluate using the complementary subset of the data-set.
The three steps involved in cross-validation are as follows :
1. Reserve some portion of sample data-set.
2. Using the rest data-set train the model.
3. Test the model using the reserve portion of the data-set.

Methods of Cross Validation

Validation
In this method, we perform training on the 50% of the given data-set and rest 50% is
used for the training purpose. The major drawback of this method is that we perform
training on the 50% of the dataset, it may possible that the remaining 50% of the
data contains some important information which we are leaving while training our
model i.e higer bias.
LOOCV (Leave One Out Cross Validation)
In this method, we perform training on the whole data-set but leaves only one data-
point of the available data-set and then iterates for each data-point. It has some
advantages as well as disadvantages also.
An advantage of using this method is that we make use of all data points and hence
it is low bias.
The major drawback of this method is that it leads to higher variation in the testing
model as we are testing against one data point. If the data point is an outlier it can
lead to higher variation. Another drawback is it takes a lot of execution time as it
iterates over ‘the number of data points’ times.
K-Fold Cross Validation
In this method, we split the data-set into k number of subsets(known as folds) then
we perform training on the all the subsets but leave one(k-1) subset for the
evaluation of the trained model. In this method, we iterate k times with a different
subset reserved for testing purpose each time.
Advantages of train/test split:
1. This runs K times faster than Leave One Out cross-validation because K-fold
cross-validation repeats the train/test split K-times.
2. Simpler to examine the detailed results of the testing process.

Advantages of cross-validation:
1. More accurate estimate of out-of-sample accuracy.
2. More “efficient” use of data as every observation is used for both training and
testing.

Classification
Classification is the process of predicting the class of given data points. Classes are sometimes
called as targets/ labels or categories. Classification predictive modeling is the task of
approximating a mapping function (f) from input variables (X) to discrete output variables (y).

For example, spam detection in email service providers can be identified as a classification
problem. This is s binary classification since there are only 2 classes as spam and not spam. A
classifier utilizes some training data to understand how given input variables relate to the class.
In this case, known spam and non-spam emails have to be used as the training data. When the
classifier is trained accurately, it can be used to detect an unknown email.

Classification belongs to the category of supervised learning where the targets also provided
with the input data. There are many applications in classification in many domains such as in
credit approval, medical diagnosis, target marketing etc.

There are two types of learners in classification as lazy learners and eager learners.

1. Lazy learners
Lazy learners simply store the training data and wait until a testing data appear. When it does,
classification is conducted based on the most related data in the stored training data. Compared
to eager learners, lazy learners have less training time but more time in predicting.

Ex. k-nearest neighbor, Case-based reasoning

2. Eager learners

Eager learners construct a classification model based on the given training data before receiving
data for classification. It must be able to commit to a single hypothesis that covers the entire
instance space. Due to the model construction, eager learners take a long time for train and less
time to predict.

Ex. Decision Tree, Naive Bayes, Artificial Neural Networks

Classification Problem + Decision Boundary

The classification problem consists of taking input vectors and deciding which of N classes
they belong to, based on training from exemplars of each class. The most important point
about the classification problem is that it is discrete—each example belongs to precisely one
class, and the set of classes covers the whole possible output space. These two constraints
are not necessarily realistic; sometimes examples might belong partially to two different
classes. There are fuzzy classifiers that try to solve this problem, but we won’t be talking
about them in this book. In addition, there are many places where we might not be able
to categorise every possible input. For example, consider a vending machine, where we use
a neural network to learn to recognise all the different coins. We train the classifier to
recognise all New Zealand coins, but what if a British coin is put into the machine? In that
case, the classifier will identify it as the New Zealand coin that is closest to it in appearance,
but this is not really what is wanted: rather, the classifier should identify that it is not one
of the coins it was trained on. This is called novelty detection. For now we’ll assume that we
will not receive inputs that we cannot classify accurately.

In classification problems, prediction of a particular class is involved among multiple classes.

In other words, it can also be framed in a way that a particular instance (data-point in terms of
Feature Space Geometry) needs to be kept under a particular region (signifying the class) and
needs to separated from other regions (signifying other classes). This separation from other
regions can be visualized by a boundary known as Decision Boundary. This visualization of
the Decision Boundary in feature space is done on a Scatter Plot where every point depicts a
data-point of the data-set and axes depicting the features. The Decision Boundary separates the
data-points into regions, which are actually the classes in which they belong.

Importance/Significance of a Decision Boundary:

After training a Machine Learning Model using a data-set, it is often necessary to visualize the
classification of the data-points in Feature Space. Decision Boundary on a Scatter Plot serves
the purpose, in which the Scatter Plot contains the data-points belonging to different classes
(denoted by colour or shape) and the decision boundary can be drawn following many
different strategies:

1. Single-Line Decision Boundary: The basic strategy to draw the Decision Boundary on a
Scatter Plot is to find a single line that separates the data-points into regions signifying
different classes. Now, this single line is found using the parameters related to the
Machine Learning Algorithm that are obtained after training the model. The line co-
ordinates are found using the obtained parameters and intuition behind the Machine
Learning Algorithm. Deployment of this strategy is not possible if the intuition and
working mechanism of the ML Algorithm is not known.

2. Contour-Based Decision Boundary: Another strategy involves drawing contours which

are regions each enclosing data-points with matching or closely matching colours-
depicting classes to which the data-points belong and contours-depicting the predicted
classes. This is the mostly followed strategy as this does not employ parameters and
related calculations of the Machine Learning Algorithm obtained after Model Training.
But on the other hand, this does not perfectly separate data-points using a single line that
can only be given by obtained parameters after training and their co-ordinates calculation.

Nearest-Neighbours(k-Nearest-Neighbors)

The k-Nearest-Neighbours (kNN) method of classification is one of the simplest methods in

machine learning and is a great way to introduce yourself to machine learning and classification
in general. At its most basic level, it is essentially classification by finding the most similar data
points in the training data, and making an educated guess based on their classifications.
Although very simple to understand and implement, this method has seen wide application in
many domains, such as in recommendation systems, semantic searching, and anomaly
detection.

As we would need to in any machine learning problem, we must first find a way to represent
data points as feature vectors. A feature vector is our mathematical representation of data, and
since the desired characteristics of our data may not be inherently numerical, pre-processing and
feature-engineering may be required in order to create these vectors. Given data with N unique
features, the feature vector would be a vector of length N, where entry I of the vector represents
that data point’s value for feature I. Each feature vector can thus be thought of as a point in R^N.

Now, unlike most other methods of classification, kNN falls under lazy learning, which means
that there is no explicit training phase before classification. Instead, any attempts to
generalize or abstract the data is made upon classification. While this does mean that we can
immediately begin classifying once we have our data, there are some inherent problems with
this type of algorithm. We must be able to keep the entire training set in memory unless we
apply some type of reduction to the data-set, and performing classifications can be
computationally expensive as the algorithm parse through all data points for each
classification. For these reasons, kNN tends to work best on smaller data-sets that do not
have many features.
Once we have formed our training data-set, which is represented as an M x N matrix where M is
the number of data points and N is the number of features, we can now begin classifying. The
gist of the kNN method is, for each classification query, to:

1. Compute a distance value between the item to be classified and every item in the training
data-set
2. Pick the k closest data points (the items with the k lowest distances)
3. Conduct a “majority vote” among those data points — the dominating classification in that
pool is decided as the final classification

There are two important decisions that must be made before making classifications. One is the
value of k that will be used; this can either be decided arbitrarily, or you can try cross-
validation to find an optimal value. The next, and the most complex, is the distance metric that
will be used.

There are many different ways to compute distance, as it is a fairly ambiguous notion, and the
proper metric to use is always going to be determined by the data-set and the classification task.
Two popular ones, however, are Euclidean distance and Cosine similarity.

Euclidean distance is probably the one that you are most familiar with; it is essentially the
magnitude of the vector obtained by subtracting the training data point from the point to be
classified.

General formula for Euclidean distance

Another common metric is Cosine similarity. Rather than calculating a magnitude, Cosine
similarity instead uses the difference in direction between two vectors.

General formula for Cosine similarity

Choosing a metric can often be tricky, and it may be best to just use cross-validation to decide,
unless you have some prior insight that clearly leads to using one over the other. For example,
for something like word vectors, you may want to use Cosine similarity because the direction
of a word is more meaningful than the sizes of the component values. Generally, both of these
methods will run in roughly the same time, and will suffer from highly-dimensional data.
After doing all of the above and deciding on a metric, the result of the kNN algorithm is a
decision boundary that partitions R^N into sections. Each section (colored distinctly below)
represents a class in the classification problem. The boundaries need not be formed with actual
training examples — they are instead calculated using the distance metric and the available
training points. By taking R^N in (small) chunks, we can calculate the most likely class for a
hypothetical data-point in that region, and we thus color that chunk as being in the region for
that class.

This information is all that is needed to begin implementing the algorithm and doing so should
be relatively simple. There are, of course, many ways to improve upon this base algorithm.
Common modifications include weighting, and specific pre-processing to reduce computation
and reduce noise, such as various algorithms for feature extraction and dimension reduction.
Additionally, the kNN method has also been used, although less-commonly, for regression
tasks, and operates in a manner very similar to that of the classifier through averaging.

MATLAB
MATLAB makes machine learning easy. With tools and functions for handling big data, as
well as apps to make machine learning accessible, MATLAB is an ideal environment for
applying machine learning to your data analytics.

With MATLAB, engineers and data scientists have immediate access to prebuilt functions,
extensive toolboxes, and specialized apps for classification, regression, and clustering.

MATLAB lets you:

• Compare approaches such as logistic regression, classification trees, support vector machines,
ensemble methods, and deep learning.
• Use model refinement and reduction techniques to create an accurate model that best captures
the predictive power of your data.
• Integrate machine learning models into enterprise systems, clusters, and clouds, and target
models to real-time embedded hardware.
• Perform automatic code generation for embedded sensor analytics.
• Support integrated workflows from data analytics to deployment.

Final Project Report
50% (4)
Final Project Report
52 pages
Time Series Forecasting - Final Project Report
89% (9)
Time Series Forecasting - Final Project Report
67 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
71 pages
Unit 1 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Machine Learning - WWW - Rgpvnotes.in
23 pages
Credit Card Fraud Detection Using Machine Learning Techniques
No ratings yet
Credit Card Fraud Detection Using Machine Learning Techniques
9 pages
Machine Learning As A Tool For Geologists
No ratings yet
Machine Learning As A Tool For Geologists
5 pages
Decision Rules For Selection of Allophones of Marathi Affricates For Speech Synthesis
No ratings yet
Decision Rules For Selection of Allophones of Marathi Affricates For Speech Synthesis
5 pages
Unit-5 Alt
No ratings yet
Unit-5 Alt
15 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
ML UNIT 2 Sir
No ratings yet
ML UNIT 2 Sir
46 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
ML Lab Observation
100% (1)
ML Lab Observation
44 pages
Video Tutorial: Machine Learning 17CS73
100% (2)
Video Tutorial: Machine Learning 17CS73
27 pages
Lec-1 ML Intro
No ratings yet
Lec-1 ML Intro
15 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
Classification and Prediction
No ratings yet
Classification and Prediction
126 pages
UNIT2
No ratings yet
UNIT2
25 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
M.Tech (CSE) Big Data Analytics Curriculum
No ratings yet
M.Tech (CSE) Big Data Analytics Curriculum
69 pages
CP5191 Machine Learning Techniques L T P C3 0 0 3
No ratings yet
CP5191 Machine Learning Techniques L T P C3 0 0 3
7 pages
Ge8151 Phython Prog Unit 4 New
No ratings yet
Ge8151 Phython Prog Unit 4 New
33 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
ML unit-1
100% (1)
ML unit-1
15 pages
Final Twitter - Sentiment - Analysis - Report
100% (1)
Final Twitter - Sentiment - Analysis - Report
14 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
IF4071 - Deep Learning Laboratory
No ratings yet
IF4071 - Deep Learning Laboratory
1 page
Fdsa UNIT V
No ratings yet
Fdsa UNIT V
18 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
Regression Notes
100% (1)
Regression Notes
20 pages
Designing A Learning System
No ratings yet
Designing A Learning System
12 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
UNIT1
No ratings yet
UNIT1
38 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
ML Unit-3 Notes
No ratings yet
ML Unit-3 Notes
26 pages
Applied Machine Learning Question Paper
100% (1)
Applied Machine Learning Question Paper
2 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Supervised Learning (Classification and Regression)
No ratings yet
Supervised Learning (Classification and Regression)
14 pages
Machine Learning: in Telugu
No ratings yet
Machine Learning: in Telugu
14 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Bai602 Ml Lesson Plan 2024-25 Even Aiml Dept
No ratings yet
Bai602 Ml Lesson Plan 2024-25 Even Aiml Dept
5 pages
Machine Learning: PAC-Learning and VC-Dimension
No ratings yet
Machine Learning: PAC-Learning and VC-Dimension
31 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Chapter-2-Fundamentals of Machine Learning
No ratings yet
Chapter-2-Fundamentals of Machine Learning
23 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
ML QB WITH ANSWER
No ratings yet
ML QB WITH ANSWER
20 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
44 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Unit 4
No ratings yet
Unit 4
24 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
Artificial Neural Networks Kluniversity Course Handout
No ratings yet
Artificial Neural Networks Kluniversity Course Handout
18 pages
Da Unit-2
No ratings yet
Da Unit-2
23 pages
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
No ratings yet
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
5 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
CP4252 Machine Learning lab manual
No ratings yet
CP4252 Machine Learning lab manual
37 pages
ML UNIT-IV Notes
100% (1)
ML UNIT-IV Notes
23 pages
Machine Learning Notes Unit 1
No ratings yet
Machine Learning Notes Unit 1
25 pages
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
From Everand
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
Dr.Chandrakant
No ratings yet
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
No ratings yet
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
10 pages
Artificial Neural Networks and Efficient Optimization Techniques For Applications in Engineering
No ratings yet
Artificial Neural Networks and Efficient Optimization Techniques For Applications in Engineering
25 pages
Greykite Part 2
No ratings yet
Greykite Part 2
2 pages
Idc Oracles Autonomous Database 4497146
No ratings yet
Idc Oracles Autonomous Database 4497146
8 pages
Analysis & Pediction Using WEKA Machine Learing Toolkit
No ratings yet
Analysis & Pediction Using WEKA Machine Learing Toolkit
37 pages
Dott. Ing. Letizia Squarcina, PH.D.: Tecniche Di Analisi Di MRI Cerebrale Neuroscience and Psychiatry
No ratings yet
Dott. Ing. Letizia Squarcina, PH.D.: Tecniche Di Analisi Di MRI Cerebrale Neuroscience and Psychiatry
46 pages
Natural Language Processing With RNNs .Ipynb - Colaboratory
No ratings yet
Natural Language Processing With RNNs .Ipynb - Colaboratory
15 pages
A Neural Network Lab Experiment
No ratings yet
A Neural Network Lab Experiment
11 pages
Microarray Data Analysis: Class Discovery and Class Prediction: Clustering and Discrimination
No ratings yet
Microarray Data Analysis: Class Discovery and Class Prediction: Clustering and Discrimination
70 pages
Supplementary Materials For: Improving Refugee Integration Through Data-Driven Algorithmic Assignment
No ratings yet
Supplementary Materials For: Improving Refugee Integration Through Data-Driven Algorithmic Assignment
37 pages
MCQ Question
No ratings yet
MCQ Question
5 pages
Child Mortality Prediction Using Machine Learning Techniques
No ratings yet
Child Mortality Prediction Using Machine Learning Techniques
6 pages
User Guide For 4cast XL
No ratings yet
User Guide For 4cast XL
31 pages
Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms
No ratings yet
Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms
19 pages
Guide
No ratings yet
Guide
210 pages
Location Prediction On Twitter Using Machine Learning Techniques
No ratings yet
Location Prediction On Twitter Using Machine Learning Techniques
4 pages
A Universal SNP and Small-Indel Variant Caller Using Deep Neural Networks
No ratings yet
A Universal SNP and Small-Indel Variant Caller Using Deep Neural Networks
9 pages
SSRN Id3607845 PDF
No ratings yet
SSRN Id3607845 PDF
39 pages
Project Review (Face Mask Detection Using Machine Learning)
No ratings yet
Project Review (Face Mask Detection Using Machine Learning)
19 pages
Predicting Article Retweets and Likes Based On The Title Using Machine Learning
No ratings yet
Predicting Article Retweets and Likes Based On The Title Using Machine Learning
10 pages
Spwla 2019 CC
No ratings yet
Spwla 2019 CC
13 pages
Discussion Paper On Machine Learning For IRB Models
No ratings yet
Discussion Paper On Machine Learning For IRB Models
29 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
Mobile Net
No ratings yet
Mobile Net
9 pages
Physica A: Feng Shen, Xingchao Zhao, Zhiyong Li, Ke Li, Zhiyi Meng
No ratings yet
Physica A: Feng Shen, Xingchao Zhao, Zhiyong Li, Ke Li, Zhiyi Meng
17 pages

Unit 1 - Machine Learning

Uploaded by

Unit 1 - Machine Learning

Uploaded by

UNIT-I

• supervised learning with the concept of function approximation, where basically we

Common examples of supervised learning include:

• classifying e-mails as spam,

There are many supervised learning algorithms such as

• The computer is trained with unlabelled data.

List of Common Algorithms

• k-means clustering, Association Rules

Reinforcement Learning is a type of Machine Learning, and thereby also a branch of

1. Input state is observed by the agent.

List of Common Algorithms

Traffic Alerts (Maps)

Social Media (Facebook)

Virtual Personal Assistants

Model of Artificial Neural Network

i.e., Net input yin=∑mixi.wi

Introduction to linear regression

Regression is a method of modelling a target value based on independent predictors. This

Why do we square the errors?

Gradient descent is an optimization algorithm used to minimize some function by iteratively

Normal Equation is an analytical approach to Linear Regression with a Least

In the above equation,

Maths Behind the equation –

The motive in Linear Regression is to minimize the cost function :

Therefore, the cost function is

So, now getting the value of θ using derivative

Features of Linear regression

Underfitting in Machine Learning

Underfitting is often not discussed as it is easy to detect given a good performance

Training Set and Test Set

In machine learning, an unknown universal dataset is assumed to exist, which

Methods of Cross Validation

Ex. k-nearest neighbor, Case-based reasoning

Ex. Decision Tree, Naive Bayes, Artificial Neural Networks

Classification Problem + Decision Boundary

In classification problems, prediction of a particular class is involved among multiple classes.

Importance/Significance of a Decision Boundary:

2. Contour-Based Decision Boundary: Another strategy involves drawing contours which

The k-Nearest-Neighbours (kNN) method of classification is one of the simplest methods in

General formula for Euclidean distance

General formula for Cosine similarity

MATLAB lets you:

You might also like