0% found this document useful (0 votes)

15 views34 pages

MLT Unit 4 and 5 Part 2

The document discusses various neural network architectures, including static and recurrent backpropagation, and emphasizes the significance of deep learning in pattern recognition and prediction. It outlines the structure of deep learning models, including input, hidden, and output layers, and highlights the advantages and limitations of deep learning techniques. Additionally, it covers convolutional neural networks, structured prediction, and learning to rank methodologies, along with model validation techniques in supervised machine learning.

Uploaded by

devendra vikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views34 pages

MLT Unit 4 and 5 Part 2

Uploaded by

devendra vikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

DEPARTMENT OF CSE AY:2023-24

 Static backpropagation: Static backpropagation is a network designed to map static

inputs for static outputs. These types of networks are capable of solving static
classification problems such as OCR (Optical Character Recognition).
 Recurrent backpropagation: Recursive backpropagation is another network used for
fixed-point learning. Activation in recurrent backpropagation is feed-forward until a
fixed value is reached. Static backpropagation provides an instant mapping, while
recurrent backpropagation does not provide an instant mapping.

Advantages:

 It is simple, fast, and easy to program.

 Only numbers of the input are tuned, not any other parameter.
 It is Flexible and efficient.
 No need for users to learn any special functions.

Disadvantages:

 It is sensitive to noisy data and irregularities. Noisy data can lead to inaccurate results.
 Performance is highly dependent on input data.
 Spending too much time training.
 The matrix-based approach is preferred over a mini-batch.

What is Deep Learning?

Deep learning is a computer software that mimics the network of neurons in a brain. It is a subset of
machine learning and is called deep learning because it makes use of deep neural networks.

Deep learning algorithms are constructed with connected layers.

 The first layer is called the Input Layer

 The last layer is called the Output Layer
 All layers in between are called Hidden Layers. The word deep means the network join neurons in
more thantwo layers.

MACHINE LEARNING 50
DEPARTMENT OF CSE AY:2023-24

Each Hidden layer is composed of neurons. The neurons are connected to each other. The neuron will
process and then propagate the input signal it receives the layer above it. The strength of the signal given the
neuron in the next layer depends on the weight, bias and activation function.

The network consumes large amounts of input data and operates them through multiple layers; the network
can learnincreasingly complex features of the data at each layer.

Why is Deep Learning Important?

Deep learning is a powerful tool to make prediction an actionable result. Deep learning excels in pattern
discovery (unsupervised learning) and knowledge-based prediction. Big data is the fuel for deep learning.
When both are combined, an organization can reap unprecedented results in term of productivity, sales,
management, and innovation.

Deep learning can outperform traditional method. For instance, deep learning algorithms are 41% more
accurate than machine learning algorithm in image classification, 27 % more accurate in facial recognition
and 25% in voice recognition.

Limitations of deep learningData labelling

Most current AI models are trained through "supervised learning." It means that humans must label and
categorize the underlying data, which can be a sizable and error-prone chore. For example, companies
developing self-driving- car technologies are hiring hundreds of people to manually annotate hours of video
feeds from prototype vehicles to help train these systems.

Obtain huge training datasets

It has been shown that simple deep learning techniques like CNN can, in some cases, imitate the knowledge
of experts in medicine and other fields. The current wave of machine learning, however, requires training
data sets that are not only labeled but also sufficiently broad and universal.
MACHINE LEARNING 51
DEPARTMENT OF CSE AY:2023-24

Deep-learning methods required thousands of observation for models to become relatively good at
classification tasks and, in some cases, millions for them to perform at the level of humans. Without surprise,
deep learning is famous in giant tech companies; they are using big data to accumulate petabytes of data. It

allows them to create an impressive and highly accurate deep learning model.

Unsupervised

Unsupervised feature learning is learning features from unlabeled data. The goal of unsupervised feature
learning is often to discover low-dimensional features that captures some structure underlying the high-
dimensional input data. When the feature learning is performed in an unsupervised way, it enables a form of
semisupervised learning where features learned from an unlabeled dataset are then employed to improve
performance in a supervised setting with labeled data. Several approaches are introduced in the following.

Recurrent Neural Network

A Recurrent Neural Network is architected in the same way as a “traditional” Neural Network. We
have someinputs, we have some hidden layers and we have some outputs.

The only difference is that each hidden unit is doing a slightly different function. So, let’s explore how this
hiddenunit works.

A recurrent hidden unit computes a function of an input and its own previous output, also known as the cell
state. For textual data, an input could be a vector representing a word x(i) in a sentence of n words (also
known as word embedding).

MACHINE LEARNING 52
DEPARTMENT OF CSE AY:2023-24

W and U are weight matrices and tanh is the hyperbolic tangent function.
Similarly, at the next step, it computes a function of the new input and its previous cell state: s2 =
tanh(Wx1+ Us1 . This behavior is similar to a hidden unit in a feed-forward Network. The difference, proper
to sequences, is that we are adding an additional term to incorporate its own previous state.
A common way of viewing recurrent neural networks is by unfolding them across time. We can notice that
we are using the same weight matrices W and U throughout the sequence. This solves our problem of
parameter sharing. We don’t have new parameters for every point of the sequence. Thus, once we learn
something, it can apply at any point in the sequence.

The fact of not having new parameters for every point of the sequence also helps us deal with variable-
length sequences. In case of a sequence that has a length of 4, we could unroll this RNN to four timesteps.
In other cases, we can unroll it to ten timesteps since the length of the sequence is not prespecified in the
algorithm. By unrolling we simply mean that we write out the network for the complete sequence. For
example, if the sequence we care about is a sentence of 5 words, the network would be unrolled into a 5-
layer neural network, one layer for each word.

MACHINE LEARNING 53
DEPARTMENT OF CSE AY:2023-24

Introduction to Convolution Neural Network

It is assumed that the reader knows the concept of Neural
networks. When it comes to Machine Learning, Artificial Neural Networks perform really well.
Artificial Neural Networks are used in various classification tasks like image, audio, words. Different
types of Neural Networks are used for different purposes, for example for predicting the sequence of
words we use Recurrent Neural Networks more precisely an LSTM, similarly for image
classification we use Convolution Neural networks. In this blog, we are going to build a
basic building blockregular Neural Network there are three types of

1. Input Layers: It’s the layer in which we give input to our model. The number of neurons in
this layer is equalto the total number of features in our data (number of pixels in the case of an
image).
2. Hidden Layer: The input from the Input layer is then feed into the hidden layer. There can
be many hidden layers depending upon our model and data size. Each hidden layer can have
different numbers of neurons which are generally greater than the number of features. The output from
each layer is computed by matrix multiplication of output of the previous layer with learnable weights
of that layer and then by the addition of learnable biases followed by activation function which makes
the network nonlinear.
3. Output Layer: The output from the hidden layer is then fed into a logistic function like
sigmoid orsoftmax which converts the output of each class into the probability score of each class.
The data is then fed into the model and output from each layer is obtained this step is called
feedforward, we then calculate the error using an error function, some common error functions are
cross-entropy, square loss error, etc. After that, we backpropagate into the model by calculating the
derivatives. This step is called Back propagation which basically is used to minimize the
loss. Here’s thebasic python code for a neural network with random inputs and two hidden
layers.

activation = lambda x: 1.0/(1.0 + np.exp(-x)) # sigmoid function input = np.random.randn(3, 1)

hidden_1 = activation(np.dot(W1, input) + b1)
hidden_2 = activation(np.dot(W2, hidden_1) + b2)output = np.dot(W3, hidden_2) + b3

1,W2,W3,b1,b2,b3 are learnable parameter of the model.

Convolution Neural Network

Convolution Neural Networks or covnets are neural networks that share their parameters. Imagine
you havean image. It can be represented as a cuboid having its length, width (dimension of the
image), and height (as
Images generally havered, green,andblue channels).

MACHINE LEARNING 54
DEPARTMENT OF CSE AY:2023-24

Now imagine taking a small patch of this image and running a small neural network on it, with say, k
outputsand represent them vertically. Now slide that neural network across the whole image, as a result,
we will get another image with different width, height, and depth. Instead of just R, G, and B channels
now we have more channels but lesser width and height. This operation is called Convolution. If the
patch size is the same as that of the image it will be a regular neural network. Because of this small
patch, we have fewer weights.

Now let’s talk about a bit of mathematics that is involved in the whole
convolution process.

 Convolution layers consist of a set of learnable filters (a patch in the above image). Every filter has
small width and height and the same depth as that of input volume (3 if the input layer is image input).
 For example, if we have to run convolution on an image with dimension 34x34x3. The possible size
of filters can be axax3, where ‘a’ can be 3, 5, 7, etc but small as compared to image dimension.
 During forward pass, we slide each filter across the whole input volume step by step where each
step is called stride (which can have value 2 or 3 or even 4 for high dimensional images) and compute
the dot product between the weights of filters and patch from input volume. 
 As we slide our filters we’ll get a 2-D output for each filter and we’ll stack them together and as a
result, we’ll get output volume having a depth equal to the number of filters. The network will learn all
the filters.

Layers used to build ConvNets

A covnets is a sequence of layers, and every layer transforms one volume to another through a
differentiablefunction.

MACHINE LEARNING 55
DEPARTMENT OF CSE AY:2023-24

Types of layers:
1. Input Layer: This layer holds the raw input of the image with width 32, height 32, and depth 3.
2. Convolution Layer: This layer computes the output volume by computing the dot product between all filters
and image patches. Suppose we use a total of 12 filters for this layer we’ll get output volume of
dimension 32x 32 x 12.
3. Activation Function Layer: This layer will apply an element-wise activation function to the output of
the convolution layer. Some common activation functions are RELU: max(0, x), Sigmoid: 1/(1+e^-x),
Tanh, Leaky RELU, etc. The volume remains unchanged hence output volume will have dimension 32 x 32
x 12.
4. Pool Layer: This layer is periodically inserted in the covnets and its main function is to reduce the size
of volume which makes the computation fast reduces memory and also prevents overfitting. Two common
types of pooling layers are max pooling and average pooling. If we use a max pool with 2 x 2 filters and
stride 2,the resultanvolume will be of dimension .

Performance Metrics

• Accuracycan be calculated by taking average of the values lying across the “main diagonal” i.e
Accuracy = (True Positives+False Negatives)/Total Number of Samples

Precision:-It is the number of correct positive results divided by the number of positive results predicted by
classifier.

MACHINE LEARNING 56
DEPARTMENT OF CSE AY:2023-24

• Recall :- It is the number of correct positive results divided by the number of all relevant samples

Structured prediction or structured (output) learning :-

It is an umbrella term for supervised machine learning techniques that involves predicting structured objects,
rather than scalar discrete or real values.

Similar to commonly used supervised learning techniques, structured prediction models are typically trained
by means of observed data in which the true prediction value is used to adjust model parameters. Due to the
complexityof the model and the interrelations of predicted variables the process of prediction using a trained
model and of training itself is often computationally infeasible and approximate inference and learning
methods are used.

For example, the problem of translating a natural language sentence into a syntactic representation such as a
parse tree can be seen as a structured prediction problem in which the structured output domain is the set of
all possible parse trees. Structured prediction is also used in a wide variety of application
domains including bioinformatics, natural language processing, speech recognition, and computer vision.

Example: sequence tagging

MACHINE LEARNING 57
DEPARTMENT OF CSE AY:2023-24

Sequence tagging is a class of problems prevalent in natural language processing, where input data are often
sequences (e.g. sentences of text). The sequence tagging problem appears in several guises, e.g. part-of-
speech tagging and named entity recognition. In POS tagging, for example, each word in a sequence must
receive a "tag" (class label) that expresses its "type" of word:

DT-DeterminerVB-Verb
JJ-AdjectiveNN-Noun

Ranking :-

Learning to Rank (LTR) is a class of techniques that apply supervised machine learning (ML) to solve
ranking problems. The main difference between LTR and traditional supervised ML is this:

 Traditional ML solves a prediction problem (classification or regression) on a single instance at a

time. 
E.g. if you are doing spam detection on email, you will look at all the features associated with that email and
classify it as spam or not. The aim of traditional ML is to come up with a class (spam or no-spam) or asingle
numerical score for that instance.
LTR solves a ranking problem on a list of items. The aim of LTR is to come up with optimal ordering of
those items. As such, LTR doesn't care much about the exact score that each item gets, but cares more about
the relative ordering among all the items.

The most common application of LTR is search engine ranking, but it's useful anywhere you need to produce
a ranked list of items.

The training data for a LTR model consists of a list of items and a "ground truth" score for each of those
items. For search engine ranking, this translates to a list of results for a query and a relevance rating for each
of those results with respect to the query. The most common way used by major search engines to generate
these relevance ratingsis to ask human raters to rate results for a set of queries

Learning to rank algorithms have been applied in areas other than information retrieval:

 In machine translation for ranking a set of hypothesized translations

 In computational biology for ranking candidate 3-D structures in protein structure prediction problem
 In recommender systems for identifying a ranked list of related news articles to recommend to a user
after heor she has read a current news article
 In software engineering, learning-to-rank methods have been used for fault localization kernel k-
means clustering algorithm

MACHINE LEARNING 58
DEPARTMENT OF CSE AY:2023-24

UNIT - IV
Model Validation in Classification : Cross Validation - Holdout Method, K-Fold, Stratified K-Fold, Leave-
One-Out Cross Validation. Bias-Variance tradeoff, Regularization , Overfitting, Underfitting. Ensemble
Methods: Boosting, Bagging, Random Forest.

Supervised Machine Learning: Model Validation, a Step by Step Approach

Model validation is the process of evaluating a trained model on test data set. This provides the
generalization ability of a trained model. Here I provide a step by step approach to complete first iteration of
model validation in minutes.
The basic recipe for applying a supervised machine learning model are:

1. Choose a class of model

2. Choose model hyper parameters
3. Fit the model to the training data
4. Use the model to predict labels for new data

What exactly is Cross-Validation?

CV is a technique used to train and evaluate an ML model using several portions of a dataset. This
implies that rather than splitting the dataset into two parts only, one to train on and another to test on, the
dataset is divided into more slices, or “folds”. And these slices use CV techniques to train the ML model
so as to test its predictive capability and hence accuracy.

Cross-Validation Data Flow Overview

In the process of building a training set, different portions of data are gathered, while the remaining ones
are reserved for constructing a validation set. This strategic approach ensures that the model
continuously leverages new and diverse data during training and testing stages, promoting its ability to
adapt to various scenarios and challenges.

MACHINE LEARNING 59
DEPARTMENT OF CSE AY:2023-24

One key objective of employing cross-validation is to safeguard the model against overfitting.
Overfitting occurs when a model simply memorizes the samples in the training set, resulting in an
artificially high predictive test score. However, such a model may struggle to generalize well on unseen
data, leading to a lack of useful results. By validating the model's performance on a separate validation
set, CV helps identify if the model has truly learned meaningful patterns and can generalize to new and
unseen scenarios effectively.

The three key steps involved in CV are as follows:

1. Slice and reserve portions of the dataset for the training set,
2. Using what's left, test the ML model.
3. Use CV techniques to test the model using the reserve portions of the dataset created in step 1.

The Advantages of CV are as follows:

1. CV assists in realizing the optimal tuning of hyperparameters (or model settings) that increase
the overall efficiency of the ML model's performance.
2. Training data is efficiently utilized as every observation is employed for both testing and
training.

The Disadvantages of CV are as follows:

1. One of the main considerations with computer vision (CV) is the significant increase in testing
and training time it requires for machine learning models. This is because CV involves multiple
iterative testing cycles to ensure the accuracy and efficiency of the model.
It includes various steps such as test preparation, execution, and rigorous analysis of the results
to fine-tune and optimize the CV system. Therefore, understanding the time commitment
involved in CV development is crucial for effectively leveraging its potential benefits.
2. Additional computation translates to increased resource demands. Cross Validation is known for
its high computational expense, necessitating ample processing power. This results in the first
drawback of extended time, which further inflates the budgetary requirements for an ML model
project.

Two Types of Cross-Validation

Cross validation in machine learning is a crucial technique for evaluating the performance of predictive
models. It involves dividing the available data into multiple subsets, or folds, to train and test the model
iteratively.Non-exhaustive methods, such as k-fold cross-validation, randomly partition the data into k
subsets and train the model on k-1 folds while evaluating it on the remaining fold.On the other hand,
exhaustive methods, like leave-one-out cross-validation, systematically leave out one data point at a
time for testing while training the model on the remaining data points.These methods provide a

MACHINE LEARNING 60
DEPARTMENT OF CSE AY:2023-24

comprehensive assessment of the model's performance and help in addressing overfitting or underfitting
issues effectively.

The five key types of CV in ML are:

1. Holdout Method
2. K-Fold CV
3. Stratified K-Fold CV
4. Leave-P-Out CV
5. Leave-One-Out CV

Holdout Method

The holdout method is a basic CV approach in which the original dataset is divided into two discrete
segments:

1. Training Data - As a reminder this set is used to fit and train the model.
2. Test Data - This set is used to evaluate the model.

The Hold-out method splits the dataset into two portions

As a non-exhaustive method, the Hold-out model 'trains' the ML model on the training dataset and
evaluates the ML model using the testing dataset.

In the majority of cases, the size of the training dataset is typically much larger than the test dataset.
Therefore, a standard holdout method split ratio is 70:30 or 80:20. Furthermore, the overall dataset is
randomly rearranged before dividing it into the training and test set portions using the predetermined
ratio.

There are several disadvantages to the holdout method that need to be considered. One drawback is that
as the model trains on distinct combinations of data points, it can sometimes yield inconsistent results,
which can introduce doubt into the validity of the model and the overall validation process.

Another concern is that there is no certainty that the training dataset selected fully represents the
MACHINE LEARNING 61
DEPARTMENT OF CSE AY:2023-24

complete dataset. If the original data sample is not large enough, there is a possibility that the test data
may contain information that the model will fail to recognize because it was not included in the original
training data portion.

However, despite these limitations, the Holdout CV method can be considered ideal in situations where
time is a scarce project resource and there is an urgency to train and test an ML model using a large
dataset.
K fold Cross-Validation
The k-fold cross-validation method is considered an improvement over the holdout method due to its
ability to provide additional consistency to the overall testing score of machine learning models. This
improvement is achieved by applying a specific procedure for selecting and dividing the training and
testing datasets.
To implement k-fold cross-validation, the original dataset is divided into k number of partitions. The
holdout method is then performed k number of occasions, each time using a different partition as the
testing set, while the remaining partitions are used for training. This repeated process helps to obtain a
more reliable and robust evaluation of the model's performance by leveraging a larger amount of data for
testing and training purposes.
Let us look at an example: if the value of k is set to six, there will be six subsets of equivalent sizes or
folds of data. In the first iteration, the model trains on one subset and validates on the other. In the
second iteration, the model re-trains on another subset and then is tested on the remaining subset. And so
on for six iterations in total.

Diagrammatically this is shown as follows:

MACHINE LEARNING 62
DEPARTMENT OF CSE AY:2023-24

The k-fold cross-validation randomly splits the original dataset into k number of folds

The test results of each iteration are then averaged out, which is called the CV accuracy. Finally, CV
accuracy is employed as a performance metric to contrast and compare the efficiencies of different ML
models.It is important to note that the value of k is incidental or random. However, the k value is
commonly set to ten within the data science field. The k-fold cross-validation approach is widely
recognized for generating ML models with reduced subjectivity. By ensuring that each data point is
present in both testing and training datasets, this technique enhances the objectivity of the
models.Moreover, the k-fold method proves to be particularly advantageous for data science projects
with a finite amount of data. It maximizes the utilization of available data by repeatedly utilizing
different data sets
Jake VanderPlas, gives the process of model validation in four simple and clear steps. There is also a whole
process needed before we even get to his first step. Like fetching all the information we need from the data to
make a good judgement for choosing a class model. Also providing finishing touches to confirm the results
after. I will get into depth about these steps and break it down further.
 Data cleansing and wrangling.

 Split the data into training and test data sets.

 Define the metrics for which model is getting optimized.

 Get quick initial metrics estimate.

 Feature engineering to optimize the metrics. (Skip this during first pass).

 Data pre-processing.

 Feature selection. 

 Model selection.

 Model validation.

 Interpret the results.

 Get the best model and check it against test data set.

Domain knowledge on the problem in hand will be of great use for feature engineering. This is a bigger topic
in itselfand requires extensive investment of time and resource.

Data pre-processing.

Data pre-processing converts features into format that is more suitable for the estimators. In general,
machine learning model prefer standardization of the data set. I will make use of RobustScaler for our
example.
MACHINE LEARNING 63
DEPARTMENT OF CSE AY:2023-24

Feature selection.

Feature selection or dimensionality reduction on data sets helps to

 Either to improve models’ accuracy scoresor
 To boost their performance on very high-dimensional data sets.

WHAT ARE ENSEMBLE MODELS?

Ensemble models are a machine learning approach to combine multiple other models in the
prediction process. These models are referred to as base estimators. Ensemble models offer a
solution to overcome the technical challenges of building a single estimator.\
The technical challenges of building a single estimator include:

 High variance: The model is very sensitive to the provided inputs for the learned features.
 Low accuracy: One model (or one algorithm) to fit the entire training data might not provide
you with the nuance your project requires.
 Features noise and bias: The model relies heavily on too few features while making a
prediction.

Ensemble Algorithm

A single algorithm may not make the perfect prediction for a given data set. Machine learning
algorithms have their limitations and producing a model with high accuracy is challenging. If we
build and combine multiple models, we have the chance to boost the overall accuracy. We then
implement the combination of models by aggregating the output from each model with two
objectives:

1. Reducing the model error

2. Maintaining the model’s generalization

MACHINE LEARNING 64
DEPARTMENT OF CSE AY:2023-24

TYPES OF ENSEMBLE MODELING TECHNIQUES

1. Bagging
2. Boosting
3. Stacking
4. Blending
5.
BAGGING

The idea of bagging is based on making the training data available to an iterative learning process.
Each model learns the error produced by the previous model using a slightly different subset of the
training data set. Bagging reduces variance and minimizes overfitting. One example of such a
technique is the random forest algorithm.

Bootstrap Aggregation (Bagging)

This technique is based on a bootstrapping sampling technique. Bootstrapping creates multiple sets
of the original training data with replacement. Replacement enables the duplication of sample
instances in a set. Each subset has the same equal size and can be used to train models in parallel.
The method involves:
 Creating multiple subsets from the original dataset with replacement,
 Building a base model for each of the subsets,
 Running all the models in parallel, 
 Combining predictions from all models to obtain final predictions.

MACHINE LEARNING 65
DEPARTMENT OF CSE AY:2023-24

Boosting
Boosting is a machine learning ensemble technique that reduces bias and variance by converting weak learners
into strong learners. The weak learners are applied to the dataset in a sequential manner. The first step is building
an initial model and fitting it into the training set.
A second model that tries to fix the errors generated by the first model is then fitted. Here’s what the entire
process looks like:
 Create a subset from the original data,
 Build an initial model with this data,
 Run predictions on the whole data set,
 Calculate the error using the predictions and the actual values, 
 Assign more weight to the incorrect predictions,
 Create another model that attempts to fix errors from the last model, 
 Run predictions on the entire dataset with the new model, 
 Create several models with each model aiming at correcting the errors generated by the previous one,
 Obtain the final model by weighting the mean of all the models. 

Random Forest Algorithm

Random Forest Algorithm widespread popularity stems from its user-friendly nature and

adaptability, enabling it to tackle both classification and regression problems effectively. The

algorithm’s strength lies in its ability to handle complex datasets and mitigate overfitting, making

it a valuable tool for various predictive tasks in machine learning.

One of the most important features of the Random Forest Algorithm is that it can handle the data

set containing continuous variables, as in the case of regression, and categorical variables, as in

the case of classification. It performs better for classification and regression tasks.

What is Random forest

A Random Forest is like a group decision-making team in machine learning. It combines the
opinions of many “trees” (individual models) to make better predictions, creating a more robust
and accurate overall model.

What is Random Forest Algorithm?

Random Forest Algorithm widespread popularity stems from its user-friendly nature and
adaptability, enabling it to tackle both classification and regression problems effectively. The
algorithm’s strength lies in its ability to handle complex datasets and mitigate overfitting, making
it a valuable tool for various predictive tasks in machine learning.

MACHINE LEARNING 66
DEPARTMENT OF CSE AY:2023-24

One of the most important features of the Random Forest Algorithm is that it can handle the data
set containing continuous variables, as in the case of regression, and categorical variables, as in
the case of classification. It performs better for classification and regression tasks. In this tutorial,
we will understand the working of random forest and implement random forest on a
classification task.

As mentioned earlier, Random forest works on the Bagging principle. Now let’s dive in and
understand bagging in detail.

Steps Involved in Random Forest Algorithm

 Step 1: In the Random forest model, a subset of data points and a subset of features is selected
for constructing each decision tree. Simply put, n random records and m features are taken from
the data set having k number of records.
 Step 2: Individual decision trees are constructed for each sample.
 Step 3: Each decision tree will generate an output.
 Step 4: Final output is considered based on Majority Voting or Averaging for Classification and
regression, respectively. 

For example
Consider the fruit basket as the data as shown in the figure below. Now n number of samples are
taken from the fruit basket, and an individual decision tree is constructed for each sample. Each
decision tree will generate an output, as shown in the figure. The final output is considered based on
majority voting. In the below figure, you can see that the majority decision tree gives output as an
apple when compared to a banana, so the final output is taken as an apple.

MACHINE LEARNING 67
DEPARTMENT OF CSE AY:2023-24

Important Features of Random Forest

 Diversity: Not all attributes/variables/features are considered while making an individual tree;
each tree is different.
 Immune to the curse of dimensionality: Since each tree does not consider all the features, the
feature space is reduced.
 Parallelization: Each tree is created independently out of different data and attributes. This
means we can fully use the CPU to build random forests.
 Train-Test split: In a random forest, we don’t have to segregate the data for train and test as
there will always be 30% of the data which is not seen by the decision tree.
 Stability: Stability arises because the result is based on majority voting/ averaging. 

MACHINE LEARNING 68
DEPARTMENT OF CSE AY:2023-24

UNIT – V
Unsupervised Learning : Clustering-K-means, K-Modes, K-Prototypes, Gaussian Mixture
Models, Expectation-Maximization.
Reinforcement Learning: Exploration and exploitation trade-offs, non-associative learning, Markov decision
processes, Q-learning.

Unsupervised Machine Learning:

Introduction to clustering

As the name suggests, unsupervised learning is a machine learning technique in which models are
not supervised using training dataset. Instead, models itself find the hidden patterns and insights
from the given data. It can be compared to learning which takes place in the human brain while
learning new things. It can be defined as:

“Unsupervised learning is a type of machine learning in which models are trained using unlabeled
dataset and are allowed to act on that data without any supervision.”

Unsupervised learning cannot be directly applied to a regression or classification problem because

unlike supervisedlearning, we have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of dataset, group that data according
to similarities, and represent that dataset in a compressed format
Example: Suppose the unsupervised learning algorithm is given an input dataset containing images of
different types of cats and dogs. The algorithm is never trained upon the given dataset, which means it does
not have any idea about the features of the dataset. The task of the unsupervised learning algorithm is to
identify the image features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between images.

Why use Unsupervised Learning?

Below are some main reasons which describe the importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights fromthe data.

o Unsupervised learning is much similar as a human learns to think by

their ownexperiences, which makes itcloser to the real AI.

MACHINE LEARNING 69
DEPARTMENT OF CSE AY:2023-24

o
Unsupervised learning works on unlabeled and uncategorized data
which make unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding
output so to solve such cases, we need unsupervised learning.

Working of Unsupervised Learning

we have taken an unlabeled input data, which means it is not categorized and corresponding outputs
are also not given. Now, this unlabeled input data is fed to the machine learning model in order to
train it. Firstly, it will interpret the raw data to find the hidden patterns from the data and then will
apply suitable algorithms such as k- means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groupsaccording to
the similarities and difference between the objects.
Types of Unsupervised Learning Algorithm:
The unsupervised learning algorithm can be further categorized into two types of problems:

Clustering: Clustering is a method of grouping the objects into clusters such that objects with most
similarities remains into a group and has less or no similarities with the objects of another group.
Cluster analysis finds the commonalities between the data objects and categorizes them as per the
presence and absence of those commonalities.

MACHINE LEARNING 70
DEPARTMENT OF CSE AY:2023-24

o Association: An association rule is an unsupervised learning method which is used

for finding the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective. Such as
people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical
example of Association rule is Market Basket Analysis.

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

K-means clustering of unsupervised learning can be understood by the below diagram:

o KNN (k-nearest neighbors)

o Hierarchal clustering
o Anomaly detection

o Neural Networks

o Principle Component Analysis

o Independent Component Analysis

o Apriori algorithm
o Unsupervised learning is used for more complex tasks as compared to
supervised learning because, in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison
to labeled data.

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than

supervised learning asit does not have corresponding output.

o The result of the unsupervised learning algorithm might be less

accurate as input data is not labeled, and algorithms do not know the exact output in
advance.

k-means clustering algorithm

One of the most used clustering algorithm is k-means. It allows to group the data according
to the existing similarities among them in k clusters, given as input to the algorithm. I‟ll
startwith a simple example.

Let’s imagine we have 5 objects (say 5 people) and for each of them we know two features

MACHINE LEARNING 71
DEPARTMENT OF CSE AY:2023-24

(height and weight). We want to group them into k=2 clusters.

Our dataset will look like this:

How to apply k-means?

As you probably already know, I‟m using Python libraries to analyze my data. The k-means
algorithm is implemented in the scikit-learn package. To use it, you will just need the following line
in your script:

What if our data is… non-numerical?

At this point, you will maybe have noticed something. The basic concept of k-means stands on
mathematical calculations (means, euclidian distances). But what if our data is non-numerical or, in
other words, categorical? Imagine, for instance, to have the ID code and date of birth of the five
people of the previous example, instead of their heights and weights.

We could think of transforming our categorical values in numerical values and eventually apply k-
means. But beware: k-means uses numerical distances, so it could consider close two really distant
objects that merely have been assigned two close numbers.

k-modes is an extension of k-means. Instead of distances it uses dissimilarities (that is,

quantification of the total mismatches between two objects: the smaller this number, the more
similar the two objects). And instead of means, it uses modes. A mode is a vector of elements that
minimizes the dissimilarities between the vector itself and each object of the data. We will have as
many modes as the number of clusters we required, since they act as centroids.

K-means implements the Expectation-Maximization strategy to solve the problem. The

Expectation-step is used to assign data points to the nearest cluster, and the Maximization-step is

used to compute the centroid of each cluster.

When using the K-means algorithm, we must keep the following points in mind:

 It is suggested to normalize the data while dealing with clustering algorithms such as K-

Means since such algorithms employ distance-based measurement to identify the similarity

between data points.

 Because of the iterative nature of K-Means and the random initialization of centroids, K-

Means may become stuck in a local optimum and fail to converge to the global optimum. As

a result, it is advised to employ distinct centroids’ initializations.

MACHINE LEARNING 72
DEPARTMENT OF CSE AY:2023-24

k-Prototype

One of the conventional clustering methods commonly used in clustering techniques and efficiently

used for large data is the K-Means algorithm. However, its method is not good and suitable for data
that contains categorical variables. This problem happens when the cost function in K-Means is

calculated using the Euclidian distance that is only suitable for numerical data. While K-Mode is

only suitable for categorical data only, not mixed data types.

Facing these problems, Huang proposed an algorithm called K-Prototype which is created in order to

handle clustering algorithms with the mixed data types (numerical and categorical variables). K-

Prototype is a clustering method based on partitioning. Its algorithm is an improvement of the K-

Means and K-Mode clustering algorithm to handle clustering with the mixed data types.
K-prototypes algorithm integrates the k-means and k-modes algorithms to deal with the mixed
data types [7]. The k-prototypes algorithm is more useful practically because data collected in the
real world are mixed type objects. Assume a set n
objects, 𝑋={𝑋1, 𝑋2,⋯,𝑋n}={1, 2,⋯,n}. 𝑋𝑖={𝑋𝑖1,𝑋𝑖2,⋯,𝑋𝑖𝑚}={1,2,⋯,} consists of 𝑚 attributes
(𝑚𝑟 is numerical attributes, 𝑚𝑐 is categorical attributes, 𝑚=𝑚𝑟+𝑚𝑐). The goal of clustering is to
partition n objects into k disjoint clusters 𝐶={𝐶1,𝐶2,⋯,𝐶𝑘}={1,2,⋯,}, where 𝐶𝑖 is an i-th cluster
center. The distance 𝑑(𝑋𝑖,𝐶𝑗)(,) between 𝑋𝑖 and 𝐶𝑗 can be calculated as follows:
𝑑(𝑋𝑖,𝐶𝑗)=𝑑𝑟(𝑋𝑖,𝐶𝑗)+𝛾 𝑑𝑐(𝑋𝑖,𝐶𝑗)(,)=(,)+,)
(1)
where 𝑑𝑟(𝑋𝑖,𝐶𝑗)(,) is the distance between numerical attributes, 𝑑𝑐(𝑋𝑖,𝐶𝑗)(,) is the distance
between categorical attributes, and 𝛾 is a weight for categorical attributes.
𝑑𝑟(𝑋𝑖,𝐶𝑗)=∑𝑙=1𝑝|𝑥𝑖𝑙−𝑐𝑗𝑙|2(,)=∑=1|−|2
(2)
𝑑𝑐(𝑋𝑖,𝐶𝑗)=∑𝑙=𝑝+1𝑚𝛿(𝑥𝑖𝑙,𝑐𝑗𝑙)(,)=∑=+1(,)
(3)
𝛿(𝑥𝑖𝑙,𝑐𝑗𝑙)={0,1,whenwhen𝑥𝑖𝑙=𝑐𝑗𝑙𝑥𝑖𝑙≠𝑐𝑗𝑙.(,)={0,wheni=1,wheni≠.j
(4)
In Equation (2), 𝑑𝑟(𝑋𝑖,𝐶𝑗)(,) is the squared Euclidean distance measure between cluster centers
and an object on the numerical attributes. 𝑑𝑐(𝑋𝑖,𝐶𝑗)(,) is the simple matching dissimilarity measure
on the categorical attributes, where 𝛿(𝑥𝑖𝑙,𝑐𝑗𝑙)(,) = 0 for 𝑥𝑖𝑙=𝑐𝑗𝑙= and 𝛿(𝑥𝑖𝑙,𝑐𝑗𝑙)(,) = 1
for 𝑥𝑖𝑙≠𝑐𝑗𝑙≠. 𝑥𝑖𝑙 and 𝑐𝑗𝑙, 1≤𝑙≤𝑝1≤≤, are values of numerical attributes,
whereas 𝑥𝑖𝑙 and 𝑐𝑗𝑙, 𝑝+1≤𝑙≤𝑚+1≤j≤l are values of categorical attributes for object i and the cluster
center j. 𝑝 is the numbers of numerical attributes and 𝑚−𝑝−lis the numbers of categorical attributes.

MACHINE LEARNING 73
DEPARTMENT OF CSE AY:2023-24

Reinforcement learning

Reinforcement learning addresses the question of how an autonomous agent that senses and acts in its
environment can learn to choose optimal actions to achieve its goals

Introduction

 Consider building a learning robot. The robot, or agent, has a set of sensors
to observe the state of itsenvironment, and a set of actions it can performto alter this state.
 Its task is to learn a control strategy, or policy, for choosing actions that achieve its goals.
 The goals of the agent can be defined by a reward function that assigns a
numericalvalue to each distinctaction the agent may take from each distinct state.
 This reward function may be built into the robot, or known only to an external
teacher whoprovides thereward value for each action performed bythe robot.
 The task of the robot is to perform sequences of actions, observe their
consequences,and learn a controlpolicy.
 The control policy is one that, from any initial state, chooses actions that
maximize thereward accumulatedover time by the agent.
Example:
 A mobile robot may have sensors such as a camera and sonars, and actions such as "move
forward"and "turn."
The robot may have a goal of docking onto its battery charger whenever its battery level islow.
 The goal of docking to the battery charger can be captured by assigning a positive
reward (Eg., +100) to state- action transitions that immediately result in a connection to the charger
and a reward of zero to every other state-action transition.

Reinforcement Learning Problem

 An agent interacting with its environment. The agent exists in an environment
described by some set of possible states S.
 Agent perform any of a set of possible actions A. Each time it performs an action a,
in some state st the agent receives a real-valued reward r, that indicates the immediate value of this
state-action transition. This produces a sequence of states si, actions ai, and immediate rewards ri as
shown in the figure.
The agent's task is to learn a control policy, 𝝅: S → A, that maximizes the expected sum of these rewards,
withfuture rewards discounted exponentially bytheir delay.

MACHINE LEARNING 74
DEPARTMENT OF CSE AY:2023-24

Reinforcement learning problemcharacteristics

1. Delayed reward: The task of the agent is to learn a target function 𝜋 that maps from
the current state s to the optimal action a = 𝜋 (s). In reinforcement learning, training information is
not available in (s, 𝜋 (s)). Instead, the trainer provides only a sequence of immediate reward values
as the agent executes its sequence of actions. The agent, therefore, faces the problem of temporal
credit assignment: determining which of the actions in its sequence are to be credited with
producing the eventual rewards.

2. Exploration: In reinforcement learning, the agent influences the distribution of

training examples by the action sequence it chooses. This raises the question of which
experimentation strategy produces most effective learning. The learner faces a trade-off in choosing
whether to favor exploration of unknown states and actions, or exploitation of states and actions
that it has already learned will yield high reward.

3. Partially observable states: The agent's sensors can perceive the entire state of the
environment at each time step, in many practical situations sensors provide only partial information.
In such cases, the agent needs to consider its previous observations together with its current sensor
data when choosing actions, and the best policy may be onethat chooses actions specifically to
improve the observability of the environment.

4. Life-long learning: Robot requires to learn several related tasks within the same
environment,using the same sensors. For example, a mobile robot may need to learn how to dock
on its battery charger, how to navigate through narrow corridors, and how to pick up output from
laser printers. This setting raises the possibility of using previously obtained experience or
knowledge to reduce sample complexity when learning new tasks.

Learning Task

Consider Markov decision process (MDP) where the agent can perceive a set S of distinct states of
itsenvironment and has a set A of actions that it can perform

 At each discrete time step t, the agent senses the current state st, chooses a current action
at, andperforms it.
 The environment responds by giving the agent a reward rt = r(st, at) and by producing the
succeedingstate st+l
= δ(st, at). Here the functions δ(st, at) and r(st, at) depend only on the current state and action, and not
onearlier states or actions.

The task of the agent is to learn a policy, 𝝅: S → A, for selecting its nextaction a, based on
the current observedstate st; that is, 𝝅(st) = at.

Howshall we specify precisely which policy π we would like the agent to learn?

1. One approach is to require the policy that produces the greatest

MACHINE LEARNING 75
DEPARTMENT OF CSE AY:2023-24

possible cumulative reward for the robot overtime.

To state this requirement more precisely, define the cumulative valueVπ (st) achieved
by following an arbitrary policy π from an arbitrary

initial state st as follows:

Where, the sequence of rewards rt+i is generated by beginning atstate st and by

repeatedly using the policy π to select actions.
Here 0 ≤ γ ≤ 1 is a constant that determines the relative value of delayed versusimmediate rewards. if we
set γ
= 0, only the immediate reward is considered. As we set γ closer to 1, future rewards are
given greater emphasis relative to the immediate reward.
The quantity Vπ (st) is called the discounted cumulative reward achieved by policy π
from initial state s. It is reasonable to discount future rewards relative to immediate
rewards because, in many cases, we prefer to obtain the reward sooner rather than later.
2. Other definitions of total reward is finite horizon reward,

Considers the undiscounted sum of rewards over a finite number h of steps

3. Another approach is average reward

Considers the average reward per time step over the entire lifetime of the agent.

We require that the agent learn a policy π that maximizes Vπ (st) for allstates s. such a
policy is called an optimalpolicy and denote it by π*

Refer the value function Vπ*(s) an optimal policy as V*(s). V*(s) gives the maximum
discounted cumulative rewardthat the agent can obtain starting from state s.

Example:

A simple grid-world environment is depicted in the diagram

The six grid squares in this diagram represent six possible states, or locations,for theagent.

Each arrow in the diagram represents a possible action the agent can take tomove from one state
to another.

MACHINE LEARNING 76
DEPARTMENT OF CSE AY:2023-24

The number associated with each arrow represents the immediate reward r(s, a) the
agent receives if it executesthe corresponding state-action transition
The immediate reward in this environment is defined to be zero forall state-action
transitions except for those leading into the state labelled G. The state G as the goal
state, and the agent can receive reward by entering thisstate.

Once the states, actions, and immediate rewards are defined, choose a value for the
discount factor γ, determine theoptimal policy π * and itsvalue function V*(s).

Let’s choose γ = 0.9. The diagramat the bottom of the figure shows one optimal

policy for this setting.

Values of V*(s) and Q(s, a) follow from r(s, a), and the discount factor γ =
0.9. An optimal policy, corresponding toactions with maximal Q values,is also shown.

The discounted future reward fromthe bottom centre state is

0+ γ 100+ γ2 0+ γ3 0+... = 90

Non-Associative Learning:

Below is the dictionary definition of non-associative learning:

As applied to animal behavior, is instances where behavior toward stimulus changes in the
absence of any apparent associated stimulus or event (such as a reward or punishment).

In non-associative learning, the person is being trained on how to respond to a certain situation.
There is a right and a wrong answer.

Supervised learning algorithms use non-associative learning. These algorithms learn from the
training data. Primarily, they are taught based on the assumption there is a right or wrong answer.

The cost function, or loss, associated with the algorithm, is a similar concept to ‘punishment.’

In non-associative machine learning, you use the training data set to teach the machine learning
algorithm how to predict on the data set.

This is instead of letting the algorithm learn for itself on what the outcome should be.

In other words, it represents the process of supervised machine learning.

1. REGRESSION ANALYSIS

The classic example of supervised ML using regression is the prediction of house prices.

MACHINE LEARNING 77
DEPARTMENT OF CSE AY:2023-24

For example, the number of rooms a house has (input) and the price of the house (output).

This training data will teach the machine how the number of rooms and price are related, allowing it
to make predictions of the output, cost of a house, based on the inputs, number of rooms.

2. CLASSIFICATION ANALYSIS

If we move onto classification analysis, we begin to use machine learning to determine which group
an object belongs to. One of the classic examples is whether or not a tumor is malignant or benign.
Or you could use it to say yes or no if someone is likely to pass an exam.

Another example is, will this person develop diabetes? Yes or No.

In classification analysis, the labeled training data set will have a sample set of people and their
characteristics alongside whether or not they developed diabetes.

This training data is there to teach the machine how different characteristics of a person’s genetics
or lifestyle contribute to whether or not they would get diabetes.

Q LEARNING

Howcan an agent learn an optimal policy π * for an arbitrary environment?

The training information available to the learner is the sequence of immediate rewards r(si,ai)
for i = 0, 1,2, . . . .
Given this kind of training information it is easier to learn a numerical evaluation
function defined over states andactions, then implement the optimal policy in terms of
this evaluation function.
What evaluation function should the agent attempt to learn?

One obvious choice is V*. The agent should prefer state sl over state s2 whenever
V*(sl) > V*(s2), because thecumulative future reward will begreater from sl
The optimal action in state s is the action a that maximizes the sum of theimmediate
reward r(s, a) plus the value V*of the immediate successor state, discounted by γ.

The Q Function
The value of Evaluation function Q(s, a) is the reward receivedimmediately
upon executing action a from state s,plus the value (discounted by γ ) of

MACHINE LEARNING 78
DEPARTMENT OF CSE AY:2023-24

Rewrite Equation (3) in terms of Q(s, a) as

Equation (5) makes clear, it need only consider each available action a
in its current state s and choose the actionthat maximizes Q(s, a).

An Algorithm for Learning Q

Learning the Q function corresponds to learning the optimal policy.

The key problem is finding a reliable way to estimate training valuesfor Q, given only
a sequence of immediaterewards r spread out over

time. This can be accomplished through iterative approximation

Rewriting Equation

Q learning algorithm:

Q learning algorithmassuming deterministic rewards and actions.

MACHINE LEARNING 79
DEPARTMENT OF CSE AY:2023-24

The discount factor γ may be anyconstantsuch that 0 ≤ γ < 1

𝑄̂ to refer to the learner's estimate, or hypothesis, of the actual Q function

An Illustrative Example

To illustrate the operation of the Q learning algorithm, consider a single action taken
by an agent, and thecorresponding refinement to

MACHINE LEARNING 80
DEPARTMENT OF CSE AY:2023-24

𝑄̂ shown in below figure

The agent moves one cell to the right in its grid world and receives an
immediate reward of zero for thistransition.

Apply the training rule of Equation

to refine its estimate Q for the state-action transition it just executed.

According to the training rule, the new 𝑄̂ estimate for this transitionis the sum of the received reward (zero)and the highest

𝑄̂ value associated with the resulting state (100), discounted byγ (.9).

Convergence
Will the Q Learning Algorithm converge toward a Q equal to the true Q function?
Yes, under certain conditions.
1. Assume the system is a deterministic MDP.
2. Assume the immediate reward values are bounded; that is, there exists
some positive constant c such that for allstates s and actions a, | r(s, a)|
<c
3. Assume the agent selects actions in such a fashion that it visits every possible
state-action pair infinitely often

MACHINE LEARNING 81
DEPARTMENT OF CSE AY:2023-24

Here are four machine learning trends that could become a reality in the near future:

1) Intelligence on the Cloud

Algorithms can help companies unearth insights about their business, but this proposition can be
expensive with no guarantees of a bottom-line increase. Companies often deal with havingto collect
data, hire data scientists and train them to deal with changing databases. Now that more data metrics
are becoming available, the cost to store it is dropping thanks to the cloud. There will no longer be
the need to manage infrastructure as cloud systems can generate new models as the scale of an
operation increases, while also delivering more accurate results. More open-source ML frameworks
are coming to the fold, obtaining pre-trained platforms thatcan tag images, recommend products and
perform natural language processing tasks.

2) Quantum Computing Capabilities

Some of the tasks that ML can help companies deal with is the manipulation and classification of
large quantities of vectors in high-dimensional spaces. Current algorithms take a large chunk of time
to solve these problems, costing companies more to complete their business processes. Quantum
computers are slated to become all the rage soon as they can manipulate high-dimensional vectors at
a fraction of the time. These will be able to increase the number of vectors and dimensions that are
processed when compared to traditional algorithms in a quicker period of time.

3) Improved Personalization

Retailers are already making waves in developing recommendation engines that reach their target
audience more accurately. Taking this a step further, ML will be able to improve the personalization
techniques of these engines in more precise ways. The technology will offer more specific data that
they can then use on ads to improve the shopping experience for consumers.

4) Data on Data

As the amount of data available increases, the cost of storing this data decreases at roughly thesame
rate. ML has great potential in generating data of the highest quality that will lead to better models,
an improved user experience and more data that helps repeat but improve uponthis cycle. Companies
such as Tesla add a million miles of driving data to enhance its self- driving capabilities every hour.
Its Autopilot feature learns from this data and improves the software that propels these self-driving
vehicles forward as the company gathers more data onthe possible pitfalls of autonomous driving
technology.

MACHINE LEARNING 82
DEPARTMENT OF CSE AY:2023-24

MACHINE LEARNING 83

DL Unit 1
No ratings yet
DL Unit 1
200 pages
Writing Women in Korea Translation and Feminism in The Early Twentieth Century (Theresa Hyun) (Z-Library)
No ratings yet
Writing Women in Korea Translation and Feminism in The Early Twentieth Century (Theresa Hyun) (Z-Library)
192 pages
A Fundamental, Practical Theology of Children, Mothers', and Fathers in Modern Societies
No ratings yet
A Fundamental, Practical Theology of Children, Mothers', and Fathers in Modern Societies
436 pages
Milestone Challenge On Used Bikes Data Set
25% (8)
Milestone Challenge On Used Bikes Data Set
11 pages
The Metaphysics of Quantum Mechanics (T en
100% (1)
The Metaphysics of Quantum Mechanics (T en
353 pages
Unit 4
100% (1)
Unit 4
57 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
7 pages
UNIT-1 Foundations of Deep Learning
100% (1)
UNIT-1 Foundations of Deep Learning
51 pages
CNN vs. RNN vs. ANN - Analysing 3 Types of Neural Networks in Deep Learning
No ratings yet
CNN vs. RNN vs. ANN - Analysing 3 Types of Neural Networks in Deep Learning
10 pages
Module 3.0 PPT - Social Political Background of Jesus Birth
100% (1)
Module 3.0 PPT - Social Political Background of Jesus Birth
39 pages
Unit 4 Notes
100% (1)
Unit 4 Notes
45 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Area Under The Curve PDF
No ratings yet
Area Under The Curve PDF
51 pages
CCS355 NNDL Unit1
No ratings yet
CCS355 NNDL Unit1
30 pages
The Magic of The Pen: Select Miniatures From The Khamsa of Nizami Ganjavi
No ratings yet
The Magic of The Pen: Select Miniatures From The Khamsa of Nizami Ganjavi
276 pages
Unit 1
No ratings yet
Unit 1
16 pages
Assignment Unit 1 Automata
No ratings yet
Assignment Unit 1 Automata
4 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
205 pages
Reading Torch Test Comprehensive Lesson Plan 2
No ratings yet
Reading Torch Test Comprehensive Lesson Plan 2
5 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Angleski Glagol 2
No ratings yet
Angleski Glagol 2
82 pages
Module 2
No ratings yet
Module 2
44 pages
CPM Textbooks Homework Help
100% (1)
CPM Textbooks Homework Help
6 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Unit 4-Health Care and Deep Learninh
No ratings yet
Unit 4-Health Care and Deep Learninh
87 pages
DL 02 Deep Forward Networks
No ratings yet
DL 02 Deep Forward Networks
47 pages
EL121N Day 1 Notes
No ratings yet
EL121N Day 1 Notes
35 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Deep Learnig
No ratings yet
Deep Learnig
16 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
Lecture Slides-Week13,14
No ratings yet
Lecture Slides-Week13,14
62 pages
Deep Learning
No ratings yet
Deep Learning
59 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Unit 4 Hca
No ratings yet
Unit 4 Hca
57 pages
Unit 1
No ratings yet
Unit 1
70 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
Project Report Car Rental App
No ratings yet
Project Report Car Rental App
43 pages
(2022 11 10) 張訓碩-生藥學 (terpenoids)
No ratings yet
(2022 11 10) 張訓碩-生藥學 (terpenoids)
49 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Deep Learning Day 27
No ratings yet
Deep Learning Day 27
43 pages
Unit 03 - Neural Networks - MD
No ratings yet
Unit 03 - Neural Networks - MD
24 pages
Fast Path To B2C Commerce Developer Certification - Module 2 - Cartridges and Controllers
No ratings yet
Fast Path To B2C Commerce Developer Certification - Module 2 - Cartridges and Controllers
18 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
Neural Network Oxygen
No ratings yet
Neural Network Oxygen
25 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
PP&DS 5
No ratings yet
PP&DS 5
31 pages
Deep Learning Basics in Machine Learnning 1
No ratings yet
Deep Learning Basics in Machine Learnning 1
29 pages
Analysing 3 Networks
No ratings yet
Analysing 3 Networks
30 pages
DL Unit 4 Perfect PDF - 1
No ratings yet
DL Unit 4 Perfect PDF - 1
23 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Lesson 4 - Deep Learning
No ratings yet
Lesson 4 - Deep Learning
20 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
22 pages
Unit 1
No ratings yet
Unit 1
19 pages
Chapter-3 1
No ratings yet
Chapter-3 1
21 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
Chapter One
No ratings yet
Chapter One
9 pages
Neural Networks in Machine Learning11
No ratings yet
Neural Networks in Machine Learning11
11 pages
Clevered AI Wizard Level 3
No ratings yet
Clevered AI Wizard Level 3
17 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
No ratings yet
2630 20230529 Mahdi Momen Aldawood HH 15261 946399124
11 pages
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
No ratings yet
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
14 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
10 pages
What Are Neural Networks
No ratings yet
What Are Neural Networks
5 pages
ML Unit 4
No ratings yet
ML Unit 4
16 pages
Three Things Have Been Critical In: The Rise of The New Literacies
No ratings yet
Three Things Have Been Critical In: The Rise of The New Literacies
3 pages
RoseIso LP1
No ratings yet
RoseIso LP1
8 pages
Software Assignment No1 Zohaib Ijaz 23811
No ratings yet
Software Assignment No1 Zohaib Ijaz 23811
10 pages
Voice Based Email System For Visually Impaired
No ratings yet
Voice Based Email System For Visually Impaired
8 pages
Deep Learning Concise Notes
No ratings yet
Deep Learning Concise Notes
4 pages
2 Business Objects
No ratings yet
2 Business Objects
3 pages
5A Notes Intro To Conic Hyperbolas
No ratings yet
5A Notes Intro To Conic Hyperbolas
6 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
8 pages
OCI DL Fundations
No ratings yet
OCI DL Fundations
4 pages
CBSE English (Marigold) Chapter 1 Neha Alarm Clock Class 4 Notes PDF
No ratings yet
CBSE English (Marigold) Chapter 1 Neha Alarm Clock Class 4 Notes PDF
4 pages
Agenda: What Is OS?
No ratings yet
Agenda: What Is OS?
5 pages
There Is No Such Thing As A Morale or An Immoral Book
No ratings yet
There Is No Such Thing As A Morale or An Immoral Book
3 pages
Calming The Storm in Matthew
No ratings yet
Calming The Storm in Matthew
2 pages
Paradigms Vs Processes
No ratings yet
Paradigms Vs Processes
2 pages
BCD To Excess 3
No ratings yet
BCD To Excess 3
3 pages
Sakib's Resume
No ratings yet
Sakib's Resume
1 page
Prabhakar Mishra Resume
No ratings yet
Prabhakar Mishra Resume
1 page
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)