0% found this document useful (0 votes)

7 views10 pages

Simulation Notes

Uploaded by

Jerry Yue

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views10 pages

Simulation Notes

Uploaded by

Jerry Yue

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Simulation Notes

Machine Learning
Introduction to ML
Machine learning is a data-driven approach during which computers learn from existing data to make
predictions often for optimisation purposes.

There are 3 types of machine learning:

- Supervised
- Unsupervised
- Reinforcement

In supervised learning, labelled datasets are provided to algorithms; each dataset is often split into
two subsets for training and testing, respectively (usually 80/20 or 70/30 split). In each subset,
variables are typically labelled as the target variable and predictor variables. The target variable
requires its data-points to be modelled by using the predictor variables. An example would be bank
loan repayment prediction: predictor variables such as initial payment, last payment and credit score
are used to predict the bank’s decision as the target variable.

Datasets are unlabelled in unsupervised learning, the algorithm used must make predictions by
studying the patterns of existing data without any explicit instructions; this means that there is no
necessity to split datasets for training and testing. An example would be performing weather forecast
based on past daily temperature, humidity and weather patterns.

Reinforcement learning is similar to a human trial-and-error process; algorithms are trained to make
optimal decisions in a dynamic environment. Common examples including text prediction, traffic
control and teaching a computer to game. The performance of a reinforcement learning algorithm is
often monitored by tracking its learning curve.
Artificial Neural Networks
Introduction to ANN
A typical artificial neural network (ANN) begins with an input layer and ends with an output layer
with multiple hidden layers in between.

Each layer contains a selection of neurons, the number of neurons in each layer depends on the type of
problem involved and how the output(s) should be deduced based on the inputs. However, the
numbers of neurons generally decrease from the first to the last layer.

The process begins with assigning an activation value (often lies between 0 and 1) to each neuron in
the first input layer. A set of weight values which can be positive or negative corresponds to each
neuron in the following layer.

The magnitudes of the weights are directly proportional to the strengths of inter-neuron connections.
The number of weight values in each set needs to be the same as the number of neurons in the first
layer. The weight sum of all neurons for each set of weight values is calculated in the presence of a
bias value (one bias for each neuron in the following layer)

The bias used to calculate the activation value of a neuron is an indication of whether that neuron is
going to active or inactive in the calculations for the activation values of the neurons in the following
layer (which also emphasizes the goal to be achieved across the layers).

The weighted sums are used as inputs for an activation function that produces outputs within the
range between 0 and 1 (e.g. Sigmoid or ReLU). The outputs then become the activation value of each
corresponding neuron in the following layer. The higher the activation value, the more relevant the
neuron is for determining neurons of interest in the following layer.

The process repeats itself until the output layer is reached and a final decision is made to produce
output, i.e. the neuron with the highest activation value in the output layer is selected. Calculations in
such way can be visualised as matrix multiplication.

When assigning activation values to neurons in the input layer for initialisation, there should be a
criterion for choosing the magnitude of activation value for each neutron depending on how well they
fit with the criterion.

However, the initial process of assigning weight and bias value is usually totally random. As
expected, the quality of the output in early stages is most likely unacceptable.

The degree of accuracy of the output vector produced by each trial is measured by the cost of a single
training example. The cost is calculated as the sum of the squared values of the difference between
the actual activation value of each output neuron and its corresponding desired activation value. The
higher the cost, the lower the degree of accuracy.

NOTE: The desired activation value should be 1 only for the correct output and 0 for the rest.
As a result, appropriate neuron weights and biases are defined as values that will result in the lowest
cost for the output, meaning that the task becomes finding the minima of the function that produces
the cost as the single output with all the weights and biases as its inputs.

The process of searching for local minima is known as gradient descent. The cost function uses the
average cost of all training data of interest to compute the gradient (as a vector with the same number
of elements as the total number of weights and biases in the neural network).

The computation of the gradient for each weight and bias begins with the initial random set of values
for all weights and biases. The gradients are evaluated based on these random values and the cost
function which are then fed back into the cost function to minimise the cost. The process is repeated
until a minimum value is reached.

The magnitudes of the gradients represent how sensitive the cost function is to any changes in each
weight and bias, the larger the magnitude, the higher the sensitivity.

Since the average cost is used in this approach, all training examples should benefit from the
adjustments and result in lower training costs.

Backpropagation is the core algorithm for machine learning in ANN. The goal of any training
example is to maximise the activation value of the desired output neuron and minimise those of
undesired output neurons. Since all output neurons are in the final layer of the network, the algorithm
back-calculates everything.

There are 3 ways to increase the activation value of the desired output neuron:

- Increase the bias

- Increase the weights in proportion to the activation values from the previous layer
- Change the activation values from the previous layer in proportion to the weights
Such process is repeated for all neurons in the output layer; opposite changes need to be applied to
undesired output neurons to minimise their activation values.

The network propagates backwards starting from the output layer until the input layer is reached so
that the desired changes for all weights and biases could be calculated. The average of each weight
and bias is computed over all training data as the negative gradient used in the gradient descent
method.

In practice, the computation time would be too long if the weight and bias nudges were to be
calculated using the whole training dataset as the input. The technique for shortening computation
time is known as stochastic gradient descent.

It begins with randomly shuffling the dataset and diving them into mini batches. A gradient descent
step is computed for each mini batch; the output will not be as accurate meaning that the stepdown is
not as efficient, but it results in a significant computational speedup.

Following the ways proposed to adjust the activation values of output neurons, the computation of the
gradient of the cost function can be shown as the following equations:

[]
(Inner refers to closer to input layer)

∂C
( )
∂ w jk 1
e
∂C
∂ b jk (1 )
1
n−1 e
C= ∑ C i ∇ C= …
n i=0
e
∂C
∂ w jk( l)
e
∂C
(l )
∂b jk

Equation 1. Cost function & Cost gradient

Note that j and k refer to indices of neurons in adjacent layers and they are used repeatedly.

0 ≤ j ≤ Number of neurons ∈the outer layer

0 ≤ k ≤ Number of neurons∈theinner layer
Each element in the cost gradient vector needs to be taken as the average of the costs of all training
examples. The cost of a single training example could be computed using the chain rule; the
computation process begins from the output layer backwards:
n−1 ( l) (l )
∂C 1 ∂ Ci ∂ Ci ∂ zj ∂ a j ∂ Ci
( l)
= ∑
n i=0 ∂ w jk ∂ w jk
(l) (l )
=
∂ w jk ∂ w jk (l) ∂ z j( l) ∂ a j(l )
Equation 2. Cost & outer-weight partial derivative (output layer)

n−1 (l ) (l )
∂C 1 ∂ Ci ∂ Ci ∂ z j ∂ a j ∂ Ci
(l )
= ∑
n i=0 ∂ b jk ∂b jk
(l ) (l )
=
∂ b jk ∂ b jk (l) ∂ z j( l) ∂ a j(l )
Equation 3. Cost & bias partial derivative (output layer)

Each parameter in the chain rule equations could be calculated using the following equations:
nl −1 nl −1
∂ Ci
C i=∑ ( a j − y j) = ∑ 2 ( a j(l )− y j )
(l ) 2
(l )
j=0 ∂aj j=0

Equation 4. Direct calculation of cost

nl =Number of outer neurons

(l )
a j =Outer neuron actual activation value

y j=Outer neuron desired activation value

(l )
∂aj
a j(l )=f a ( z j(l ) ) (l )
=f a ' ( z j(l ))
∂z j
Equation 5. Activation value of output neuron

z j =Weighted ∑ of corresponding neuron f a =Activation function

( l)

nl −1 ( l) nl −1 (l )
∂ zj ∂ zj
z j = ∑ ( w jk ak
( l) (l) ( l−1)
)+b j (l )
= ∑ ( ak ( l−1)
) =1
j ,k=0 ∂ w jk (l ) j ,k=0 ∂b jk (l )
Equation 6. Weighted sum calculation

(l )
w jk =Weight linking inner∧outer neurons
( l−1)
ak = Activation value of inner neuron
(l )
b j =Bias value for computation of outer neuron
By substituting the equations above, the partial derivatives for each cost now become:
nl−1 nl −1
∂C i
( l)
= ∑ ( ak (l −1 )) ( f a ' ( z j(l ) ) ) ∑ 2 ( a j( l)− y j )
∂ w jk j , k=0 j=0

nl−1
∂ Ci
(l )
=( f a ' ( z j ( l)
) ) ∑ 2 ( a j( l)− y j )
∂ b jk j=0

Note that these partial derivatives are only for the computation of the elements associated with the
output layer and its adjacent layer in the gradient vector.

The sensitivity of the cost of a single training example on the activation values of neurons that are one
layer away from the output layer can also be deduced using the chain rule:

( )
nl −1 ( ) ( ) ( ) nl −1
∂C i ∂ z j l ∂ a j l ∂ Ci ∂ zjl
=∑ ∑ ( l)
( l−1 ) ( l −1 ) (l ) (l) ( l −1 )
= w jk
∂ ak j=0 ∂ ak ∂ zj ∂aj ∂ ak j , k=0

Equation 7. Cost & activation value partial derivative

To compute weights linked to neurons closer towards the input layer, the chain rule equations need
to be extended:
( l −1 ) ( l−1 ) ( l) (l )
∂ Ci ∂ zk ∂ ak ∂ zj ∂ a j ∂ Ci
=
∂ w jk( l−1) ∂ w jk (l−1 ) ∂ z k( l−1) ∂ a k (l−1) ∂ z j( l) ∂ a j(l )
Equation 8. Cost & inner-weight partial derivative

( l−1) ( l −1 ) (l ) (l )
∂ Ci ∂ zk ∂ ak ∂z j ∂aj ∂C i
( l−1)
= ( l−1) ( l−1) ( l−1 )
∂bj ∂ bj ∂ zk ∂ ak ∂ z j ∂ a j( l)
(l )

Equation 9. Cost & inner-bias partial derivative

Note that the partial derivatives above need to be calculated for all training examples and the average
is taken for gradient computation.

Eventually, the gradient vector could be fully computed by extending the chain rule equations shown
above in similar sequence for each layer of the network until the input layer is reached.
Tips for Neural Networks
1. Neurons from the preceding layer could be connected to one or more neurons in the current layer,
depending on the purpose of the current neuron. For example, areas of the building and land is a
critical parameter for property evaluation regardless of the circumstances; however, the number of
bedrooms might be more important than the distance to CBD in a city where a larger number of
families with children is present. In that case, connection between the number of bedrooms as an input
neuron to a neuron in the following layer which combines it with the areas of building and land is
more favourable than the connection of the distances to the CBD.

2. Different activation functions could be selected for different neurons in the same layer, depending
on the criteria of "significant impact" of that neuron. For example, the areas of the building and land,
the number of bedrooms along with the age of the building could be combined for property evaluation
and the value depreciates as the building gets older; an appropriate activation function might be
Sigmoid in this case. However, some older buildings with historical significance might be a lot more
expensive than properties with similar attributes in other areas. In that case, a separate neuron
connection should be established just for the age attribute and an appropriate activation function
would be ReLU (i.e. the function is not triggered to boost property value until a threshold age is met).

3. A common problem with gradient descent is that the minima found for the cost function could be a
local minima when the shape of the function is NOT convex (positive parabola) which is very
common for complex optimisation problems. Stochastic gradient descent is therefore employed for
seeking global minima for functions with complicated geometries. This method uses subsets of the
dataset as input for each training example where it adjusts the weights and biases after each run until
the algorithm converges.

Data Preprocessing

Categorical Data Encoding

It is the process of converting categorical (or textual) data into numerical data for the convenience of
algorithm operation as machines work with numbers not strings. There are 2 types of categorical data:
ordinal and nominal.

An ordinal dataset has an inherent order meaning that the data can be ranked from the lowest to the
highest or vice versa; an example would be the level of education attained.

A nominal dataset does not have an inherent order meaning that the data cannot be ranked, e.g.
locations of interest, departments at a tertiary institution.

The choice of encoding method could have a significant impact on the performance of the model
therefore choosing an appropriate method is critical at the beginning stage of the development of a
neural network.

There are several types of categorical data encoding methods:

 One-hot encoding
- Most common method, a binary column is created for each unique category in the variable
- 0 = not present, 1 = present
- Columns are moved to the beginning of the dataset

 Dummy encoding
- Works in the same way as one-hot encoding, uses one less column than
- N categories require N-1 binary columns

 Label encoding
- Each unique category is assigned a unique integer value
- Assigned integers may be misinterpreted as having an order relationship when they do not

 Ordinal encoding
- Used when the categories have a natural order, i.e. dataset is ordinal
- Each category is assigned a numerical value based on their order

 Binary encoding
- Similar to one-hot encoding, but all categories remain in one column
- Each category is represented by a unique set of binary digits

 Count encoding
- All categories remain in one column
- Each category is represented by the number of times that it appears in the variable

 Tagert encoding
- The encoded category must have some connections with the target to be predicted
- Target mean of each category is often calculated as the probability of achieving the target
- The corresponding target mean is assigned to each category

Feature Scaling
It is the process of transforming a dataset to make it fit with a specific scale which will improve the
performance of machine learning. There are 2 common methods: standardisation and normalisation.

The formula for standardisation is:

' X −μ
X=
σ

The formula for normalisation is

' X−X min

X=
X max− X min

Genetic Algorithm
Introduction to Genetic Algorithms
Genetic algorithms (under the class of evolutionary algorithms) are used for solving constrained and
unconstrained optimisation problems based on natural selection:

- The process begins with creating a random selection of solutions as the initial population
which is known as Generation 0. The viability of each solution is calculated by a fitness
function.

- Crossover then occurs for producing offsprings by swapping genetic cords of any 2 solutions;
solutions with higher fitness scores are more likely going to undergo this process.

- A mechanism known as elitism gets triggered before complete generational transformation

occurs. It will keep n best solutions from the previous generation to prevent production of
rubbish generations.

- Mutation is then introduced to randomly alter the solutions for the sake of seeking an optimal
solution. The offsprings generated are known as Generation 1 and the fitness score of each
solution in the new population is calculated.

The process repeats itself until an optimal solution is acquired or the maximum computable number of
generations is reached.

Training NN Models with GA

General Notes on Modelling

Measures of Predictive Model Accuracy
R2 statistics is the most common approach for measuring the accuracy of a predictive model. When
predicted variables (dependent) are compared with observed variables(independent), R 2 is a
measurement of how much of the total variation in the former is explained by the variation in the
latter.

The part of such variation that is NOT explained could be calculated using the fraction shown in the
equation below; therefore R2 is calculated by subtracting the fraction from 1:
n

∑ ( y i−^y i )2
R2=1− i =1
n
, ∈[0 ,1]
∑ ( y i− y i ) 2

i =1

A higher R2 value indicates a good fir for the regressive prediction model, and it is most likely going
to accurately predict responses for future observations. Note that R 2 is also known as the coefficient of
determination; it is the square of the Pearson correlation coefficient for simple linear regression
however they are not directly related for complex multivariate regression.

R2 could become misleading with an increasing number of independent variables (or types of
observed values) as it will either remain unchanged or increase in most cases, irrespective of the
significance of the variable. Adjusted R2 is used to resolve such problem and it can be calculated
using the following equation:

2 ( 1−R2 ) ( N−1 )
Adjusted R =1−
( N −k−1 )
N=Total sample ¿ o . of data points
k =No .of independent variables

Mean squared error (MSE) is used to measure the deviation of predicted variables from observed
variables; it emphasizes larger errors due to squaring and is more sensitive to outliers:
n
1
MSE= ∑ ( y i−^y i ) ,∈ ¿
2

n i=1
In practice, root mean squared error (RMSE) is more commonly used which is shown as below:

√
n
1
R MSE= ∑ ( y i− ^y i ) , ∈¿
2
n i=1

When the error is small, MSE or RMSE may not be suitable as their values could become too small
for useful comparison due to squaring.

Mean absolute error (MAE) treats all errors equally. Absolute values of errors are taken so that they
do not balance out each other which is similar to squaring in MSE and RMSE. It is less sensitive to
outliers and provides a more balanced view about the performance of the model:
n
1
M A E= ∑ | y i−^y i|,∈ ¿
n i=1

Cosmic Codes Hidden Messages Fro
50% (2)
Cosmic Codes Hidden Messages Fro
675 pages
A Test For Comparing Diversities Based On The Shannon Formula
No ratings yet
A Test For Comparing Diversities Based On The Shannon Formula
4 pages
A Multilayer Feed-Forward Neural Network
No ratings yet
A Multilayer Feed-Forward Neural Network
9 pages
Notes 7sem Pec Csm701
No ratings yet
Notes 7sem Pec Csm701
23 pages
Neural Net 2002
No ratings yet
Neural Net 2002
12 pages
Question 105A
No ratings yet
Question 105A
33 pages
1 Intro
No ratings yet
1 Intro
91 pages
Basic Neural Networks
No ratings yet
Basic Neural Networks
9 pages
A Neural Network With Minimal Structure For Maglev System Modeling and Control
No ratings yet
A Neural Network With Minimal Structure For Maglev System Modeling and Control
6 pages
UNIT-2 Machine Learning
No ratings yet
UNIT-2 Machine Learning
35 pages
13 - Chapter 5 PDF
No ratings yet
13 - Chapter 5 PDF
40 pages
Perceptron and Multi Layer Perceptron
No ratings yet
Perceptron and Multi Layer Perceptron
5 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Model of Neuron in An ANN
No ratings yet
Model of Neuron in An ANN
12 pages
Future Scope and Conclusion
No ratings yet
Future Scope and Conclusion
13 pages
Single Neuron As A Classifier
No ratings yet
Single Neuron As A Classifier
27 pages
Artificial Neural Network Notes
No ratings yet
Artificial Neural Network Notes
9 pages
Performance Evaluation of Artificial Neural Networks For Spatial Data Analysis
No ratings yet
Performance Evaluation of Artificial Neural Networks For Spatial Data Analysis
15 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Fast Training of Multilayer Perceptrons
No ratings yet
Fast Training of Multilayer Perceptrons
15 pages
Artificial Neural Network Concepts/Terminology
No ratings yet
Artificial Neural Network Concepts/Terminology
22 pages
Machine Learning Course in Bangalore
No ratings yet
Machine Learning Course in Bangalore
14 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
Unit 4
No ratings yet
Unit 4
18 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
A Review of Artificial Neural Network (ANN)
No ratings yet
A Review of Artificial Neural Network (ANN)
5 pages
Dl-Unit 2
No ratings yet
Dl-Unit 2
7 pages
Lecture 4
No ratings yet
Lecture 4
22 pages
Neural Networks Notes
No ratings yet
Neural Networks Notes
22 pages
Unit 2
No ratings yet
Unit 2
37 pages
Upload Unit 2
No ratings yet
Upload Unit 2
19 pages
Diagnosing Breast Cancer Using Machine Learning
No ratings yet
Diagnosing Breast Cancer Using Machine Learning
14 pages
DL Unit2
No ratings yet
DL Unit2
113 pages
Major Classes of Neural Networks
No ratings yet
Major Classes of Neural Networks
21 pages
Unit 1
No ratings yet
Unit 1
72 pages
Perceptrons and NN
No ratings yet
Perceptrons and NN
29 pages
Supervised Training Via Error Backpropagation: Derivations: 4.1 A Closer Look at The Supervised Training Problem
No ratings yet
Supervised Training Via Error Backpropagation: Derivations: 4.1 A Closer Look at The Supervised Training Problem
32 pages
DWDM Unit 2
No ratings yet
DWDM Unit 2
23 pages
Back-Propagation Algorithm of CHBPN Code
No ratings yet
Back-Propagation Algorithm of CHBPN Code
10 pages
Tensorflow Keras Pytorch: Step 1: For Each Input, Multiply The Input Value X With Weights W
No ratings yet
Tensorflow Keras Pytorch: Step 1: For Each Input, Multiply The Input Value X With Weights W
6 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
7 pages
Multi Percept Ron
No ratings yet
Multi Percept Ron
14 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
ch6 Perceptron MLP PDF
No ratings yet
ch6 Perceptron MLP PDF
31 pages
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
No ratings yet
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
8 pages
08 Chapter3 CONCEPTSOFANNFUZZYLOGICANDANFIS
No ratings yet
08 Chapter3 CONCEPTSOFANNFUZZYLOGICANDANFIS
18 pages
Mod 3
No ratings yet
Mod 3
101 pages
Unit 5 Learning
No ratings yet
Unit 5 Learning
21 pages
Unit 5
No ratings yet
Unit 5
32 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Tutorial Backpropagation Neural Network
No ratings yet
Tutorial Backpropagation Neural Network
10 pages
Back Propagation Learning Algorithm
No ratings yet
Back Propagation Learning Algorithm
15 pages
Supervised Learning
No ratings yet
Supervised Learning
4 pages
ANN Multi Layer Perceptron Assignment
No ratings yet
ANN Multi Layer Perceptron Assignment
3 pages
Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
27 pages
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
No ratings yet
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
16 pages
ML Lec 09 ANN Quadratic Training
No ratings yet
ML Lec 09 ANN Quadratic Training
44 pages
Machine Learning NN
100% (2)
Machine Learning NN
16 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
16 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Untitled
No ratings yet
Untitled
31 pages
Mathematics 1 Holiday Homework
No ratings yet
Mathematics 1 Holiday Homework
4 pages
SegNeXt Rethinking Convolutional Attention Design Segmentation
No ratings yet
SegNeXt Rethinking Convolutional Attention Design Segmentation
15 pages
Data Link Layer: TOPIC: Error Control & Flow Control
No ratings yet
Data Link Layer: TOPIC: Error Control & Flow Control
25 pages
Lab2 Monte Carlo Simulations
No ratings yet
Lab2 Monte Carlo Simulations
73 pages
EEN 412 Lecture 05b 25032023 115409am 20022024 082256am 10022025 021846pm
No ratings yet
EEN 412 Lecture 05b 25032023 115409am 20022024 082256am 10022025 021846pm
19 pages
Rapid Miner Cheat Doc 1
No ratings yet
Rapid Miner Cheat Doc 1
14 pages
Operations Research
No ratings yet
Operations Research
47 pages
Modeling and Simulation
No ratings yet
Modeling and Simulation
2 pages
Chapter3 ProblemSolvingBySearching
No ratings yet
Chapter3 ProblemSolvingBySearching
61 pages
01 - Andi Ahmad Rifaldi R - Quiz 3 Spasial
No ratings yet
01 - Andi Ahmad Rifaldi R - Quiz 3 Spasial
4 pages
Lecture Notes Lecture 2 Basic Linear Algebra Matlab
No ratings yet
Lecture Notes Lecture 2 Basic Linear Algebra Matlab
45 pages
Unit 1
No ratings yet
Unit 1
6 pages
Act03 CSP 4 Queens Sol
No ratings yet
Act03 CSP 4 Queens Sol
13 pages
Knowledge Organiser Quiz
No ratings yet
Knowledge Organiser Quiz
3 pages
Publication 6553 PDF
No ratings yet
Publication 6553 PDF
200 pages
FAI - Unit-2 - State Space Search & Heuristic Search Techniques
No ratings yet
FAI - Unit-2 - State Space Search & Heuristic Search Techniques
19 pages
4-Arithmetic Coding
No ratings yet
4-Arithmetic Coding
1 page
Shri Madhwa Vadiraja Institute of Technology & Management: Vishwothama Nagar, Bantakal - 574 115, Udupi Dist
No ratings yet
Shri Madhwa Vadiraja Institute of Technology & Management: Vishwothama Nagar, Bantakal - 574 115, Udupi Dist
50 pages
CSC462-AI Lec01 Slides
No ratings yet
CSC462-AI Lec01 Slides
9 pages
Dokumen - Pub - Numerical Approximation of Partial Differential Equations 1st Ed 3319323539 978 3 319 32353 4 978 3 319 32354 1 3319323547
100% (2)
Dokumen - Pub - Numerical Approximation of Partial Differential Equations 1st Ed 3319323539 978 3 319 32353 4 978 3 319 32354 1 3319323547
541 pages
Course: Digital Control (CS416) Sheet No.: 1 Date: 27/10/2021 Due: 3/11/2021 Time: 2 Hours
No ratings yet
Course: Digital Control (CS416) Sheet No.: 1 Date: 27/10/2021 Due: 3/11/2021 Time: 2 Hours
7 pages
Divide and Conquer 1
No ratings yet
Divide and Conquer 1
11 pages
Humss 4 Trends
No ratings yet
Humss 4 Trends
21 pages
Digital Filters: Primer Interpretation: He He He He
No ratings yet
Digital Filters: Primer Interpretation: He He He He
2 pages
Deep Reinforcement Learning PDF
No ratings yet
Deep Reinforcement Learning PDF
150 pages
Rohan Thesis Presentation
No ratings yet
Rohan Thesis Presentation
13 pages
Three Novel Fifth-Order Iterative Schemes For Solving Nonlinear Equations
No ratings yet
Three Novel Fifth-Order Iterative Schemes For Solving Nonlinear Equations
12 pages

Simulation Notes

Uploaded by

Simulation Notes

Uploaded by

Simulation Notes

There are 3 types of machine learning:

- Increase the bias

Equation 1. Cost function & Cost gradient

0 ≤ j ≤ Number of neurons ∈the outer layer

Equation 4. Direct calculation of cost

nl =Number of outer neurons

y j=Outer neuron desired activation value

z j =Weighted ∑ of corresponding neuron f a =Activation function

Equation 7. Cost & activation value partial derivative

Equation 9. Cost & inner-bias partial derivative

Categorical Data Encoding

There are several types of categorical data encoding methods:

The formula for standardisation is:

The formula for normalisation is

' X−X min

- A mechanism known as elitism gets triggered before complete generational transformation

Training NN Models with GA

General Notes on Modelling

You might also like