0% found this document useful (0 votes)
39 views22 pages

AIDL03 EvolutionOfAI

Uploaded by

Ragnar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views22 pages

AIDL03 EvolutionOfAI

Uploaded by

Ragnar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Within a generation, the problem of creating

‘artificial intelligence’ will be substantially solved.


Marvin Minsky, 1967

Evolution of AI Shankar Venkatagiri


Charles Babbage, 1837
Computing

Ada Lovelace, 1848


“The Analytical Engine has
no pretensions whatever to
originate anything.
It can do whatever we know
how to order it to perform…
Its province is to assist us in
making available what we’re
already acquainted with.”
Source: John Cummings, Wikipedia
x1

Neuron inputs x2 𝛴 f y

output
z >= 𝞱
xn aka
activation

❖ 1943: McCulloch-Pitts propose a neuron model using a threshold logical unit

A summer 𝝨 which emits 𝛴i xi = x1 + x2 + . . . + xn


(TLU) with boolean inputs and outputs

If the output of 𝝨 exceeds a threshold θ, then the neuron “fires”


❖ f is a threshold activation function

❖ Insights: Networks of TLUs can compute arithmetic and logical functions


x1

𝛴
x1 x2 … xn OR
Compute x2 f y
0 0 … 0 0

1 0 … 0 1 xn

1 1 … 0 1

… … … … …

1 1 … 1 1

Q: What will the summer 𝝨 do?


❖ Let’s construct a TLU neuron that represents the OR function

❖ Q: What will be the threshold θ? What will f do?


❖ Q: Can you modify the solution to represent the AND function?
Gallery aka Hard Limit
Filename:
ActivationFunctions.ipynb

1
1 + e −x

Aka hardlims
sin(x) −x 2
e
x

x −x
e −e
e x + e −x
Weights

❖ 1958: Frank Rosenblatt proposes a neuron model with weights


❖ Adding a weight w0 (bias), we can eliminate the threshold θ

1
wo bias Activation f(z)
x1 w1 Logit
z >= 0 output
w2 z
inputs x2 Σ y
… wn
0
xn weights
z =zw=0w+1 w
x11 +
x1w+2w
x2 +
x2. +. .. +. .w+nw
xn xn
Application Decision
boundary

❖ Q: What can we achieve with


this “naive” model?
❖ Suppose the data points are
linearly separable
❖ We can build a Classifier
❖ Learning process
❖ The decision boundary
adjusts with the data
Image: Elizabeth Goodspeed
Perceptron 𝝨|f
1
𝝨|f y1

𝝨|f
x1
y2

𝝨|f
x2

❖ Model generalises to a layer of neurons, multiple outputs


❖ Each arrow carries one weight: the collection of weights is the “model”
❖ Insight: Perceptron Rule helps the network learn complicated mappings
❖ Result guarantees convergence over iterations
y v
Learning
u
x
Linearly separable

Rule: If u >= 0, then red, else green

Problem: Given any point, declare it as green or red


Modifying the representation reduces complexity
Learning involves iterating through all the solution candidates (hypothesis
space), and identifying an “optimal” solution
Need: A mechanism that suitably represents the data and iteratively solves
the problem by minimising loss = deviations from the truth
x1 x2 XOR x1
Challenge w=1
0 0 0 w=1
w = -2
∏ Σ y
1 0 1 w=1
w=1
0 1 1 x2

1 1 0 Multi-Layer Perceptron solution

❖ XOR assumes 1 when exactly one of the inputs is 1


❖ Otherwise it evaluates to 0
❖ Q: Can you “solve” this using a neuron? Or a single layer perceptron?
❖ Short answer: You cannot (Minsky & Papert, 1969)
❖ This led to funding cuts for AI, and an AI Winter
DNNs Source: Chollet, F. Deep Learning in Python (2018)

Hidden layers
Inputs of neutrons Outputs

❖ Perceptrons connected in multiple layers - deep neural nets


❖ Tasks: Classifying images, generating music, …
❖ Learning involves adjusting layers & edge weights to achieve the task
Backprop

❖ Q: How to “solve” a deep neural net?


❖ In 1970, Seppo Linnainmaa describes back-propagation
❖ In 1986, Geoffrey Hinton et al apply BP to “solve” neural networks
❖ “… allow an arbitrarily connected neural network to develop an internal
structure that is appropriate for a particular task domain…”
❖ People refuse to take note!
CNNs

❖ Jeopardy! style hint: He is the Chief AI Scientist at Meta.


❖ In 1989, Yann LeCun cracks the MNIST digit recognition problem
❖ Solution exploits Convolutional Neural Nets
Swagger
Swagger
RNNs

Image: fdeloche

❖ Some applications involve sequential data, with variable input lengths


❖ Time series data - stock prices, city temperature
❖ Autonomous driving systems - anticipate trajectories and avoid accidents
❖ Natural language processing - sentiment prediction, machine translation
❖ Recurrent neurons constitute a class of nets that can predict the future
❖ A portion of the output is fed back into the next input (memory)
LSTMs

Image: fdeloche
❖ Melanie Mitchell
❖ My mother said that the cat that flew with her sister to Hawaii the year before you
started at that new high school is now living with my cousin.
❖ Q: Who is living with my cousin?
❖ RNNs have trouble processing this - need longer memory
❖ In 1995, Hochreiter and Schmidhuber propose Long Short Term Memory
(LSTM) as a solution
❖ 2016 Google Translate used 8 encode + 8 decode layers of 1024 LSTMs
Classification

❖ playground.tensorflow.org
❖ Introduce noise, increase train-to-test ratio
❖ Change the pattern and observe
❖ Play with the controls and train the neural network to classify
❖ No. of input & output nodes, layers
x -1 0 1 2 3 4
Regression
y -3 -1 1 3 5 7

1 wo
output
w1
x Σ y

❖ Q: Can you relate x and y?


❖ Option A: By direct observation, conclude that y = 2 x - 1
❖ Option B: Fit a regression model y = w0 + w1 x using Excel / Python / …
❖ Option C: Figure out approximate (w0, w1) using a neural model
❖ Open LinReg.ipynb on Google Colab
❖ Q: Why is this code difficult to run on a local machine?
x -1 0 1 2 3 4
Steps
y -3 -1 1 3 5 7

1. Set up a single layer model


❖ One neuron unit, one input (x)
❖ Dense ⟹ all nodes link up
❖ Default activation is linear
2. The optimiser (SGD) will minimise a loss function (MSE)
3. Fit the model to the data - over multiple epochs
4. Predict y for an x value - calibrate the outcome
Image Source: Prof.Rene Vidal, JHU
Award

❖ Video: And the Turing Award for 2018 goes to… (upto 3:45)
❖ Geoffrey Hinton: “If you have an idea, and it seems to you it has to be right,
don’t let people tell you it is silly. Just ignore them!”

You might also like