0% found this document useful (0 votes)

9 views50 pages

1 DL

The document provides an overview of the distinctions between Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL), highlighting their interrelationships and historical context. It discusses foundational concepts, key figures like Alan Turing, and significant advancements in neural networks and optimization techniques over the years. The document also emphasizes the importance of data availability and computational power in the evolution of deep learning technologies.

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views50 pages

1 DL

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

AI vs ML vs DL

CACSC18: Lecture 2
Prerequisites
• ALGEBRA
• CALCULUS
• STATISTICS
• PROBABILITY
• MACHINE LEARNING
• PYTHON
Leibniz–Newton calculus
controversy

https://en.wikipedia.org/wiki/Leibniz%E2%80%93Newton_calculus_controversy
https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
Turing Test by Alan Turing
At a high level, ML generally means
algorithms or models that
Artificial 1. Data: get a lot of (cleaned) data, with
human-defined features (e.g. “age”,
Intelligence -> “height”, “FICO score”, “is this email
Machine spam?” etc.)
Learning (ML) 2. Training: use the data to “tune” the
relative importance of each feature.
3. Inference: predict something on new data.
• DL is a subset of ML. It is based on
neural networks, a conceptual model
of the brain that has been around
since the 1950s but largely ignored
Machine until recently. That’s because they are
Learning -> very computationally expensive and
it’s only recently that
Deep Learning • Processing has become sufficiently cheap
and powerful, through GPUs and FPGAs
and
• There’s been enough data to feel the DL
algorithms
AI History
CACSC18: Lecture 3
History of Deep Learning
• Philosophy of mind: Aristotle
• THE ART OF RAMON LLULL (1232–1350): FROM THEOLOGY
TO MATHEMATICS
• The laws of thought: Bool
• Turing’s thesis:
“L.C.M.s [logical computing machines: Turing’s expression for Turing machines]
can do anything that could be described as ‘rule of thumb’ or ‘purely
mechanical’.” (Turing 1948: 414)
• The ENIAC (Electronic Numerical Integrator and Computer) was invented by J. Presper Eckert
and John Mauchly at the University of Pennsylvania and began construction in
1943 and was not completed until 1946. It occupied about 1,800 square feet and
used about 18,000 vacuum tubes, weighing almost 50 tons.
• Reticular theory is an
obsolete scientific theory in
neurobiology that stated that
everything in the nervous
system, such as brain, is a
single continuous network.
• The concept was postulated
by a German anatomist
Joseph von Gerlach in 1871
and was most popularized by
the Nobel laureate Italian
Reticular Theory Vs Neuron physician Camillo Golgi.

Theory (1871-1906)
• The early model of an
artificial neuron is
introduced by Warren
McCulloh a
neuroscientist and
Walter Pitts a logician in
1943.
• The McCulloh – Pitts
neuron model is also
known as linear
threshold gate.
McCulloch Pitts neuron
model
Alan Turing -> Artificial
Intelligence (AI)

➢ Alan Turing was a mathematician, cryptographer who

deciphered the Enigma machine in WW2, logician,
philosopher, Cambridge fellow (at age 22) and ultra-long
distance runner. He also lay the foundations of the modern day
computer and artificial intelligence.
➢ His work permeated into wider public knowledge in the 1950s.
This gave birth to the idea of “General AI”: can computers
could posses the same characteristics of human intelligence,
including reasoning, interacting, and thinking like we do? The
answer was a resounding “no” (at least not yet).
➢ Therefore, we had to focus on “Narrow AI” — technologies that
can accomplish specific tasks such as playing chess,
recommending your next Netflix TV show, and identifying
spam emails. All of these exhibit parts of human intelligence.
But how do they work? That’s machine learning.
Perceptron
(1958)
perceptron may eventually
be able to learn, make
decision’s and translate
languages.
• In 1965, Ivakhnenko and
Lapa [71] published the
first general, working
learning algorithm for
supervised deep
feedforward multilayer
perceptron's with
arbitrarily many layers of
neuron-like elements,
using nonlinear
activation functions
based on additions (i.e.,
linear perceptron's) and
multiplications (i.e.,
First Generation Multilayer Perceptron- gates).
Ivakhnenko (1968)
Perceptron
Limitations:
AI winter of
Connectionism
(1969)
Backpropagation: 1986
Geoff Hinton / Yoshua Bengio /Yann Lecun
Gradient Descent:
Cauchy: 1847.
• Gradient descent is an
optimization algorithm used to
minimize some function by
iteratively moving in the
direction of steepest
descent as defined by the
negative of the gradient. In
machine learning, we
use gradient descent to
update the parameters of our
model.
Universal
Approximation
Theorem
(1986)
Second fall
of AI/Neural
Network
(1990)
• Around 2006, Hinton once again declared that he knew
how the brain works and introduced the idea of
unsupervised pretraining and deep belief nets.
• The idea was to train a simple 2-layer unsupervised
REBRANDING model like a restricted Boltzmann machine, freeze all the
parameters, stick on a new layer on top and train just the
AS ‘DEEP parameters for the new layer.
• You would keep adding and training layers in this greedy
LEARNING’ fashion until you had a deep network, and then use the
result of this process to initialize the parameters of a
(2006) traditional neural network.
• Using this strategy, people were able to train networks
that were deeper than previous attempts, prompting a
rebranding of ‘neural networks’ to ‘deep learning’.
Unsupervised pre-
training (2006)

• Hinton and Salakhutdinov

described an effective way
of initializing the weights
that allows deep
autoencoder networks to
learn a low dimensional
representation of data.
• Very deep learner (1991)
FURTHER INVESTIGATION WHY DOES UNSUPERVISED EXPLORING STRATEGIES FOR

Unsupervised INTO EFFECTIVENESS OF

SUPERVISED PRE-TRAINING.
PRE-TARINING HELP DEEP
LEARNING?
TRAINING DEEP NEURAL
NETWORKS.

pre-training
(2006-2009)
HOW TO INITIALIZE A BETTER OPTIMIZATION BETTER REGULARIZATION
NETWORK? ALGORITHM? ALGORITHM?
Success in hand
written recognition

• Graves et. al. outperformed

all entries in an Arabic
handwritten recognition
competition.
• Dahl et al. showed
relative error
reduction of 16%
and 23.2% over a
state of art system.

Success in speech
recognition (2010)
New record on MNIST (2010)
Ciresan et al. set a new record on the MNIST dataset using good old backpropagation on GPU.
• D.C. Ciresan et. al.
achived 0.56% error
rate in IJCNN traffic sign
recognition
competition.

First superhuman visual

pattern recognition (2011)
ImageNet Challenge (2012)
Alexnet error 16% with 8 layers
Network Error layers

AlexNet 16.0% 8

[2012-2016] ZFNet 11.2% 8

VGGNet 7.3% 19

GoogLeNet 6.7% 22

MS ResNet 3.6% 152

Hubble and
Wiesel
experiment
(1959)
Hubble and Wiesel experiment (1959)

Experimentally
showed that each
neuron has a fixed
receptive field.
LeCun et. al. Hand written text recognition using back
propagation over a convolutional neural network.

Convolutional
neural
network
(1989)
Better
optimization
method (1983- Optimization method
Nesterov
Year
1983
2018) Adagrad 2011
RMSProp 2012
Adam/Batch Normilization 2015
Eve 2016
Beyond Adam 2018
What Changed? Why Now?
• Data along with GPUs
probably explains most
of the improvements
we’ve seen. Deep
learning is a furnace
that needs a lot of fuel
to keep burning, and we
finally have enough
fuel.

1. Appearance of large, high-

quality labeled datasets -
http://beamlab.org/deeplearning/2017/02/23/deep_learning_101_part1.html
2. Massively parallel
computing with GPUs -
• It turns out that neural nets are actually just a
bunch of floating-point calculations that you
can do in parallel.
• It also turns out that GPUs are great at doing
these types of calculations. The transition from
CPU-based training to GPU-based has resulted
in massive speed ups for these models, and as
a result, allowed us to go bigger and deeper,
and with more data.
• The transition away
from saturating
activation functions like
tanh and the logistic
function to things
like ReLU have
alleviated the vanishing
gradient problem.

Backprop-friendly activation
functions
Improved architectures
• Resnets, inception modules, and Highway
networks keep the gradients flowing
smoothly, and let us increase the depth and
flexibility of the network
Software platforms

• Frameworks like tensorflow, theano, chainer,

and mxnet that provide automatic
differentiation allow for seamless GPU
computing and make protoyping faster and
less error-prone.
• They let you focus on your model structure
without having to worry about low-level
details like gradients and GPU management.
New regularization
techniques
• Techniques like dropout, batch
normalization, and data-
augmentation allow us to train
larger and larger networks without
(or with less) overfitting
Robust optimizers
• Modifications of the SGD (Stochastic
gradient descent) procedure
including momentum, RMSprop,
and ADAM have helped eek out every
last percentage of your loss function.
The Thinking Machine
Thanks

Module 3 Games Optimal Decisions in Games Minimax Algorithm
No ratings yet
Module 3 Games Optimal Decisions in Games Minimax Algorithm
18 pages
Matrix Methods 6
No ratings yet
Matrix Methods 6
13 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
No ratings yet
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
601 pages
Lecun 20201027 Att
No ratings yet
Lecun 20201027 Att
72 pages
Euler
No ratings yet
Euler
17 pages
DSP ch08 S2.3 2.7P PDF
No ratings yet
DSP ch08 S2.3 2.7P PDF
57 pages
How Excel Works From Data Structures Point of View
No ratings yet
How Excel Works From Data Structures Point of View
6 pages
Lecture 1
No ratings yet
Lecture 1
135 pages
Gas Network Optimization by MINLP
No ratings yet
Gas Network Optimization by MINLP
260 pages
Daffodil International University Lab Report
No ratings yet
Daffodil International University Lab Report
13 pages
DeepLearning Ebook FINAL PDF
No ratings yet
DeepLearning Ebook FINAL PDF
17 pages
DL Slides 1
No ratings yet
DL Slides 1
63 pages
Unit III
No ratings yet
Unit III
35 pages
Q Learning Ejemplo
100% (1)
Q Learning Ejemplo
11 pages
International Standard Book Number-IsBN T
No ratings yet
International Standard Book Number-IsBN T
9 pages
Deep Learning
No ratings yet
Deep Learning
34 pages
Computational Fluid Dynamics Fluid Dynamics Lecture-17: Dr. Ashwinik. Yadav Mechanical Engg. Dept
No ratings yet
Computational Fluid Dynamics Fluid Dynamics Lecture-17: Dr. Ashwinik. Yadav Mechanical Engg. Dept
32 pages
On The Possibility of Using Generalized LDPC Codes in The Mceliece Cryptosystem
No ratings yet
On The Possibility of Using Generalized LDPC Codes in The Mceliece Cryptosystem
8 pages
Tree Traversal Algorithms
No ratings yet
Tree Traversal Algorithms
3 pages
Deep Learning Midsem Merged Previous Batch
No ratings yet
Deep Learning Midsem Merged Previous Batch
423 pages
5 Segmentation Chapter
No ratings yet
5 Segmentation Chapter
41 pages
2003 - QRS Detection Using Zero Crossing Counts
No ratings yet
2003 - QRS Detection Using Zero Crossing Counts
8 pages
001 Intro
No ratings yet
001 Intro
66 pages
Tubingen DL Notes
No ratings yet
Tubingen DL Notes
151 pages
Frequency Domain Least-Squares Reverse Time Migration Using Low-Rank Green'S Function For High Memory Efficiency
No ratings yet
Frequency Domain Least-Squares Reverse Time Migration Using Low-Rank Green'S Function For High Memory Efficiency
5 pages
Binary Ordering Algorithm
No ratings yet
Binary Ordering Algorithm
5 pages
Comparative Analysis of Brute Force and Boyer Moore Algorithms in Word Suggestion Search
No ratings yet
Comparative Analysis of Brute Force and Boyer Moore Algorithms in Word Suggestion Search
5 pages
MCA 203 Deep Neural Network (DNN) : Lecture 1 - Introduction To Syllabus, Scheme, Grading and Outcomes of The Course
No ratings yet
MCA 203 Deep Neural Network (DNN) : Lecture 1 - Introduction To Syllabus, Scheme, Grading and Outcomes of The Course
42 pages
Machine Learning Deep Learning Overview AIST
No ratings yet
Machine Learning Deep Learning Overview AIST
86 pages
Deep Learning University
No ratings yet
Deep Learning University
129 pages
Deep Learning Basics Concepts
90% (10)
Deep Learning Basics Concepts
69 pages
Introduction+to+Neural+Networks+ +Lecture+Slides+Part+1
No ratings yet
Introduction+to+Neural+Networks+ +Lecture+Slides+Part+1
36 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
Seminar 4 Shift Scheduling Solutions
No ratings yet
Seminar 4 Shift Scheduling Solutions
4 pages
Recent Advances in Deep Learning Based Computer Vision
No ratings yet
Recent Advances in Deep Learning Based Computer Vision
6 pages
Lecture 10 - Supervised Learning in Neural Networks - (Part 3)
No ratings yet
Lecture 10 - Supervised Learning in Neural Networks - (Part 3)
2 pages
Apl3 3
No ratings yet
Apl3 3
6 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
Deep Learning Hardware
No ratings yet
Deep Learning Hardware
82 pages
Chapter 1 - Introduction To Deep Learning 2023
No ratings yet
Chapter 1 - Introduction To Deep Learning 2023
50 pages
DL Intro
No ratings yet
DL Intro
64 pages
Lec 01 Introduction
No ratings yet
Lec 01 Introduction
98 pages
Guidelines-Datamining-I - UGCF-BA-major-sem 3 - July 24
No ratings yet
Guidelines-Datamining-I - UGCF-BA-major-sem 3 - July 24
3 pages
Chapter 11 KNN Naive Bayes and LDA
No ratings yet
Chapter 11 KNN Naive Bayes and LDA
15 pages
Assignment 1 LPP Modeling and Solution
100% (6)
Assignment 1 LPP Modeling and Solution
1 page
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
DNN - 1 - M1 - Fundamentals of Neural Network
No ratings yet
DNN - 1 - M1 - Fundamentals of Neural Network
95 pages
Deep Learning Algorithms and Architectures
No ratings yet
Deep Learning Algorithms and Architectures
26 pages
Deep Learning
100% (3)
Deep Learning
32 pages
Ada Boost
No ratings yet
Ada Boost
2 pages
Lecun DLHardware Isscc2019
No ratings yet
Lecun DLHardware Isscc2019
8 pages
Uma011 (2020)
No ratings yet
Uma011 (2020)
2 pages
The Artificial Intelligence Renaissance Deep Learning and The Road To Human Level Machine Intelligence
No ratings yet
The Artificial Intelligence Renaissance Deep Learning and The Road To Human Level Machine Intelligence
19 pages
Ai For Biginners (Autosaved)
No ratings yet
Ai For Biginners (Autosaved)
135 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
A Brief History of Deep Learning - DATAVERSITY
No ratings yet
A Brief History of Deep Learning - DATAVERSITY
7 pages
Mod-1 Part 1
No ratings yet
Mod-1 Part 1
143 pages
Intro DL
No ratings yet
Intro DL
48 pages
How Powerful Is AI - A Deep Learning Literature Review by Alban Tchikladze
No ratings yet
How Powerful Is AI - A Deep Learning Literature Review by Alban Tchikladze
10 pages
All Slides
No ratings yet
All Slides
535 pages
ANN Unit 3 Answers
No ratings yet
ANN Unit 3 Answers
12 pages
Unit - 1 Deep Learning 3-2
No ratings yet
Unit - 1 Deep Learning 3-2
15 pages
DL Unit - 1 Notes
No ratings yet
DL Unit - 1 Notes
45 pages
Lecture 01
No ratings yet
Lecture 01
45 pages
AI and ML Workshop PPTX - 250131 - 193538
No ratings yet
AI and ML Workshop PPTX - 250131 - 193538
44 pages
MV cs4243 2024 Amir 6 p0
No ratings yet
MV cs4243 2024 Amir 6 p0
40 pages
Topic 07-Part1 Introduction To Deep Neural Networks
No ratings yet
Topic 07-Part1 Introduction To Deep Neural Networks
27 pages
Chap 1 Slides
No ratings yet
Chap 1 Slides
53 pages
Deep Learning
No ratings yet
Deep Learning
127 pages
XCXCXCXCXCXCXCXC
No ratings yet
XCXCXCXCXCXCXCXC
20 pages
DLTest 1 QB
No ratings yet
DLTest 1 QB
13 pages
Lecture1 ANN - Full
No ratings yet
Lecture1 ANN - Full
66 pages
Intro ML Lecture 1
No ratings yet
Intro ML Lecture 1
9 pages
Question Paper June 2024 (H44602)
No ratings yet
Question Paper June 2024 (H44602)
32 pages
Deep Learning Module 1 Chapter 1
No ratings yet
Deep Learning Module 1 Chapter 1
18 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
58 pages
NN DL Unit - III
No ratings yet
NN DL Unit - III
19 pages
Module 1 DL Snotes
No ratings yet
Module 1 DL Snotes
11 pages
Deep Learning 1737909076
No ratings yet
Deep Learning 1737909076
29 pages
(Machine Learning - Foundations, Methodologies, and Applications) Fengxiang He, Dacheng Tao - Foundations of Deep Learning-Springer (2025)
No ratings yet
(Machine Learning - Foundations, Methodologies, and Applications) Fengxiang He, Dacheng Tao - Foundations of Deep Learning-Springer (2025)
298 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
Unit IV
No ratings yet
Unit IV
21 pages
Dl-Unit 1
No ratings yet
Dl-Unit 1
12 pages
Unit 1
No ratings yet
Unit 1
30 pages
Lecture 01-1
No ratings yet
Lecture 01-1
28 pages
Chapter 1
No ratings yet
Chapter 1
52 pages
Deep Learning Unit-2
No ratings yet
Deep Learning Unit-2
33 pages
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
From Everand
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
Frank Millstein
No ratings yet

1 DL

Uploaded by

1 DL

Uploaded by

AI vs ML vs DL

➢ Alan Turing was a mathematician, cryptographer who

• Hinton and Salakhutdinov

Unsupervised INTO EFFECTIVENESS OF

• Graves et. al. outperformed

First superhuman visual

[2012-2016] ZFNet 11.2% 8

MS ResNet 3.6% 152

1. Appearance of large, high-

• Frameworks like tensorflow, theano, chainer,

You might also like