0% found this document useful (0 votes)

47 views6 pages

Lesson 1 - History, Definitions and Basic Concepts

This document provides an overview of artificial intelligence, machine learning, and deep learning by: 1. Explaining the differences between AI, ML, and DL and providing a brief history of each field. 2. Defining key ML and DL concepts like loss functions, regression, classification, supervised vs. unsupervised learning. 3. Detailing important developments in DL like AlexNet and improvements in image classification that have driven advances in the field.

Uploaded by

Sameera Khatoon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views6 pages

Lesson 1 - History, Definitions and Basic Concepts

Uploaded by

Sameera Khatoon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Artificial Intelligence/Machine Learning/Deep Learning:

‘Bridging the Skills Gap’

Lesson 1: History, Definitions and Basic Concepts
Hello, welcome back to my course on AI and Machine Learning

Today we will address following items:

1. What is the difference between AI, Machine Learning (ML) and Deep Learning (DL) ?
2. History of AI/ML/DL
3. Definitions & Basic concepts of ML/DL (Loss/Cost Function, Mean Squared Error
(MSE), Regression, Classification, Supervised Learning, Unsupervised Learning)
labels are given for all i/p
What is the difference between AI/ML and DL?

Last week we saw that AI (and ML/DL) are hyped under the digitalization umbrella to the
extend, that people believe that their jobs will be taken over by robots in a few years. The
euphoria/fear comes from the lack of knowledge about the subject.

AI ML DL

AI has been around for a while and can be defined as the ‘automation of thought’…for
example a calculator and an xls spreadsheet can be considered as AI

ML: addresses the topic of a computer learning the rules of a system while looking at mere
inputs and outputs. In 1957, Frank Rosenblatt invented the Perceptron, the simplest form
of a Neural Network (NN). The Perceptron created a lot of excitement…but this changed in
1969.

In 1969, Minsky and Papert published a book highlighting the perceptron’s limitations and
that Frank Rosenblatt's predictions had been grossly exaggerated. For example, the
Perceptron was not able to learn a XOR (exclusive OR) function. A XOR function is a logical
operator that outputs TRUE if the 2 inputs differ. Research funding dried up and this led to
the first AI (or NN) winter
1986: Rumelhart and Hinton solved the XOR problem and showed that backpropagation algo used in
(training of NN) works this was the basis of all subsequent NN and DL progress. deep
learning
In 1989, Yann LeCun (Bell Labs) combined the ideas of convolutional neural networks (CNNs) covnets
and backpropagation and applied them to classifying handwritten digits. The resulting had
network named LeNet was used by the US Postal Service to automate reading ZIP codes on about
mail envelops. 60k
params
NNs however faced another winter in the 1990s and the early 2000s mainly because of the
popularity of Kernel Methods. Kernel Methods are a group of classification algorithms, the
best known of which is the Support Vector Machine (SVM). In fact, in the early 2000s, most
NN papers were rejected at AI conferences. NN were still considered too expensive to train. expensive in
terms of
DL - is DL: DL is a subset of ML that puts emphasis on learning successive layers of increasinglycomputatn
power
neural meaningful representations. In 2006, Geoffrey Hinton launched the term ‘Deep Learning’ to
networks explain new algorithms that let computers ‘see’ and distinguish objects and text in images
only!!
and videos. Suddenly all other ML algorithms were branded as shallow learning.
The rebranding trick and the technological developments of GPUs revived research with
respect to NNs.

his CNN used AlexNet (2012): Alex Krizhevsky (advised by Hinton) created a broad and deep CNN that
60 million won the 2012 ImageNet competition (image classification competition: 1.4M high resolution
params, 60 color images into 1000 categories). 2012 marked the first year where a CNN was used to
million params
achieve a top 5 test error rate of 15.4%.
you need train

Since then many enhancements where developed to make NNs perform better and solve
new problems. By 2015, the winner reached an error rate of 3.6%. NNs have completely
replaced SVMs and Decision Trees in a wide range of applications. In 2016 and 2017, Kaggle
was dominated by gradient boosting machines (where structured data is available) and DL
(used for perceptual problems such as image classification). Most popular libraries are
XGBoost library and keras.

Most industries do not need or use DL. DL is not always the right tool for the job. Today
however, for image classification, speech recognition or machine translation, nobody would
dream of trying to do it without NNs.

Are we close to another AI winter? Unlikely because:

1. AI is still not central to the way we work, think and live

2. AI research has benefited from huge investments over the past 10 years and most of
the research findings of DL have not been applied.

In March 2019, Hinton, LeCunn and Bengio won the Turing Award for their contribution to
deep learning: https://www.nytimes.com/2019/03/27/technology/turing-award-ai.html
Definitions & Basic concepts:

Pre-ML:

Model is
Input Data programmed Output
by a human

ML:
New Input Data
Input Data Computer
‘learns’ the Predict
Output Data Model

Training Predicting

Example: Supervised Learning - Regression Problem label is any real number

Algorithm that predicts the value (k$) of a house in a specific city given its size (m2)
xi is the size of (house)i (m2)
yi is the price of the (house)i (k$)
m: number of training samples - m=6
h(xi)=yip hypothesis or prediction – a function learned during the training phase
allowing us to predict the price of any house given its size.
In order to measure the performance of the computer’s model we introduce a Cost
Function C indicating how far the output is from what we expect. ML learning aims at using
the loss as a feedback signal and adjust the weights of the model parameters (w and b) in
order to minimize the loss (y-yp) minimization problem differentiation problem
function must be continuous and smooth!

Cost Function C: Mean Squared Error (MSE)

1
C(w,b) = ∑6𝑖=1(𝑦𝑝𝑖 − 𝑦 𝑖 ) with yp the predicted housing price and y the actual housing price
6

y = 3.3x - 182
1000

800

600

400
line does not fit
perfectly Cost
200

0
0 50 100 150 200 250 300 350 400

Example: Supervised Learning - Classification Problem

we can
convert this
to
probability
using
sigmoid fn
A Classification Problem is an algorithm that takes as input the (x1,x2) coordinates of a point
and outputs whether the point is to be classified as O or X. The performance of the
algorithm can be defined as the % of the points that are correctly classified. In reality there
are more potential performance KPIs that we will discuss later.

Binary Classification: we only have 2 classes: y={0,1} equivalent to {Obese, not

Obese}
Multi-Class Classification (Softmax) where y={1, 2,…, k} example handwritten
digits classification 0 9

A boundary can also be non-linear

The ML learning classification problem can be defined as follows:

xi: feature (weight of a person in this case)

(xi,yi): training data set – i-th sample
Yi label (or class) – in our case Obese (y=1) or not Obese (y=0)
h(Xi)=yip hypothesis or prediction – a function (probability distribution) learned
during the training phase allowing us to predict the label ynew of xnew, not part of the
training data.

The examples that we discussed are examples of supervised learning because we fed the
computer with features and the associated label. In supervised learning, we teach the
machine ‘by example’. We feed the machine with examples so that the machine can learn
patterns in these examples. Once the patters are understood by the machine, we can have
the machine apply the patters learned on new data.

Another set of algorithms, like clustering (K-Means), uses unsupervised learning. The
computer is fed with unlabeled data. Based on a few parameters you ask the computer to
find some structure in the data.

In reinforcement learning (RL), we let the machine (an agent) learn from experience as it
observes ‘a world’. In RL, data scientists need to design the algorithm, the world in which
the agent operates and the reward structure. The machine’s goal is to select a sequence of
actions that maximizes the reward.
RL brings its own set of challenges such as:

1. How to design a reward structure that takes into account future rewards and avoid
short term thinking? One option is to work with ‘discounted’ rewards
2. How can we let the agent learn from experience without compromising safety? It is
unthinkable that we would let an autonomous vehicle ‘gain experience’ from
operating on a public road. In this case, simulation becomes key.
3. Do we allocate rewards along the process of ‘gaining experience’ or do we only
allocated a reward at the end when the mission is completed successfully?

World

Observation
Actions
& Reward

Agent

Currently, reinforcement learning is mostly a research area.

Next session: Lesson 2

We will dedicate next session to the most widely used optimization technique: Gradient
Descent! It makes sense to look at the math refresher on Calculus and Linear Algebra before
embarking on Gradient Descent. Topics that will be covered are:

1. Gradient Descent, Multivariate Gradient Descent, Stochastic Gradient Descent, Mini

Batch Stochastic Gradient Descent,
2. Optimizers: Momentum, RMSprop, Adam, Adagrad, Newton’s method

Cisco Meraki Fundamentals
No ratings yet
Cisco Meraki Fundamentals
473 pages
Unit I - Fundamentals of DL
No ratings yet
Unit I - Fundamentals of DL
41 pages
Deep Learning Introduction Class (1)
No ratings yet
Deep Learning Introduction Class (1)
46 pages
CSE445 NSU Week_1
No ratings yet
CSE445 NSU Week_1
28 pages
Module1_ Deep Learning
No ratings yet
Module1_ Deep Learning
26 pages
Learning
No ratings yet
Learning
48 pages
5298-Original PDF-10611-2-10-20200619
No ratings yet
5298-Original PDF-10611-2-10-20200619
13 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Week-12 - Introduction To ML-NN-CNN
No ratings yet
Week-12 - Introduction To ML-NN-CNN
45 pages
03-Lecture Notes-Mid
No ratings yet
03-Lecture Notes-Mid
23 pages
4 AI ML - 2
No ratings yet
4 AI ML - 2
31 pages
DNN Merged Sugata
No ratings yet
DNN Merged Sugata
243 pages
1.Introduction
No ratings yet
1.Introduction
24 pages
Lec 1
No ratings yet
Lec 1
30 pages
‎⁨فصل ثاني اسراء⁩
No ratings yet
‎⁨فصل ثاني اسراء⁩
13 pages
MVDAFT Final
No ratings yet
MVDAFT Final
30 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
AI Unit 2
No ratings yet
AI Unit 2
38 pages
Day 1 S3
No ratings yet
Day 1 S3
29 pages
XCXCXCXCXCXCXCXC
No ratings yet
XCXCXCXCXCXCXCXC
20 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
Akanksh learning
No ratings yet
Akanksh learning
33 pages
Unit-3
No ratings yet
Unit-3
16 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
AA12_Deep_Learning_2024 (1)
No ratings yet
AA12_Deep_Learning_2024 (1)
30 pages
Module 1.Pptx
No ratings yet
Module 1.Pptx
64 pages
Introduction To DL With TensorFlow
No ratings yet
Introduction To DL With TensorFlow
55 pages
Unit IV
No ratings yet
Unit IV
21 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Deep_Learning_with_R
No ratings yet
Deep_Learning_with_R
18 pages
4. Ai_foundations of Machine Learning i
No ratings yet
4. Ai_foundations of Machine Learning i
40 pages
ML FINA l note
No ratings yet
ML FINA l note
90 pages
Abhijit Ghatak - Deep Learning With R-Springer (2019)
No ratings yet
Abhijit Ghatak - Deep Learning With R-Springer (2019)
259 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
DL Intro
No ratings yet
DL Intro
64 pages
Deep Learning Midsem Merged Previous Batch
No ratings yet
Deep Learning Midsem Merged Previous Batch
423 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Backpropagation
No ratings yet
Backpropagation
6 pages
Introduction To Machine Learning For Beginners
No ratings yet
Introduction To Machine Learning For Beginners
5 pages
Deep
No ratings yet
Deep
15 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
DL Bro Notes UNIT 1
No ratings yet
DL Bro Notes UNIT 1
7 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Module 1
No ratings yet
Module 1
22 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
28 pages
Session 2 ANN 2024
No ratings yet
Session 2 ANN 2024
29 pages
unit-3 NNDL
No ratings yet
unit-3 NNDL
22 pages
HW_626710_2CH12&
No ratings yet
HW_626710_2CH12&
37 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
168 pages
Machine Learning 4th Unit
No ratings yet
Machine Learning 4th Unit
54 pages
-3
No ratings yet
-3
28 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
machine-1
No ratings yet
machine-1
35 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
21CS743 Model Question Paper Solution
No ratings yet
21CS743 Model Question Paper Solution
32 pages
Article
No ratings yet
Article
10 pages
Self-Taught Learning: Implementation Using MATLAB
100% (1)
Self-Taught Learning: Implementation Using MATLAB
42 pages
Comp Vis Week 2
No ratings yet
Comp Vis Week 2
16 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Deep learning: deep learning explained to your granny – a guide for beginners
From Everand
Deep learning: deep learning explained to your granny – a guide for beginners
PAT NAKAMOTO
3/5 (2)
Dell U2515h Monitor User's Guide en Us
No ratings yet
Dell U2515h Monitor User's Guide en Us
64 pages
Lumicil IS32CG5317SDK PB
0% (1)
Lumicil IS32CG5317SDK PB
2 pages
Chapter 3 Lecture 3 Small Signal Analysis of BJT and BJT As Aswitch
No ratings yet
Chapter 3 Lecture 3 Small Signal Analysis of BJT and BJT As Aswitch
10 pages
Jung1993-Design Method of Test Road Profile ForVehicle Accelerated Durability Test
No ratings yet
Jung1993-Design Method of Test Road Profile ForVehicle Accelerated Durability Test
13 pages
Mouse M3
No ratings yet
Mouse M3
6 pages
DELTA DPR 850B EnergE - 48Vdc Rectifier Module
No ratings yet
DELTA DPR 850B EnergE - 48Vdc Rectifier Module
2 pages
Directed 750d - User Manual
No ratings yet
Directed 750d - User Manual
20 pages
6R Final Tier 4 (FT4) MY18 Series Row Crop Tractors 6145R 6155R 6155RH 6175R 6195R 6215R
No ratings yet
6R Final Tier 4 (FT4) MY18 Series Row Crop Tractors 6145R 6155R 6155RH 6175R 6195R 6215R
2 pages
U S E R M A N U A L: Active Wall User Manual
No ratings yet
U S E R M A N U A L: Active Wall User Manual
58 pages
MH320 Eng Manual
No ratings yet
MH320 Eng Manual
27 pages
Group 4 1a Construction Estimates and Values Engineering
100% (1)
Group 4 1a Construction Estimates and Values Engineering
48 pages
Bba - (Iit, Bba)
No ratings yet
Bba - (Iit, Bba)
3 pages
SH250-6 Hydraulic Excavator
No ratings yet
SH250-6 Hydraulic Excavator
14 pages
2-4 @china's Participation Into WP.29 Activities V1 2022-11-18
No ratings yet
2-4 @china's Participation Into WP.29 Activities V1 2022-11-18
9 pages
6.2 Mistake Proofing
No ratings yet
6.2 Mistake Proofing
23 pages
Link Engineering Dearborn Cert Scope V 008
No ratings yet
Link Engineering Dearborn Cert Scope V 008
8 pages
MarexOS3D DNV 2027-07-12
No ratings yet
MarexOS3D DNV 2027-07-12
6 pages
Mini-Striker Hoja de Datos
No ratings yet
Mini-Striker Hoja de Datos
4 pages
About SLK
No ratings yet
About SLK
9 pages
Unit 1&2 Vlsi
No ratings yet
Unit 1&2 Vlsi
81 pages
Computer Networks: Innovative Examination
No ratings yet
Computer Networks: Innovative Examination
7 pages
Unit-1 Unit-1 Trends in Data Warehousing: By:-Maulik Dhamecha
No ratings yet
Unit-1 Unit-1 Trends in Data Warehousing: By:-Maulik Dhamecha
13 pages
The Official Red Hat Linux Getting Started Guide
No ratings yet
The Official Red Hat Linux Getting Started Guide
160 pages
Liebherr Lattice Boom Crawler Crane LR 1110 Data Sheet Technical Specificati
No ratings yet
Liebherr Lattice Boom Crawler Crane LR 1110 Data Sheet Technical Specificati
32 pages
Oswas 2.0
0% (2)
Oswas 2.0
26 pages
(GUIDE) Extreme Battery Life Thread (Greenify+ - Android Development and Hacking - XDA Forums
No ratings yet
(GUIDE) Extreme Battery Life Thread (Greenify+ - Android Development and Hacking - XDA Forums
18 pages
Stellenbosch University: Week 1 Task - EMC Definitions and Standards
No ratings yet
Stellenbosch University: Week 1 Task - EMC Definitions and Standards
3 pages
CCMC BV FTB SHD Mep FP 303a R.00 208021.01 FP 303a
No ratings yet
CCMC BV FTB SHD Mep FP 303a R.00 208021.01 FP 303a
1 page
B916 PDF
No ratings yet
B916 PDF
2 pages