0% found this document useful (0 votes)

47 views30 pages

CS 231N Midterm Review

Uploaded by

Hongming Zheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views30 pages

CS 231N Midterm Review

Uploaded by

Hongming Zheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

CS 231N Midterm Review

Midterm Logistics
● Multiple Choice
● True/False
● Short Answer Questions
● More emphasis on topics covered earlier in the course than those discussed
more recently

Focus is more on high-level understanding of concepts

How many layers in a ResNet?
What problem does the ResNet solve and how?
More Logistics…
○ The midterm exam will take place at 12:00 - 1:20pm PT on Tuesday, May 16 in person at NVIDIA
Auditorium, 420-040, and Hewlett 200.
○ If your last name begins with a letter between A and G (inclusive), you will take the exam at NVIDIA
Auditorium.
○ If your last name begins with a letter between H and M (inclusive), you will take the exam at 420-040.
○ If your last name begins with a letter between N and Z (inclusive), you will take the exam at Hewlett 200.
○ Closed-book, no internet. One double-sided cheat sheet (written or typed)
○ The exam may cover material from Assignments 1 and 2 and all lectures up to and including Lecture 12
(Visualizing and Understanding).
○ You will have 80 minutes to complete the exam. The exam will start immediately at 12:00pm. If you
arrive late, you will not be given additional time.
Midterm Review
Plan
1. Transformers & Attention
2. RNNs
3. Back Propagation
4. Optimizers
5. CNNs
6. Normalization Layers
7. Regularization Techniques
General Attention
Self-Attention
Transformer Encoder
Transformer Decoder
2 Layer Transformer Example

Credit: Medium article by Ketan Doshi

RNNs
Backpropagation
Backpropagation

z
a m

n p
Backpropagation

z
2
a m
1
4
L
1 1
1

n p
Backpropagation

z
2
a m
1
4
L
1 1
1

0 0 0
n p
Optimizers

Optimizer Per-Parameter Learning Momentum

Rate

SGD No No

SGD + Momentum No Yes

AdaGrad Yes No

RMSProp Yes No

Adam Yes Yes

SGD
AdaGrad - Learning Rate Scaling (No momentum)
RMSProp - Slowing down scaling
Adam - Momentum + RMSProp
Visualization

Video: Lily
Jiang
CNNs

Each filter has the same number of

channels as the input image.

Each filter outputs just a one channel

feature image.

Therefore, the total number of channels

in the output vector is the same as the
number of filters.
CNNs

The learnable parameters are the

weights and biases.

Each filter has one bias, that is applied

to all channels.

For an input image with C channels, N

filters, each of size (FxF), the layer has
N*(C*(F*F) + 1) learnable parameters
CNNs

Input Shape: (C,H,W)

User specifies: N filters, each of shape
(FxF). Padding P, and Stride S

Output Shape:
(N, H’, W’)

W’=(W−F+2P)/S+1
H’=(H−F+2P)/S+1

Note: Image on the left has Stride=2

BatchNorm vs LayerNorm
BatchNorm: Normalize across all data-points in the batch

LayerNorm: Normalize across the features of each data-point

Input shape: (N, D)

BatchNorm: Normalizes across N

LayerNorm: Normalizes across D

BatchNorm vs LayerNorm
BatchNorm: Normalize across all data-points in the batch

LayerNorm: Normalize across the features of each data-point

Input shape: (N, C, H, W)

BatchNorm: Normalizes across NHW

(calculates mean and var for each channel, across all images in the batch)

LayerNorm: Normalizes across C*H*W (calculates mean and var for each image,
across all pixels in all channels)
BatchNorm vs LayerNorm
Input shape: (N, C, H, W)

BatchNorm: Normalizes across NHW

(calculates mean and var for each channel, across all images in the batch)

LayerNorm: Normalizes across C*H*W (calculates mean and var for each image,
across all channels)

What is the size of their learnable parameters?

BatchNorm vs LayerNorm
Input shape: (N, C, H, W)

BatchNorm: Normalizes across NHW (reshape (NHW, C))

(calculates mean and var for each channel, across all images in the batch)

LayerNorm: Normalizes across CHW (reshape (N, CHW))

(calculates mean and var for each image, across all channels)

What is the size of their learnable parameters?

BatchNorm: C
LayerNorm: (C*H*W)
BatchNorm vs LayerNorm
One important difference:

BatchNorm calculates the mean and var across the batch, and stores a running
average which is used during test

LayerNorm performs the same during test and train

Regularization / Training a Neural Network
● L1 and L2 Regularization penalize the size of weights
● Dropout adds redundancies to learned parameters

The Digital Mindset What It Really Takes To Thrive in The Age of Data Algorithms and AI - Paul Leonardi Tsedal Neeley
No ratings yet
The Digital Mindset What It Really Takes To Thrive in The Age of Data Algorithms and AI - Paul Leonardi Tsedal Neeley
264 pages
Hot Chips Overview
No ratings yet
Hot Chips Overview
47 pages
Anthony
No ratings yet
Anthony
33 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
155 pages
03 Pytorch Computer Vision
No ratings yet
03 Pytorch Computer Vision
29 pages
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
Deep Learning: Technical Introduction: Thomas Epelbaum
No ratings yet
Deep Learning: Technical Introduction: Thomas Epelbaum
106 pages
CVlecture 5
No ratings yet
CVlecture 5
56 pages
Ai-Ds 2
No ratings yet
Ai-Ds 2
31 pages
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
No ratings yet
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
6 pages
Additional CNN
No ratings yet
Additional CNN
82 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
SDL Unit 2 3 4
No ratings yet
SDL Unit 2 3 4
12 pages
Lecture 26-30 Unit 2
No ratings yet
Lecture 26-30 Unit 2
20 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
SocrAI Day 2
No ratings yet
SocrAI Day 2
66 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Week 6
No ratings yet
Week 6
8 pages
cs519 hw2
No ratings yet
cs519 hw2
15 pages
Ml@ok Questions
No ratings yet
Ml@ok Questions
16 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Assignment 13 Modern AI
No ratings yet
Assignment 13 Modern AI
3 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
71 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
7 CNN 3
No ratings yet
7 CNN 3
30 pages
Convolutional Neural Networks in Python - DataCamp
No ratings yet
Convolutional Neural Networks in Python - DataCamp
22 pages
02 CNN Slides
No ratings yet
02 CNN Slides
77 pages
Seminar Report cnn1
No ratings yet
Seminar Report cnn1
23 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Matconvnet: Convolutional Neural Networks For Matlab
No ratings yet
Matconvnet: Convolutional Neural Networks For Matlab
55 pages
Control System Term Paper
No ratings yet
Control System Term Paper
12 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
9 CNN
No ratings yet
9 CNN
28 pages
Rec03 - Deep Architectures
No ratings yet
Rec03 - Deep Architectures
65 pages
Some Important Question
No ratings yet
Some Important Question
59 pages
Guddu Jha - Organized
No ratings yet
Guddu Jha - Organized
3 pages
Matconvnet Manual
No ratings yet
Matconvnet Manual
59 pages
CV Mot
No ratings yet
CV Mot
69 pages
Intro CNN PDF
No ratings yet
Intro CNN PDF
31 pages
CNN
No ratings yet
CNN
31 pages
DL Inference FPGA Class1
No ratings yet
DL Inference FPGA Class1
56 pages
Appendix
No ratings yet
Appendix
22 pages
MA - Koelbl Memoire CNN
No ratings yet
MA - Koelbl Memoire CNN
79 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
14 pages
Antim Prahar AI and ML For Business 2025
No ratings yet
Antim Prahar AI and ML For Business 2025
45 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
No ratings yet
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
106 pages
Deep Learning Unit 4
No ratings yet
Deep Learning Unit 4
11 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
More On CNN
No ratings yet
More On CNN
131 pages
03 Convolution Neural Networks and Computer Vision With Tensorflow
No ratings yet
03 Convolution Neural Networks and Computer Vision With Tensorflow
21 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
CV Unit V
No ratings yet
CV Unit V
18 pages
Unit III
No ratings yet
Unit III
89 pages
12 HiranFinal
No ratings yet
12 HiranFinal
14 pages
Hackathon PPT Template
No ratings yet
Hackathon PPT Template
21 pages
ML - UNIT 5 - Material - SVCK - CSE
No ratings yet
ML - UNIT 5 - Material - SVCK - CSE
22 pages
Breast Cancer Classification Analysis
No ratings yet
Breast Cancer Classification Analysis
5 pages
DL Unit 2.1
No ratings yet
DL Unit 2.1
11 pages
Class IX Unit 3 Maths For AI
No ratings yet
Class IX Unit 3 Maths For AI
26 pages
Target Operating Model Customer Service 7
No ratings yet
Target Operating Model Customer Service 7
5 pages
Qms Buyer's Guide
No ratings yet
Qms Buyer's Guide
11 pages
Software Developer C3Ai Paris Amsterdam London
No ratings yet
Software Developer C3Ai Paris Amsterdam London
1 page
An Integrative Framework For Artificial Intelligence Education
No ratings yet
An Integrative Framework For Artificial Intelligence Education
9 pages
Grass Report VF
No ratings yet
Grass Report VF
19 pages
Solving Home Automation Problems Using Artificial Intelligence Techniques
No ratings yet
Solving Home Automation Problems Using Artificial Intelligence Techniques
6 pages
UDDKRS
No ratings yet
UDDKRS
15 pages
21 AI Tools You Won't Believe Are Free - 240126 - 204940
No ratings yet
21 AI Tools You Won't Believe Are Free - 240126 - 204940
25 pages
OMIS
No ratings yet
OMIS
7 pages
ConceptLearning-Candidate Elimination Algorithm
No ratings yet
ConceptLearning-Candidate Elimination Algorithm
45 pages
Can AI Be Biased
No ratings yet
Can AI Be Biased
19 pages
Ambesange 2020
No ratings yet
Ambesange 2020
5 pages
Introduction To Signature Forgery Detectionp
No ratings yet
Introduction To Signature Forgery Detectionp
8 pages
Education in The Age of Artificial Intelligence: How Will Technology Shape Learning?
No ratings yet
Education in The Age of Artificial Intelligence: How Will Technology Shape Learning?
4 pages
Microeconomics of Information Systems: TH TH TH TH 1
No ratings yet
Microeconomics of Information Systems: TH TH TH TH 1
8 pages
BERT Finetuning Theory
No ratings yet
BERT Finetuning Theory
14 pages
Human Behavior and Emerging Technologies - 2023 - Xu - Transparency Enhances Positive Perceptions of Social Artificial
No ratings yet
Human Behavior and Emerging Technologies - 2023 - Xu - Transparency Enhances Positive Perceptions of Social Artificial
15 pages
Chapter-3 Research Methodology: The Objective of The Study
No ratings yet
Chapter-3 Research Methodology: The Objective of The Study
16 pages
Wardah Ali AI A02 Solution PDF
No ratings yet
Wardah Ali AI A02 Solution PDF
6 pages
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
17 pages
500 - Projects of ML and DL
No ratings yet
500 - Projects of ML and DL
9 pages
Introduction-to-Artificial-Intelligence-AI (1) - Tsukuyomi
No ratings yet
Introduction-to-Artificial-Intelligence-AI (1) - Tsukuyomi
8 pages
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
No ratings yet
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
11 pages