0% found this document useful (0 votes)

5 views

dl_01_intro

CS 1699 is a deep learning course taught by Prof. Adriana Kovashka at the University of Pittsburgh, focusing on machine learning techniques, neural networks, and practical implementations. The course includes lectures, programming assignments, a project, and exams, with resources like textbooks and computing facilities provided. Students are expected to engage actively, complete a project, and learn about various deep learning tasks and challenges.

Uploaded by

swati.bannore.ai

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

dl_01_intro

Uploaded by

swati.bannore.ai

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 137

CS 1699: Deep Learning

Introduction

Prof. Adriana Kovashka

University of Pittsburgh
January 7, 2020
About the Instructor

Born 1985 in
Sofia, Bulgaria

Got BA in 2008 at
Pomona College, CA
(Computer Science &
Media Studies)

Got PhD in 2014

at University of
Texas at Austin
(Computer Vision)
Course Info
• Course website: https://people.cs.pitt.edu/~kovashka/cs1699_sp20/
• Instructor: Adriana Kovashka ([email protected] )
 Use "CS1699" at the beginning of your Subject
• Office: Sennott Square 5325
• Class: Tue/Thu, 3pm-4:15pm
• Office hours:
– Tuesday 12:20pm-1:30pm and 4:20pm-5:30pm
– Thursday 2pm-2:50pm and 4:20pm-5:30pm
• TA: Mingda Zhang ([email protected])
• TA’s office: Sennott Square 5503
• TA’s office hours: TBD (Do this Doodle by end of Jan. 11: link)
Course Goals
• To develop intuitions for machine learning techniques and
challenges, in the context of deep neural networks
• To learn the basic techniques, including the math behind
basic neural network operations
• To become familiar with advances/specialized neural
frameworks (e.g. convolutional/recurrent)
• To understand advantages/disadvantages of methods
• To practice implementing and using these techniques for
simple problems
• To develop practical solutions for one real problem (course
project)
Textbooks
• Ian Goodfellow, Yoshua Bengio, Aaron
Courville. Deep Learning. MIT Press, 2016.
online version
• Additional readings from papers
• Important: Your notes from class are your best
study material, slides are not complete with
notes
Programming Language/Frameworks
• We’ll use Python, NumPy, and PyTorch
• We’ll do a short tutorial; ask TA if you need
further help
• The TA will do a PyTorch tutorial on January 30
Computing Resources
• Graphics Processing Unit (GPU)-equipped
servers provided by Pitt’s Center for Research
Computing (CRC)
• Login details soon
• Set up SSH and VPN soon
Course Structure
• Lectures
• Five programming assignments
• Course project
• Two exams
• Participation
Policies and Schedule
See course website!
Project milestones
• End of January: project proposals (2-3 pages)
• March: status report presentations (incl.
literature review)
• April: final presentations
Logistics
• Form teams of three
• Proposal: write-up due on CourseWeb
• First presentation: 10 min
• Second presentation: 12 min
Project proposal
• Pick among the suggestions in later slides, OR propose your
own – need to answer the same questions in both cases
• What is the problem you want to solve? Why is it important?
• What related work exists? This is important so you don’t
reinvent the wheel, and also so you can leverage prior work
• What data is available for you to use? Is it sufficient? How will
you deal with it if not?
• What is your baseline method?
• What is a slightly more interesting method you can try? You
don’t have to have all details, just an idea
• How will you evaluate your method?
Project proposal rubric
One point for answering each of the questions below:
• What do you propose to do?
• What have others attempted in this space, i.e. what is the relevant literature?
• Why is what you are proposing interesting?
• Why is it challenging?
• Why is it important?
• What data do you plan to use?
• What is your high-level idea of how your method will work?
• In what ways is this method novel?
• How will you evaluate the method, i.e. what metrics are you going to use,
and what baselines are you going to compare to?
• Give a (1) conservative and (2) an ambitious schedule of milestones for your
project.
Status/first report presentation
• Assumption is you’ve done a complete
literature review, and decided on your method
• Ideally you’ve begun your experiments but not
completed them
Status/first presentation grading rubric
All questions except the last one scored on a scale of 1 to 5, 5=best:
• How well did the authors (presenters) explain what problem they
are trying to solve?
• How well did they explain why this problem is important?
• How well did they explain why the problem is challenging?
• How thorough was the literature review?
• How clearly was prior work described?
• How well did the authors explain how their proposed work is
different than prior work?
• How clearly did the authors describe their proposed approach?
• How novel is the proposed approach?
• How challenging and ambitious is the proposed approach?
Final presentation
• You’ve previously already talked about your
method in depth
• Now review your problem and method
(briefly) and describe your experiments and
findings
• You need to analyze the results, not just show
them
Final presentation grading rubric
All questions except the last one scored on a scale of 1 to 5, 5=best:
• To what extent did the authors develop the method as described in the
first presentation? (1-10)
• How well did the authors describe their experimental validation?
• How informative were the figures used?
• Were all/most relevant baselines and competitor methods included in the
experimental validation?
• Were sufficient experimental settings (e.g. datasets) tested?
• To what extent is the performance of the proposed method satisfactory?
• How informative were the conclusions the authors drew about their
method’s performance relative to other methods?
• How sensible was the discussion of limitations?
• How interesting was the discussion of future work?
Tips for a successful project
• From your perspective:
– Learn something
– Try something out for a real problem
Tips for a successful project
• From your classmates’ perspective:
– Hear about a niche of DL we haven’t covered, or
learn about a niche of DL in more depth
– Hear about challenges and how you handled
them, that they can use in their own work
– Listen to an engaging presentation on a topic they
care about
Tips for a successful project
• From my perspective:
– Hear about the creative solutions you came up
with to handle challenges
– Hear your perspective on a topic that I care about
– Co-author a publication with you, potentially with
a small amount of follow-up work – a really good
deal, and looks good on your CV!
Tips for a successful project
• Summary
– Don’t reinvent the wheel – your audience will be
bored
– But it’s ok to adapt an existing method to a new
domain/problem…
– If you show interesting experimental results…
– You analyze them and present them in a clear and
engaging fashion
Possible project topics
• Common-sense reasoning
• Vision and language interactions
• Object detection
• Self-supervised learning
• Domain adaptation
• …

See CourseWeb for details

Should I take this class?
• It will be a lot of work!
– I expect you’ll spend 6-8 hours on homework or the
project each week
– But you will learn a lot
• Some parts will be hard and require that you
pay close attention!
– But I will have periodic ungraded pop quizzes to see
how you’re doing
– I will also pick on students randomly to answer
questions
– Use instructor’s and TA’s office hours!!!
Questions?
Plan for Today
• Blitz introductions
• Intro quiz
• What is deep learning?
– Example problems and tasks
– DL in a nutshell
– Challenges and generalization
• Review/tutorial
– Linear algebra
– Calculus
– Python/NumPy
Blitz introductions (10 sec)
• What is your name?
• What one thing outside of school are you
passionate about?
• What do you hope to get out of this class?

• Every time you speak, please remind me your

name
Intro quiz
• Socrative.com
• Room: KOVASHKA
What is deep learning?
• One approach to finding patterns and
relationships in data
• Finding the right representations of the data,
that enable correct automatic performance of
a given task
• Examples: Learn to predict the category (label)
of an image, learn to translate between
languages
Example deep learning tasks
• Face recognition

https://towardsdatascience.com/an-intro-to-deep-learning-for-face-recognition-aa8dfbbc51fb
Example deep learning tasks
• Image captioning

http://openaccess.thecvf.com/content_CVPR_2019/papers/Guo_MSCap_Multi-Style_Image_Captioning_With_Unpaired_Stylized_Text_CV
PR_2019_paper.pdf

http://openaccess.thecvf.com/content_CVPR_2019/papers/Kim_Dense_Relational_Captioning_Triple-Stream_Networks_for_Relationship-
Example deep learning tasks
• Image generation

Choi et al., “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation”, CVPR 2018
Example deep learning tasks
• Fake news generation

https://www.youtube.com/watch?v=-QvIX3cY4lc
Example deep learning tasks
• Machine translation

Slide credit: Carlos Guestrin

Example deep learning tasks
• Speech recognition

Slide credit: Carlos Guestrin

Example deep learning tasks
• Text generation

Andrej Karpathy
Example deep learning tasks
• Fake news generation and detection

https://grover.allenai.org/detect
Example deep learning tasks
• Question answering

Agrawal et al., “VQA: Visual Question Answering”, ICCV 2015

https://visualcommonsense.com/
Example deep learning tasks
• Robotic pets

https://www.youtube.com/watch?v=wE3fmFTtP9g
Example deep learning tasks
• Artificial general intelligence???

https://www.dailymail.co.uk/sciencetech/article-5287647/Humans-robot-second-self.html
Example deep learning tasks
• Why are these tasks challenging?
• What are some problems from everyday life
that can be helped by deep learning?
• What are some ethical concerns about using
deep learning?
DL in a Nutshell
• Deep learning is a specific group of algorithms
falling in the broader realm of machine learning
• All ML/DL algorithms roughly match schema:
– Learn a mapping from input to output f: x  y
– x: image, text, etc.
– y: {cat, notcat}, {1, 1.5, 2, …}, etc.
– f: this is where the magic happens
ML/DL in a Nutshell

y' = f(x)
output prediction input
function

• Training: given a training set of labeled examples {(x1,y1),

…, (xN,yN)}, estimate the prediction function f by minimizing
the prediction error on the training set
• Testing: apply f to a never before seen test example x and
output the predicted value y’ = f(x)
Slide credit: L. Lazebnik
ML/DL in a Nutshell
• Example:
– Predict whether an email is spam or not:

Figures from Dhruv Batra

ML/DL in a Nutshell
• Example:
– Predict whether an email is spam or not.
– x = words in the email, one-hot representation of size |
V|x1, where V is the full vocabulary and x(j) = 1 iff word
j is mentioned
– y = 1 (if spam) or 0 (if not spam)
– y’ = f(x) = wT x
• w is a vector of the same size as x
• One weight per dimension of x (i.e. one weight per word)
• Weight can be positive, zero, negative…
• What might these weights look like?
Simple strategy: Let’s count!
This is X This is Y

= 1 оr 0?

Adapted from Dhruv Batra, Fei Sha

Weigh counts and sum to get prediction

Where do the weights

come from?
Adapted from Dhruv Batra, Fei Sha
ML/DL in a Nutshell
• Example:
– Apply a prediction function to an image to get
the desired label output:

f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
Slide credit: L. Lazebnik
ML/DL in a Nutshell
• Example:
– x = pixels of the image (concatenated to form
a vector)
– y = integer (1 = apple, 2 = tomato, etc.)
– y’ = f(x) = wT x
• w is a vector of the same size as x
• One weight per each dimension of x (i.e. one
weight per pixel)
DL in a Nutshell
• Input  network  outputs
• Input X is raw (e.g. raw image,
one-hot representation of text)
• Network extracts features: abstraction of input
• Output is the labels Y
• All parameters of the network trained by
checking how well predicted/true Y agree, using
labels in the training set
Validation strategies
• Ultimately, for our application, what do we want?
– High accuracy on training data?
– No, high accuracy on unseen/new/test data!
– Why is this tricky?
• Training data
– Features (x) and labels (y) used to learn mapping f
• Test data
– Features used to make a prediction
– Labels only used to see how well we’ve learned f!!!
• Validation data
– Held-out set of the training data
– Can use both features and labels to tune model hyperparameters
– Hyperparameters are “knobs” of the algorithm tuned by the designer: number of
iterations for learning, learning rate, etc.
– We train multiple model (one per hyperparameter setting) and choose the best
one, on the validation set
Validation strategies
BAD: Overfitting; e.g. in K-
Idea #1: Choose hyperparameters nearest neighbors, K = 1 always
that work best on the data works perfectly on training data
Your Dataset

Idea #2: Split data into train and test, choose BAD: No idea how algorithm will
hyperparameters that work best on test data perform on new data; cheating
train test

Idea #3: Split data into train, val, and test; choose Better!
hyperparameters on val and evaluate on test
train validation test

Adapted from Fei-Fei, Johnson, Yeung

Validation strategies
Your Dataset

Idea #4: Cross-Validation: Split data into folds,

try each fold as validation and average the results

fold 1 fold 2 fold 3 fold 4 fold 5 test

Useful for small datasets, but not used too frequently in deep learning

Adapted from Fei-Fei, Johnson, Yeung

Why do we hope this would work?
• Statistical estimation view:
– x and y are random variables
– D = (x1,y1), (x2,y2), …, (xN,yN) ~ P(X,Y)
– Both training & testing data sampled IID from P(X,Y)
• IID: Independent and Identically Distributed
– Learn on training set, have some hope of
generalizing to test set

Adapted from Dhruv Batra

Elements of Machine Learning
• Every machine learning algorithm has:
– Data representation (x, y)
– Problem representation (network)
– Evaluation / objective function
– Optimization (solve for parameters of network)

Adapted from Pedro Domingos

Data representation
• Let’s brainstorm what our “X” should be for
various “Y” prediction tasks…
Problem representation
• Instances
• Decision trees
• Sets of rules / Logic programs
• Support vector machines
• Graphical models (Bayes/Markov nets)
• Neural networks
• Model ensembles
• Etc.

Slide credit: Pedro Domingos

Evaluation / objective function
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• Etc.
Slide credit: Pedro Domingos
Loss functions
• Measure error
• Can be defined for discrete or continuous
outputs
• E.g. if task is classification – could use cross-
entropy loss
• If task is regression – use L2 loss i.e. ||y-y’||
Optimization
• Optimization means we need to solve for the
parameters w of the model
• For a (non-linear) neural network, there is no
closed-form solution to solve for w; cannot set up
linear system with w as the unknowns
• Thus, all optimization solutions look like this:
1. Initialize w (e.g. randomly)
2. Check error (ground-truth vs predicted labels on
training set) under current model
3. Use gradient (derivative) of error wrt w to update w
4. Repeat from 2 until convergence
Types of Learning
• Supervised learning
– Training data includes desired outputs

• Unsupervised learning
– Training data does not include desired outputs

• Weakly or Semi-supervised learning

– Training data includes a few desired outputs, or contains labels that
only approximate the labels desired at test time

• Reinforcement learning
– Rewards from sequence of actions

Adapted from: Dhruv Batra

Types of Prediction Tasks
Supervised Learning
x Classification y Discrete

x Regression y Continuous

Unsupervised Learning

x Clustering x' Discrete ID

x Dimensionality x' Continuous

Reduction
62
Adapted from Dhruv Batra
Recall:
Example of Solving a ML Problem
• Spam or not?

Slide credit: Dhruv Batra

Simple strategy: Let’s count!
This is X This is Y

= 1 оr 0?

Adapted from Dhruv Batra, Fei Sha

Weigh counts and sum to get prediction

Where do the weights

come from?
Adapted from Dhruv Batra, Fei Sha
Why not just hand-code these weights?
• We’re letting the data do the work rather than
develop hand-code classification rules
– The machine is learning to program itself

• But there are challenges…

Challenges
• Some challenges: ambiguity and context
• Machines take data representations too
literally
• Humans are much better than machines at
generalization, which is needed since test data
will rarely look exactly like the training data
Klingon vs Mlingon Classification
• Training Data
– Klingon: klix, kour, koop
– Mlingon: moo, maa, mou

• Testing Data: kap

• Which language? Why?

Slide credit: Dhruv Batra

“I saw her duck”

Slide credit: Dhruv Batra, figure credit: Liang Huang

“I saw her duck”

Slide credit: Dhruv Batra, figure credit: Liang Huang

“I saw her duck”

Slide credit: Dhruv Batra, figure credit: Liang Huang

“I saw her duck with a telescope…”

Slide credit: Dhruv Batra, figure credit: Liang Huang

What humans see

Slide credit: Larry Zitnick

What computers see
243 239 240 225 206 185 188 218 211 206 216 225

242 239 218 110 67 31 34 152 213 206 208 221

243 242 123 58 94 82 132 77 108 208 208 215

235 217 115 212 243 236 247 139 91 209 208 211

233 208 131 222 219 226 196 114 74 208 213 214

232 217 131 116 77 150 69 56 52 201 228 223

232 232 182 186 184 179 159 123 93 232 235 235

232 236 201 154 216 133 129 81 175 252 241 240

235 238 230 128 172 138 65 63 234 249 241 245

237 236 247 143 59 78 10 94 255 248 247 251

234 237 245 193 55 33 115 144 213 255 253 251

248 245 161 128 149 109 138 65 47 156 239 255

190 107 39 102 94 73 114 58 17 7 51 137

23 32 33 148 168 203 179 43 27 17 12 8

17 26 12 160 255 255 109 22 26 19 35 24

Slide credit: Larry Zitnick

Generalization

Training set (labels known) Test set (labels

unknown)

• How well does a learned model generalize from

the data it was trained on to a new test set?
Slide credit: L. Lazebnik
Generalization
• Underfitting: Models with too
few parameters are
inaccurate because of a large
bias (not enough flexibility).

• Overfitting: Models with too

many parameters are
inaccurate because of a large
variance (too much sensitivity
to the sample).
Purple dots = possible test points

Red dots = training data (all that we see before we ship off our model!)
Green curve = true underlying model Blue curve = our predicted model/fit

Adapted from D. Hoiem

Generalization
• Components of generalization error
– Noise in our observations: unavoidable
– Bias: how much the average model over all training sets differs
from the true model
• Inaccurate assumptions/simplifications made by the model
– Variance: how much models estimated from different training
sets differ from each other
• Underfitting: model is too “simple” to represent all the
relevant class characteristics
– High bias and low variance
– High training error and high test error
• Overfitting: model is too “complex” and fits irrelevant
characteristics (noise) in the data
– Low bias and high variance
– Low training error and high test error

Slide credit: L. Lazebnik

Polynomial Curve Fitting

Slide credit: Chris Bishop

Sum-of-Squares Error Function

Slide credit: Chris Bishop

0th Order Polynomial

Slide credit: Chris Bishop

1st Order Polynomial

Slide credit: Chris Bishop

3rd Order Polynomial

Slide credit: Chris Bishop

9th Order Polynomial

Slide credit: Chris Bishop

Over-fitting

Root-Mean-Square (RMS) Error:

Slide credit: Chris Bishop

Data Set Size:
9th Order Polynomial

Slide credit: Chris Bishop

Data Set Size:
9th Order Polynomial

Slide credit: Chris Bishop

Regularization

Penalize large coefficient values

(Remember: We want to minimize this expression.)

Adapted from Chris Bishop

Regularization:

Slide credit: Chris Bishop

Regularization:

Slide credit: Chris Bishop

Polynomial Coefficients

Slide credit: Chris Bishop

Polynomial Coefficients
No regularization Huge regularization

Adapted from Chris Bishop

Regularization: vs.

Slide credit: Chris Bishop

Training vs test error

Underfitting Overfitting
Error

Test error

Training error

High Bias Low Bias

Low Variance
Complexity High Variance

Slide credit: D. Hoiem

The effect of training set size

Few training examples

Test Error

Many training examples

High Bias Low Bias

Low Variance
Complexity High Variance

Slide credit: D. Hoiem

Choosing the trade-off between
bias and variance
• Need validation set (separate from the test set)

Apply this model to test set

Validation error
Error

Training error

High Bias Low Bias

Low Variance
Complexity High Variance

Slide credit: D. Hoiem

Summary of generalization
• Try simple classifiers first
• Better to have smart features and simple
classifiers than simple features and smart
classifiers
• Use increasingly powerful classifiers with more
training data
• As an additional technique for reducing variance,
try regularizing the parameters

Slide credit: D. Hoiem

Linear algebra review

See http://cs229.stanford.edu/section/cs229-linalg.pdf for more

Vectors and Matrices
• Vectors and matrices are just collections of
ordered numbers that represent something:
movements in space, scaling factors, word
counts, movie ratings, pixel brightnesses, etc.
• We’ll define some common uses and standard
operations on them.

Fei-Fei Li 3
Vector
• A column vector where

• A row vector where

denotes the transpose operation

• You need to keep track of orientation
Fei-Fei Li 99
Vectors have two main uses
• Data can also be treated
as a vector
• Such vectors don’t have a
geometric interpretation,
but calculations like
• Vectors can represent an “distance” still have value
offset in 2D or 3D space
• Points are just vectors
from the origin

Fei-Fei Li 100
Matrix
• A matrix is an array of numbers with
size by , i.e. m rows and n columns.

• If , we say that is square.

Fei-Fei Li 101
Matrix Operations
• Addition

– Can only add a matrix with matching dimensions,

or a scalar.

• Scaling

Jan 30, 2025 102

Different types of product
• x, y = column vectors (nx1)
• X, Y = matrices (mxn)
• x, y = scalars (1x1)

• xT y = x · y = inner product (1xn x nx1 = scalar)

• x ⊗ y = x yT = outer product (nx1 x 1xn = matrix)

• X * Y = matrix product
• X .* Y = element-wise product
Inner Product
• Multiply corresponding entries of two vectors
and add up the result

• x·y is also |x||y|Cos( angle between x and y )

• If B is a unit vector, then A·B gives the length
of A which lies in the direction of B
(projection) (if B is unit-length hence norm is 1)

Fei-Fei Li 104
Matrix Multiplication

• Let X be an axb matrix, Y be an bxc matrix

• Then Z = X*Y is an axc matrix
• Second dimension of first matrix, and first
dimension of first matrix have to be the same,
for matrix multiplication to be possible
• Practice: Let X be an 10x5 matrix. Let’s
factorize it into 3 matrices…
Matrix Multiplication
• The product AB is:

• Each entry in the result is (that row of A) dot

product with (that column of B)

Fei-Fei Li 106
Matrix Multiplication
• Example:
– Each entry of the
matrix product is
made by taking the
dot product of the
corresponding row in
the left matrix, with
the corresponding
column in the right
one.

Fei-Fei Li 107
Matrix Operation Properties
• Matrix addition is commutative and
associative
–A+B = B+A
– A + (B + C) = (A + B) + C
• Matrix multiplication is associative and
distributive but not commutative
– A(B*C) = (A*B)C
– A(B + C) = A*B + A*C
– A*B != B*A
Matrix Operations
• Transpose – flip matrix, so row 1 becomes
column 1

• A useful identity:

Fei-Fei Li 109
Inverse
• Given a matrix A, its inverse A-1 is a matrix such
that AA-1 = A-1A = I

• E.g.

• Inverse does not always exist. If A-1 exists, A is

invertible or non-singular. Otherwise, it’s
singular.

Fei-Fei Li 110
Special Matrices
• Identity matrix I
– Square matrix, 1’s along
diagonal, 0’s elsewhere
– I ∙ [another matrix] = [that
matrix]

• Diagonal matrix
– Square matrix with numbers
along diagonal, 0’s elsewhere
– A diagonal ∙ [another matrix]
scales the rows of that matrix
Fei-Fei Li 111
Special Matrices
• Symmetric matrix

Fei-Fei Li 112
Norms
• L1 norm

• L2 norm

• Lp norm (for real numbers p ≥ 1)

System of Linear Equations
• MATLAB example // linalg.solve or lingalg.lstsq in
Python

>> x = A\B
x =
1.0000
-0.5000

Fei-Fei Li 114
Matrix Rank
• Column/row rank

• Column rank always equals row rank

• Matrix rank
• If a matrix is not full rank, inverse doesn’t exist
– Inverse also doesn’t exist for non-square matrices

Fei-Fei Li 115
Linear independence
• Suppose we have a set of vectors v1, …, vn
• If we can express v1 as a linear combination of the
other vectors v2…vn, then v1 is linearly dependent on
the other vectors.
– The direction v1 can be expressed as a combination of the
directions v2…vn. (E.g. v1 = .7 v2 -.5 v4)
• If no vector is linearly dependent on the rest of the set,
the set is linearly independent.
– Common case: a set of vectors v1, …, vn is always linearly
independent if each vector is perpendicular to every other
vector (and non-zero)
Fei-Fei Li 116
Linear independence
Linearly independent set Not linearly independent

Fei-Fei Li 117
Singular Value Decomposition (SVD)
• There are several computer algorithms that
can “factor” a matrix, representing it as the
product of some other matrices
• The most useful of these is the Singular Value
Decomposition
• Represents any matrix A as a product of three
matrices: UΣVT

Fei-Fei Li 118
Singular Value Decomposition (SVD)

UΣV = A
T

• Where U and V are rotation matrices, and Σ is

a scaling matrix. For example:

Fei-Fei Li 119
Singular Value Decomposition (SVD)
• In general, if A is m x n, then U will be m x m, Σ
will be m x n, and VT will be n x n.

Fei-Fei Li 120
Singular Value Decomposition (SVD)
• U and V are always rotation matrices.
– Geometric rotation may not be an applicable
concept, depending on the matrix. So we call
them “unitary” matrices – each column is a unit
vector.
• Σ is a diagonal matrix
– The number of nonzero entries = rank of A
– The algorithm always sorts the entries high to low

Fei-Fei Li 121
Singular Value Decomposition
(SVD)
M = UΣVT

Illustration from Wikipedia

Calculus review
Differentiation
The derivative provides us information about the
rate of change of a function.

The derivative of a function is also a function.

Example:
The derivative of the rate function is the
acceleration function.

Texas A&M Dept of Statistics

Derivative = rate of change

Image: Wikipedia
Derivative = rate of change
• Linear function y = mx + b
• Slope

Image: Wikipedia
Ways to Write the Derivative
Given the function f(x), we can write its
derivative in the following ways:

- f '(x)

d
- dx
f(x)
The derivative of x is commonly written dx.

Texas A&M Dept of Statistics

Differentiation Formulas
The following are common differentiation
formulas:

- The derivative of a constant is 0.

d
du c
0
- The derivative of a sum is the sum of
the derivatives.
d
du ( f (u)  g(u))  f '(u)
 g'(u)
Texas A&M Dept of Statistics
Examples

- The derivative of a constant

is 0.
d
du
- The derivative of a sum is the sum of

the7 derivatives.

d
dt (t 
4) 
Texas A&M Dept of Statistics
More Formulas
- The derivative of u to a constant
power:
d
du u n
n
*un1du
- The derivative
of e:
d
du e u

eudu
- The derivative of
log:
d
Texas A&M Dept of Statistics log(u) 
More Examples
- The derivative of u to a constant
power:
d
dx
3x3 
- The derivative of e:
d
e4 y 
dy
- The derivative of log:
d
dx
3log(x) 
Texas A&M Dept of Statistics
Product and Quotient
The product rule and quotient rules are
commonly used in differentiation.

- Product rule:
d
du ( f (u)* g(u))  f (u)g'(u) 
g(u) f '(u)
- Quotient
rule: g(u) f '(u)  f (u)g'(u)
d    2
(g(u))
du 
 g(u)
f
Texas A&M Dept of Statistics
Chain Rule
The chain rule allows you to combine any of the
differentiation rules we have already covered.

- First, do the derivative of the outside

and then do the derivative of the inside.

d
f (g(u))  f '(g(u))* g'(u)*
du du

Texas A&M Dept of Statistics

Try These
f (z)  z s( y)  4
11 ye2 y

log(x2 )
g( y)  4 y3  p(x)
2y  x

h(x)  q(z)  (ez  z)3

e3x

Texas A&M Dept of Statistics

Solutions
f '(z)  s'( y)   4e2 y
1 8ye2 y

2  log(x2 )
p'(x)
g'( y)  12 y  2
 x2
2

h'(x)  q'(z)  3(ez  z)2 (ez 1)

3e3x

Texas A&M Dept of Statistics

Python/NumPy/SciPy
http://cs231n.github.io/python-numpy-tutorial/

https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html

Go through at home, and ask TA if you need help.

Your Homework
• Read entire course website
• Fill out Doodle for TA’s office hours
• Sign up for Piazza

• Do first reading
• Go through NumPy tutorial
• Start thinking about your project!

Materials Science and Engineering - A First Course - V. Raghavan
No ratings yet
Materials Science and Engineering - A First Course - V. Raghavan
53 pages
Department of Biotechnology Guest Lecture Report On: Biological Processes - Issues
No ratings yet
Department of Biotechnology Guest Lecture Report On: Biological Processes - Issues
12 pages
How To Prepare Proposal
100% (1)
How To Prepare Proposal
55 pages
Orca Grant Writing Workshop
No ratings yet
Orca Grant Writing Workshop
50 pages
Sample Project Description: CSCI 544 Project by Dr. Zornitsa Kozareva
No ratings yet
Sample Project Description: CSCI 544 Project by Dr. Zornitsa Kozareva
4 pages
ML807_Distributed_and_Federated_Learning_Slides_1
No ratings yet
ML807_Distributed_and_Federated_Learning_Slides_1
190 pages
Xu Ly Ngon Ngu Tu Nhien Christopher Manning Cs224n 2019 Lecture09 Final Projects (Cuuduongthancong - Com)
No ratings yet
Xu Ly Ngon Ngu Tu Nhien Christopher Manning Cs224n 2019 Lecture09 Final Projects (Cuuduongthancong - Com)
66 pages
Academic Interview Saviane
No ratings yet
Academic Interview Saviane
28 pages
1-Lecture-FoP- Dr Ayesha Zeb
No ratings yet
1-Lecture-FoP- Dr Ayesha Zeb
57 pages
Intro To KR&R
No ratings yet
Intro To KR&R
26 pages
FES IntroClass v2
No ratings yet
FES IntroClass v2
28 pages
MTC Week 7 - Class Sum19 PDF
No ratings yet
MTC Week 7 - Class Sum19 PDF
48 pages
Ice Breaker Task List: From The Course Platform If You Have Not Done So
No ratings yet
Ice Breaker Task List: From The Course Platform If You Have Not Done So
48 pages
2012 Grad Cohort - Preparing Your Thesis Proposal and Becoming A Ph.D. Candidate
No ratings yet
2012 Grad Cohort - Preparing Your Thesis Proposal and Becoming A Ph.D. Candidate
59 pages
01_intro
No ratings yet
01_intro
108 pages
DSAI2201 Introduction To Data Science and AI: Winter 2022
No ratings yet
DSAI2201 Introduction To Data Science and AI: Winter 2022
30 pages
Facebook@NUS: CS3216: Software Development On Evolving Platforms
No ratings yet
Facebook@NUS: CS3216: Software Development On Evolving Platforms
17 pages
Week 10 IT8x01 Writing Proposals T2 2023
No ratings yet
Week 10 IT8x01 Writing Proposals T2 2023
25 pages
Applying Design Thinking To Information Literacy Instruction
No ratings yet
Applying Design Thinking To Information Literacy Instruction
19 pages
Research Report Writing PDF
No ratings yet
Research Report Writing PDF
77 pages
Artd 1109 Week 22 Lecture(1)
No ratings yet
Artd 1109 Week 22 Lecture(1)
58 pages
MScProjDiss-2324
No ratings yet
MScProjDiss-2324
28 pages
Research Skills 3.2 - Student 3
No ratings yet
Research Skills 3.2 - Student 3
32 pages
SAMPLE Winter _ MBA Data & Program Analytics (1)
No ratings yet
SAMPLE Winter _ MBA Data & Program Analytics (1)
7 pages
INF 43 - Lecture1
No ratings yet
INF 43 - Lecture1
47 pages
00 - Project Info - MSc
No ratings yet
00 - Project Info - MSc
12 pages
Week 2 Research Proposal
No ratings yet
Week 2 Research Proposal
20 pages
Welcome To Best Practices in Powerpoint: Facilitator: Dr. Sophia Scott
No ratings yet
Welcome To Best Practices in Powerpoint: Facilitator: Dr. Sophia Scott
32 pages
cs-428 828-201830 Syllabus
No ratings yet
cs-428 828-201830 Syllabus
3 pages
2017 DHS Fellows Data Users Workshop Introduction, Goals and Expectations
No ratings yet
2017 DHS Fellows Data Users Workshop Introduction, Goals and Expectations
15 pages
Thesis & Dissertations: - Definition - Getting Started - Contents
No ratings yet
Thesis & Dissertations: - Definition - Getting Started - Contents
9 pages
UOL IS3183 Management & Social Media: 14 (17) September 2023
No ratings yet
UOL IS3183 Management & Social Media: 14 (17) September 2023
75 pages
Earch Strategy and Applied Research-Lec#8
No ratings yet
Earch Strategy and Applied Research-Lec#8
28 pages
Important From Topic To Question 18 January 2013
No ratings yet
Important From Topic To Question 18 January 2013
31 pages
Project: Mid-Term Report
No ratings yet
Project: Mid-Term Report
4 pages
Phet Workshop AAPT Summer2007
No ratings yet
Phet Workshop AAPT Summer2007
33 pages
Introduction To Deep Learning: 0. Logistics Spring 2021
No ratings yet
Introduction To Deep Learning: 0. Logistics Spring 2021
56 pages
Cissp6010 1to4
No ratings yet
Cissp6010 1to4
386 pages
Week 9 - Data Gathering
No ratings yet
Week 9 - Data Gathering
20 pages
Week 1 Intro
No ratings yet
Week 1 Intro
32 pages
Lecture 1
No ratings yet
Lecture 1
67 pages
Internet & Java Course Breakup
No ratings yet
Internet & Java Course Breakup
19 pages
Introduction To HCI: ITEC 2210-01
No ratings yet
Introduction To HCI: ITEC 2210-01
31 pages
ADDIE Worksheet
No ratings yet
ADDIE Worksheet
3 pages
Tech-Integration Plan 1-2
No ratings yet
Tech-Integration Plan 1-2
11 pages
Data Can Be Fun - QR Handout
No ratings yet
Data Can Be Fun - QR Handout
29 pages
Teaching Presentation Tips
No ratings yet
Teaching Presentation Tips
14 pages
Lecture 1
No ratings yet
Lecture 1
27 pages
Creating in House Testing Course
No ratings yet
Creating in House Testing Course
47 pages
L3 TTL2
No ratings yet
L3 TTL2
22 pages
Project Guidelines: University 18
No ratings yet
Project Guidelines: University 18
66 pages
Last Day Grad
No ratings yet
Last Day Grad
16 pages
Silver & Gold CREST Awards: Top Tips For Submitting
No ratings yet
Silver & Gold CREST Awards: Top Tips For Submitting
32 pages
chapter 0_Intro to AI_Ahmed Guessoum
No ratings yet
chapter 0_Intro to AI_Ahmed Guessoum
15 pages
Research Method Unit 3
No ratings yet
Research Method Unit 3
26 pages
Lecture 1-HOW TO CHOOSE A RESEARCH TOPIC
No ratings yet
Lecture 1-HOW TO CHOOSE A RESEARCH TOPIC
29 pages
Introduction To Postgraduate Academic Work: Modified From Purdue University Online Lab
No ratings yet
Introduction To Postgraduate Academic Work: Modified From Purdue University Online Lab
29 pages
ENGG1320 EEE Project Assessment 2023
No ratings yet
ENGG1320 EEE Project Assessment 2023
15 pages
Final Year Project Your Way Around: Kevor Mark-Oliver ICT Dept
No ratings yet
Final Year Project Your Way Around: Kevor Mark-Oliver ICT Dept
15 pages
DS_3000_Syllabus_Spring_2025
No ratings yet
DS_3000_Syllabus_Spring_2025
10 pages
EPM930 MSC Dissertation Introduction Workshop
No ratings yet
EPM930 MSC Dissertation Introduction Workshop
21 pages
Simple guide to start a thesis
From Everand
Simple guide to start a thesis
lady rodriguez
No ratings yet
Communication Skills Course Outline
No ratings yet
Communication Skills Course Outline
3 pages
Price List Irrigation Pump PDF
No ratings yet
Price List Irrigation Pump PDF
38 pages
Universidades Públicas de La Comunidad de Madrid: Spink & Son CNN
No ratings yet
Universidades Públicas de La Comunidad de Madrid: Spink & Son CNN
4 pages
Is Ryuma Stronger Than Current Luffy - Google Se
No ratings yet
Is Ryuma Stronger Than Current Luffy - Google Se
1 page
Information Writing Rubric-4th
No ratings yet
Information Writing Rubric-4th
2 pages
Datasheet FBLT054049 en US
No ratings yet
Datasheet FBLT054049 en US
1 page
Patch Antenna Design Using MICROWAVE STUDIO
No ratings yet
Patch Antenna Design Using MICROWAVE STUDIO
6 pages
Air Laws - Part 2: Philippine State College of Aeronautics
No ratings yet
Air Laws - Part 2: Philippine State College of Aeronautics
85 pages
Instant ebooks textbook Technical Writing for Engineers & Scientists, 4th Edition Leo Finkelstein download all chapters
100% (4)
Instant ebooks textbook Technical Writing for Engineers & Scientists, 4th Edition Leo Finkelstein download all chapters
47 pages
17 Environmental Nuisance
No ratings yet
17 Environmental Nuisance
1 page
I Metric BSP
No ratings yet
I Metric BSP
85 pages
COE-110.10, Methodology of Laboratory Failure Analysis
100% (1)
COE-110.10, Methodology of Laboratory Failure Analysis
57 pages
Solution To Three Gods Problem
No ratings yet
Solution To Three Gods Problem
2 pages
Polycentric History of Psychology
No ratings yet
Polycentric History of Psychology
4 pages
(Ebook) Encountering the New Testament : a historical and theological survey by Walter A. Elwell, Robert W. Yarbrough ISBN 9780801039645, 0801039649 2024 Scribd Download
100% (2)
(Ebook) Encountering the New Testament : a historical and theological survey by Walter A. Elwell, Robert W. Yarbrough ISBN 9780801039645, 0801039649 2024 Scribd Download
67 pages
HBP Education Must Reads How Generative AI Is Reshaping Education
No ratings yet
HBP Education Must Reads How Generative AI Is Reshaping Education
42 pages
Marketing Management
100% (3)
Marketing Management
118 pages
Cost 3
No ratings yet
Cost 3
3 pages
Note On Godrej Agrovet Updated
No ratings yet
Note On Godrej Agrovet Updated
6 pages
Solutions To The Olympiad Hamilton Paper: First Solution
No ratings yet
Solutions To The Olympiad Hamilton Paper: First Solution
6 pages
Joining Instructions Gok 2024
No ratings yet
Joining Instructions Gok 2024
20 pages
Rape cases data analysis
No ratings yet
Rape cases data analysis
25 pages
Impact of Negative Effects of Online Classes On Students Learning Skills During COVID-19 Pandemic
No ratings yet
Impact of Negative Effects of Online Classes On Students Learning Skills During COVID-19 Pandemic
11 pages
Chapter 7 Nutrition For Life
No ratings yet
Chapter 7 Nutrition For Life
5 pages
Chemical Crossword No. 2 Rules
No ratings yet
Chemical Crossword No. 2 Rules
3 pages
BCECE Physics Question Paper 2015 PDF Download
No ratings yet
BCECE Physics Question Paper 2015 PDF Download
37 pages
Strauss II Johann Tik Tak Polka 39001
No ratings yet
Strauss II Johann Tik Tak Polka 39001
8 pages
BaniladES1 SBM TOOL
No ratings yet
BaniladES1 SBM TOOL
61 pages