0% found this document useful (0 votes)

74 views55 pages

CS224d Lecture4 PDF

This document provides an overview and summary of Lecture 4 from the CS224d Deep NLP course. The lecture covers window classification and neural networks. Specifically, it discusses updating word vectors for classification tasks using a window around the target word. It derives the cross entropy error and gradient for a basic softmax classifier on word windows. Tips are provided for calculating the gradient to update the word vectors in the window. Finally, it notes that implementing the softmax and gradients using matrix operations can improve efficiency over for loops.

Uploaded by

George Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views55 pages

CS224d Lecture4 PDF

Uploaded by

George Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

CS224d

Deep NLP

Lecture 4:
Word Window Classification
and Neural Networks

Richard Socher
Overview Today:
• General classification background

• Updating word vectors for classification

• Window classification & cross entropy error derivation tips

• A single layer neural network!

• (Max-Margin loss and backprop)

Lecture 1, Slide 2 Richard Socher 4/7/16

Classification setup and notation
• Generally we have a training dataset consisting of samples

{xi,yi}Ni=1

• xi - inputs, e.g. words (indices or vectors!), context windows,

sentences, documents, etc.

• yi - labels we try to predict,

• e.g. other words
• class: sentiment, named entities, buy/sell decision,
• later: multi-word sequences

Lecture 1, Slide 3 Richard Socher 4/7/16

Classification intuition
• Training data: {xi,yi}Ni=1

• Simple illustration case:

• Fixed 2d word vectors to classify
• Using logistic regression
• à linear decision boundary à

• General ML: assume x is fixed and

only train logistic regression weights Visualizations with ConvNetJS by Karpathy!
http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

W and only modify the decision boundary

Lecture 1, Slide 4 Richard Socher 4/7/16

Classification notation
• Cross entropy loss function over
dataset {xi,yi}Ni=1

• Where for each data pair (xi,yi):

• We can write f in matrix notation and index elements of it based

on class:

Lecture 1, Slide 5 Richard Socher 4/7/16

Classification: Regularization!
• Really full loss function over any dataset includes regularization
over all parameters µ:

• Regularization will prevent overfitting

when we have a lot of features (or
later a very powerful/deep model)
• x-axis: more powerful model or
more training iterations
• Blue: training error, red: test error

Lecture 1, Slide 6 Richard Socher 4/7/16

Details: General ML optimization
• For general machine learning µ usually
only consists of columns of W:

• So we only update the decision

boundary Visualizations with ConvNetJS by Karpathy

Lecture 1, Slide 7 Richard Socher 4/7/16

Classification difference with word vectors
• Common in deep learning:
• Learn both W and word vectors x

Very large!

Overfitting Danger!

Lecture 1, Slide 8 Richard Socher 4/7/16

Losing generalization by re-training word vectors
• Setting: Training logistic regression for movie review sentiment
and in the training data we have the words
• “TV” and “telly”
• In the testing data we have
• “television”
• Originally they were all similar
(from pre-training word vectors)

• What happens when we train

the word vectors?
telly TV

television

Lecture 1, Slide 9 Richard Socher 4/7/16

Losing generalization by re-training word vectors
• What happens when we train the word vectors?
• Those that are in the training data move around
• Words from pre-training that do NOT appear in training stay

• Example:
• In training data: “TV” and “telly”
telly
• In testing data only: “television” TV

:(
television
Lecture 1, Slide 10 Richard Socher 4/7/16
Losing generalization by re-training word vectors
• Take home message:

If you only have a small

training data set, don’t
train the word vectors.

telly
TV
If you have have a very
large dataset, it may
work better to train
word vectors to the task.

television
Lecture 1, Slide 11 Richard Socher 4/7/16
Side note on word vectors notation
• The word vector matrix L is also called lookup table
• Word vectors = word embeddings = word representations (mostly)
• Mostly from methods like word2vec or Glove
|V|

L = d
[ ]
… …

aardvark a … meta … zebra

• These are the word features xword from now on

• Conceptually you get a word’s vector by left multiplying a one-hot

vector e by L: x = Le 2 d£ V ¢ V £ 1

12
Window classification
• Classifying single words is rarely done.

• Interesting problems like ambiguity arise in context!

• Example: auto-antonyms:
• "To sanction" can mean "to permit" or "to punish.”
• "To seed" can mean "to place seeds" or "to remove seeds."

• Example: ambiguous named entities:

• Paris à Paris, France vs Paris Hilton
• Hathaway à Berkshire Hathaway vs Anne Hathaway

Lecture 1, Slide 13 Richard Socher 4/7/16

Window classification
• Idea: classify a word in its context window of neighboring words.

• For example named entity recognition into 4 classes:

• Person, location, organization, none

• Many possibilities exist for classifying one word in context, e.g.

averaging all the words in a window but that looses position
information

Lecture 1, Slide 14 Richard Socher 4/7/16

Window classification
• Train softmax classifier by assigning a label to a center word and
concatenating all word vectors surrounding it

• Example: Classify Paris in the context of this sentence with

window length 2:

… museums in Paris are amazing … .

Xwindow = [ xmuseums xin xParis xare xamazing ]T

• Resulting vector xwindow = x 2 R5d , a column vector!

Lecture 1, Slide 15 Richard Socher 4/7/16

Simplest window classifier: Softmax
• With x = xwindow we can use the same softmax classifier as before

predicted model
output probability
• With cross entropy error as before: same

• But how do you update the word vectors?

Lecture 1, Slide 16 Richard Socher 4/7/16

Updating concatenated word vectors
• Short answer: Just take derivatives as before

• Long answer: Let’s go over the steps together (you’ll have to fill
in the details in PSet 1!)
• Define:
• : softmax probability output vector (see previous slide)
• : target probability distribution (all 0’s except at ground
truth index of class y, where it’s 1)
• and fc = c’th element of the f vector

• Hard, the first time, hence some tips now :)

Lecture 1, Slide 17 Richard Socher 4/7/16

Updating concatenated word vectors
• Tip 1: Carefully define your variables and
keep track of their dimensionality!

• Tip 2: Know thy chain rule and don’t forget which variables
depend on what:

• Tip 3: For the softmax part of the derivative: First take the
derivative wrt fc when c=y (the correct class), then take
derivative wrt fc when c≠ y (all the incorrect classes)

Lecture 1, Slide 18 Richard Socher 4/7/16

Updating concatenated word vectors
• Tip 4: When you take derivative wrt
one element of f, try to see if you can
create a gradient in the end that includes
all partial derivatives:

• Tip 5: To later not go insane & implementation! à results in

terms of vector operations and define single index-able vectors:

Lecture 1, Slide 19 Richard Socher 4/7/16

Updating concatenated word vectors
• Tip 6: When you start with the chain rule,
first use explicit sums and look at
partial derivatives of e.g. xi or Wij

• Tip 7: To clean it up for even more complex functions later:

Know dimensionality of variables &simplify into matrix notation

• Tip 8: Write this out in full sums if it’s not clear!

Lecture 1, Slide 20 Richard Socher 4/7/16

Updating concatenated word vectors
• What is the dimensionality of the window vector gradient?

• x is the entire window, 5 d-dimensional word vectors, so the

derivative wrt to x has to have the same dimensionality:

Lecture 1, Slide 21 Richard Socher 4/7/16

Updating concatenated word vectors
• The gradient that arrives at and updates the word vectors can
simply be split up for each word vector:
• Let
• With xwindow = [ xmuseums xin xParis xare xamazing ]

• We have

Lecture 1, Slide 22 Richard Socher 4/7/16

Updating concatenated word vectors
• This will push word vectors into areas such they will be helpful
in determining named entities.

• For example, the model can learn that seeing xin as the word
just before the center word is indicative for the center word to
be a location

Lecture 1, Slide 23 Richard Socher 4/7/16

What’s missing for training the window model?
• The gradient of J wrt the softmax weights W!

• Similar steps, write down partial wrt Wij first!

• Then we have full

Lecture 1, Slide 24 Richard Socher 4/7/16

A note on matrix implementations
• There are two expensive operations in the softmax:
• The matrix multiplication and the exp
• A for loop is never as efficient when you implement it
compared vs when you use a larger matrix
multiplication!

• Example code à

25 Richard Socher 4/7/16

A note on matrix implementations
• Looping over word vectors instead of concatenating
them all into one large matrix and then multiplying
the softmax weights with that matrix

• 1000 loops, best of 3: 639 µs per loop

10000 loops, best of 3: 53.8 µs per loop
26 Richard Socher 4/7/16
A note on matrix implementations

• Result of faster method is a C x N matrix:

• Each column is an f(x) in our notation (unnormalized class scores)

• Matrices are awesome!

• You should speed test your code a lot too

27 Richard Socher 4/7/16

Softmax (= logistic regression) is not very powerful

• Softmax only gives linear decision boundaries in the

original space.
• With little data that can be a good regularizer
• With more data it is very limiting!

28 Richard Socher 4/7/16

Softmax (= logistic regression) is not very powerful

• Softmax only linear decision boundaries

• à Lame when problem

is complex

• Wouldn’t it be cool to
get these correct?

29 Richard Socher 4/7/16

Neural Nets for the Win!
• Neural networks can learn much more complex
functions and nonlinear decision boundaries!

30 Richard Socher 4/7/16

From logistic regression to neural nets

31
Demystifying neural networks

Neural networks come with A single neuron

their own terminological A computational unit with n (3) inputs
baggage and 1 output
and parameters W, b
… just like SVMs

But if you understand how

softmax models work
Then you already understand the
operation of a basic neural Inputs Activation Output
network neuron! function

Bias unit corresponds to intercept term

32
A neuron is essentially a binary logistic regression unit

b: We can have an “always on”

hw,b (x) = f (w T x + b) feature, which gives a class prior,
or separate it out, as a bias term
1
f (z) = −z
1+ e

w, b are the parameters of this neuron

i.e., this logistic regression model
33
A neural network
= running several logistic regressions at the same time
If we feed a vector of inputs through a bunch of logistic regression
functions, then we get a vector of outputs …

But we don’t have to decide

ahead of time what variables
these logistic regressions are
trying to predict!

34
A neural network
= running several logistic regressions at the same time
… which we can feed into another logistic regression function

It is the loss function

that will direct what
the intermediate
hidden variables should
be, so as to do a good
job at predicting the
targets for the next
layer, etc.

35
A neural network
= running several logistic regressions at the same time
Before we know it, we have a multilayer neural network….

36
Matrix notation for a layer

We have
a1 = f (W11 x1 + W12 x2 + W13 x3 + b1 ) W12
a2 = f (W21 x1 + W22 x2 + W23 x3 + b2 ) a1

etc.
In matrix notation a2

z = Wx + b a3
a = f (z)
where f is applied element-wise: b3
f ([z1, z2 , z3 ]) = [ f (z1 ), f (z2 ), f (z3 )]
37
Non-linearities (f): Why they’re needed
• Example: function approximation,
e.g., regression or classification
• Without non-linearities, deep neural networks
can’t do anything more than a linear
transform
• Extra layers could just be compiled down into
a single linear transform:
W1 W2 x = Wx
• With more layers, they can approximate more
complex functions!

38
A more powerful window classifier
• Revisiting

• Xwindow = [ xmuseums xin xParis xare xamazing ]

Lecture 1, Slide 39 Richard Socher 4/7/16

A Single Layer Neural Network

• A single layer is a combination of a linear layer

and a nonlinearity:

• The neural activations a can then

be used to compute some function
• For instance, a softmax probability or an
unnormalized score:

40
Summary: Feed-forward Computation

Computing a window’s score with a 3-layer neural

net: s = score(museums in Paris are amazing )

Xwindow = [ xmuseums xin xParis xare xamazing ]

41
Next lecture:

Training a window-based neural network.

Taking more deeper derivatives à Backprop

Then we have all the basic tools in place to learn about

more complex models :)

42 Richard Socher 4/7/16

Probably for next lecture…

43 Richard Socher 4/7/16

Another output layer and loss function combo!

• So far: softmax and cross-entropy error (exp slow)

• We don’t always need probabilities, often
unnormalized scores are enough to classify correctly.

• Also: Max-margin!

• More on that in future

lectures!

44
Neural Net model to classify grammatical phrases

• Idea: Train a neural network to produce high scores

for grammatical phrases of specific length and low
scores for ungrammatical phrases

• s = score(cat chills on a mat)

• sc = score(cat chills Menlo a mat)

45 Richard Socher 4/7/16

Another output layer and loss function combo!

• Idea for training objective

• Make score of true window larger and corrupt
window’s score lower (until they’re good enough):
minimize

• This is continuous, can perform SGD

46
Training with Backpropagation

Assuming cost J is > 0, it is simple to see that we

can compute the derivatives of s and sc wrt all the
involved variables: U, W, b, x

47
Training with Backpropagation
• Let’s consider the derivative of a single weight Wij

• This only appears inside ai

• For example: W23 is only s U2

used to compute a2
a1 a2
W23

x1 x2 x3 +1
48
Training with Backpropagation

Derivative of weight Wij:

s U2

a1 a2
W23

x1 x2 x3 +1
49
Training with Backpropagation

Derivative of single weight Wij :

s U2

a1 a2
W23

Local error Local input

signal signal
x1 x2 x3 +1
where for logistic f
50
Training with Backpropagation

• From single weight Wij to full W:

• We want all combinations of s U2

i = 1, 2 and j = 1, 2, 3
• Solution: Outer product: a1 a2
W23
where is the
“responsibility” coming from
each activation a
x1 x2 x3 +1
51
Training with Backpropagation

• For biases b, we get:

s U2

a1 a2
W23

x1 x2 x3 +1
52
Training with Backpropagation

That’s almost backpropagation

It’s simply taking derivatives and using the chain rule!

Remaining trick: we can re-use derivatives computed for

higher layers in computing derivatives for lower layers

Example: last derivatives of model, the word vectors in x

53
Training with Backpropagation

• Take derivative of score with

respect to single word vector
(for simplicity a 1d vector,
but same if it was longer)
• Now, we cannot just take
into consideration one ai
because each xj is connected
to all the neurons above and
hence xj influences the
overall score through all of
these, hence:

54 Re-used part of previous derivative

Summary

55 Richard Socher 4/7/16

Higher Engineering Mathematics 5th Edition John Bird Download
100% (2)
Higher Engineering Mathematics 5th Edition John Bird Download
78 pages
JNTUH - B Tech - 2019 - 3 2 - May - R18 - MECH - 136BW FEM Finite Element Methods
No ratings yet
JNTUH - B Tech - 2019 - 3 2 - May - R18 - MECH - 136BW FEM Finite Element Methods
2 pages
(Ebook) Digital Control Engineering - Analysis and Design by Fadali, M. Sami Visioli, Antonio ISBN 9781621987925, 9781935082316, 1621987922, 1935082310 PDF Download
100% (1)
(Ebook) Digital Control Engineering - Analysis and Design by Fadali, M. Sami Visioli, Antonio ISBN 9781621987925, 9781935082316, 1621987922, 1935082310 PDF Download
56 pages
Lecture 11
No ratings yet
Lecture 11
130 pages
Lecture 1
No ratings yet
Lecture 1
58 pages
CQF - ML - 2 - General Issues - Annotated
No ratings yet
CQF - ML - 2 - General Issues - Annotated
81 pages
Lect 06 Feature Engineering and Selection
No ratings yet
Lect 06 Feature Engineering and Selection
41 pages
SP14 CS188 Lecture 22 - Perceptron - Print
No ratings yet
SP14 CS188 Lecture 22 - Perceptron - Print
35 pages
VBR Pu College: Cosxy
No ratings yet
VBR Pu College: Cosxy
3 pages
GML Slides 2024 04 29
No ratings yet
GML Slides 2024 04 29
206 pages
Deep
No ratings yet
Deep
73 pages
Optimization
No ratings yet
Optimization
51 pages
8 SVM
No ratings yet
8 SVM
55 pages
The Official Ministry of Transportation (MTO) Driver's Handbook - Ontario - Ca
No ratings yet
The Official Ministry of Transportation (MTO) Driver's Handbook - Ontario - Ca
231 pages
L3 Cse256 Fa24 FFN
No ratings yet
L3 Cse256 Fa24 FFN
64 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
125 pages
Player Ank
No ratings yet
Player Ank
18 pages
DL 02 Basics
No ratings yet
DL 02 Basics
94 pages
شباتر اله مجمعه
No ratings yet
شباتر اله مجمعه
126 pages
19 Image Classification
No ratings yet
19 Image Classification
78 pages
L4 Cse256 Fa24 We
No ratings yet
L4 Cse256 Fa24 We
68 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
XCS224N Module2 Slides
No ratings yet
XCS224N Module2 Slides
80 pages
Car Selection
No ratings yet
Car Selection
25 pages
cs224n 2017 Lecture4 PDF
No ratings yet
cs224n 2017 Lecture4 PDF
61 pages
LIDAR-Assisted Channel Modelling For LiFi in Realistic Indoor Scenarios
No ratings yet
LIDAR-Assisted Channel Modelling For LiFi in Realistic Indoor Scenarios
30 pages
Linear Algebra For Machine Learning: Sargur N. Srihari Srihari@cedar - Buffalo.edu
No ratings yet
Linear Algebra For Machine Learning: Sargur N. Srihari Srihari@cedar - Buffalo.edu
62 pages
Leanattention: Hardware-Aware Scalable Attention Mechanism For The Decode-Phase of Transformers
No ratings yet
Leanattention: Hardware-Aware Scalable Attention Mechanism For The Decode-Phase of Transformers
13 pages
Wa0007.
No ratings yet
Wa0007.
28 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
169 pages
Cs224n 2025 Lecture03 Neuralnets
No ratings yet
Cs224n 2025 Lecture03 Neuralnets
96 pages
Robotics Unit 1 Part 2
No ratings yet
Robotics Unit 1 Part 2
81 pages
CS231n Convolutional Neural Networks For Visual Recognition
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition
8 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part II Spring 2016
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part II Spring 2016
11 pages
Session 2 ANN 2024
No ratings yet
Session 2 ANN 2024
29 pages
DR Lee Peng Hin: EE6203 Computer Control Systems 231 DR Lee Peng Hin: EE6203 Computer Control Systems 232
No ratings yet
DR Lee Peng Hin: EE6203 Computer Control Systems 231 DR Lee Peng Hin: EE6203 Computer Control Systems 232
21 pages
CS 224n Assignment #2: Word2Vec and Dependency Parsing
No ratings yet
CS 224n Assignment #2: Word2Vec and Dependency Parsing
10 pages
Machine Learning: Support Vector Machines Kernel Methods
No ratings yet
Machine Learning: Support Vector Machines Kernel Methods
87 pages
FEM802 - L2 Direct Stiffness Method
No ratings yet
FEM802 - L2 Direct Stiffness Method
73 pages
BSC Mathematics
0% (1)
BSC Mathematics
26 pages
cs224n 2025 Lecture02 Wordvecs2
No ratings yet
cs224n 2025 Lecture02 Wordvecs2
46 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
CS 229 Machine Learning Handout #2: Tentative Course Schedule
No ratings yet
CS 229 Machine Learning Handout #2: Tentative Course Schedule
1 page
Gaussian Elimination 01 Echelon Form
No ratings yet
Gaussian Elimination 01 Echelon Form
9 pages
02-03-Warming-Up and Data and Features
No ratings yet
02-03-Warming-Up and Data and Features
22 pages
Storage & Solution-1
No ratings yet
Storage & Solution-1
5 pages
Learning
No ratings yet
Learning
48 pages
2024 Scu ML 2 1 SVM
No ratings yet
2024 Scu ML 2 1 SVM
36 pages
7a. Word Embeddings Word2Vec and GloVe
No ratings yet
7a. Word Embeddings Word2Vec and GloVe
39 pages
8 SVMs
No ratings yet
8 SVMs
72 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
SVM Notes
No ratings yet
SVM Notes
40 pages
EMME2 ALGORITMS Chap6
No ratings yet
EMME2 ALGORITMS Chap6
44 pages
AustMaths Quest 11C Teacher S Addition PDF
100% (2)
AustMaths Quest 11C Teacher S Addition PDF
640 pages
Doubt Clearance Session (AI) On 29.12.2024
No ratings yet
Doubt Clearance Session (AI) On 29.12.2024
41 pages
Symmetry
No ratings yet
Symmetry
17 pages
Chapter 8
No ratings yet
Chapter 8
103 pages
Java Lab Manual
No ratings yet
Java Lab Manual
69 pages
Matching, Linear Systems, and The Ball and Beam: RJ J J K R R R
No ratings yet
Matching, Linear Systems, and The Ball and Beam: RJ J J K R R R
7 pages
Convex Optimization Overview (CNT'D) : 1 Recap
No ratings yet
Convex Optimization Overview (CNT'D) : 1 Recap
15 pages
PII - Principles of Quantum Mechanics - Horgan (2014) 87pg
No ratings yet
PII - Principles of Quantum Mechanics - Horgan (2014) 87pg
87 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
34 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
Linear Programming: Penn State Math 484 Lecture Notes: Licensed Under A
No ratings yet
Linear Programming: Penn State Math 484 Lecture Notes: Licensed Under A
167 pages
Lecture 5
No ratings yet
Lecture 5
114 pages
Cs224n 2024 Lecture02 Wordvecs2
No ratings yet
Cs224n 2024 Lecture02 Wordvecs2
45 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
جبر خطي 2
No ratings yet
جبر خطي 2
17 pages
On The Maximum Principle of Ky Fan
No ratings yet
On The Maximum Principle of Ky Fan
8 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
CSE 527 Notes II
No ratings yet
CSE 527 Notes II
3 pages
Intro Slides
No ratings yet
Intro Slides
31 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
A Introduction To SVM PDF
No ratings yet
A Introduction To SVM PDF
48 pages
Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras
No ratings yet
Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras
45 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
33 pages
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
No ratings yet
Linear Algebra: Submitted by Ahmad Saeed Submitted To Sir Muzzam Ali BITM-F18-022
5 pages
Reading 03-Rough Cut Capacity Planning - Supplement For Mps - 2011
No ratings yet
Reading 03-Rough Cut Capacity Planning - Supplement For Mps - 2011
22 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
14 pages
VM Supplier Schedule A User Guide
No ratings yet
VM Supplier Schedule A User Guide
9 pages
Nkrumah Maths Department Course Outlines
No ratings yet
Nkrumah Maths Department Course Outlines
63 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
Juj 2007
No ratings yet
Juj 2007
112 pages
SP18 Practice Midterm
No ratings yet
SP18 Practice Midterm
5 pages
Lecture 2: Basics and Definitions: Networks As Data Models
No ratings yet
Lecture 2: Basics and Definitions: Networks As Data Models
28 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
14 pages
Support Vector Machines: Xiaojin Zhu
No ratings yet
Support Vector Machines: Xiaojin Zhu
41 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
110 pages
Syntax: A Generative Introduction
From Everand
Syntax: A Generative Introduction
Andrew Carnie
2.5/5 (4)
Differential Equations
From Everand
Differential Equations
Harry Hochstadt
3.5/5 (2)
DevOps Foundation Courseware - English
From Everand
DevOps Foundation Courseware - English
Oleg Skrynnik
No ratings yet
IELTS Academic Writing: Important Tips & High Scoring Sample Answers
From Everand
IELTS Academic Writing: Important Tips & High Scoring Sample Answers
Daniella Moyla
4/5 (25)
The Swift Codebook: A Beginner's Guide from Basics to Best Practices
From Everand
The Swift Codebook: A Beginner's Guide from Basics to Best Practices
Grace Huang
No ratings yet