0% found this document useful (0 votes)

241 views10 pages

Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb

This document summarizes how to build a character-level language model to generate new dinosaur names. It discusses loading a dataset of existing dinosaur names, preprocessing the data, and building the core components of the model including gradient clipping to avoid exploding gradients and sampling to generate new character sequences. The model will learn patterns in the names and use sampling at each time step to synthesize new names one character at a time.

Uploaded by

EMBA IITKGP

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

241 views10 pages

Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb

Uploaded by

EMBA IITKGP

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

HeroKillerEver / coursera-deep-learning

Branch: master coursera-deep-learning / Sequence Models / Dinosaur Island -- Character-level language model / Find ﬁle Copy path
Dinosaurus+Island+--+Character-level+language+model+-+%28ﬁnal%29+-+learners.ipynb

Haibin update 2019.9.27 62ef9eb on Sep 27, 2019

0 contributors

882 lines (882 sloc) 36.8 KB Raw Blame History

Character level language model - Dinosaurus land
Welcome to Dinosaurus Island! 65 million years ago, dinosaurs existed, and in this assignment they are back. You are in charge of a special task.
Leading biology researchers are creating new breeds of dinosaurs and bringing them to life on earth, and your job is to give names to these
dinosaurs. If a dinosaur does not like its name, it might go beserk, so choose wisely!

Luckily you have learned some deep learning and you will use it to save the day. Your assistant has collected a list of all the dinosaur names they
could ﬁnd, and compiled them into this dataset (dinos.txt). (Feel free to take a look by clicking the previous link.) To create new dinosaur names,
you will build a character level language model to generate new names. Your algorithm will learn the different name patterns, and randomly
generate new names. Hopefully this algorithm will keep you and your team safe from the dinosaurs' wrath!

By completing this assignment you will learn:

How to store text data for processing using an RNN

How to synthesize data, by sampling predictions at each time step and passing it to the next RNN-cell unit
How to build a character-level text generation recurrent neural network
Why clipping the gradients is important

We will begin by loading in some functions that we have provided for you in rnn_utils. Speciﬁcally, you have access to functions such as
rnn_forward and rnn_backward which are equivalent to those you've implemented in the previous assignment.

In [ ]: import numpy as np

from utils import *
import random
from random import shuffle

1 - Problem Statement

1.1 - Dataset and Preprocessing

Run the following cell to read the dataset of dinosaur names, create a list of unique characters (such as a-z), and compute the dataset and
vocabulary size.

In [ ]: data = open('dinos.txt', 'r').read()

data= data.lower()
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print('There are %d total characters and %d unique characters in your data.' % (data_size, vocab
_size))

The characters are a-z (26 characters) plus the "\n" (or newline character), which in this assignment plays a role similar to the <EOS> (or "End of
sentence") token we had discussed in lecture, only here it indicates the end of the dinosaur name rather than the end of a sentence. In the cell
below, we create a python dictionary (i.e., a hash table) to map each character to an index from 0-26. We also create a second python dictionary
that maps each index back to the corresponding character character. This will help you ﬁgure out what index corresponds to what character in the
probability distribution output of the softmax layer. Below, char_to_ix and ix_to_char are the python dictionaries.

In [ ]: char_to_ix = { ch:i for i,ch in enumerate(sorted(chars)) }

ix_to_char = { i:ch for i,ch in enumerate(sorted(chars)) }
print(ix_to_char)

1.2 - Overview of the model

Your model will have the following structure:

Initialize parameters
Run the optimization loop
Forward propagation to compute the loss function
Backward propagation to compute the gradients with respect to the loss function
Clip the gradients to avoid exploding gradients
Using the gradients, update your parameter with the gradient descent update rule.
Return the learned parameters
At each time-step, the RNN tries to predict what is the next character given the previous characters. The dataset is a list

of characters in the training set, while is such that at every time-step , we have .

2 - Building blocks of the model

In this part, you will build two important blocks of the overall model:

Gradient clipping: to avoid exploding gradients

Sampling: a technique used to generate characters

You will then apply these two functions to build the model.

2.1 - Clipping the gradients in the optimization loop

In this section you will implement the clip function that you will call inside of your optimization loop. Recall that your overall loop structure usually
consists of a forward pass, a cost computation, a backward pass, and a parameter update. Before updating the parameters, you will perform
gradient clipping when needed to make sure that your gradients are not "exploding," meaning taking on overly large values.

In the exercise below, you will implement a function clip that takes in a dictionary of gradients and returns a clipped version of gradients if
needed. There are different ways to clip gradients; we will use a simple element-wise clipping procedure, in which every element of the gradient
vector is clipped to lie between some range [-N, N]. More generally, you will provide a maxValue (say 10). In this example, if any component of the
gradient vector is greater than 10, it would be set to 10; and if any component of the gradient vector is less than -10, it would be set to -10. If it is
between -10 and 10, it is left alone.

Exercise: Implement the function below to return the clipped gradients of your dictionary gradients. Your function takes in a maximum threshold
and returns the clipped versions of your gradients. You can check out this hint (https://docs.scipy.org/doc/numpy-
1.13.0/reference/generated/numpy.clip.html).

In [ ]: ### GRADED FUNCTION: clip

def clip(gradients, maxValue):

'''
Clips the gradients' values between minimum and maximum.

Arguments:
gradients -- a dictionary containing the gradients "dWaa", "dWax", "dWya", "db", "dby"
maxValue -- everything above this number is set to this number, and everything less than -ma
xValue is set to -maxValue

Returns:
gradients -- a dictionary with the clipped gradients.
'''

dWaa, dWax, dWya, db, dby = gradients['dWaa'], gradients['dWax'], gradients['dWya'], gradien

ts['db'], gradients['dby']

### START CODE HERE ###

# clip to mitigate exploding gradients, loop over [dWax, dWaa, dWya, db, dby]. (≈2 lines)
for gradient in [dWax, dWaa, dWya, db, dby]:
None
### END CODE HERE ###

gradients = {"dWaa": dWaa, "dWax": dWax, "dWya": dWya, "db": db, "dby": dby}

return gradients

In [ ]: np.random.seed(3)
dWax = np.random.randn(5,3)*10
dWaa = np.random.randn(5,5)*10
dWya = np.random.randn(2,5)*10
db = np.random.randn(5,1)*10
dby = np.random.randn(2,1)*10
gradients = {"dWax": dWax, "dWaa": dWaa, "dWya": dWya, "db": db, "dby": dby}
gradients = clip(gradients, 10)
print("gradients[\"dWaa\"][1][2] =", gradients["dWaa"][1][2])
print("gradients[\"dWax\"][3][1] =", gradients["dWax"][3][1])
print("gradients[\"dWya\"][1][2] =", gradients["dWya"][1][2])
print("gradients[\"db\"][4] =", gradients["db"][4])
print("gradients[\"dby\"][1] =", gradients["dby"][1])
Expected output:

**gradients["dWaa"][1][2] ** 10.0

**gradients["dWax"][3][1]** -10.0

**gradients["dWya"][1][2]** 0.29713815361

**gradients["db"][4]** [ 10.]

**gradients["dby"][1]** [ 8.45833407]

2.2 - Sampling
Now assume that your model is trained. You would like to generate new text (characters). The process of generation is explained in the picture
below:

Exercise: Implement the sample function below to sample characters. You need to carry out 4 steps:

Step 1: Pass the network the ﬁrst "dummy" input (the vector of zeros). This is the default input before we've generated any
characters. We also set
Step 2: Run one step of forward propagation to get and . Here are the equations:

Note that is a (softmax) probability vector (its entries are between 0 and 1 and sum to 1). represents the probability that the
character indexed by "i" is the next character. We have provided a softmax() function that you can use.

Step 3: Carry out sampling: Pick the next character's index according to the probability distribution speciﬁed by . This means that if

, you will pick the index "i" with 16% probability. To implement it, you can use np.random.choice
(https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.choice.html).

Here is an example of how to use np.random.choice():

np.random.seed(0)
p = np.array([0.1, 0.0, 0.7, 0.2])
index = np.random.choice([0, 1, 2, 3], p = p.ravel())

This means that you will pick the index according to the distribution:
.

Step 4: The last step to implement in sample() is to overwrite the variable x, which currently stores , with the value of . You
will represent by creating a one-hot vector corresponding to the character you've chosen as your prediction. You will then forward
propagate in Step 1 and keep repeating the process until you get a "\n" character, indicating you've reached the end of the dinosaur
name.

In [ ]: # GRADED FUNCTION: sample

def sample(parameters, char_to_ix, seed):

"""
Sample a sequence of characters according to a sequence of probability distributions output
of the RNN

Arguments:
parameters -- python dictionary containing the parameters Waa, Wax, Wya, by, and b.
char_to_ix -- python dictionary mapping each character to an index.
seed -- used for grading purposes. Do not worry about it.

Returns:
indices -- a list of length n containing the indexes of the sampled characters.
"""

# Retrieve parameters and relevant shapes from "parameters" dictionary

Waa, Wax, Wya, by, b = parameters['Waa'], parameters['Wax'], parameters['Wya'], parameters[
'by'], parameters['b']
vocab_size = by.shape[0]
n_a = Waa.shape[1]
### START CODE HERE ###
# Step 1: Create the one-hot vector x for the first character (initializing the sequence gen
eration). (≈1 line)
x = None
# Step 1': Initialize a_prev as zeros (≈1 line)
a_prev = None

# Create an empty list of indices, this is the list which will contain the list of indexes o
f the characters to generate (≈1 line)
indices = None

# Idx is a flag to detect a newline character, we initialize it to -1

idx = -1

# Loop over time-steps t. At each time-step, sample a character from a probability distribut
ion and append
# its index to "indexes". We'll stop if we reach 50 characters (which should be very unlikel
y with a well
# trained model), which helps debugging and prevents entering an infinite loop.
counter = 0
newline_character = char_to_ix['\n']

while (idx != newline_character and counter != 50):

# Step 2: Forward propagate x using the equations (1), (2) and (3)
a = None
z = None
y = None

# for grading purposes

np.random.seed(counter+seed)

# Step 3: Sample the index of a character within the vocabulary from the probability dis
tribution y
idx = None

# Append the index to "indices"

None

# Step 4: Overwrite the input character as the one corresponding to the sampled index.
x = None
x[None] = None

# Update "a_prev" to be "a"

a_prev = None

# for grading purposes

seed += 1
counter +=1

### END CODE HERE ###

if (counter == 50):
indices.append(char_to_ix['\n'])

return indices

In [ ]: np.random.seed(2)
n, n_a = 20, 100
a0 = np.random.randn(n_a, 1)
i0 = 1 # first character is ix_to_char[i0]
Wax, Waa, Wya = np.random.randn(n_a, vocab_size), np.random.randn(n_a, n_a), np.random.randn(voc
ab_size, n_a)
b, by = np.random.randn(n_a, 1), np.random.randn(vocab_size, 1)
parameters = {"Wax": Wax, "Waa": Waa, "Wya": Wya, "b": b, "by": by}

indexes = sample(parameters, char_to_ix, 0)

print("Sampling:")
print("list of sampled indices:", indexes)
print("list of sampled characters:", [ix_to_char[i] for i in indexes])

Expected output:

list of sampled indices: [18, 2, 26, 0]

list of sampled characters: ['r', 'b', 'z', '\n']

3 - Building the language model
It is time to build the character-level language model for text generation.

3.1 - Gradient descent

In this section you will implement a function performing one step of stochastic gradient descent (with clipped gradients). You will go through the
training examples one at a time, so the optimization algorithm will be stochastic gradient descent. As a reminder, here are the steps of a common
optimization loop for an RNN:

Forward propagate through the RNN to compute the loss

Backward propagate through time to compute the gradients of the loss with respect to the parameters
Clip the gradients if necessary
Update your parameters using gradient descent

Exercise: Implement this optimization process (one step of stochastic gradient descent).

We provide you with the following functions:

def rnn_forward(X, Y, a_prev, parameters):

""" Performs the forward propagation through the RNN and computes the cross-entropy loss.
It returns the loss' value as well as a "cache" storing values to be used in the backpropagatio
n."""
....
return loss, cache

def rnn_backward(X, Y, parameters, cache):

""" Performs the backward propagation through time to compute the gradients of the loss with respe
ct
to the parameters. It returns also all the hidden states."""
...
return gradients, a

def update_parameters(parameters, gradients, learning_rate):

""" Updates parameters using the Gradient Descent Update Rule."""
...
return parameters

In [ ]: # GRADED FUNCTION: optimize

def optimize(X, Y, a_prev, parameters, learning_rate = 0.01):

"""
Execute one step of the optimization to train the model.

Arguments:
X -- list of integers, where each integer is a number that maps to a character in the vocabu
lary.
Y -- list of integers, exactly the same as X but shifted one index to the left.
a_prev -- previous hidden state.
parameters -- python dictionary containing:
Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n
_x)
Waa -- Weight matrix multiplying the hidden state, numpy array of shape
(n_a, n_a)
Wya -- Weight matrix relating the hidden-state to the output, numpy arra
y of shape (n_y, n_a)
b -- Bias, numpy array of shape (n_a, 1)
by -- Bias relating the hidden-state to the output, numpy array of shape
(n_y, 1)
learning_rate -- learning rate for the model.

Returns:
loss -- value of the loss function (cross-entropy)
gradients -- python dictionary containing:
dWax -- Gradients of input-to-hidden weights, of shape (n_a, n_x)
dWaa -- Gradients of hidden-to-hidden weights, of shape (n_a, n_a)
dWya -- Gradients of hidden-to-output weights, of shape (n_y, n_a)
db -- Gradients of bias vector, of shape (n_a, 1)
dby -- Gradients of output bias vector, of shape (n_y, 1)
a[len(X)-1] -- the last hidden state, of shape (n_a, 1)
"""
### START CODE HERE ###

# Forward propagate through time (≈1 line)

loss, cache = None

# Backpropagate through time (≈1 line)

gradients, a = None

# Clip your gradients between -5 (min) and 5 (max) (≈1 line)

gradients = None

# Update parameters (≈1 line)

parameters = None

### END CODE HERE ###

return loss, gradients, a[len(X)-1]

In [ ]: np.random.seed(1)
vocab_size, n_a = 27, 100
a_prev = np.random.randn(n_a, 1)
Wax, Waa, Wya = np.random.randn(n_a, vocab_size), np.random.randn(n_a, n_a), np.random.randn(voc
ab_size, n_a)
b, by = np.random.randn(n_a, 1), np.random.randn(vocab_size, 1)
parameters = {"Wax": Wax, "Waa": Waa, "Wya": Wya, "b": b, "by": by}
X = [12,3,5,11,22,3]
Y = [4,14,11,22,25, 26]

loss, gradients, a_last = optimize(X, Y, a_prev, parameters, learning_rate = 0.01)

print("Loss =", loss)
print("gradients[\"dWaa\"][1][2] =", gradients["dWaa"][1][2])
print("np.argmax(gradients[\"dWax\"]) =", np.argmax(gradients["dWax"]))
print("gradients[\"dWya\"][1][2] =", gradients["dWya"][1][2])
print("gradients[\"db\"][4] =", gradients["db"][4])
print("gradients[\"dby\"][1] =", gradients["dby"][1])
print("a_last[4] =", a_last[4])

Expected output:

**Loss ** 126.503975722

**gradients["dWaa"][1][2]** 0.194709315347

**np.argmax(gradients["dWax"])** 93

**gradients["dWya"][1][2]** -0.007773876032

**gradients["db"][4]** [-0.06809825]

**gradients["dby"][1]** [ 0.01538192]

**a_last[4]** [-1.]

3.2 - Training the model

Given the dataset of dinosaur names, we use each line of the dataset (one name) as one training example. Every 100 steps of stochastic gradient
descent, you will sample 10 randomly chosen names to see how the algorithm is doing. Remember to shuﬄe the dataset, so that stochastic
gradient descent visits the examples in random order.

Exercise: Follow the instructions and implement model(). When examples[index] contains one dinosaur name (string), to create an example
(X, Y), you can use this:

index = j % len(examples)
X = [None] + [char_to_ix[ch] for ch in examples[index]]
Y = X[1:] + [char_to_ix["\n"]]

Note that we use: index= j % len(examples), where j = 1....num_iterations, to make sure that examples[index] is always a valid
statement (index is smaller than len(examples)). The ﬁrst entry of X being None will be interpreted by rnn_forward() as setting .
Further, this ensures that Y is equal to X but shifted one step to the left, and with an additional "\n" appended to signify the end of the dinosaur
name.

In [ ]: # GRADED FUNCTION: model

def model(data, ix_to_char, char_to_ix, num_iterations = 35000, n_a = 50, dino_names = 7, vocab_
size = 27):
"""
Trains the model and generates dinosaur names.

Arguments:
data -- text corpus
ix_to_char -- dictionary that maps the index to a character
char_to_ix -- dictionary that maps a character to an index
num_iterations -- number of iterations to train the model for
n_a -- number of hidden neurons in the softmax layer
dino_names -- number of dinosaur names you want to sample at each iteration.
vocab_size -- number of unique characters found in the text, size of the vocabulary

Returns:
parameters -- learned parameters
"""

# Retrieve n_x and n_y from vocab_size

n_x, n_y = vocab_size, vocab_size

# Initialize parameters
parameters = initialize_parameters(n_a, n_x, n_y)

# Initialize loss (this is required because we want to smooth our loss, don't worry about i
t)
loss = get_initial_loss(vocab_size, dino_names)

# Build list of all dinosaur names (training examples).

with open("dinos.txt") as f:
examples = f.readlines()
examples = [x.lower().strip() for x in examples]

# Shuffle list of all dinosaur names

shuffle(examples)

# Initialize the hidden state of your LSTM

a_prev = np.zeros((n_a, 1))

# Optimization loop
for j in range(num_iterations):

### START CODE HERE ###

# Use the hint above to define one training example (X,Y) (≈ 2 lines)
index = None
X = None
Y = None

# Perform one optimization step: Forward-prop -> Backward-prop -> Clip -> Update paramet
ers
# Choose a learning rate of 0.01
curr_loss, gradients, a_prev = None

### END CODE HERE ###

# Use a latency trick to keep the loss smooth. It happens here to accelerate the trainin
g.
loss = smooth(loss, curr_loss)

# Every 2000 Iteration, generate "n" characters thanks to sample() to check if the model
is learning properly
if j % 2000 == 0:

print('Iteration: %d, Loss: %f' % (j, loss) + '\n')

# The number of dinosaur names to print

seed = 0
for name in range(dino_names):

# Sample indexes and print them

sampled_indexes = sample(parameters, char_to_ix, seed)
print_sample(sampled_indexes, ix_to_char)

seed += 1 # To get the same result for grading purposed, increment the seed by
one.

print('\n')
return parameters

Run the following cell, you should observe your model outputting random-looking characters at the ﬁrst iteration. After a few thousand iterations,
your model should learn to generate reasonable-looking names.

In [ ]: parameters = model(data, ix_to_char, char_to_ix)

Conclusion
You can see that your algorithm has started to generate plausible dinosaur names towards the end of the training. At ﬁrst, it was generating random
characters, but towards the end you could see dinosaur names with cool endings. Feel free to run the algorithm even longer and play with
hyperparameters to see if you can get even better results. Our implemetation generated some really cool names like maconucon, marloralus and
macingsersaurus. Your model hopefully also learned that dinosaur names tend to end in saurus, don, aura, tor, etc.

If your model generates some non-cool names, don't blame the model entirely--not all actual dinosaur names sound cool. (For example,
dromaeosauroides is an actual dinosaur name and is in the training set.) But this model should give you a set of candidates from which you can
pick the coolest!

This assignment had used a relatively small dataset, so that you could train an RNN quickly on a CPU. Training a model of the english language
requires a much bigger dataset, and usually needs much more computation, and could run for many hours on GPUs. We ran our dinosaur name for
quite some time, and so far our favoriate name is the great, undefeatable, and ﬁerce: Mangosaurus!

4 - Writing like Shakespeare

The rest of this notebook is optional and is not graded, but we hope you'll do it anyway since it's quite fun and informative.

A similar (but more complicated) task is to generate Shakespeare poems. Instead of learning from a dataset of Dinosaur names you can use a
collection of Shakespearian poems. Using LSTM cells, you can learn longer term dependencies that span many characters in the text--e.g., where a
character appearing somewhere a sequence can inﬂuence what should be a different character much much later in ths sequence. These long term
dependencies were less important with dinosaur names, since the names were quite short.

We have implemented a Shakespeare poem generator with Keras. Run the following cell to load the required packages and models. This may take a
few minutes.

In [ ]: from future import print_function

from keras.callbacks import LambdaCallback
from keras.models import Model, load_model, Sequential
from keras.layers import Dense, Activation, Dropout, Input, Masking
from keras.layers import LSTM
from keras.utils.data_utils import get_file
from keras.preprocessing.sequence import pad_sequences
from shakespeare_utils import *
import sys
import io

To save you some time, we have already trained a model for ~1000 epochs on a collection of Shakespearian poems called "The Sonnets"
(shakespeare.txt).

Let's train the model for one more epoch. When it ﬁnishes training for an epoch---this will also take a few minutes---you can run
generate_output, which will prompt asking you for an input (<40 characters). The poem will start with your sentence, and our RNN-Shakespeare
will complete the rest of the poem for you! For example, try "Forsooth this maketh no sense " (don't enter the quotation marks). Depending on
whether you include the space at the end, your results might also differ--try it both ways, and try other inputs as well.

In [ ]: print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y, batch_size=128, epochs=1, callbacks=[print_callback])

In [ ]: # Run this cell to try with different inputs without having to re-train the model
generate_output()

The RNN-Shakespeare model is very similar to the one you have built for dinosaur names. The only major differences are:

LSTMs instead of the basic RNN to capture longer-range dependencies

The model is a deeper, stacked LSTM model (2 layer)
Using Keras instead of python to simplify the code

If you want to learn more, you can also check out the Keras Team's text generation implementation on GitHub: https://github.com/keras-
team/keras/blob/master/examples/lstm_text_generation.py (https://github.com/keras-
team/keras/blob/master/examples/lstm_text_generation.py).

Congratulations on ﬁnishing this notebook!

References:

This exercise took inspiration from Andrej Karpathy's implementation: https://gist.github.com/karpathy/d4dee566867f8291f086

(https://gist.github.com/karpathy/d4dee566867f8291f086). To learn more about text generation, also check out Karpathy's blog post
(http://karpathy.github.io/2015/05/21/rnn-effectiveness/).
For the Shakespearian poem generator, our implementation was based on the implementation of an LSTM text generator by the Keras
team: https://github com/keras team/keras/blob/master/examples/lstm text generation py (https://github com/keras

Model Design Process Anaplan
0% (1)
Model Design Process Anaplan
6 pages
Assessment in Double Entry Accounting
No ratings yet
Assessment in Double Entry Accounting
7 pages
AI Lab Manual-1
100% (1)
AI Lab Manual-1
16 pages
Ai Lab Programs
No ratings yet
Ai Lab Programs
12 pages
Week 10
No ratings yet
Week 10
10 pages
Ai Lab
No ratings yet
Ai Lab
15 pages
FINALailabfile
No ratings yet
FINALailabfile
26 pages
Ai 3
No ratings yet
Ai 3
8 pages
AI Lab (Manual)
No ratings yet
AI Lab (Manual)
11 pages
ML Lab
No ratings yet
ML Lab
7 pages
Practical 7 Thsem
No ratings yet
Practical 7 Thsem
50 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
53 pages
Aiml Sem Practicals
No ratings yet
Aiml Sem Practicals
8 pages
PR Code
No ratings yet
PR Code
6 pages
AI&ML
No ratings yet
AI&ML
38 pages
Ad3511 Deep Learning Lab Manual
No ratings yet
Ad3511 Deep Learning Lab Manual
80 pages
Ai Lab Manual Artificial Intelligence Lab Using Python (LC-CSE-326G)
No ratings yet
Ai Lab Manual Artificial Intelligence Lab Using Python (LC-CSE-326G)
29 pages
748747019-Ad3511-Deep-Learning-Lab-Manual-Iii-Yearjnn (1) - 1
No ratings yet
748747019-Ad3511-Deep-Learning-Lab-Manual-Iii-Yearjnn (1) - 1
51 pages
AIML Final Programs
No ratings yet
AIML Final Programs
8 pages
AI LAB RPOGRAMS 1 To 6
No ratings yet
AI LAB RPOGRAMS 1 To 6
8 pages
AIML MANUAL Word Final
No ratings yet
AIML MANUAL Word Final
38 pages
ML Lab File Batch 1
No ratings yet
ML Lab File Batch 1
20 pages
Ad3511-Deep Learning-Lab Manual
No ratings yet
Ad3511-Deep Learning-Lab Manual
53 pages
AD3511 - Deep Learning Lab Manual
No ratings yet
AD3511 - Deep Learning Lab Manual
61 pages
CVDL (Practical No. 3)
No ratings yet
CVDL (Practical No. 3)
1 page
Artificial Intelligence Lab File
No ratings yet
Artificial Intelligence Lab File
16 pages
Aiml Programs
No ratings yet
Aiml Programs
24 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
46 pages
Aiml Lab Exps
No ratings yet
Aiml Lab Exps
16 pages
Aiml Lab Manual
No ratings yet
Aiml Lab Manual
39 pages
(P) Program AIO
No ratings yet
(P) Program AIO
22 pages
Aiml Record
No ratings yet
Aiml Record
42 pages
HPC Practical 2025
No ratings yet
HPC Practical 2025
19 pages
Daa Record 6 To 9
No ratings yet
Daa Record 6 To 9
28 pages
Ai Journal
No ratings yet
Ai Journal
24 pages
AI LAB File
No ratings yet
AI LAB File
16 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Ai Lab
No ratings yet
Ai Lab
26 pages
AD3511 Deep Learning Lab Manual
No ratings yet
AD3511 Deep Learning Lab Manual
54 pages
Wa0027.
No ratings yet
Wa0027.
34 pages
Artificial Intelligence Lab File
No ratings yet
Artificial Intelligence Lab File
10 pages
Cognitive Science Manual
No ratings yet
Cognitive Science Manual
17 pages
Exp 8 Machine Translation
No ratings yet
Exp 8 Machine Translation
11 pages
Aiml Lab Manual New Ucev
No ratings yet
Aiml Lab Manual New Ucev
37 pages
AD3511 - Deep Learning Lab Manual
No ratings yet
AD3511 - Deep Learning Lab Manual
53 pages
Ai Lab
No ratings yet
Ai Lab
30 pages
Central Board of Secondary Education Amity International School Sector-1, Vasundhara, Ghaziabad
No ratings yet
Central Board of Secondary Education Amity International School Sector-1, Vasundhara, Ghaziabad
13 pages
AIML Lab Manual
No ratings yet
AIML Lab Manual
39 pages
Pydata Seattle
No ratings yet
Pydata Seattle
98 pages
Aiml Lab Manual Final
No ratings yet
Aiml Lab Manual Final
46 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
Quizlet Py
No ratings yet
Quizlet Py
13 pages
Design and Analysis of Algorithm Lab Manual - Answers
No ratings yet
Design and Analysis of Algorithm Lab Manual - Answers
13 pages
Python Solution
No ratings yet
Python Solution
29 pages
BFS DFS
No ratings yet
BFS DFS
8 pages
Ai Programs
No ratings yet
Ai Programs
19 pages
Safin Ai Prac 1-14
No ratings yet
Safin Ai Prac 1-14
20 pages
Practicals
No ratings yet
Practicals
22 pages
Qs ML
No ratings yet
Qs ML
8 pages
1a.install Python and Set Up The Development Environment.: Code: Print ("Hello World") Output
No ratings yet
1a.install Python and Set Up The Development Environment.: Code: Print ("Hello World") Output
14 pages
RNNstepbystep
No ratings yet
RNNstepbystep
18 pages
Machine Translation
No ratings yet
Machine Translation
12 pages
JazzSolo LSTM
No ratings yet
JazzSolo LSTM
32 pages
Coursera-Deep-Learning Sequence Models Emojify
No ratings yet
Coursera-Deep-Learning Sequence Models Emojify
35 pages
1 Sole Proprietorship Business in India
No ratings yet
1 Sole Proprietorship Business in India
6 pages
City University of Hong Kong Course Syllabus Offered by Department of Computer Science With Effect From Semester A 2018/19
No ratings yet
City University of Hong Kong Course Syllabus Offered by Department of Computer Science With Effect From Semester A 2018/19
5 pages
Module 4 - Mindmap PDF
No ratings yet
Module 4 - Mindmap PDF
1 page
Sas#12 Acc150 Quiz
No ratings yet
Sas#12 Acc150 Quiz
3 pages
Lauterbach Tricore App Ocds
No ratings yet
Lauterbach Tricore App Ocds
52 pages
BMW PDF
No ratings yet
BMW PDF
38 pages
Mba Project Literature Review
100% (2)
Mba Project Literature Review
4 pages
Maglev Train Market SAMPLE
No ratings yet
Maglev Train Market SAMPLE
158 pages
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
No ratings yet
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
35 pages
Oemaomaa PDF 1734439841
No ratings yet
Oemaomaa PDF 1734439841
34 pages
CS 3
No ratings yet
CS 3
12 pages
Master Bollinger Bands Swing Trading Strategy - OpoFinance
No ratings yet
Master Bollinger Bands Swing Trading Strategy - OpoFinance
14 pages
Three Phase Frequency Converter PDF
No ratings yet
Three Phase Frequency Converter PDF
86 pages
10 Vallarta v. CA
No ratings yet
10 Vallarta v. CA
2 pages
Inventor Tutorials
100% (3)
Inventor Tutorials
1,264 pages
CLobazam
No ratings yet
CLobazam
7 pages
Winback - en Brochure Rshock Version J3 Mars 2021 A
100% (1)
Winback - en Brochure Rshock Version J3 Mars 2021 A
12 pages
Stiffened Round
100% (1)
Stiffened Round
16 pages
Product Manager Ultimate Guide
No ratings yet
Product Manager Ultimate Guide
62 pages
Business Communication Report
No ratings yet
Business Communication Report
15 pages
Puente Arizona Et Al v. Arpai Arizona MOTION For Summary Judgment
100% (1)
Puente Arizona Et Al v. Arpai Arizona MOTION For Summary Judgment
31 pages
Marine Hsse Final Assignment Chop Saw
No ratings yet
Marine Hsse Final Assignment Chop Saw
11 pages
Top 50 SAP ABAP Interview Questions and Answers PDF
No ratings yet
Top 50 SAP ABAP Interview Questions and Answers PDF
12 pages
SLIDE PAPARAN POLPUM KEMENDAGRI 18 JAN 23 TTG PEMILU
No ratings yet
SLIDE PAPARAN POLPUM KEMENDAGRI 18 JAN 23 TTG PEMILU
35 pages
COT Report Analysis - Risk-On Risk-Off
No ratings yet
COT Report Analysis - Risk-On Risk-Off
9 pages
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
No ratings yet
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
46 pages
Virginia Class Submarine
No ratings yet
Virginia Class Submarine
3 pages
Canicosa Contract To Sell
No ratings yet
Canicosa Contract To Sell
5 pages
UFBU Meeting Notice03072025120953
No ratings yet
UFBU Meeting Notice03072025120953
2 pages
CPIM part 2 practice exam 2单词卡 - Quizlet
No ratings yet
CPIM part 2 practice exam 2单词卡 - Quizlet
15 pages

Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb

Uploaded by

Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb

Uploaded by

HeroKillerEver / coursera-deep-learning

Haibin update 2019.9.27 62ef9eb on Sep 27, 2019

882 lines (882 sloc) 36.8 KB Raw Blame History

By completing this assignment you will learn:

How to store text data for processing using an RNN

In [ ]: import numpy as np

1.1 - Dataset and Preprocessing

In [ ]: data = open('dinos.txt', 'r').read()

In [ ]: char_to_ix = { ch:i for i,ch in enumerate(sorted(chars)) }

1.2 - Overview of the model

2 - Building blocks of the model

Gradient clipping: to avoid exploding gradients

2.1 - Clipping the gradients in the optimization loop

In [ ]: ### GRADED FUNCTION: clip

def clip(gradients, maxValue):

dWaa, dWax, dWya, db, dby = gradients['dWaa'], gradients['dWax'], gradients['dWya'], gradien

### START CODE HERE ###

Here is an example of how to use np.random.choice():

In [ ]: # GRADED FUNCTION: sample

def sample(parameters, char_to_ix, seed):

# Retrieve parameters and relevant shapes from "parameters" dictionary

# Idx is a flag to detect a newline character, we initialize it to -1

while (idx != newline_character and counter != 50):

# for grading purposes

# Append the index to "indices"

# Update "a_prev" to be "a"

# for grading purposes

### END CODE HERE ###

indexes = sample(parameters, char_to_ix, 0)

**list of sampled indices:** [18, 2, 26, 0]

**list of sampled characters:** ['r', 'b', 'z', '\n']

3.1 - Gradient descent

Forward propagate through the RNN to compute the loss

We provide you with the following functions:

def rnn_forward(X, Y, a_prev, parameters):

def rnn_backward(X, Y, parameters, cache):

def update_parameters(parameters, gradients, learning_rate):

In [ ]: # GRADED FUNCTION: optimize

def optimize(X, Y, a_prev, parameters, learning_rate = 0.01):

# Forward propagate through time (≈1 line)

# Backpropagate through time (≈1 line)

# Clip your gradients between -5 (min) and 5 (max) (≈1 line)

# Update parameters (≈1 line)

### END CODE HERE ###

return loss, gradients, a[len(X)-1]

loss, gradients, a_last = optimize(X, Y, a_prev, parameters, learning_rate = 0.01)

3.2 - Training the model

In [ ]: # GRADED FUNCTION: model

# Retrieve n_x and n_y from vocab_size

# Build list of all dinosaur names (training examples).

# Shuffle list of all dinosaur names

# Initialize the hidden state of your LSTM

### START CODE HERE ###

### END CODE HERE ###

print('Iteration: %d, Loss: %f' % (j, loss) + '\n')

# The number of dinosaur names to print

# Sample indexes and print them

In [ ]: parameters = model(data, ix_to_char, char_to_ix)

4 - Writing like Shakespeare

In [ ]: from __future__ import print_function

In [ ]: print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y, batch_size=128, epochs=1, callbacks=[print_callback])

LSTMs instead of the basic RNN to capture longer-range dependencies

Congratulations on ﬁnishing this notebook!

This exercise took inspiration from Andrej Karpathy's implementation: https://gist.github.com/karpathy/d4dee566867f8291f086

You might also like

list of sampled indices: [18, 2, 26, 0]

list of sampled characters: ['r', 'b', 'z', '\n']

In [ ]: from future import print_function