Matlab Tutorial: Anthony S. Maida October 8, 2001 Revised September 20, 2004 March 23, 2006
Matlab Tutorial: Anthony S. Maida October 8, 2001 Revised September 20, 2004 March 23, 2006
Draft
Anthony S. Maida
Contents
1 Introduction 2
1.1 Is MatLab appropriate for your problem? . . . . . . . . . . . . . . . . . . 2
1.2 Interpreted language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Command-line shell . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Loading scripts from files . . . . . . . . . . . . . . . . . . . . . . 3
2 Vectorizing code 3
2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Statement terminators and output suppression . . . . . . . . . . . . 4
2.1.2 Some matrix operators . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Flexible matrix access . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.4 Loading data via a script . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Vector operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Examining the workspace . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Applying functions to matrix elements . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Vectorizing a feedforward network for one epoch . . . . . . . . . . 7
2.4 Defining functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Appendix 11
1
A Appendix: Matrix notation 12
A.1 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.2 Definition of transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1 Introduction
M ATLAB stands for matrix laboratory and the name is a trademark of The MathWorks
Incorporated. The purpose of this document is to give you enough background so that you
effectively use the M ATLAB built-in help system to solve your programming problems.
>> 2 + 2
ans = 4
The command-line prompt is >>. The system reads the expression 2 + 2, evaluates it,
stores it in a default variable ans, and prints the value of this variable. The variable ans
always has the result of the most recent computation that has not already been assigned a
value. For instance, if we continue the session as shown below, the value of ans doesn’t
change.
1
If you cannot afford MatLab, then GNU O CTAVE (www.octave.org) is an alternative for interactive
matrix computations and GNUPLOT (www.gnuplot.org) is an alternative for plotting and visualizing
data.
2
>> x = 2 + 3
x = 5
>> ans
ans = 4
At this point the workspace has two variables x and ans. The development environment
has workspace editor which allows you to examine the variables in the workspace and how
much space they consume. You can clear the workspace to its original state by typing the
command clear.
You can use the M ATLAB built-in editor to write scripts, save them to files, and then have
them loaded and evaluated as if you typed the commands directly into the command line.
Also, while in the command line shell, you can type the file name containing a M ATLAB
script to achieve the same effect. Text files containing M ATLAB code are called m-files and
have the file extension “.m”.
As you develop a script, you will follow a cycle of editing and loading. When you
reload your script, the workspace from the previous cycle will still be in effect unless you
clear it. You probably want to clear it to avoid subtle reinitialization errors. The best way
to do this is to put the clear command as the first line of the script, so that the script runs
in a virgin workspace.
2 Vectorizing code
Interpreted M ATLAB scripts achieve efficiency in speed of execution by using large instruc-
tions to reduce decoding overhead and by using space to decrease time. For a given problem
size, M ATLAB scipts use a lot of memory. Thus M ATLAB trades memory to increase speed.
2.1 Matrices
The primary data structure in M ATLAB is a matrix. The appendix in this tutorial has more
information about matrices. You can create and initialize a matrix by typing the matrix
values in an assignment statement. The example below creates a 3 × 2 matrix of double-
precision floating-point values and then sets the variable w to reference the matrix.2 Notice
that the variable w was not previously declared. The rows of the matrix are terminated with
semicolons. All the commands below create the same matrix.
>> w = [1 2 3; 4 5 6]
>> w = [1 2 3; 4 5 6;]
>> w = [1, 2, 3; 4, 5, 6]
>> w = [1, 2, 3; 4, 5, 6;]
2
Numbers in M ATLAB are always double-precision floating-point.
3
M ATLAB will respond by echoing the value of w. That is, it will print w and the matrix
of values. In this example, terminating a line with a semicolon character will suppress the
output.3
Of course, you can also add two compatible matrices using the “+” operator. There are
other more subtle operators. Suppose that you want to square each element in a matrix.
There is an operator, “.*”, called array multiply. to allow you to multiply corresponding
elements of two m × n matrices to yield a new m × n matrix. In the example below, we
use the operator to square the elements of w.
>> w = [1 2 3; 4 5 6];
>> w.*w
ans = 1 4 9
16 25 36
4
can treat it as a six-element vector by referencing it using the expression “w(:)”. Type this
into the command shell to see what happens.
You also have full access to rows and columns in the matrix. For instance, the expres-
sion “w(2,:)” gives you the second row of the matrix and the expression “w(:,2)”
gives you the second column. Type these into the command shell to see what happens. You
can delete the second column of matrix “w(:,2)” by typing
>> w(:,2) = [];
w = 1 3
4 6
In most of the examples that follow, I will leave out the command-line prompt.
data(:, :, 2) = [
...
Notice that JAVA program embedded the output within a M ATLAB script where the M AT-
LAB variable data is a three-dimensional array. The third dimension holds the cycle num-
ber and the first two dimensions hold the state of the two-dimensional array of cellular au-
tomata for that cycle. Suppose several cycles of this output are sent to a file named data.m.
Then from within M ATLAB this data can be loaded simply by typing the one-line command
data, which is an instruction to evaluate the script named data.m. Once loaded into
M ATLAB, any number of tools can be used to manipulate, examine, and visualize the data.
5
2.2 Vector operations
Let us interpret the matrix w as the weight matrix for the first layer of a two-unit neural
network with three input-features to each unit. The matrix has two rows and each row codes
the three weight values for one unit. Then if we have an input pattern vector p~ = [1, 0, 0]
(represented as a column vector5 ), we can compute the net-input for this layer with one
matrix multiplication as shown below in the last line.
p = [1; 0; 0];
w = [1 2 3; 4 5 6];
n = w*p;
Each component ni of ~n represents the net input for one neuron in the input layer. Each ni
is equivalent to the sum below.
N
X
ni = wj,i pj
j=1
If we were to compute these sums using for-loops, the interpreter would have to decode
and execute six different assignment statements. In this example, the operation of matrix
multiplication vectorizes a double-nested for-loop, so that only one statement is needed to
calculate it.
M ATLAB has excellent run-time debugging facilities. In an interpreted language, you get
run-time errors that are inconceivable in a compiled language, so excellent run-time debug-
ging is a necessity in an intepreted language.
The currrent workspace has three variables which reference matrices. In the M ATLAB
development environment, you have direct access to inspect and edit the objects in this
workspace. Depending on your platform, the workspace inspector is probably under the
window menu. When you access this inspector, you will see a list of variable names and
the amount of space their associated objects consume. If you click on one of these names,
you will be able to inspect the array contents using a spread-sheet-like interface. You can
even change the values of entries in a matrix.
If you are debugging a neural network program, you can watch the weights evolve using
this editor.
6
p = [1; 0; 0]
w = [1 2 3; 4 5 6]
n = w *p
a = 1 ./ (1+exp(-4*n))
This example is the same as the previous except that one more line was added. Let us
explain what that line does. Let us work from the innermost expressions, starting with n.
First, we premultiplied the matrix n with the scalar −4. Next we applied the exponential
function, exp, to the matrix. This has the effect of applying the function to each element
of the matrix. Notice, that whereas matrices are signaled by square brackets, function
application is signaled parentheses. The next step was to add one to the matrix of results.
Notice adding the scalar 1 to a matrix, has the effect of adding one to each element of the
matrix. How does this happen? M ATLAB converts the scalar 1 to a matrix of ones whose
dimensions match the argument on the other side of the operator (in this case +). After
this, M ATLAB applies the matrix operation of addition. The symbol “./” stands for array
divide and is the division equivalent of “.*”. The 1 in the numerator again gets converted
to a matrix of ones whose dimensions match those on the other side of the operator. Then
the matrix of reciprocals is computed.
Notice that wts is a 2 × 2 matrix and that inputs is a 2 × 4 matrix. Multiplying wts
with inputs yields a 2 × 4 matrix. The matrix biasWts is 2 × 1 and onesVec is 1 × 4.
Multiplying these together yields a 2 × 4 matrix. The matrix netInput is therefore 2 × 4
and should be interpreted as follows. Each column of the matrix codes the output values
of the two units for one input pattern. Since there are four input patterns, there are four
columns.
7
When you define a function in M ATLAB, it should work with vectorized code. The
purpose of this section is to show how to define functions that work with vectorized code.
Let’s start with a simple example. We shall write a function to compute the logistic
sigmoid function, as defined below.
1
logsig(x) =
1 + e−x
M ATLAB does not have a built-in logistic sigmoid function. Here is how to implement it.
function f = logsig(x)
f = 1 ./ (1 + exp(-x));
Normally, this function is placed in a file named logsig.m. Notice that the function does
not have the return statement characteristic of C, C++, or JAVA. The function returns
when it reaches the end of its body. The return value is the value of the variable f, which
was declared at the start of the function. It is also customary not to indent the body of the
function. The function is also desiged to work either with scalars or with arrays. That is,
you should be able to issue the function invocation logSig(1.5) to apply the function
1.5, or the invocation logSig([1 2]) to apply the function to each element of the matrix
[1 2].
The next example implements a piecewise linear function and is a bit more tricky to
vectorize. M ATLAB does not have a built-in symmetric hard-limit function hardlims,
which is defined below. (
−1 x < 0
hardlims(x) =
+1 x ≥ 0
MatLab does have the built-in function sign which returns −1, 0, or 1. An example of its
use is given below.
This function is similar to the hardlims with one difference. This function returns 0 when
its argument is zero and the hardlims function returns 1 which its argument is zero.
Here is how to define the hardlims function. It needs to be put on its own file called
hardlims.m. In this example, I have included a comment line between the function
declaration and the function body. The comment line begins with the % symbol.
function f = hardlims(x)
% 1 if x >= 0, -1 otherwise
f = 2 * (x >= 0) - 1;
For this function to work on array arguments, it is necessary to cause the system to create
an array of zeros whose dimensions are the same as x. The expression (x >= 0) in the
first line of the function body does this before the relational operator is applied.
8
3 Plotting and visualization
The language has convenient and powerful visualization facilities. You can generate data
within M ATLAB, or from an external program as was illustrated in Section 2.1.4.
>> y = 0:.1:10;
>> plot(y)
The first line creates a vector whose values range from 0 through 10 in increments of 0.1.
More accurately, y is a 1 × 101 matrix. The second line plots this vector as a function of an
implicit x ranging from 0 through 100 in increments of 1. That is, the y-values are plotted
as a function of their array indices.
The plot command operates on vectors and plots a y against an x. This was implicit
in the previous example and is made explicit below.
>> y = 0:.1:10;
>> [rows, cols] = size(y);
>> x = rows:cols;
>> plot(x,y)
In the above, size returns the dimensions of the matrix y, and the componentwise assign-
ment statement gives rows the value 1 and cols the value 101. The next line creates a
vector x with a default increment of 1. Compare this with the creation of y with an explicit
increment of 0.1. Finally, the plot command explicitly plots y as a function of x.
for epoch=1:1000
. . .
6
This example illustrates the syntax of for statements and if statements. Notice that both statements
terminate with an end. Also, the for statement allows a break. Finally, the scope of the iteration variable
epoch continues beyond the end of the for statement.
9
The variable error is assumed to hold a vector of error values for the n training patterns
in one epoch of training. If we square those error values and add them up, then we have
the SSE for that particular training epoch. We save these values in the dynamically growing
array7 SSE. In M ATLAB, when storing a value in an array, if you use an array index that is
larger than the number of elements in the array, the array grows so that it is large enough to
handle the index. In JAVA, you would get an array index out-of-bounds exception. In C or
C++, your program behavior would be undefined. Once it is computed, plotting the SSE is
so easy it is mind boggling. Of course, we should put labels on the graph axes and give it a
title, as shown below.
plot(SSE);
xlabel(’Epoch’);
ylabel(’SSE’);
title(’SS error for backprop’);
If you want to print the value of a variable in a title, then use the more complex variant
below.
title([’SS error for ’, num2str(epoch), ’ epochs of bp’]);
In this variant, title accepts a vector of strings. Notice, that the value of the variable
epoch is converted to a string.
3.2 3D Plots
The command plot3 allows you to plot data in three dimensions. The script below loads
the data illustrated in Section 2.1.4 and then plots it in three dimensions using the plot3
command.
data;
hold on
for cyc=1:50,
[x,y] = find(data(:,:,cyc));
z=zeros(length(y),1)+cyc;
plot3(z,x,y,’.’)
axis([1 50 1 6 1 12])
end;
In the above, we assume that 50 frames or cycles of data have been generated. The 3D plot
is generated in a loop of 50 iterations where each iteration plots one frame of data on the
graph using the command plot3. The hold on command says to superimpose the data
from successive plots, rather than to erase the graph for each new plot. For a given cycle
of the life simulation, the command find obtains the x and y coordinates of the non-zero
elements of the two-dimensional matrix data(:, :, cyc). The list of x coordinates
goes into the x vector, and similarly for the y coordinates. Since x and y are vectors of the
same length, we also need a z vector of the same length to give to the plot3 command. To
do this, we create a vector of zeroes whose length matches y. The we add a scalar cyc to
7
If you are in a debugging cycle and you reload this file, then you should include a clear statement at the
beginning of the file. You need to erase the old SSE array from the system.
10
this vector. In M ATLAB, this adds the scaler to each element of the vector, yielding a vector
whose length is the same as y and whose components are all equal to cyc. Next, we issue
the plot3 command. We plot the z dimension on the x axis of plot3 because we want
the progress of time to be depicted on the x access of the plot. Finally, we use the axis
command to say that we want dimensions of the x, y, and z axes to vary from 1 . . . 50,
1 . . . 6, and 1 . . . 12, respectively. These values match the dimensions of the plotted data.
history(:,epoch)=outputsL2(:);
SSE(epoch) = sum(error .* error);
if((SSE(epoch)<.02)) break; end
end
plot(SSE);
figure;
surf(history);
view([45,45]);
The command surf(history) creates a surface plot of the history matrix. This
surface plot is very useful because it shows you how the output units change their response
to the input patterns as a function of training. It vividly displays the network’s change in
behavior as a result of the learning process. The command figure tells M ATLAB to plot
the results in a new figure and do not overwrite the results in the SSE figure. The command
view([45,45]) sets the viewing angle. You can play with these parameters to get a
good viewing angle. Of course, you will want to annote the plot as illustrated below.
surf(history);
view([45,45]);
xlabel(’Epoch’);
ylabel(’Pattern’);
zlabel(’Activation’);
title(’Activation to patterns as a function of training’);
11
A Appendix: Matrix notation
A 2 by 3 matrix has two rows and three columns. An m × n matrix has m rows and n
columns. With these conventions, the elements of an m × n matrix, A, would be written as
shown below.
a1,1 a1,2 ··· a1,n
a2,1 a2,2 ··· a2,n
A ≡ ..
.. .. ..
. . . .
am,1 am,2 · · · am,n
If m equals n, then we have a square matrix. A square matrix has the same number of rows
and columns. Further, a square matrix has a diagonal. This is the set of matrix locations
ai,i . A very important square matrix is the identity matrix, I, which has ones along the
diagonals and zeros everywhere else, as shown below.
1 0 ··· 0
0 1 ··· 0
I≡
.. .. . . .
. . . ..
0 0 ··· 1
AI = IA = A (1)
12
is an n × m matrix. When deriving results, there are a number of useful facts about the
transpose of a matrix.
I = IT
A = (AT )T
3. The transpose of the product of two matrices is the transpose of the second matrix
multiplied with the transpose of the first matrix.
(AB)T = B T AT
13