MATLAB Function Reference
MATLAB Function Reference
optimset
Create or edit optimization options parameter structure
Syntax
• options = optimset('param1',value1,'param2',value2,...)
• optimset
• options = optimset
• options = optimset(optimfun)
• options = optimset(oldopts,'param1',value1,...)
• options = optimset(oldopts,newopts)
•
Description
Parameters
Examples
This statement creates an optimization options structure called options in which the
Display parameter is set to 'iter' and the TolFun parameter is set to 1e-8.
• options = optimset('Display','iter','TolFun',1e-8)
•
This statement makes a copy of the options structure called options, changing the
value of the TolX parameter and storing new values in optnew.
• optnew = optimset(options,'TolX',1e-4);
•
This statement returns an optimization options structure that contains all the
parameter names and default values relevant to the function fminbnd.
• optimset('fminbnd')
Optimization Toolbox
fminunc
Find a minimum of an unconstrained multivariable function
Syntax
• x = fminunc(fun,x0)
• x = fminunc(fun,x0,options)
• x = fminunc(fun,x0,options,P1,P2,...)
• [x,fval] = fminunc(...)
• [x,fval,exitflag] = fminunc(...)
• [x,fval,exitflag,output] = fminunc(...)
• [x,fval,exitflag,output,grad] = fminunc(...)
• [x,fval,exitflag,output,grad,hessian] = fminunc(...)
•
Description
[x,fval] = fminunc(...) returns in fval the value of the objective function fun
at the solution x.
Input Arguments
If the gradient of fun can also be computed and the GradObj parameter is
'on', as set by
• options = optimset('GradObj','on')
•
then the function fun must return, in the second output argument, the
gradient value g, a vector, at x. Note that by checking the value of
nargout the function can avoid computing g when fun is called with only
one output argument (in the case where the optimization algorithm only
needs the value of f but not g).
• function [f,g] = myfun(x)
• f = ... % Compute the function value at x
• if nargout > 1 % fun called with 2 output
arguments
• g = ... % Compute the gradient evaluated
at x
• end
•
The gradient is the partial derivatives of f at the point x. That is, the
ith component of g is the partial derivative of f with respect to the ith
component of x.
If the Hessian matrix can also be computed and the Hessian parameter is
'on', i.e., options = optimset('Hessian','on'), then the function
fun must return the Hessian value H, a symmetric matrix, at x in a third
output argument. Note that by checking the value of nargout we can
avoid computing H when fun is called with only one or two output
arguments (in the case where the optimization algorithm only needs the
values of f and g but not H).
• function [f,g,H] = myfun(x)
• f = ... % Compute the objective function value at
x
• if nargout > 1 % fun called with two output
arguments
• g = ... % Gradient of the function evaluated at x
• if nargout > 2
• H = ... % Hessian evaluated at x
• end
•
Output Arguments
Options
We start by describing the LargeScale option since it states a preference for which
algorithm to use. It is only a preference since certain conditions must be met to use
the large-scale algorithm. For fminunc, the gradient must be provided (see the
description of fun above to see how) or else the minimum-scale algorithm is used:
LargeScale Use large-scale algorithm if possible when set to 'on'. Use medium-
scale algorithm when set to 'off'.
Large-Scale Algorithm Only. These parameters are used only by the large-scale
algorithm:
Medium-Scale Algorithm Only. These parameters are used only by the medium-
scale algorithm:
Examples
• function f = myfun(x)
• f = 3*x(1)^2 + 2*x(1)*x(2) + x(2)^2; % Cost function
•
• x0 = [1,1];
• [x,fval] = fminunc(@myfun,x0)
•
After a couple of iterations, the solution, x, and the value of the function at x, fval,
are returned.
• x =
• 1.0e-008 *
• -0.7512 0.2479
• fval =
• 1.3818e-016
•
To minimize this function with the gradient provided, modify the M-file myfun.m so
the gradient is the second output argument
• options = optimset('GradObj','on');
• x0 = [1,1];
• [x,fval] = fminunc(@myfun,x0,options)
•
After several iterations the solution x and fval, the value of the function at x, are
returned.
• x =
• 1.0e-015 *
• 0.1110 -0.8882
• fval2 =
• 6.2862e-031
•
• f = inline('sin(x)+3');
• x = fminunc(f,4)
•
• x =
• 4.7124
•
Notes
fminunc is not the preferred choice for solving problems that are sums-of-squares,
that is, of the form
Instead use the lsqnonlin function, which has been optimized for problems of this
form.
To use the large-scale method, the gradient must be provided in fun (and the GradObj
parameter set to 'on' using optimset). A warning is given if no gradient is provided
and the LargeScale parameter is not 'off'.
Algorithms
The default line search algorithm, i.e., when the LineSearchType parameter is set to
'quadcubic', is a safeguarded mixed quadratic and cubic polynomial interpolation
and extrapolation method. A safeguarded cubic polynomial method can be selected by
setting the LineSearchType parameter to 'cubicpoly'. This second method
generally requires fewer function evaluations but more gradient evaluations. Thus, if
gradients are being supplied and can be calculated inexpensively, the cubic
polynomial line search method is preferable.
Limitations
fminunc only minimizes over the real numbers, that is, x must only consist of real
numbers and f(x) must only return real numbers. When x has complex variables, they
must be split into real and imaginary parts.
Large-Scale Optimization. To use the large-scale algorithm, the user must supply
the gradient in fun (and GradObj must be set 'on' in options).
[2] Coleman, T.F. and Y. Li, "An Interior, Trust Region Approach for Nonlinear
Minimization Subject to Bounds," SIAM Journal on Optimization, Vol. 6, pp. 418-
445, 1996.
[3] Coleman, T.F. and Y. Li, "On the Convergence of Reflective Newton Methods
for Large-Scale Nonlinear Minimization Subject to Bounds," Mathematical
Programming, Vol. 67, Number 2, pp. 189-224, 1994.
[4] Davidon, W.C., "Variable Metric Method for Minimization," A.E.C. Research
and Development Report, ANL-5990, 1959.
[7] Fletcher, R. and M.J.D. Powell, "A Rapidly Convergent Descent Method for
Minimization," Computer Journal, Vol. 6, pp. 163-168, 1963.
[8] Goldfarb, D., "A Family of Variable Metric Updates Derived by Variational
Means," Mathematics of Computing, Vol. 24, pp. 23-26, 1970.
fmincg
fmincg is more accurate than fminunc. Time taken by both of them is almost same. In
Neural Network or in general which has more no of weights fminunc can give out of
memory error.So fmincg is more memory efficient.
The fmincg function is used not to get a more accurate result, your cost function
should be the same in either case, and your hypothesis will become less simple or
more complex, but because it is more efficient at doing gradient descent for especially
complex hypotheses. You should use fminunc where the hypothesis has few features,
but fmincg where it has hundreds.
• all_theta is a matrix, where there is a row for each of the trained thetas.
• The "y == c" statement creates a vector of 0's and 1's for each value of 'c' as you
iterate from 1 to num_labels. Those are the effective 'y' values that are used for
training to detect each label.