0% found this document useful (0 votes)

11 views5 pages

A MATLAB Library For Stochastic Optimization Algorithms

SGDLibrary is a MATLAB library designed for stochastic optimization algorithms, aimed at facilitating research on large-scale data minimization problems in machine learning. It provides a flexible and extensible environment for evaluating various stochastic optimization techniques, including SGD and its variants, across different machine learning models. The library's architecture separates problem descriptors from optimization solvers, allowing users to easily implement and compare different algorithms for tasks such as regression and classification.

Uploaded by

willixtlil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

A MATLAB Library For Stochastic Optimization Algorithms

Uploaded by

willixtlil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Journal of Machine Learning Research 18 (2018) 1-5 Submitted 10/17; Revised 4/18; Published 4/18

SGDLibrary: A MATLAB library for stochastic optimization

algorithms

Hiroyuki Kasai [email protected]

Graduate School of Informatics and Engineering
The University of Electro-Communications
Tokyo, 182-8585, Japan

Editor: Geoff Holmes

Abstract
d
We consider the problemPnof finding the minimizer of a function f : R → R of the finite-sum
form min f (w) = 1/n i fi (w). This problem has been studied intensively in recent years
in the field of machine learning (ML). One promising approach for large-scale data is to use
a stochastic optimization algorithm to solve the problem. SGDLibrary is a readable, flexible
and extensible pure-MATLAB library of a collection of stochastic optimization algorithms.
The purpose of the library is to provide researchers and implementers a comprehensive
evaluation environment for the use of these algorithms on various ML problems.
Keywords: Stochastic optimization, stochastic gradient, finite-sum minimization prob-
lem, large-scale optimization problem

1. Introduction
This work aims to facilitate research on stochastic optimization for large-scale data. We
particularly address a regularized finite-sum minimization problem defined as
n n
1X 1X
min f (w) := fi (w) = L(w, xi , yi ) + λR(w), (1)
w∈Rd n n
i=1 i=1

where w ∈ Rd represents the model parameter and n denotes the number of samples (xi , yi ).
L(w, xi , yi ) is the loss function and R(w) is the regularizer with the regularization parameter
λ ≥ 0. Widely diverse machine learning (ML) models fall into this problem. Considering
L(w, xi , yi ) = (wT xi − yi )2 , xi ∈ Rd , yi ∈ R and R(w) = kwk22 , this results in an `2 -
norm regularized linear regression problem (a.k.a. ridge regression) for n training samples
(x1 , y1 ), · · · , (xn , yn ). In the case of binary classification with the desired class label yi ∈
{+1, −1} and R(w) = kwk1 , an `1 -norm regularized logistic regression (LR) problem is
obtained as fi (w) = log(1 + exp(−yi wT xi )) + λkwk1 , which encourages the sparsity of the
solution of w. Other problems covered are matrix completion, support vector machines
(SVM), and sparse principal components analysis, to name but a few.
Full gradient descent (a.k.a. steepest descent) with a step-size η is the most straight-
forward approach for (1), which updates as wk+1 ← wk − η∇f (wk ) at the k-th iteration.
However, this is expensive when n is extremely large. In fact, one needs a sum of n calcu-
lations of the inner product wT xi in the regression problems above, leading to O(nd) cost
overall per iteration. For this issue, a popular and effective alternative is stochastic gradient

c 2018 Hiroyuki Kasai.

License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided
at http://jmlr.org/papers/v18/17-632.html.
H.Kasai

descent (SGD), which updates as wk+1 ← wk − η∇fi (wk ) for the i-th sample uniformly at
random (Robbins and Monro, 1951; Bottou, 1998). SGD assumes an unbiased estimator
of the full gradient as Ei [∇fi (wk )] = ∇f (wk ). As the update rule represents, the calcu-
P in O(d) per iteration. Furthermore, mini-batch
lation cost is independent of n, resulting
SGD (Bottou, 1998) calculates 1/|Sk | i∈Sk ∇fi (wk ), where Sk is the set of samples of size
|Sk |. SGD needs a diminishing step-size algorithm to guarantee convergence, which causes
a severe slow convergence rate (Bottou, 1998). To accelerate this rate, we have two active
research directions in ML; Variance reduction (VR) techniques (Johnson and Zhang, 2013;
Roux et al., 2012; Shalev-Shwartz and Zhang, 2013; Defazio et al., 2014; Nguyen et al., 2017)
that explicitly or implicitly exploit a full gradient estimation to reduce the variance of the
noisy stochastic gradient, leading to superior convergence properties. Another promising
direction is to modify deterministic second-order algorithms into stochastic settings, and
solve the potential problem of first-order algorithms for ill-conditioned problems. A direct
extension of quasi-Newton (QN) is known as online BFGS (Schraudolph et al., 2007). Its
variants include a regularized version (RES) (Mokhtari and Ribeiro, 2014), a limited mem-
ory version (oLBFGS) (Schraudolph et al., 2007; Mokhtari and Ribeiro, 2015), a stochastic
QN (SQN) (Byrd et al., 2016), an incremental QN (Mokhtari et al., 2017), and a non-convex
version. Lastly, hybrid algorithms of the SQN with VR are proposed (Moritz et al., 2016;
Kolte et al., 2015). Others include (Duchi et al., 2011; Bordes et al., 2009).
The performance of stochastic optimization algorithms is strongly influenced not only
by the distribution of data but also by the step-size algorithm (Bottou, 1998). Therefore,
we often encounter results that are completely different from those in papers in every exper-
iment. Consequently, an evaluation framework to test and compare the algorithms at hand
is crucially important for fair and comprehensive experiments. One existing tool is Light-
ning (Blondel and Pedregosa, 2016), which is a Python library for large-scale ML problems.
However, its supported algorithms are limited, and the solvers and the problems such as
classifiers are mutually connected. Moreover, the implementations utilize Cython, which
is a C-extension for Python, for efficiency. Subsequently, they decrease users’ readability
of code, and also make users’ evaluations and extensions more complicated. SGDLibrary
is a readable, flexible and extensible pure-MATLAB library of a collection of stochastic
optimization algorithms. The library is also operable on GNU Octave. The purpose of the
library is to provide researchers and implementers a collection of state-of-the-art stochas-
tic optimization algorithms that solve a variety of large-scale optimization problems such
as linear/non-linear regression problems and classification problems. This also allows re-
searchers and implementers to easily extend or add solvers and problems for further evalu-
ation. To the best of my knowledge, no report in the literature and no library describe a
comprehensive experimental environment specialized for stochastic optimization algorithms.
The code is available at https://github.com/hiroyuki-kasai/SGDLibrary.

2. Software architecture

The software architecture of SGDLibrary follows a typical module-based architecture, which

separates problem descriptor and optimization solver. To use the library, the user selects
one problem descriptor of interest and no less than one optimization solvers to be compared.

2
SGDLibrary: A MATLAB library for stochastic optimization algorithms

Problem descriptor: The problem descriptor, denoted as problem, specifies the problem
of interest with respect to w, noted as w in the library. This is implemented by MATLAB
classdef. The user does nothing other than calling a problem definition function, for
instance, logistic_regression() for the `2 -norm regularized LR problem. Each problem
definition includes the functions necessary for solvers; (i) (full) cost function f (w), (ii)
mini-batch stochastic derivative v=1/|S|∇fi∈S (w) for the set of samples S. (iii) stochastic
Hessian (Bordes et al., 2009), and (iv) stochastic Hessian-vector product for a vector v.
The built-in problems include, for example, `2 -norm regularized multidimensional linear
regression, `2 -norm regularized linear SVM, `2 -norm regularized LR, `2 -norm regularized
softmax classification (multinomial LR), `1 -norm multidimensional linear regression, and
`1 -norm LR. The problem descriptor provides additional specific functions. For example,
the LR problem includes the prediction and the classification accuracy calculation functions.

Optimization solver: The optimization solver implements the main routine of the stochas-
tic optimization algorithm. Once a solver function is called with one selected problem de-
scriptor problem as the first argument, it solves the optimization problem by calling some
corresponding functions via problem such as the cost function and the stochastic gradient
calculation function. Examples of the supported optimization solvers in the library are listed
in categorized groups as; (i) SGD methods: Vanila SGD (Robbins and Monro, 1951), SGD
with classical momentum, SGD with classical momentum with Nesterov’s accelerated gra-
dient (Sutskever et al., 2013), AdaGrad (Duchi et al., 2011), RMSProp, AdaDelta, Adam,
and AdaMax, (ii) Variance reduction (VR) methods: SVRG (Johnson and Zhang,
2013), SAG (Roux et al., 2012), SAGA (Defazio et al., 2014), and SARAH (Nguyen et al.,
2017), (iii) Second-order methods: SQN (Bordes et al., 2009), oBFGS-Inf (Schraudolph
et al., 2007; Mokhtari and Ribeiro, 2015), oBFGS-Lim (oLBFGS) (Schraudolph et al., 2007;
Mokhtari and Ribeiro, 2015), Reg-oBFGS-Inf (RES) (Mokhtari and Ribeiro, 2014), and
Damp-oBFGS-Inf, (iv) Second-order methods with VR: SVRG-LBFGS (Kolte et al.,
2015), SS-SVRG (Kolte et al., 2015), and SVRG-SQN (Moritz et al., 2016), and (v) Else:
BB-SGD and SVRG-BB. The solver function also receives optional parameters as the second
argument, which forms a struct, designated as options in the library. It contains elements
such as the maximum number of epochs, the batch size, and the step-size algorithm with
an initial step-size. Finally, the solver function returns to the caller the final solution w and
rich statistical information, such as a record of the cost function values, the optimality gap,
the processing time, and the number of gradient calculations.

Others: SGDLibrary accommodates a user-defined step-size algorithm. This accommoda-

tion is achieved by setting as options.stepsizefun=@my_stepsize_alg, which is delivered
to solvers. Additionally, when the regularizer R(w) in the minimization problem (1) is a
non-smooth regularizer such as the `1 -norm regularizer kwk1 , the solver calls the prox-
imal operator as problem.prox(w,stepsize), which is the wrapper function defined in
each problem. The `1 -norm regularized LR problem, for example, calls the soft-threshold
function as w = prox(w,stepsize)=soft_thresh(w,stepsize*lambda), where stepsize
is the step-size η and lambda is the regularization parameter λ > 0 in (1).

3
H.Kasai

3. Tour of the SGDLibrary

We embark on a tour of SGDLibrary exemplifying the `2 -norm regularized LR problem.
The LR model generates n pairs of (xi , yi ) for an unknown model parameter w, where xi
is an input d-dimensional vector and yi ∈ {−1, 1} is the binary class label, as P (yi |xi , w) =
1/(1 + exp(−yi wT xi )). The problem seeks w that fits the regularized LR model to the
generated data (xi , yi ). This problem is cast as a minimization problem as min f (w) :=
1/n ni=1 log[1 + exp(−yi wT xi )] + λ/2kwk2 . The code for this problem is in Listing 1.
P

1 % generate synthetic 300 samples of dimension 3 for logistic regression

2 d = l o g i s t i c _ r e g r e s s i o n _ d a t a _ g e n e r a t o r (300 ,3) ;
3 % define logistic regression problem
4 problem = logistic_regression ( d . x_train , d . y_train , d . x_test , d . y_test ) ;
5
6 options . w_init = d . w_init ; % set initial value
7 options . step_init = 0.01; % set initial stepsize
8 options . verbose = 1; % set verbose mode
9 [ w_sgd , info_sgd ] = sgd ( problem , options ) ; % perform SGD solver
10 [ w_svrg , info_svrg ] = svrg ( problem , options ) ; % perform SVRG solver
11 [ w_svrg , info_svrg ] = sqn ( problem , options ) ; % perform SQN solver
12 % display cost vs . number of gradient evaluations
13 display_graph ( ’ grad_calc_count ’ , ’ cost ’ ,{ ’ SGD ’ , ’ SVRG ’ } ,...
14 { w_sgd , w_svrg } ,{ info_sgd , info_svrg }) ;

Listing 1: Demonstration code for logistic regression problem.

First, we generate train/test datasets d using logistic_regression_data_generator(),
where the input feature vector is with n = 300 and d = 3. yi ∈ {−1, 1} is its class label. The
LR problem is defined by calling logistic_regression(), which internally contains the
functions for cost value, the gradient and the Hessian. This is stored in problem. Then, we
execute solvers, i.e., SGD and SVRG, by calling solver functions, i.e., sgd() and svrg() with
problem and options after setting some options into the options struct. They return the
final solutions of {w_sgd,w_svrg} and the statistical information {info_sgd,info_svrg}.
Finally, display_graph() visualizes the behavior of the cost function values in terms of the
number of gradient evaluations. It is noteworthy that each algorithm requires a different
number of evaluations of samples in each epoch. Therefore, it is common to use this value
to evaluate the algorithms instead of the number of iterations. Illustrative results addition-
ally including SQN and SVRG-LBFGS are presented in Figure 1, which are generated by
display_graph(), and display_classification_result() specialized for classification
problems. Thus, SGDLibrary provides rich visualization tools as well.

(a) Cost function value (b) Optimality gap (c) Classification result

Figure 1: Results of `2 -norm regularized logistic regression problem.

4
SGDLibrary: A MATLAB library for stochastic optimization algorithms

References
M. Blondel and F. Pedregosa. Lightning: large-scale linear classification, regression and
ranking in Python, 2016. URL https://doi.org/10.5281/zenodo.200504.
A. Bordes, L. Bottou, and P. Callinari. SGD-QN: Careful quasi-Newton stochastic gradient
descent. JMLR, 10:1737–1754, 2009.
L. Bottou. Online algorithm and stochastic approximations. In David Saad, editor, On-Line
Learning in Neural Networks. Cambridge University Press, 1998.
R. H. Byrd, S. L. Hansen, J. Nocedal, and Y. Singer. A stochastic quasi-Newton method
for large-scale optimization. SIAM J. Optim., 26(2), 2016.
A. Defazio, F. Bach, and S. Lacoste-Julien. SAGA: A fast incremental gradient method
with support for non-strongly convex composite objectives. In NIPS, 2014.
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and
stochastic optimization. JMLR, 12:2121–2159, 2011.
R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance
reduction. In NIPS, 2013.
R. Kolte, M. Erdogdu, and A. Ozgur. Accelerating SVRG via second-order information,.
In OPT2015, 2015.
A. Mokhtari and A. Ribeiro. RES: Regularized stochastic BFGS algorithm. IEEE Trans.
on Signal Process., 62(23):6089–6104, 2014.
A. Mokhtari and A. Ribeiro. Global convergence of online limited memory BFGS. JMLR,
16:3151–3181, 2015.
A. Mokhtari, M. Eizen, and A. Ribeiro. An incremental quasi-Newton method with a local
superlinear convergence rate. In ICASSP, 2017.
P. Moritz, R. Nishihara, and M. I. Jordan. A linearly-convergent stochastic L-BFGS algo-
rithm. In AISTATS, 2016.
L. M. Nguyen, J. Liu, K. Scheinberg, and M. Takac. SARAH: A novel method for machine
learning problems using stochastic recursive gradient. In ICML, 2017.
H. Robbins and S. Monro. A stochastic approximation method. Ann. Math. Statistics, 22
(3):400–407, 1951.
N. L. Roux, M. Schmidt, and F. R. Bach. A stochastic gradient method with an exponential
convergence rate for finite training sets. In NIPS, 2012.
N. N. Schraudolph, J. Yu, and S. Gunter. A stochastic quasi-Newton method for online
convex optimization. In AISTATS, 2007.
S. Shalev-Shwartz and T. Zhang. Stochastic dual coordinate ascent methods for regularized
loss minimization. JMLR, 14:567–599, 2013.
I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and
momentum in deep learning. In ICML, 2013.

Parallelizing Stochastic Gradient Descent For Least Squares Regression Mini-Batching, Averaging, and Model Misspecification
No ratings yet
Parallelizing Stochastic Gradient Descent For Least Squares Regression Mini-Batching, Averaging, and Model Misspecification
39 pages
Bridging The Gap Between Constant Step Size Stochastic Gradient Descent and Markov Chains
No ratings yet
Bridging The Gap Between Constant Step Size Stochastic Gradient Descent and Markov Chains
30 pages
Adaptive SGD With Polyak
No ratings yet
Adaptive SGD With Polyak
29 pages
Better Theory For SGD in The Nonconvex World
No ratings yet
Better Theory For SGD in The Nonconvex World
33 pages
Adaptive Stochastic Conjugate Gradient For Machine Learning
No ratings yet
Adaptive Stochastic Conjugate Gradient For Machine Learning
14 pages
A Progressive Batching L-BFGS Method For Machine Learning
No ratings yet
A Progressive Batching L-BFGS Method For Machine Learning
10 pages
Stochastic Gradient Descent For Nonconvex Learning Without Bounded Gradient Assumptions
No ratings yet
Stochastic Gradient Descent For Nonconvex Learning Without Bounded Gradient Assumptions
7 pages
GS-OPT: A New Fast Stochastic Algorithm For Solving The Non-Convex Optimization Problem
No ratings yet
GS-OPT: A New Fast Stochastic Algorithm For Solving The Non-Convex Optimization Problem
10 pages
Stochastic Gradient Descent As Approximate Bayesian Inference
No ratings yet
Stochastic Gradient Descent As Approximate Bayesian Inference
35 pages
SDE For SGD
No ratings yet
SDE For SGD
35 pages
A Progressive Batching L-BFGS Method For Machine Learning: Robbins & Monro 1951
No ratings yet
A Progressive Batching L-BFGS Method For Machine Learning: Robbins & Monro 1951
24 pages
Online Bootstrap Confidence Intervals For The Stochastic Gradient Descent Estimator
No ratings yet
Online Bootstrap Confidence Intervals For The Stochastic Gradient Descent Estimator
21 pages
Less Than A Single Pass: Stochastically Controlled Stochastic Gradient Method
No ratings yet
Less Than A Single Pass: Stochastically Controlled Stochastic Gradient Method
28 pages
PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates
No ratings yet
PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates
57 pages
GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference With GPU Acceleration
No ratings yet
GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference With GPU Acceleration
22 pages
Lecture 8 Applications
No ratings yet
Lecture 8 Applications
26 pages
Mmep 11.10 25
No ratings yet
Mmep 11.10 25
10 pages
Is Stochastic Gradient Descent Effective? A PDE Perspective On Machine Learning Processes
No ratings yet
Is Stochastic Gradient Descent Effective? A PDE Perspective On Machine Learning Processes
50 pages
Coeurdoux etal23-PnPGibbs
No ratings yet
Coeurdoux etal23-PnPGibbs
15 pages
Sampling Is As Easy As Learning The Score: Theory For Diffusion Models With Minimal Data Assumptions
No ratings yet
Sampling Is As Easy As Learning The Score: Theory For Diffusion Models With Minimal Data Assumptions
29 pages
Benign Overfitting of Constant-Stepsize SGD For Linear Regression
No ratings yet
Benign Overfitting of Constant-Stepsize SGD For Linear Regression
56 pages
Stochastic Gradient Descent - Term Paper
No ratings yet
Stochastic Gradient Descent - Term Paper
8 pages
Main SGD
No ratings yet
Main SGD
32 pages
Stochastic Gradient Descent On Nonconvex Functions With General Noise Models
No ratings yet
Stochastic Gradient Descent On Nonconvex Functions With General Noise Models
19 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Lecture 5
No ratings yet
Lecture 5
4 pages
When Will Gradient Methods Converge To Max Margin Classifier Under Relu Models PDF
No ratings yet
When Will Gradient Methods Converge To Max Margin Classifier Under Relu Models PDF
23 pages
Conjugate Gradient Methods For High-Dimensional Glmms
No ratings yet
Conjugate Gradient Methods For High-Dimensional Glmms
39 pages
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
22 pages
Bott Curt Noce 18
No ratings yet
Bott Curt Noce 18
89 pages
Main SGD
No ratings yet
Main SGD
26 pages
A Score-Based Density Formula, With Applications in
No ratings yet
A Score-Based Density Formula, With Applications in
24 pages
ML - Stochastic Gradient Descent (SGD) - GeeksforGeeks
No ratings yet
ML - Stochastic Gradient Descent (SGD) - GeeksforGeeks
9 pages
Statistical Learning and Inver
No ratings yet
Statistical Learning and Inver
18 pages
Lec 12
No ratings yet
Lec 12
15 pages
L23 Stochastic Gradient and Mini Batch
No ratings yet
L23 Stochastic Gradient and Mini Batch
9 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Numerical Methods For Least Squares Problems, Second Edition
No ratings yet
Numerical Methods For Least Squares Problems, Second Edition
510 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
NeurIPS 2019 Towards Closing The Gap Between The Theory and Practice of SVRG Paper
No ratings yet
NeurIPS 2019 Towards Closing The Gap Between The Theory and Practice of SVRG Paper
11 pages
6705-Article Text-13114-1-10-20210220
No ratings yet
6705-Article Text-13114-1-10-20210220
29 pages
Lecture 1
No ratings yet
Lecture 1
6 pages
Logistic
No ratings yet
Logistic
14 pages
Linear Convergence of Adaptive Stochastic Gradient Descent
No ratings yet
Linear Convergence of Adaptive Stochastic Gradient Descent
19 pages
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
No ratings yet
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
601 pages
NLPQLG: A Fortran Implementation of A Sequential Quadratic Programming Algorithm For Heuristic Global Optimization - User's Guide
No ratings yet
NLPQLG: A Fortran Implementation of A Sequential Quadratic Programming Algorithm For Heuristic Global Optimization - User's Guide
24 pages
ANNMath
No ratings yet
ANNMath
104 pages
A Cost-Based Optimizer For Gradient Descent Optimization
No ratings yet
A Cost-Based Optimizer For Gradient Descent Optimization
16 pages
Pardalos, Handbook of Global Optimization
No ratings yet
Pardalos, Handbook of Global Optimization
58 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
12 pages
Nonasymptotic Analysis of Stochastic Gradient Hamiltonian Monte Carlo Under Local Conditions For Nonconvex Optimization
No ratings yet
Nonasymptotic Analysis of Stochastic Gradient Hamiltonian Monte Carlo Under Local Conditions For Nonconvex Optimization
34 pages
Exact Gaussian Processes On A Million Data Points: Equal Contribution
No ratings yet
Exact Gaussian Processes On A Million Data Points: Equal Contribution
13 pages
Handout 1 Introduction
No ratings yet
Handout 1 Introduction
7 pages
SVM GC
No ratings yet
SVM GC
42 pages
Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited
No ratings yet
Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited
19 pages
A Recursive Local Polynomial Approximation Method Using Dirichlet Clouds and Radial Basis Functions
No ratings yet
A Recursive Local Polynomial Approximation Method Using Dirichlet Clouds and Radial Basis Functions
26 pages
Math
No ratings yet
Math
737 pages
Patel Uchicago 0330D 14442
No ratings yet
Patel Uchicago 0330D 14442
239 pages
Non Convex Non Concave
No ratings yet
Non Convex Non Concave
31 pages
Lecture 2
No ratings yet
Lecture 2
6 pages
Tut7 Soln
No ratings yet
Tut7 Soln
10 pages
Ads Sy
No ratings yet
Ads Sy
3 pages
LecX6 DAA
No ratings yet
LecX6 DAA
19 pages
DSP Online Mid Exam 2020
No ratings yet
DSP Online Mid Exam 2020
2 pages
DIP Lab Manual No 06
No ratings yet
DIP Lab Manual No 06
7 pages
Circle Generation Algorithm
No ratings yet
Circle Generation Algorithm
3 pages
Applications of Deep Learning in Severity Prediction of Traffic Accidents
No ratings yet
Applications of Deep Learning in Severity Prediction of Traffic Accidents
17 pages
Analog Communication Course File
No ratings yet
Analog Communication Course File
37 pages
(Almost) All The Algorithms You Need To Know For OCR A Level Computer Science (v3.0)
No ratings yet
(Almost) All The Algorithms You Need To Know For OCR A Level Computer Science (v3.0)
16 pages
A Fast Fractal Image Compression Using Huffman Coding: D. Venkatasekhar, P. Aruna
No ratings yet
A Fast Fractal Image Compression Using Huffman Coding: D. Venkatasekhar, P. Aruna
4 pages
Topic Formula: P Q R HCF (P, Q, R) HCF (P, Q) HCF (Q, R) HCF (P, R) P Q R LCM (P, Q, R) LCM (P, Q) LCM (Q, R) LCM (P, R)
No ratings yet
Topic Formula: P Q R HCF (P, Q, R) HCF (P, Q) HCF (Q, R) HCF (P, R) P Q R LCM (P, Q, R) LCM (P, Q) LCM (Q, R) LCM (P, R)
9 pages
Assignment
No ratings yet
Assignment
9 pages
Digital Communicationsg
100% (1)
Digital Communicationsg
8 pages
MR Book
0% (1)
MR Book
414 pages
22317-2019-Winter-Question-Paper (Msbte Study Resources)
100% (1)
22317-2019-Winter-Question-Paper (Msbte Study Resources)
4 pages
Week 2
No ratings yet
Week 2
6 pages
6A-Divide-Conquer CP PC
No ratings yet
6A-Divide-Conquer CP PC
35 pages
Data Structures - Final Proyect 1
No ratings yet
Data Structures - Final Proyect 1
8 pages
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
No ratings yet
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
3 pages
Design and Analysis of Algorithms: Question Bank
100% (1)
Design and Analysis of Algorithms: Question Bank
7 pages
PSD
100% (1)
PSD
5 pages
1) Traveling Salesman Problem Using Branch and Bound
No ratings yet
1) Traveling Salesman Problem Using Branch and Bound
63 pages
Data Mining - Cluster Analysis
No ratings yet
Data Mining - Cluster Analysis
4 pages
Solution For The Laplace Equation
No ratings yet
Solution For The Laplace Equation
10 pages
Pulse Code Modulation
No ratings yet
Pulse Code Modulation
32 pages
Unit V Cluster Analysis
No ratings yet
Unit V Cluster Analysis
2 pages
EEE 203: Signals and Systems (Fall 2010)
No ratings yet
EEE 203: Signals and Systems (Fall 2010)
3 pages
Space & Time Complexity
No ratings yet
Space & Time Complexity
3 pages
Data Structures Lab - 3
No ratings yet
Data Structures Lab - 3
3 pages
302 Data Structure Using C-Min
No ratings yet
302 Data Structure Using C-Min
3 pages

A MATLAB Library For Stochastic Optimization Algorithms

Uploaded by

A MATLAB Library For Stochastic Optimization Algorithms

Uploaded by

Journal of Machine Learning Research 18 (2018) 1-5 Submitted 10/17; Revised 4/18; Published 4/18

SGDLibrary: A MATLAB library for stochastic optimization

Hiroyuki Kasai [email protected]

Editor: Geoff Holmes

c 2018 Hiroyuki Kasai.

The software architecture of SGDLibrary follows a typical module-based architecture, which

Others: SGDLibrary accommodates a user-defined step-size algorithm. This accommoda-

3. Tour of the SGDLibrary

1 % generate synthetic 300 samples of dimension 3 for logistic regression

Listing 1: Demonstration code for logistic regression problem.

Figure 1: Results of `2 -norm regularized logistic regression problem.

You might also like