Model-Driven Deep Learning
Model-Driven Deep Learning
INFORMATION SCIENCE
C The Author(s) 2017. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. All rights reserved. For permissions, please e-mail:
[email protected]
PERSPECTIVES Xu and Sun 23
X(1) X(n-1) C(n-1) S(n-1) X(n) C(n) S(n) X(n+1) C(n+1) S(n+1)
Stage n
Figure 2. Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing.
prior knowledge). The model family using sub-sampled k-space data lower results using this model-driven deep-
is a family of functions with a large than the Nyquist rate. The model family learning method.
set of unknown parameters, amount- is defined as: The above model-driven deep-
ing to the hypothesis space in ma- learning approach obviously retains the
1
chine learning. Differently from the x̂ = arg min Ax − y 22 advantages (i.e. determinacy and theo-
accurate model in the model-driven x 2 retical soundness) of the model-driven
approach, this model family only L approach, and avoids the requirement
provides a very rough and broad def- + λl g (D l x) , (1) for accurate modeling. It also retains
l =1
inition of the solution space. It has the powerful learning ability of the
the advantage of a model-driven ap- where A = P F is the measurement deep-learning approach, and overcomes
proach but greatly reduces the pres- matrix, P is the sampling matrix, F is the difficulties in network topology
sure of accurate modeling. the Fourier transform matrix, Dl is lin-
ear transform for convolution, g (·) is the selection. This makes the deep-learning
(2) An algorithm family is then designed regularization function, λl is the regular- approach designable and predictable,
for solving the model family and the ization parameter and L is the number and it balances well versatility and
convergence theory of the algorithm of linear transforms. All the parameters of pertinence in real applications.
family is established. The algorithm (D l , g , λl , L) are unknown and reflect We point out that the model-driven
family refers to the algorithm with the uncertainty in modeling (notice that approach and data-driven approach are
unknown parameters for minimiz- these parameters are known and fixed not opposed to each other. If the model
ing the model family in the func- in traditional CS-MRI models). Accord-
ing to the ADMM (Alternating Direction is accurate, it provides the essential
tion space. The convergence theory Method of Multipliers) method, the al- description of the problem solutions,
should include the convergence rate gorithm family for solving the model fam- from which infinite ideal samples can
estimation and the constraints on the ily can be designated as: be generated, and vice versa: when the
parameters that assure the conver- ⎧ T sufficient samples are provided, the
T −1
⎪x = F P P + l ρl F Dl Dl F
⎪ (n) T T
gence of the algorithm family. model of the problem is fully (but in
⎪
⎪
(3) The algorithm family is unfolded to ⎪
⎪
discretized form) represented. This is
⎪
⎪
a deep network with which parame- ⎨× P T y + l ρl F DlT zl(n−1) + βl(n−1)
,
the essential reason for the effectiveness
ter learning is performed as a deep- ⎪
⎪
⎪ (n)
= (n)
+ β
(n−1) λl of the model-driven deep-learning
⎪ l
⎪ z S D l x l ; ρl
learning approach. The depth of the ⎪
⎪ approach.
⎪
⎩ (n)
network is determined by the con- βl = βl
(n−1)
+ ηl Dl x − zl (n)
(n)
Please refer to [2,8] for the previ-
vergence rate estimation of the al- (2)
ous investigations of the model-driven
gorithm family. The parameter space deep-learning approach. The recent ad-
where S(·) is a non-linear transform re-
of the deep network is determined vances can be found in [7,9–11]. Most
lating to g (·). According to the ADMM
by the parameter constraints. All the of these successful applications lie in the
convergence theory, this algorithm is
parameters of the algorithm family inverse problems in imaging sciences,
linearly convergent. By unfolding the
are learnable. In this way, the topol- for which there exists domain knowl-
algorithm family to a deep network, we
ogy of the deep network is deter- edge that can be well modeled in the
design an ADMM-Net composed of T
mined by the algorithm family, and model family. We believe that this model-
successive stages, as shown in Fig. 2. Each
the deep network can be trained driven deep-learning approach can be
stage consists of a reconstruction layer
through back-propagation. widely applied to the applications where
(R), a convolution layer (C), a non-
Taking [7] as an example, we apply linear transform layer (Z) and a mul- we can design the model family by
the above model-driven deep-learning tiplier update layer (M). We learn the incorporating domain knowledge and
approach to compressive sensing mag- parameters of (S, D l , λl , ρl , ηl ) using then the deep architecture can be corre-
netic resonance imaging (CS-MRI), i.e. a back-propagation algorithm. In [7], spondingly designed following the above
recovering the high-quality MR image we reported the state-of-the-art CS-MRI procedures.
24 Natl Sci Rev, 2018, Vol. 5, No. 1 PERSPECTIVES
Zongben Xu∗ and Jian Sun∗ 2. Gregor K and LeCun Y. ICML 2010. 9. Sun J and Tappen M. IEEE T Image Process 2013;
Xi’an International Academy for Mathematics & 3. Schroff F, Kalenichenko D and Philbin J. CVPR 22: 402–8.
Mathematical Technology, Xi’an Jiaotong 2015. 10. Sun J, Sun J and Xu Z. IEEE T Image Process 2015;
University, China 4. Yonghui W, Schuster M and Zhifeng Chen et al. 24: 4148–59.
∗ Corresponding authors. arXiv:1609.08144, 2016. 11. Sprechmann P, Bronstein AM and Sapiro G. IEEE
E-mails: [email protected]; 5. Silver D, Aja Huang and Chris J. Maddison et al. TPAMI 2015; 37: 1821–33.
[email protected] Nature 2016; 529: 484–9.
6. Gulshan V, Peng L and Coram M et al. Jama 2016; National Science Review
316: 2402–10. 5: 22–24, 2018
REFERENCES 7. Yang Y, Sun J and Li H et al. NIPS 2016. doi: 10.1093/nsr/nwx099
8. Sun J and Tappen M. CVPR 2011. Advance access publication 25 August 2017
1. LeCun Y, Bengio Y and Hinton G. Nature 2015;
521: 436–44.
COMPUTER SCIENCE
INTRODUCTION however. It might not be sufficient for performances in the first four tasks and
inference and decision making, which becomes the state-of-the-art technology
Deep learning refers to machine learning are essential for complex problems like for the tasks (e.g. [1–8]).
technologies for learning and utilizing multi-turn dialogue. Furthermore, how Table 2 shows the performances of
‘deep’ artificial neural networks, such to combine symbolic processing and neu- example problems in which deep learn-
as deep neural networks (DNN), con- ral processing, how to deal with the long ing has surpassed traditional approaches.
volutional neural networks (CNN) tail phenomenon, etc. are also challenges Among all the NLP problems, progress
and recurrent neural networks (RNN). of deep learning for natural language in machine translation is particularly
Recently, deep learning has been suc- processing. remarkable. Neural machine translation,
cessfully applied to natural language i.e. machine translation using deep
processing and significant progress learning, has significantly outperformed
has been made. This paper summa- traditional statistical machine translation.
rizes the recent advancement of deep PROGRESS IN NATURAL The state-of-the art neural translation
learning for natural language process- LANGUAGE PROCESSING systems employ sequence-to-sequence
ing and discusses its advantages and In our view, there are five major tasks in learning models comprising RNNs
challenges. natural language processing, namely clas- [4–6].
We think that there are five major sification, matching, translation, struc- Deep learning has also, for the first
tasks in natural language processing, in- tured prediction and the sequential deci- time, made certain applications possi-
cluding classification, matching, transla- sion process. Most of the problems in nat- ble. For example, deep learning has been
tion, structured prediction and the se- ural language processing can be formal- successfully applied to image retrieval
quential decision process. For the first ized as these five tasks, as summarized in (also known as text to image), in which
four tasks, it is found that the deep learn- Table 1. In the tasks, words, phrases, sen- query and image are first transformed
ing approach has outperformed or sig- tences, paragraphs and even documents into vector representations with CNNs,
nificantly outperformed the traditional are usually viewed as a sequence of tokens the representations are matched with
approaches. (strings) and treated similarly, although DNN and the relevance of the image to
End-to-end training and represen- they have different complexities. In fact, the query is calculated [3]. Deep learn-
tation learning are the key features of sentences are the most widely used pro- ing is also employed in generation-based
deep learning that make it a powerful cessing units. natural language dialogue, in which, given
tool for natural language process- It has been observed recently an utterance, the system automatically
ing. Deep learning is not almighty, that deep learning can enhance the generates a response and the model