Model-Driven Deep Learning

This document discusses model-driven deep learning, which combines modeling-based and deep learning-based approaches. It begins by noting the successes of deep learning but also its limitations like lack of theoretical foundations and difficulty designing network topology. It then introduces the model-driven approach, which uses models based on task objectives and domain knowledge to guide network design. The rest of the document outlines the process of model-driven deep learning, including constructing a model family based on the task, designing an algorithm family to solve the model family, and using this approach to determine network topology in a way that is theoretically sound while retaining deep learning's ability to learn from data.

Uploaded by

jun zhao

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

Model-Driven Deep Learning

Uploaded by

jun zhao

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

PERSPECTIVES

INFORMATION SCIENCE

Special Topic: Machine Learning

Model-driven deep-learning
Zongben Xu∗ and Jian Sun∗

Deep learning has been widely recog-

nized as the representative advances of Model family Algorithm family Deep network
machine learning or artificial intelligence
in general nowadays [1,2]. This can be
attributed to the recent breakthroughs Figure 1. Model-driven deep-learning approach.
made by deep learning on a series of
challenging applications. A deep-learning
approach improves the accuracy rate institutions can obviously match these as a data-driven approach because it
of face recognition to be higher than requirements. uses a standard network architecture
99%, beating the human level [3]. For With the arrival of the big data era, as a black box, heavily relying on huge
speech recognition and machine trans- data requirements are gradually no longer data to train the black box. In contrast,
lation, deep learning is approaching the an obstacle (at least for many areas), but the model-driven approach here refers
performance level of a simultaneous in- the determination of network topology to the method using a model (e.g. a
terpreter [4]. For the game of ‘go’, is still a bottleneck. This is mainly due loss function) constructed based on
it successfully beats the human world to the lack of theoretical understandings the objective, physical mechanism and
champion [5]. For diagnosis of some spe- of the relationship between the network domain knowledge for a specific task. A
cific diseases, it has matched the level of topology and performance. In the cur- prominent feature of the model-driven
medium or senior professional physicians rent state, the selection of network topol- approach is that, when the model is
[6]. Until now, it has been hard to find ogy is still an engineering practice instead sufficiently accurate, the solution can be
areas in which the deep-learning tech- of scientific research, leading to the fact generally expected to be optimal, and the
nique has not been tried in their respec- that most of the existing deep-learning minimization algorithm is commonly
tive tasks. approaches lack theoretical foundations. deterministic. A fatal flaw of the model-
One can observe that these break- The difficulties in network design and its driven approach lies in the difficulty in
throughs always take place in large IT interpretation, and a lack of understand- accurately modeling for a specific task
companies or specialized R&D institutes, ing in its generalization ability are the in real applications, and sometimes the
such as Google, Microsoft, Facebook, common limitations of the deep-learning pursuit of accurate modeling is a luxury
etc. This is because deep-learning ap- approach. These limitations may prevent expectation. In recent years, we have
plications require some prerequisites, its widespread use in the trends of ‘stan- studied and implemented a series of
such as a huge volume of labeled data, dardization, commercialization’ of ma- model-driven deep-learning methods
sufficient computational resources and chine learning and artificial intelligence [7–10] combining the modeling-based
the engineering experiences in determin- technology. and deep-learning-based approaches,
ing the network topology, including the A natural question is whether we can which showed their feasibilities and
number of layers, number of neurons design network topology with theoretical effectiveness in real applications.
per layer and non-linear transforms of foundations, and make the network Given a specific task, the basic proce-
neurons. Due to these prerequisites, it structure explainable and predictable. dures of our model-driven deep-learning
requires sufficient knowledge and en- We believe that it is possible to provide a method are shown in Fig. 1 and explained
gineering experience in neural network positive answer to this question through as follows:
design, and takes a long time in accu- combing the model-driven approach (1) A model family is first constructed
mulating and labeling data. Professional and data-driven deep-learning approach. based on the task backgrounds (e.g.
IT companies and specialized R&D Here we take the deep-learning approach objective, physical mechanism and

[email protected]
PERSPECTIVES Xu and Sun 23

Sampling data Reconstructed

M(n-1) M(n) M(n+1)
in k-space MR image

X(1) X(n-1) C(n-1) S(n-1) X(n) C(n) S(n) X(n+1) C(n+1) S(n+1)

Stage n

Figure 2. Topology of ADMM-Net [7]: given under-sampled k-space data, it outputs the reconstructed MRI image after T stages of processing.

prior knowledge). The model family using sub-sampled k-space data lower results using this model-driven deep-
is a family of functions with a large than the Nyquist rate. The model family learning method.
set of unknown parameters, amount- is defined as: The above model-driven deep-
ing to the hypothesis space in ma- learning approach obviously retains the
1
chine learning. Differently from the x̂ = arg min Ax − y 22 advantages (i.e. determinacy and theo-
accurate model in the model-driven x 2 retical soundness) of the model-driven
approach, this model family only L approach, and avoids the requirement
provides a very rough and broad def- + λl g (D l x) , (1) for accurate modeling. It also retains
l =1
inition of the solution space. It has the powerful learning ability of the
the advantage of a model-driven ap- where A = P F is the measurement deep-learning approach, and overcomes
proach but greatly reduces the pres- matrix, P is the sampling matrix, F is the difficulties in network topology
sure of accurate modeling. the Fourier transform matrix, Dl is lin-
ear transform for convolution, g (·) is the selection. This makes the deep-learning
(2) An algorithm family is then designed regularization function, λl is the regular- approach designable and predictable,
for solving the model family and the ization parameter and L is the number and it balances well versatility and
convergence theory of the algorithm of linear transforms. All the parameters of pertinence in real applications.
family is established. The algorithm (D l , g , λl , L) are unknown and reflect We point out that the model-driven
family refers to the algorithm with the uncertainty in modeling (notice that approach and data-driven approach are
unknown parameters for minimiz- these parameters are known and fixed not opposed to each other. If the model
ing the model family in the func- in traditional CS-MRI models). Accord-
ing to the ADMM (Alternating Direction is accurate, it provides the essential
tion space. The convergence theory Method of Multipliers) method, the al- description of the problem solutions,
should include the convergence rate gorithm family for solving the model fam- from which infinite ideal samples can
estimation and the constraints on the ily can be designated as: be generated, and vice versa: when the
parameters that assure the conver- ⎧ T sufficient samples are provided, the

T −1
⎪x = F P P + l ρl F Dl Dl F
⎪ (n) T T
gence of the algorithm family. model of the problem is fully (but in
⎪
⎪
(3) The algorithm family is unfolded to ⎪
⎪
discretized form) represented. This is
⎪
⎪
a deep network with which parame- ⎨× P T y + l ρl F DlT zl(n−1) + βl(n−1)
,
the essential reason for the effectiveness
ter learning is performed as a deep- ⎪
⎪
⎪ (n)
= (n)
+ β
(n−1) λl of the model-driven deep-learning
⎪ l
⎪ z S D l x l ; ρl
learning approach. The depth of the ⎪
⎪ approach.
⎪
⎩ (n)
network is determined by the con- βl = βl
(n−1)
+ ηl Dl x − zl (n)
(n)
Please refer to [2,8] for the previ-
vergence rate estimation of the al- (2)
ous investigations of the model-driven
gorithm family. The parameter space deep-learning approach. The recent ad-
where S(·) is a non-linear transform re-
of the deep network is determined vances can be found in [7,9–11]. Most
lating to g (·). According to the ADMM
by the parameter constraints. All the of these successful applications lie in the
convergence theory, this algorithm is
parameters of the algorithm family inverse problems in imaging sciences,
linearly convergent. By unfolding the
are learnable. In this way, the topol- for which there exists domain knowl-
algorithm family to a deep network, we
ogy of the deep network is deter- edge that can be well modeled in the
design an ADMM-Net composed of T
mined by the algorithm family, and model family. We believe that this model-
successive stages, as shown in Fig. 2. Each
the deep network can be trained driven deep-learning approach can be
stage consists of a reconstruction layer
through back-propagation. widely applied to the applications where
(R), a convolution layer (C), a non-
Taking [7] as an example, we apply linear transform layer (Z) and a mul- we can design the model family by
the above model-driven deep-learning tiplier update layer (M). We learn the incorporating domain knowledge and
approach to compressive sensing mag- parameters of (S, D l , λl , ρl , ηl ) using then the deep architecture can be corre-
netic resonance imaging (CS-MRI), i.e. a back-propagation algorithm. In [7], spondingly designed following the above
recovering the high-quality MR image we reported the state-of-the-art CS-MRI procedures.
24 Natl Sci Rev, 2018, Vol. 5, No. 1 PERSPECTIVES

Zongben Xu∗ and Jian Sun∗ 2. Gregor K and LeCun Y. ICML 2010. 9. Sun J and Tappen M. IEEE T Image Process 2013;
Xi’an International Academy for Mathematics & 3. Schroff F, Kalenichenko D and Philbin J. CVPR 22: 402–8.
Mathematical Technology, Xi’an Jiaotong 2015. 10. Sun J, Sun J and Xu Z. IEEE T Image Process 2015;
University, China 4. Yonghui W, Schuster M and Zhifeng Chen et al. 24: 4148–59.
∗ Corresponding authors. arXiv:1609.08144, 2016. 11. Sprechmann P, Bronstein AM and Sapiro G. IEEE
E-mails: [email protected]; 5. Silver D, Aja Huang and Chris J. Maddison et al. TPAMI 2015; 37: 1821–33.
[email protected] Nature 2016; 529: 484–9.
6. Gulshan V, Peng L and Coram M et al. Jama 2016; National Science Review
316: 2402–10. 5: 22–24, 2018
REFERENCES 7. Yang Y, Sun J and Li H et al. NIPS 2016. doi: 10.1093/nsr/nwx099
8. Sun J and Tappen M. CVPR 2011. Advance access publication 25 August 2017
1. LeCun Y, Bengio Y and Hinton G. Nature 2015;
521: 436–44.

COMPUTER SCIENCE

Special Topic: Machine Learning

Deep learning for natural language processing: advantages
and challenges
Hang Li

INTRODUCTION however. It might not be sufficient for performances in the first four tasks and
inference and decision making, which becomes the state-of-the-art technology
Deep learning refers to machine learning are essential for complex problems like for the tasks (e.g. [1–8]).
technologies for learning and utilizing multi-turn dialogue. Furthermore, how Table 2 shows the performances of
‘deep’ artificial neural networks, such to combine symbolic processing and neu- example problems in which deep learn-
as deep neural networks (DNN), con- ral processing, how to deal with the long ing has surpassed traditional approaches.
volutional neural networks (CNN) tail phenomenon, etc. are also challenges Among all the NLP problems, progress
and recurrent neural networks (RNN). of deep learning for natural language in machine translation is particularly
Recently, deep learning has been suc- processing. remarkable. Neural machine translation,
cessfully applied to natural language i.e. machine translation using deep
processing and significant progress learning, has significantly outperformed
has been made. This paper summa- traditional statistical machine translation.
rizes the recent advancement of deep PROGRESS IN NATURAL The state-of-the art neural translation
learning for natural language process- LANGUAGE PROCESSING systems employ sequence-to-sequence
ing and discusses its advantages and In our view, there are five major tasks in learning models comprising RNNs
challenges. natural language processing, namely clas- [4–6].
We think that there are five major sification, matching, translation, struc- Deep learning has also, for the first
tasks in natural language processing, in- tured prediction and the sequential deci- time, made certain applications possi-
cluding classification, matching, transla- sion process. Most of the problems in nat- ble. For example, deep learning has been
tion, structured prediction and the se- ural language processing can be formal- successfully applied to image retrieval
quential decision process. For the first ized as these five tasks, as summarized in (also known as text to image), in which
four tasks, it is found that the deep learn- Table 1. In the tasks, words, phrases, sen- query and image are first transformed
ing approach has outperformed or sig- tences, paragraphs and even documents into vector representations with CNNs,
nificantly outperformed the traditional are usually viewed as a sequence of tokens the representations are matched with
approaches. (strings) and treated similarly, although DNN and the relevance of the image to
End-to-end training and represen- they have different complexities. In fact, the query is calculated [3]. Deep learn-
tation learning are the key features of sentences are the most widely used pro- ing is also employed in generation-based
deep learning that make it a powerful cessing units. natural language dialogue, in which, given
tool for natural language process- It has been observed recently an utterance, the system automatically
ing. Deep learning is not almighty, that deep learning can enhance the generates a response and the model

Physics Informed Neural Network Theory and Applications
No ratings yet
Physics Informed Neural Network Theory and Applications
44 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
71 pages
Aashto T180
100% (3)
Aashto T180
11 pages
Deep Learning
No ratings yet
Deep Learning
243 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
Deep Learning Artificial Intelligence
No ratings yet
Deep Learning Artificial Intelligence
9 pages
Why_are_Graph_Neural_Networks_Effective_for_EDA_Problems
No ratings yet
Why_are_Graph_Neural_Networks_Effective_for_EDA_Problems
8 pages
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
No ratings yet
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
13 pages
Cheatsheets For Deep Learning 1650192034
No ratings yet
Cheatsheets For Deep Learning 1650192034
95 pages
Deep Learning: Nicholas G. Polson Vadim O. Sokolov
No ratings yet
Deep Learning: Nicholas G. Polson Vadim O. Sokolov
18 pages
Survey of FNN
No ratings yet
Survey of FNN
25 pages
Model-Based Deep Learning
No ratings yet
Model-Based Deep Learning
35 pages
Machine Learning 4 Hep
No ratings yet
Machine Learning 4 Hep
111 pages
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
No ratings yet
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
37 pages
DL Intro
No ratings yet
DL Intro
64 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Peng Et Al.: Deep Learning and Practice 1
No ratings yet
Peng Et Al.: Deep Learning and Practice 1
8 pages
AA12_Deep_Learning_2024 (1)
No ratings yet
AA12_Deep_Learning_2024 (1)
30 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
No ratings yet
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
32 pages
A Review On Basic Deep Learning
No ratings yet
A Review On Basic Deep Learning
9 pages
The Modern Mathematics of Deep Learning
No ratings yet
The Modern Mathematics of Deep Learning
78 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
conmatphys-031119-050745
No ratings yet
conmatphys-031119-050745
28 pages
Statistics Mechanic of Deep Learning
No ratings yet
Statistics Mechanic of Deep Learning
28 pages
Deep Learning As A Frontier of Machine Learning A
No ratings yet
Deep Learning As A Frontier of Machine Learning A
10 pages
Deep learning (nirali)
No ratings yet
Deep learning (nirali)
32 pages
Recent Advances in Deep Learning Based Computer Vision
No ratings yet
Recent Advances in Deep Learning Based Computer Vision
6 pages
Review of Deep Learning Algorithms and Architectur
No ratings yet
Review of Deep Learning Algorithms and Architectur
29 pages
Deep Learning As A Frontier of Machine Learning A
No ratings yet
Deep Learning As A Frontier of Machine Learning A
10 pages
Deep Learning Algorithms and Architectures
No ratings yet
Deep Learning Algorithms and Architectures
26 pages
(Studies in Computational Intelligence) Witold Pedrycz, Shyi-Ming Chen - Deep Learning - Algorithms and Applications-Springer (2020)
100% (6)
(Studies in Computational Intelligence) Witold Pedrycz, Shyi-Ming Chen - Deep Learning - Algorithms and Applications-Springer (2020)
368 pages
Madda Walabu University: Collage of Computing Department of Information System
No ratings yet
Madda Walabu University: Collage of Computing Department of Information System
22 pages
Machine Deep Learning.
No ratings yet
Machine Deep Learning.
8 pages
THE_DEEP_NEURAL_NETWORK-A_REVIEW
No ratings yet
THE_DEEP_NEURAL_NETWORK-A_REVIEW
5 pages
Anomaly Detection as a Tool for Discovering New Physics at CERN’s Large Hadron Collider
No ratings yet
Anomaly Detection as a Tool for Discovering New Physics at CERN’s Large Hadron Collider
8 pages
Deep
No ratings yet
Deep
15 pages
Deep Learning Module-01
No ratings yet
Deep Learning Module-01
17 pages
Deep Learning Modelling Techniques Current Progress, Applications, Advantages, and Challenges
No ratings yet
Deep Learning Modelling Techniques Current Progress, Applications, Advantages, and Challenges
97 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
NISS Deep Learning Tutorial
No ratings yet
NISS Deep Learning Tutorial
58 pages
Deep Learning in Neural Networks An Overview
No ratings yet
Deep Learning in Neural Networks An Overview
89 pages
Rishi Mnist
No ratings yet
Rishi Mnist
26 pages
Model-Based Deep Learning
No ratings yet
Model-Based Deep Learning
35 pages
Algorithm_Unrolling_Interpretable_Efficient_Deep_Learning_for_Signal_and_Image_Processing
No ratings yet
Algorithm_Unrolling_Interpretable_Efficient_Deep_Learning_for_Signal_and_Image_Processing
27 pages
3rd Unit DL Final Class Notes
No ratings yet
3rd Unit DL Final Class Notes
78 pages
Intro Deep Learning
No ratings yet
Intro Deep Learning
32 pages
Deep Learning in Data Science Theoretical Foundati
No ratings yet
Deep Learning in Data Science Theoretical Foundati
6 pages
Analysis of Heart Diseases Dataset Using Neural Network Approach
No ratings yet
Analysis of Heart Diseases Dataset Using Neural Network Approach
8 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
A Survey of Deep Learning and Its Applications: A New Paradigm To Machine Learning
No ratings yet
A Survey of Deep Learning and Its Applications: A New Paradigm To Machine Learning
22 pages
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
No ratings yet
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
23 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
ann report ashwin
No ratings yet
ann report ashwin
16 pages
Machine Learning BSP
No ratings yet
Machine Learning BSP
26 pages
ITR Roll No.20
No ratings yet
ITR Roll No.20
3 pages
Secrets of Deep Learning 1716536527
No ratings yet
Secrets of Deep Learning 1716536527
12 pages
Deep Learning 2 July 2014
No ratings yet
Deep Learning 2 July 2014
75 pages
Unit-4 ML Notes Part-2
No ratings yet
Unit-4 ML Notes Part-2
21 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
An Invitation To 3-D Vision From Images To Models
No ratings yet
An Invitation To 3-D Vision From Images To Models
339 pages
Radio-Frequency Block Arrangements For Fixed Wireless Access Systems in The Range 10.15-10.3/10.5-10.65 GHZ
No ratings yet
Radio-Frequency Block Arrangements For Fixed Wireless Access Systems in The Range 10.15-10.3/10.5-10.65 GHZ
5 pages
8 - RISCV - Pipelined - Arch2
No ratings yet
8 - RISCV - Pipelined - Arch2
57 pages
Notes Unit 4
No ratings yet
Notes Unit 4
10 pages
Jurnal Agregasi Protein
No ratings yet
Jurnal Agregasi Protein
13 pages
3D Model Searching
No ratings yet
3D Model Searching
8 pages
Quality Inspection
No ratings yet
Quality Inspection
17 pages
Pair of Straight Lines PDF
No ratings yet
Pair of Straight Lines PDF
44 pages
Flower Pollination Algorithm For Global Optimization2
No ratings yet
Flower Pollination Algorithm For Global Optimization2
10 pages
Computer Project
100% (1)
Computer Project
6 pages
Ucmas 1/2D - 3/4R
100% (2)
Ucmas 1/2D - 3/4R
12 pages
Gps01 1569793 Osram Dulux Classic B
No ratings yet
Gps01 1569793 Osram Dulux Classic B
4 pages
Ee353 Network Synthesis (Open Elective) Dr. M. Raju
No ratings yet
Ee353 Network Synthesis (Open Elective) Dr. M. Raju
8 pages
Choosg The Chiller
No ratings yet
Choosg The Chiller
8 pages
(Ebook) Handbook of Geotechnical Testing: Basic Theory, Procedures and Comparison of Standards by Yanrong Li ISBN 9780367340643, 036734064X All Chapters Instant Download
No ratings yet
(Ebook) Handbook of Geotechnical Testing: Basic Theory, Procedures and Comparison of Standards by Yanrong Li ISBN 9780367340643, 036734064X All Chapters Instant Download
65 pages
203011-2015
No ratings yet
203011-2015
97 pages
Ba 3119 06-09 e Ohne KA
No ratings yet
Ba 3119 06-09 e Ohne KA
150 pages
Recent Developments and Applications of Chemical Mechanical Polishing
No ratings yet
Recent Developments and Applications of Chemical Mechanical Polishing
12 pages
Battery Selection Chart
100% (1)
Battery Selection Chart
15 pages
Investigation of Platformer Reactor Nozzle Cracking
No ratings yet
Investigation of Platformer Reactor Nozzle Cracking
12 pages
1. RACH CBRA Success Rate - Guideline PA7
No ratings yet
1. RACH CBRA Success Rate - Guideline PA7
58 pages
Prod EW2440 ECCO 8.5x11
No ratings yet
Prod EW2440 ECCO 8.5x11
1 page
CS604P Assignment 2 Solution Spring 2024
No ratings yet
CS604P Assignment 2 Solution Spring 2024
6 pages
CIT Notes
No ratings yet
CIT Notes
39 pages
IYPL - Questions 3
100% (1)
IYPL - Questions 3
4 pages
Mind Map For Chapter Four. The Mind Map of The Types of Solution
No ratings yet
Mind Map For Chapter Four. The Mind Map of The Types of Solution
4 pages
Chapter 4 Heat
No ratings yet
Chapter 4 Heat
71 pages
JCPR 3
No ratings yet
JCPR 3
6 pages