0% found this document useful (0 votes)
12 views

Table of Content

Deeplearning book

Uploaded by

parinisoni99
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
12 views

Table of Content

Deeplearning book

Uploaded by

parinisoni99
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
Deep Learning Tan Goodfellow Yoshua Bengio Aaron Courville Contents Website Acknowledgments Notation 1 Introduction 1.1 Who Should Read This Book’ 1.2 Historical Trends in Deep Learning I Applied Math and Machine Learning Basics 2 Linear Algebra 2.1 Scalars, Vectors, Matrices and Tensors 2.2 Multiplying Matrices and Vectors 2.3 Identity and Inverse Matrices 2.4 Linear Dependence and Span 2.5 Norms beeen ee 2.6 Special Kinds of Matrices and Vect: 27 decomposition : 2.8 Singular Value Decomposition 2.9 The Moore-Penrose Pseudoinverse 2.10 The Trace Operator 2.11 The Determinant 2.12 Example: Principal Components Analysis 3 Probability and Information Theory 3.1 Why Probability? . . . viii ix xiii 27 29 29 32 34 35 37 38 40 42 43 44 45 45, 51 52 Ir Random Variables Probability Distributions Marginal Probability Conditional Probability The Chain Rule of Conditional Probabilities Independence and Conditional Indey Expectation, Variance and Covariance Common Probability Distributions 3.10 Useful Properties of Common Functions B11 Bayes’ Rules... eee 3.12. Technical Details of Continuous Variables 3.13 Information Theory . 3.14 Structured Probabilistic Models endence Numerical Computation 4.1 Overflow and Underflow 4.2 Poor Conditioning 4.3 Gradient-Based Optimization 4.4 Constrained Optimization 4.5 Example: Linear Least Squares Machine Learning Basics 5.1 Learning Algorithms 5.2 Capacity, Overfitting and Underfitting 5.3 Hyperparameters and Validation Sets 54 Estimators, Bias and Variance . 5.5 Maximum Likelihood Estimation 5.6 Bayesian Statistics Supervised Learning Algorithms 5.8 Unsupervised Learning Algorithms 5.9 Stochastic Gradient Descent . . 5.10 Building a Machine Learning Algorithm 5.11 Challenges Motivating Deep Learning . . Deep Networks: Modern Practices Deep Feedforward Networks 6.1 Example: Learning XOR 6.2 Gradient-Based Learning . 54 54 56 57 57 58 58 60 65 68 69 7 73 73 78 80 80 91 94 96 .. OT - 108 118 . 120 129 137 142 . 49 151 152 162 164 167 . 172 6.3, 64 6.5 6.6 Hidden Units Architecture Design Back Propagation and Other Differentiation Algorithms Historical Notes Regularization for Deep Learning 7A 72 7.3 74 75 76 7.7 Parameter Norm Penalties Norm Penalties as Constrained Optimization Regularization and Under-Constrained Problems Dataset Augmentation . Noise Robustness Semi-Supervised Learning Multitask Learning Early Stopping Paraineter Tying and Paramet Sparse Representations Bagging and Other Ensemble Methods Dropout Adversarial Training Tangent Distance, Tangent Prop and Manifold Tangent Classifier Sharing . Optimization for Training Deep Models 81 8.2 8.3, 8.6 87 How Learning Differs from Pure Optimization Challenges in Neural Network Optimi: Basic Algorithms Parameter Initialization Strategies Algorithms with Adaptive Learning Rates Approximate Second-Order Methods Optimization Strategies and Meta-Algorithms . . tion Convolutional Networks 9.1 9.2 9.3 94 9.5 9.6 9.7 The Convolution Operation Motivation Pooling - Convolution and Pooling as an Infinitely Strong Prior... . Variants of the Basic Convolution Function Structured Outputs . . Data Types 187 193 200 - 220 224 . 226 233 . 236 238 . 240 241 . 241 249 251 ~ 265 267 271 272 . 279 290 - 296 302 307 . 313 326 . 327 329 335 . 339 342 352 354 9.8 Efficient Convolution Algorithms 9.9 Random or Unsupervised Features 9.10 The Neuroscientific Basis for Convolutional Networks 9.11 Convolutional Networks and the History of Deep Learning . 10 Sequence Modeling: Recurrent and Recursive Nets 10.1 Unfolding Computational Graphs . 10.2 Recurrent Neural Networks 10.3 Bidirectional RNNs 10.4 Architectures 10.5 Deep Recurrent Networks 10.6 Recursive Neural Networks 10.7 The Challenge of Long-Term Dependencies 10.8 s etworks 2. ee 10.9 Leaky Units and Other Strategies for Multiple Time Scales... 0.000. bees 10.10 The Long Short-Term Memory and Other Gated RNI 10.11 Optimization for Long-Term Dependencies 10.12 Explicit Memory > 11 Practical Methodology 11.1 Performance Metrics . . 11.2 Default Baseline Models, 11.3. Determining Whether to Gather More Data 11.4 Selecting Hyperparameters 11.5 Debugging Strategies . 11.6 Example: Multi-Digit Number Re 12 Applications 12.1 Large-Scale Deep Learning . . . 12.2 Computer Vision 12.3. Speech Recognition . . 12.4 Natural Language Proce 12.5 Other Applications . . sing 416 . 417 420 421 422 - 431 438 - 438 447 - 453 456 - 473 III Deep Learning Research 13 Linear Factor Models 13.1 13.2 13.3 13.4 13.5 Probabilistic PCA and Factor Analysis Independent Component Analysis (ICA) Slow Feature Analysis . Sparse Coding Manifold Interpretation of PC ‘A 14 Autoencoders 141 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 Undercomplete Autoencoders Regularized Autoencoders Representational Power, Lay Encoder Denoising Autoencoders Learning Manifolds with A Size and Depth . . . and Decoders . utoencoders . Contractive Autoencoders Predictive Sparse Decomposition Applications of Autoencoders 15 Representation Learning 15.1 15.2 15.3 15.4 15.5 15.6 Greedy Layer-Wise Unsupervised Pretraining Transfer Learning and Domain Adaptation ed Di Distributed Rep: Exponential Gains from Depth : Providing Clues to Discover Underlying Caus Semi-Supe ntangling of Causal Factors ntation 16 Structured Probabilistic Models for Deep Learning 16.1 16.2 16.3 16.4 16.5 16.6 16.7 The Challenge of Unstructured Modeling Using Graphs to Describe Model Structure Sampling from Graphical Models... . Advantages of Structured Modeling . . Learning about Dependencies Inference and Approximate Infe: The Deep Learning Approach to Structured Probabilistic Models 17 Monte Carlo Methods 171 Sampling and Monte Carlo Methods 555 . 556 560 . S77 579 579 . 580 18 19 20 17.2. Importance Sampling 17.3 Markov Chain Monte Carlo Methods . . 17.4 Gibbs Sampling 17.5 The Challenge of Mixing between Separated Modes Confronting the Partition Function 18.1 The Log-Likelihood Gradient 18.2 Stochastic Maximum Likelihood and Contrastive Divergence 18.3. Pseudolikelihood 184 Score Matching and Ratio Matching 18.5 Denoising Score Matching 18.6 Noise-Contrastive Estimation 18.7. Estimating the Partition Function Approximate Inference 19.1 Inference as Optimization 19.2. Expectation Maximization . 19.3 MAP Inference and Sparse Coding 19.4. Variational Inference and Learning 19.5 Learned Approximate Inference Deep Generative Models 20.1 Boltzmann Machines . . 20.2. Restricted Boltzmann Machin 20.3. Deep Belief Networks 20.4 Deep Boltzmann Machines : 205 Boltzmann Machines for Real-Valued Data 20.6 Convolutional Boltzmann Machines . 20.7 Boltzmann Machines for Structured or Sequential Outputs - 20.8 Other Boltzmann Machines 20.9 Back-Propagation through Random Operations . 20.10 Directed Generative Nets 20.11 Drawing Samples from Autoencoders . . 20.12 Generative Stochastic Networks 20.13 Other Generation Schemes 20.14 Evaluating Generative Model 20.15 Conclusion Bibliography -. 657 - 660 589 592 596 603 . 604 605 -. 613 . 615 617 . 618 621 629 631 - 632 636 - 648 651 . 651 673 .. 679 . 681 683 . 684 688 . 707 710 . 712 713 716 717 Index 173 ‘Website www.deeplearningbook.org This book is accompanied by the above website. The website provides a variety of supplementary material, including exercises, lecture slides, corrections of mistakes, and other resources that should be useful to both readers and instructors. viii

You might also like