Deep Learning

deep learning based on machine learning

Uploaded by

accelia.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

18 views21 pages

Deep Learning

deep learning based on machine learning

Uploaded by

accelia.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 21

WHAT IS DEEP LEARNING? Deep learning is a subset of machine learning, which is a subset of artificial intelligence. Deep learning algorithms attempt to draw similar conclusions as humans would by continually analyzing data with a given logical structure. To achieve this, deep learning uses a multi-layered structure of algorithms called neural networks. | Aruna: inteitigence vs. Machine Learning vs. Deep Learning What Is Deep Learning? Deep learning is a subset of machine learning, which is a subset of artificial intelligence. Artificial intelligence is a general term that refers to techniques that enable computers to mimic human behavior. Machine learning represents a set of algorithms trained on data that make all of this possible. Deep learning is just a type of machine learning, inspired by the structure of the human brain.aa E nce a ety Avs, machine learning vs. deep learning Deep learning algorithms attempt to draw similar conclusions as humans would by continually analyzing data with a given logical structure. To achieve this, deep learning uses a multi-layered structure of algorithms called neural networks. Atypical neural network ‘The design of the neural network is based on the structure of the human brain. Just as we use our brains to identify patterns and classify different types of information, we can teach neural networks to perform the same tasks on data. (in) Post () shareNeural networks enable us to perform many tasks, such as clustering, classification or regre: ion, With neural networks, we can group or sort unlabeled data according to similarities among samples in the data. Or, in the case of classification, we can train the network ona labeled data set in order to classify the samples in the data set into different categories. In general, neural networks algorithms (but cl: networks). In other words, artificial neural networks have unique capabilities that in perform the same tasks as classical machine learning | algorithms cannot perform the same tasks as neural enable deep learning models to solve tasks that machine learning models can never solve. All recent advances in artificial intelligence in recent years are due to deep learning. Without deep learning, we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri, Google Translate would continue to be as primitive as it was 10 years ago before Google switched to neural networks and Netflix would have no idea which movies to suggest. Neural networks are behind all these technologies. Anew industrial revolution is taking place, driven by artificial neural networks and deep learning, At the end of the day, deep learning is the best and most obvious approach to real machine intelligence we've ever had. Is Deep Learning Popular?cauea jeacure extracvn. Long before we began using deep learning, we relied on traditional machine learning methods including decision trees, SVM, naive Bayes classifier and logistic regression. These algorithms are also called flat algorithms. “Flat” here refers to the fact these algorithms cannot normally be applied directly to the raw data (such as .csv, images, text, etc.). We need a preprocessing step called feature extraction. ‘The result of feature extraction is a representation of the given raw data that these classic machine learning algorithms can use to perform a task. For example, we can now classify the data into several categories or classes. Feature extraction is usually quite complex and requires detailed knowledge of the problem domain. This preprocessing layer must be adapted, tested and refined over several iterations for optimal results. Deep learnings artificial neural networks don't need the feature extraction step. The layers are able to learn an implicit representation of the raw data directly and on their own. Here's how it works: A more and more abstract and compressed representation of the raw data is produced over several layers of an artificial neural net. We then use this compressed representation of the input data to produce the result. The result can be, for example, the classification of the input data into different classes. Machine Learning om ii i234 Ei ee put Feature extraction Cssiteation ‘Output Deep Learning _ ae —During the training process, this neural network optimizes this step to obtain the best possible abstract representation of the input data. This means that deep learning models require little to no manual effort to perform and optimize the feature extraction process. Let’s look at a concrete example. If you want to use a machine learning model to. determine if a particular image is showing a car or not, we humans first need to identify the unique features of a car (shape, size, windows, wheels, etc,), then extract the feature and give it to the algorithm as input data. In this way, the algorithm would perform a classification of the images. That is, in machine learning, a programmer must intervene directly in the action for the model to come to a conclusion. In the case of a deep learning model, the feature extraction step is completely unnecessary. The model would recognize these unique characteristics of a car and make correct predictions without human intervention. In fact, refraining from extracting the characteristics of data applies to every other task you'll ever do with neural networks. Simply give the raw data to the neural network and the model will do the rest. THE ERA OF BIG DATA ‘The second huge advantage of deep learning, and a key part of understanding why it's becoming so popular, is that it’s powered by massive amounts of data. The era of big data will provide huge opportunities for new innovations in deep learning. But don't take my word for it Andrew Ng, the chief scientist of China's major search engine Baidu, co-founder of Coursera and one of the leaders of the Google Brain Project, puts it this way:The analogy to deep learning 1s that the rocket engine 1s the deep learning models and the ‘fuel is the huge amounts of data we can feed to these algorithms. Large Neural Networ« ‘Small Neural Network Performance Traditional ‘Machine Learning Data Deep learning algorithms improve with increasing amounts of data Deep learning models tend to increase their accuracy with the increasing amount of training data, whereas traditional machine learning models such as SVM and naive Bayes classifier stop improving after a saturation point. How Do Neural Networks Work? BIOLOGICAL NEURAL NETWORKS. Artificial neural networks are inspired by the biological neurons found in our brains. In fact the artificial neural networks simulate same hasic fimctionalities of hinlagicaldendrites Amodel of a biological neural network Atypical neuron consists of a cell body, dendrites and an axon. Dendrites are thin structures that emerge from the cell body. An axon is a cellular extension that emerges from this cell body. Most neurons receive signals through the dendrites and send out signals along the axon. At the majority of synapses, signals cross from the axon of one neuron to the dendrite of another. All neurons are electrically excitable due to the maintenance of voltage gradients in their membranes. If the voltage changes by a large enough amount over a short interval, the neuron generates an electrochemical pulse called an action potential. This potential travels rapidly along the axon and activates synaptic connections. ARTIFICIAL NEURAL NETWORKS Now that we have a basic understanding of how biological neural networks are functioning, let's take a look at the architecture of the artificial neural network.An artificial feedforward neural network A neuron is simply a graphical representation of a numeric value (e.g. 1.2, 5.0, 42.6, @.25, etc.). Any connection between two artificial neurons can be considered an axon in a biological brain. The connections between the neurons are realized by so-called weights, which are also nothing more than numerical values. When an artificial neural network learns, the weights between neurons change, as does the strength of the connection. Well what does that mean? Given training data and a particular task such as classification of numbers, we are looking for certain set weights that allow the neural network to perform the classification. ‘The set of weights is different for every task and every data set. We cannot predict the values of these weights in advance, but the neural network has to learn them. The ~-ocess of learning is what we call training.Typical Neural Network Architecture The typical neural network architecture consists of several layers; we call the first one the input layer. The input layer receives input x, (ie. data from which the neural network learns). In our previous example of classifying handwritten numbers, these inputs x would represent the images of these numbers (x is basically an entire vector where each entry is a pixel). The input layer has the same number of neurons as there are entries in the vector x. In other words, each input neuron represents one element in the vector. Structure of a feedforward neural network The last layer is called the output layer, which outputs a vector y representing the neural network's result. The entries in this vector represent the values of the neurons in the output layer. In our classification, each neuron in the last layer represents a different class. In this case, the value of an output neuron gives the probability that the handwrittenthe layers look like. Layer Connections in a Neural Network Please consider a smaller neural network that consists of only two layers. The input layer has two input neurons, while the output layer consists of three neurons. Layer connections As mentioned earlier, each connection between two neurons is represented by a numerical value, which we call weight.All weights between two neural network layers can be represented by a matrix called the weight matrix. Wi1 W12 W13 W= W221 W222 W23 Aweight matrix Aweight matrix has the same number of entries as there are connections between neurons. The dimensions of a weight matrix result from the sizes of the two layers that are connected by this weight matrix. ‘The number of rows corresponds to the number of neurons in the layer from which the connections originate and the number of columns corresponds to the number of neurons in the layer to which the connections lead. In this particular example, the number of rows of the weight matrix corresponds to the size of the input layer, which is two, and the number of columns to the size of the output layer, which is three. Learning a Neural Network's ProcessForward propagation We also call this step forward propagation. With the input vector x and the weight matrix W connecting the two neuron layers, we compute the dot product between the vector x and the matrix W. The result of this dot product is another vector, which we call z TT Wy, Wy Wig W = (2, 0) ding Nia ton = ( memnrateas LW) geLQW'29, wists) ws) 5, i= o(2) Equations for forward propagation We obtain the final prediction vector h by applying a so-called activation function to the vector z. In this case, the activation function is represented by the letter sigma. An activation function is only a nonlinear function that performs a nonlinear mapping from z to h. ‘There are three activation functions we use in deep learning: tanh, sigmoid and ReLu.These numerical values are the weights that tell us how strongly these neurons are connected with each other. During training, these weights adjust; some neurons become more connected while some neurons become less connected. As in a biological neural network, learning means weight alteration. Accordingly, the values of z, h and the final output vector y are changing with the weights. Some weights make the predictions of a neural network closer to the actual ground truth vector y_hat ; other weights increase the distance to the ground truth vector. Now that we know what the mathematical calculations between two neural network layers look like, we can extend our knowledge to a deeper architecture that consists of five layers. As before, we calculate the dot product between the input x and the first weight, matrix wi, and apply an activation function to the resulting vector to obtain the first hidden vector 1. We now consider h1 the input for the upcoming third layer. We repeat the whole procedure from before until we obtain the final output yNg = a(n Wa) hs = o(ha- Ws) J = o(hs- W1) Equations for forward propagation Loss Functions After we get the prediction of the neural network, we must compare this prediction vector to the actual ground truth label. We call the ground truth label vector y_hat. While the vector y contains predictions that the neural network has computed during the forward propagation (which may, in fact, be very different from the actual values), the vector y_hat contains the actual values. Mathematically, we can measure the difference between y and y_hat by defining a loss function, whose value depends on this difference. An example of a general loss function is the quadratic loss: £0) = 50 - a) Quadratic loss The value of this loss function depends on the difference between y_hat and y.AMinimizing the loss tunction automatically causes the neural network model to make better predictions regardless of the exact characteristics of the task at hand. You only have to select the right loss function for the task. Fortunately, there are only two loss functions that you should know about to solve almost any problem that you encounter in practice: the cross-entropy loss and the mean squared error (MSE) loss. THE CROSS-ENTROPY LOSS iN, Yi. centres in the prediction vector 7 L0)=—S° loan) os Wi ventries in the ground truth label 7’ Cross-entropy loss function MEAN SQUARED ERROR LOSS Yi tentres inthe prediction vector i7 LO=_U wo rar i entries in the ground truth label if Mean squared error loss function Since the loss depends on the weight, we must find a certain set of weights for which the value of the loss function is as small as possible. The method of minimizing the loss function is achieved mathematically by a method called gradient descent.words) to impiuve ane weights of a neural network, To understand the basic concept of the gradient descent process, let's consider a basic example of a neural network consisting of only one input and one output neuron connected by a weight value w. Ly ——_()+ Wi Simple neural network ‘This neural network receives an input x and outputs a prediction y. Let's say the initial weight value of this neural network is 5 and the input x is 2. Therefore the prediction y of this network has a value of 1@, while the label y_hat might have a value of 6 wy =5,%,=2 7 y=m-wn = 10 y =6 C4 ll Parameters and predictions This means that the prediction is not accurate and we must use the gradient descent method to find a new weight value that causes the neural network to make the correct prediction. In the first step, we must choose a loss function for the task. Let's take the quadratic loss that I defined above and plot this function, which is basically just a quadratic function:3 wi Quadratic loss funct ‘The y-axis is the loss value, which depends on the difference between the label and the prediction, and thus the network parameters — in this case, the one weight w. ‘The x- axis represents the values for this weight. As you can see, there is a certain weight w for which the loss function reaches a global minimum. This value is the optimal weight parameter that would cause the neural network to make the correct prediction (which is 6). In this case, the value for the optimal weight is 3Initial weights Initial weight value On the other hand, our initial weight is 5, which leads to a fairly high loss. The goal now is to repeatedly update the weight parameter until we reach the optimal value for that particular weight. This is the time when we need to use the gradient of the loss function. Fortunately, in this case, the loss function is a function of one single variable, the weight w: L(wi1) AG —yy 5(6- wn ei) Loss funetion In the next step, we calculate the derivative of the loss function with respect to this parameter: ALL)‘This tangent points toward the highest rate of increase of the loss function and the corresponding weight parameters on the x-axis. ‘This means that we have just used the gradient of the loss function to find out which weight parameters would result in an even higher loss value. What we really want to know is the exact opposite. We can get what we want if we multiply the gradient by -1 and, in this way, obtain the opposite direction of the gradient. ‘This is how we get the dire’ ‘ion of the loss function’s highest rate of decrease and the corresponding parameters on the x-axis that cause this decrease: L(wi) Vie £ (wna) 3 Why Finally, we perform one gradient descent step as an attempt to improve our weights. We use this negative gradient to update your current weight in the direction of the weights for which the value of the loss function decreases, according to the negativeWitpew = Wiigg — 9-1-8 = 4.2 Gradient descent step The factor epsilon in this equation is a hyper-parameter called the learning rate. The learning rate determines how quickly or how slowly you want to update the parameters. Please keep in mind that the learning rate is the factor with which we have to multiply the negative gradient and that the learning rate is usually quite small, In our case, the learning rate is 0.1 As you can see, our weight w after the gradient descent is now 4.2 and closer to the optimal weight than it was before the gradient step. L(wir) htps:bulin.comimachine-Jeamingiwhats-doeplearingEach time we update the weights, we move down the negatwe gradient towards the optimal weights. After each gradient descent step or weight update, the current weights of the network get closer and closer to the optimal weights until we eventually reach them. At that point, the neural network will be capable of making the predictions we want to make, htps:bulin.comimachine-Jeamingiwhats-doeplearing

DL Unit 1
No ratings yet
DL Unit 1
200 pages
RNN - Recurrent Neural Networks
No ratings yet
RNN - Recurrent Neural Networks
20 pages
Deep Learnings
No ratings yet
Deep Learnings
44 pages
Neural Networks
No ratings yet
Neural Networks
44 pages
Deep Learning 1687744660
No ratings yet
Deep Learning 1687744660
26 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
Unit-3 UIUX
No ratings yet
Unit-3 UIUX
17 pages
What Is Deep Learning and How Does It Work - Towards Data Science
No ratings yet
What Is Deep Learning and How Does It Work - Towards Data Science
38 pages
Neural Networks
No ratings yet
Neural Networks
21 pages
Main Dataset
No ratings yet
Main Dataset
300 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
An Ingression Into Deep Learning - Resp
No ratings yet
An Ingression Into Deep Learning - Resp
25 pages
Deep Learning Step by Step
No ratings yet
Deep Learning Step by Step
171 pages
ML06 Neural-Network 2024-2025
No ratings yet
ML06 Neural-Network 2024-2025
78 pages
Deep Learning 1737909076
No ratings yet
Deep Learning 1737909076
29 pages
Module 2
No ratings yet
Module 2
84 pages
Unit 4
100% (1)
Unit 4
57 pages
Mlnov 2024
No ratings yet
Mlnov 2024
2 pages
Al - Lec 4
No ratings yet
Al - Lec 4
29 pages
An Ingression Into Deep Learning - FP
No ratings yet
An Ingression Into Deep Learning - FP
17 pages
Neural Networks
No ratings yet
Neural Networks
17 pages
U4-Naive Bayes Algorithm
No ratings yet
U4-Naive Bayes Algorithm
5 pages
ML Unit 3 Part1
No ratings yet
ML Unit 3 Part1
30 pages
Deep Learning Unit1
No ratings yet
Deep Learning Unit1
63 pages
UIUX CAT2 Key Answers
No ratings yet
UIUX CAT2 Key Answers
4 pages
Unit 4 - DL
No ratings yet
Unit 4 - DL
33 pages
Neural Networks
No ratings yet
Neural Networks
13 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
24 pages
Playbook Executive Briefing Deep Learning
No ratings yet
Playbook Executive Briefing Deep Learning
20 pages
CNN - Convolution Neural Network
No ratings yet
CNN - Convolution Neural Network
13 pages
Intro To Deep Learning (ML Class)
No ratings yet
Intro To Deep Learning (ML Class)
9 pages
Deep Learning Day 27
No ratings yet
Deep Learning Day 27
43 pages
What Is A Neural Network
No ratings yet
What Is A Neural Network
3 pages
Deep Learning
No ratings yet
Deep Learning
18 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
10 pages
Analyzing Types of Neural Networks in Deep Learning
No ratings yet
Analyzing Types of Neural Networks in Deep Learning
15 pages
1.1. Introduction To Deep Learning
No ratings yet
1.1. Introduction To Deep Learning
26 pages
uNIT 1
No ratings yet
uNIT 1
16 pages
Unit 5 PR
No ratings yet
Unit 5 PR
47 pages
Wa0000
No ratings yet
Wa0000
52 pages
DL Unit 3 Notes
No ratings yet
DL Unit 3 Notes
16 pages
Module 3
No ratings yet
Module 3
97 pages
Fai Unit-4tb
No ratings yet
Fai Unit-4tb
18 pages
Unit 1 Part 1
No ratings yet
Unit 1 Part 1
61 pages
Deep Learning Computer Vision
No ratings yet
Deep Learning Computer Vision
47 pages
DL Notes
No ratings yet
DL Notes
97 pages
Article Review 10 Eng
No ratings yet
Article Review 10 Eng
28 pages
Neural Networks
No ratings yet
Neural Networks
17 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit-3 D.L
No ratings yet
Unit-3 D.L
16 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Unit 3
No ratings yet
Unit 3
21 pages
Unit 5 Neural Network
No ratings yet
Unit 5 Neural Network
31 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
1 - Deep Learning 10-10-2023
No ratings yet
1 - Deep Learning 10-10-2023
30 pages
Neural Networks
No ratings yet
Neural Networks
16 pages
Deep Learning Introduction
No ratings yet
Deep Learning Introduction
14 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
Reading+10+ +Introduction+to+Deep+Learning
No ratings yet
Reading+10+ +Introduction+to+Deep+Learning
21 pages
Deep Learning Project
No ratings yet
Deep Learning Project
24 pages
W1 Ann
No ratings yet
W1 Ann
3 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
A Guide To Deep Learning and Neural Networks
No ratings yet
A Guide To Deep Learning and Neural Networks
15 pages
Deep Learnig
No ratings yet
Deep Learnig
16 pages

Deep Learning

Uploaded by

Deep Learning

Uploaded by

You might also like