Ai Ga1
Ai Ga1
Supervised learning is when the data you feed your algorithm with is "tagged" or
"labelled", to help your logic make decisions.
Example: Bayes spam filtering, where you have to flag an item as spam to refine the
results.
Unsupervised learning are types of algorithms that try to find correlations without any
external inputs other than the raw data.
Example: data mining clustering algorithms.
This particular example of face detection is supervised, which means that your examples
must be labeled, or explicitly say which ones are faces and which ones aren't.
In an unsupervised algorithm your examples are not labeled, i.e. you don't say
anything. Of course, in such a case the algorithm itself cannot "invent" what a face is, but it
can try to cluster the data into different groups, e.g. it can distinguish that faces are very
different from landscapes, which are very different from horses.
Since another answer mentions it (though, in an incorrect way): there are "intermediate"
forms of supervision, i.e. semi-supervised and active learning. Technically, these
are supervised methods in which there is some "smart" way to avoid a large number of
labeled examples. In active learning, the algorithm itself decides which thing you should
label (e.g. it can be pretty sure about a landscape and a horse, but it might ask you to
confirm if a gorilla is indeed the picture of a face). In semi-supervised learning, there are two
different algorithms which start with the labeled examples, and then "tell" each other the
way they think about some large number of unlabeled data.
Architecture of a CNN
Definition[]
The name “convolutional neural network” indicates that the network employs a mathematical
operation called convolution. Convolution is a specialized kind of linear operation. Convolutional
networks are simply neural networks that use convolution in place of general matrix multiplication in
at least one of their layers
Design[]
A convolutional neural network consists of an input and an output layer, as well as multiple hidden
layers. The hidden layers of a CNN typically consist of a series of convolutional layers
that convolve with a multiplication or other dot product. The activation function is commonly a RELU
layer, and is subsequently followed by additional convolutions such as pooling layers, fully
connected layers and normalization layers, referred to as hidden layers because their inputs and
outputs are masked by the activation function and final convolution. The final convolution, in turn,
often involves backpropagation in order to more accurately weight the end product.[
Though the layers are colloquially referred to as convolutions, this is only by convention.
Mathematically, it is technically a sliding dot product or cross-correlation. This has significance for
the indices in the matrix, in that it affects how weight is determined at a specific index point.[
Convolutional[]
When programming a CNN, each convolutional layer within a neural network should have the
following attributes:
Input is a tensor with shape (number of images) x (image width) x (image height) x (image
depth).
Convolutional kernels whose width and height are hyper-parameters, and whose depth must
be equal to that of the image. Convolutional layers convolve the input and pass its result to the
next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus.[11]
Each convolutional neuron processes data only for its receptive field. Although fully connected
feedforward neural networks can be used to learn features as well as classify data, it is not practical
to apply this architecture to images. A very high number of neurons would be necessary, even in a
shallow (opposite of deep) architecture, due to the very large input sizes associated with images,
where each pixel is a relevant variable. For instance, a fully connected layer for a (small) image of
size 100 x 100 has 10,000 weights for each neuron in the second layer. The convolution operation
brings a solution to this problem as it reduces the number of free parameters, allowing the network
to be deeper with fewer parameters.[ For instance, regardless of image size, tiling regions of size 5 x
5, each with the same shared weights, requires only 25 learnable parameters. In this way, it resolves
the vanishing or exploding gradients problem in training traditional multi-layer neural networks with
many layers by using backpropagation.
Pooling[]
Convolutional networks may include local or global pooling layers to streamline the underlying
computation. Pooling layers reduce the dimensions of the data by combining the outputs of neuron
clusters at one layer into a single neuron in the next layer. Local pooling combines small clusters,
typically 2 x 2. Global pooling acts on all the neurons of the convolutional layer.[13][14] In addition,
pooling may compute a max or an average. Max pooling uses the maximum value from each of a
cluster of neurons at the prior layer. Average pooling uses the average value from each of a cluster
of neurons at the prior layer.
Fully connected[]
Fully connected layers connect every neuron in one layer to every neuron in another layer. It is in
principle the same as the traditional multi-layer perceptron neural network (MLP). The flattened
matrix goes through a fully connected layer to classify the images.
Receptive field[]
In neural networks, each neuron receives input from some number of locations in the previous layer.
In a fully connected layer, each neuron receives input from every element of the previous layer. In a
convolutional layer, neurons receive input from only a restricted subarea of the previous layer.
Typically the subarea is of a square shape (e.g., size 5 by 5). The input area of a neuron is called
its receptive field. So, in a fully connected layer, the receptive field is the entire previous layer. In a
convolutional layer, the receptive area is smaller than the entire previous layer.
Data science continues to evolve as one of the most promising and in-demand
career paths for skilled professionals. Today, successful data professionals
understand that they must advance past the traditional skills of analyzing large
amounts of data, data mining, and programming skills. In order to uncover useful
intelligence for their organizations, data scientists must master the full spectrum of
the data science life cycle and possess a level of flexibility and understanding to
maximize returns at each phase of the process.