We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30
Computer Vision
Convolution
• before the concept of convolution was
presented by Yann LeCun in 1998 for digit classification, people used other methods like support vector machine, knn, logistic regression, etc to classify images. • In those algorithms, pixel values were considered as features i.e. for a 28x28 image there would be 784 features. • There are a lot of algorithms that people used for image classification before convolution became popular. • People used to create features from images and then feed those features into some classification algorithm like SVM. • Some algorithms also used the pixel level values of images as a feature vector. • Convolution layer uses information from adjacent pixels to down-sample the image into features by convolution and then use prediction layers to predict the target values. How does a Convolution layer work?
• We use multiple convolution filters or kernels
that run over the image and compute a dot product. • Lets consider a filter of size 3x3 and an image of size 5x5. • We perform an element wise multiplication between the image pixel values that match the size of the kernel and the kernel itself and sum them up. • This provides us a single value for the feature cell.cts different features from the image. Video- Working of Convolution network • https://www.youtube.com/watch?v=KiftWz 544_8&t=11s • https://www.youtube.com/watch?v=f0t-OC G79-U Max Pooling Layer • Max pooling layer helps reduce the spatial size of the convolved features and also helps reduce over-fitting by providing an abstracted representation of them. • It is similar to the convolution layer but instead of taking a dot product between the input and the kernel we take the max of the region from the input overlapped by the kernel. • Below is an example which shows a max pool layer’s operation with a kernel having size of 2 and stride of 1. Convolution Layer • It is the first layer of a CNN. • The objective of the Convolution Operation is to extract the high-level features such as edges, from the input image. • It uses convolution operation on the images. In the convolution layer, there are several kernels that are used to produce several features. • The output of this layer is called the feature map. • A feature map is also called the activation map. • There’s several uses we derive from the feature map: ▫ We reduce the image size so that it can be processed more efficiently. ▫ We only focus on the features of the image that can help us in processing the image further. Rectified Linear Unit Function • Rectified Linear Unit function or the ReLU layer. • After we get the feature map, it is then passed onto the ReLU layer. • This layer simply gets rid of all the negative numbers in the feature map and lets the positive number stay as it is. The process of passing it to the ReLU layer introduces non – linearity in the feature map. Now the question arises, why do we pass the feature map to the ReLU layer? it is to make the colour change more obvious and more abrupt? Pooling Layer • Similar to the Convolutional Layer. • The Pooling layer is responsible for reducing the spatial size of the Convolved Feature while still retaining the important features. • There are two types of pooling which can be performed on an image. ▫ Max Pooling : Max Pooling returns the maximum value from the portion of the image covered by the Kernel. ▫ Average Pooling: Max Pooling returns the maximum value from the portion of the image covered by the Kernel. • The pooling layer is an important layer in the CNN as it performs a series of tasks which are as follows : ▫ Makes the image smaller and more manageable ▫ Makes the image more resistant to small transformations, distortions and translations in the input image. Fully Connected Layer • The final layer in the CNN is the Fully Connected Layer (FCP). • The objective of a fully connected layer is to take the results of the convolution/pooling process and use them to classify the image into a label. • The final layer in the CNN is the Fully Connected Layer (FCP). • The objective of a fully connected layer is to take the results of the convolution/pooling process and use them to classify the image into a label