0% found this document useful (0 votes)
32 views5 pages

Basic Concepts

This document summarizes key concepts in traditional computer vision and deep learning. In traditional vision, it discusses image processing tasks like classification and segmentation. It also covers color spaces, image correction through intensity transforms and filters, and edge detection methods. In deep learning, it defines the difference between GOFAI and machine learning, the types of machine learning, and fundamentals of algorithms like k-NN, perceptron, SVM, and regularization. It also provides overviews of loss functions, the gradient method, and techniques to improve efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views5 pages

Basic Concepts

This document summarizes key concepts in traditional computer vision and deep learning. In traditional vision, it discusses image processing tasks like classification and segmentation. It also covers color spaces, image correction through intensity transforms and filters, and edge detection methods. In deep learning, it defines the difference between GOFAI and machine learning, the types of machine learning, and fundamentals of algorithms like k-NN, perceptron, SVM, and regularization. It also provides overviews of loss functions, the gradient method, and techniques to improve efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

TRADITIONAL VISION

1. Basic Concepts: Computer Vision, Image Processing. Fundamental tasks: classification,


detection, segmentation (types). Difficulties

Computer vision: Computer vision, as a scientific domain, aims to extract, analyze, process and
acquire relevant information from images or image, in order to produce numerical or symbolic
information. We can do this with the help of algorithmic tools.

Computer vison task:

• Localization
• Segmentation
• Classification
• Detection

Image Processing: Image processing aims to produce an image that is more advantageous for our
purposes. Image processing is often used to prepare images for further analysis or to help human
users to recognize crucial details more easily.

Traditional vision steps:

• Image Acquisition
• Image correction
• Feature detection
• Decision

2. Imaging, how photo-diodes work, CCD and its variations, CMOS, (dis)advantages. Image
structure, properties, and errors. Camera type

A photodiode is a semiconductor device that converts light into an electrical current. The current is
generated when photons are absorbed in the photodiode.

CCD: CCD are analog devices, which can store charge in the case a photon is absorbed.
CMOS: is an image sensor where each pixel sensor has a photodetector, also they have smaller
dimensions, less energy consumption, and cheap.
Cameras: There are three types of cameras
• Stereo cameras
• depth cameras
• LIDAR

3. Producing color images. Three important color spaces, interpretation of color


components, basic advantages of certain color spaces and their use. Variations of
grayscale conversion.

Three important color spaces: red, green and blue, abbreviated as RGB).
basic advantages of certain color spaces and their use:
• YCbCr: Y represents the lightness of the given color. The Cr and Cb channels describe the
hue of the color.
• HSV/HSI/HSL- The common in these color representations is that the chrominance is coded
with a hue and saturation value.

Image Correction

4. Intensity transforms (what does each do), histogram, its use, histogram operations.

The histogram describes the frequency intensity values in an image. The histogram can help us to
detect and correct the defects of image acquisition

5. Image noise types, convolutional filters. Smoothing, sharpening, edge detection filters,
recognize them from the weight matrix. Linear vs rank filters, which is good for what,
advantages.
Types of noise:
• Gaussian-noise, which is the consequence of the noisy nature of the imaging sensor and the
surrounding electronics.
• salt and pepper noise, which occurs sparsely, but changes the value of the pixels in a
significant manner.
convolutional filters: a small filtering window is slid through the image, and the value of each pixel
is set to the result of the convolution with the pixel itself and its neighborhood

Smoothing: each element of the kernel is non-negative, and they sum up to one. If the sum differs,
then a brightening/darkening step also occurs beside smoothing

sharpening, edge detection: There are such filters, which are similar to edge detector filters
regarding the structure of their elements (i.e., negative and positive elements on different sides), but
sum up to 1
Edge detection filter: In each position, two differences are calculated (x- and y-direction), one
between the pixel and its right and one between the pixel and its below neighbor. The squared sum
of both gives a metric that characterizes how much is the pixel edge-like

Linear vs rank filters, which is good for what, advantages:

6. Edge detection, using first and second order derivatives, determine derivative order from
the filter matrix. Sobel and Prewitt operators, directionality. Idea of the Canny algorithm
and its steps.

Sobel and Prewitt operators: are direction-dependent edge detectors, otherwise their working
principle is similar to that of the Gaussian.

Canny algorithm: the first of which is the calculation - with simple derivation filters - of vertical and
horizontal derivatives the norm and the direction of the image gradient are calculated.

7. Basic operations of image math and their goal. Principle and algorithms of interpolation,
properties, (dis)advantages.
DEEP LEARNING
What is the difference good old-fashioned artificial intelligence (GOFAI) and machine learning?
What is the structure of machine learning algorithms, what types does it have, and what important
things do we always have to keep in mind when we use them?
the input is denoted by x, the output by y, and the parameters by #.
^y = f (x; #)
Types of machine learning:
• Supervised learning, we need to make labels databases, this mean time a money. Second
the quality of the labels determines the quality of the result
• unsupervised learning the goal of the algorithm is to explain the input with the help of a
compact model
• reinforcement learning, the algorithm makes a sequence of decisions, but there is generally
no feedback after each decision. On the other hand, the feedback only describes the quality
of the steps taken

How does the kNN algorithm work? What problems does it have? How does the Perceptron model
work and how can we interpret its outputs? What does the decision function of the Perceptron
look like?

kNN: is a non-parametric method used for classification and regression. In both cases, the input
consists of the k closest training examples in the feature space. The output depends on whether k-
NN is used for classification or regression
Problem The neighbors decide the label of the image. The image distance is, in the case of kNN,
mostly defined as the absolute
The Perceptron model the working principle is that the pixels are ordered in a vector, which is
multiplied with a weight matrix, the result having as many elements as the number of classes. Each
element can be interpreted as an indicator of the degree of belonging to that class
s=Wx
Where x is the input, s the degree of belonging to a class, W the matrix of parameters

SVM fundamentals, operation. Explanation of the method of determining the output for a new input.
What is the kernel function and how does it affect the SVM decision function?

SVM The SVM algorithm is quite popular in computer vision, which determines the hyperplane,
which has the highest margin. The decision function of the SVM is stated below
kernel function: kernel function is a similarity measure between the input to be classified and the
training samples this means that the output of each entry influences the decision proportional to
how similar two data points are.
What types of loss functions can we use to train the Perceptron? What is the idea behind the hinge
loss and what does it look like? Fundamentals of the cross-entropy loss, how can we get probabilities
on the output of the Perceptron, and how can we define the “real” class probabilities? What are the
(dis)advantages of the two?

What types of loss functions can we use to train the Perceptron?


• the cross-entropy loss
• the hinge loss
What is the idea behind the hinge loss and what does it look like?: the main principle of which is to
define a measure, which is called the gap, if the score of the correct class is bigger at least with than
the other scores, then the error value is 0. Otherwise, the error increases linearly

What is regularization and what types does it have? Why do we need to do it, and how is it
connected to overfitting?
Regularization: are used because when the weight matrices are multiplied with a constant then the
value of them will decrease; thus, the limit of the value of the weights will tend towards infinity,
causing numeric problems and resulting in a confident model
Types of regularization:
• L2
• L1
• Elastic Net (L1+L2)
• Dropout, Batch Normalization

How does the gradient method work? Why do we use mini-batches, how does the size of the batch
affect the learning? How can we improve the efficiency of the gradient method (momentum,
scaling) and how do these help

You might also like