Basic Concepts
Basic Concepts
Computer vision: Computer vision, as a scientific domain, aims to extract, analyze, process and
acquire relevant information from images or image, in order to produce numerical or symbolic
information. We can do this with the help of algorithmic tools.
• Localization
• Segmentation
• Classification
• Detection
Image Processing: Image processing aims to produce an image that is more advantageous for our
purposes. Image processing is often used to prepare images for further analysis or to help human
users to recognize crucial details more easily.
• Image Acquisition
• Image correction
• Feature detection
• Decision
2. Imaging, how photo-diodes work, CCD and its variations, CMOS, (dis)advantages. Image
structure, properties, and errors. Camera type
A photodiode is a semiconductor device that converts light into an electrical current. The current is
generated when photons are absorbed in the photodiode.
CCD: CCD are analog devices, which can store charge in the case a photon is absorbed.
CMOS: is an image sensor where each pixel sensor has a photodetector, also they have smaller
dimensions, less energy consumption, and cheap.
Cameras: There are three types of cameras
• Stereo cameras
• depth cameras
• LIDAR
Three important color spaces: red, green and blue, abbreviated as RGB).
basic advantages of certain color spaces and their use:
• YCbCr: Y represents the lightness of the given color. The Cr and Cb channels describe the
hue of the color.
• HSV/HSI/HSL- The common in these color representations is that the chrominance is coded
with a hue and saturation value.
Image Correction
4. Intensity transforms (what does each do), histogram, its use, histogram operations.
The histogram describes the frequency intensity values in an image. The histogram can help us to
detect and correct the defects of image acquisition
5. Image noise types, convolutional filters. Smoothing, sharpening, edge detection filters,
recognize them from the weight matrix. Linear vs rank filters, which is good for what,
advantages.
Types of noise:
• Gaussian-noise, which is the consequence of the noisy nature of the imaging sensor and the
surrounding electronics.
• salt and pepper noise, which occurs sparsely, but changes the value of the pixels in a
significant manner.
convolutional filters: a small filtering window is slid through the image, and the value of each pixel
is set to the result of the convolution with the pixel itself and its neighborhood
Smoothing: each element of the kernel is non-negative, and they sum up to one. If the sum differs,
then a brightening/darkening step also occurs beside smoothing
sharpening, edge detection: There are such filters, which are similar to edge detector filters
regarding the structure of their elements (i.e., negative and positive elements on different sides), but
sum up to 1
Edge detection filter: In each position, two differences are calculated (x- and y-direction), one
between the pixel and its right and one between the pixel and its below neighbor. The squared sum
of both gives a metric that characterizes how much is the pixel edge-like
6. Edge detection, using first and second order derivatives, determine derivative order from
the filter matrix. Sobel and Prewitt operators, directionality. Idea of the Canny algorithm
and its steps.
Sobel and Prewitt operators: are direction-dependent edge detectors, otherwise their working
principle is similar to that of the Gaussian.
Canny algorithm: the first of which is the calculation - with simple derivation filters - of vertical and
horizontal derivatives the norm and the direction of the image gradient are calculated.
7. Basic operations of image math and their goal. Principle and algorithms of interpolation,
properties, (dis)advantages.
DEEP LEARNING
What is the difference good old-fashioned artificial intelligence (GOFAI) and machine learning?
What is the structure of machine learning algorithms, what types does it have, and what important
things do we always have to keep in mind when we use them?
the input is denoted by x, the output by y, and the parameters by #.
^y = f (x; #)
Types of machine learning:
• Supervised learning, we need to make labels databases, this mean time a money. Second
the quality of the labels determines the quality of the result
• unsupervised learning the goal of the algorithm is to explain the input with the help of a
compact model
• reinforcement learning, the algorithm makes a sequence of decisions, but there is generally
no feedback after each decision. On the other hand, the feedback only describes the quality
of the steps taken
How does the kNN algorithm work? What problems does it have? How does the Perceptron model
work and how can we interpret its outputs? What does the decision function of the Perceptron
look like?
kNN: is a non-parametric method used for classification and regression. In both cases, the input
consists of the k closest training examples in the feature space. The output depends on whether k-
NN is used for classification or regression
Problem The neighbors decide the label of the image. The image distance is, in the case of kNN,
mostly defined as the absolute
The Perceptron model the working principle is that the pixels are ordered in a vector, which is
multiplied with a weight matrix, the result having as many elements as the number of classes. Each
element can be interpreted as an indicator of the degree of belonging to that class
s=Wx
Where x is the input, s the degree of belonging to a class, W the matrix of parameters
SVM fundamentals, operation. Explanation of the method of determining the output for a new input.
What is the kernel function and how does it affect the SVM decision function?
SVM The SVM algorithm is quite popular in computer vision, which determines the hyperplane,
which has the highest margin. The decision function of the SVM is stated below
kernel function: kernel function is a similarity measure between the input to be classified and the
training samples this means that the output of each entry influences the decision proportional to
how similar two data points are.
What types of loss functions can we use to train the Perceptron? What is the idea behind the hinge
loss and what does it look like? Fundamentals of the cross-entropy loss, how can we get probabilities
on the output of the Perceptron, and how can we define the “real” class probabilities? What are the
(dis)advantages of the two?
What is regularization and what types does it have? Why do we need to do it, and how is it
connected to overfitting?
Regularization: are used because when the weight matrices are multiplied with a constant then the
value of them will decrease; thus, the limit of the value of the weights will tend towards infinity,
causing numeric problems and resulting in a confident model
Types of regularization:
• L2
• L1
• Elastic Net (L1+L2)
• Dropout, Batch Normalization
How does the gradient method work? Why do we use mini-batches, how does the size of the batch
affect the learning? How can we improve the efficiency of the gradient method (momentum,
scaling) and how do these help