Computer Vision
Computer Vision
Aim of computer vision is to generate useful information and insights from scenery, sequences and
the objects, received from video cameras. To analyze patterns.
An image is a 2D optical projection but the world we wish to analyze or make sense is a 3D, so we
have to do inverse optics by converting this 3D projection to a 2D projection to get properties from
the image. But a 2D- 3D projection is mathematically impossible.
It is also inverse graphics. Graphics begins with 3D world description. Facial recognition presents huge
challenges in Computer vision. Humans can process and identify faces easily, machines often are
handicapped by textured backgrounds, colors, lightings
There needs to have a mathematical model which can
Perform the figure ground segmentation for objects and background.
Infer the 3D arrangements of the objects.
Infer surface properties from 2D image statistics.
Infer volumetric properties from 2D image properties.
And all these computing has to be done in real time.
Charge Accumulation: The freed electrons collect in a potential bucket inside the pixel. The amount of
charge in each pixel corresponds to the intensity of light that fell on it. Bright areas of the scene
generate more electrons, and dark areas generate fewer electrons.
Shift Registers: The CCD sensor is designed with a series of shift registers, which are like conveyor
belts for electric charge. These registers move the accumulated charge from pixel to pixel in a
controlled sequence.
Analog-to-Digital Conversion: There, the analog charge signal is converted into a digital signal, which
can be stored, processed, and displayed by electronic devices like computers or screens.
Signal Processing: Once the digital signal is obtained, it can undergo various forms of processing, such
as noise reduction, color interpolation (for color imaging), and other adjustments to enhance the final
image quality.
Readout and Reset: After the charge from all the pixels has been read out, the CCD sensor needs to be
reset. This involves clearing the accumulated charge from each pixel, preparing the sensor for the next
exposure.
Formats
Jpeg – Ideal for variable compression of continuous images, DCT compression from 100:1 to 10:1
Mpeg- stream oriented used mainly for videos individual frames are jpeg compressed
Gif- ideal for sparsed binarized images, ideal for low bandwidth browsers, provides high compression
Bmp- non-compressive bit mapped format, individual pixel values can be easily extracted.
Common structure flow is ill posed insoluble problems of inference and raw data and convert into well
posed problems in which we can compute object properties.
Texture information:
Helps in object and scene identification. Also helps to identify the surface shape and in image
segmentation. Image segmentation converts images into collections of regions that are labelled. It can
help in easier identification and processing of the images. Texture can be defined by the existing
correlation across the images
Color information:
Helps in object and scene identification just like texture information but has difficulty in indetifying
wavelength. For eg when a light source is used to illuminate the object human eye can still process the
color of the object but computer vision has issue in processing the natural color when illuminated with
light. Retinex algorithm helps to solve this color constancy
Stereo information:
Information regarding the depth can be obtained by using two or more cameras. By increasing the
distance between eyes/cameras we can increase the depth of field of vision