Computer Vision
Computer Vision
CLASS X
CODE: 417
CHAPTER : COMPUTER VISION
COMPUTER VISION
Computer Vision is a domain of AI. CV trains
computer to interpret and understand the
images in the visual world. Computer Vision
(CV) is the scientific technology for building
artificial systems that obtain information from
images, video or any other visual data. This
involves methods of acquiring, processing,
analyzing, and understanding digital images, and
extraction of data from the real and visual world
to produce information.
For example-
Self-Driving Cars/ Automatic Cars
Face Lock in Smartphones
APPLICATIONS OF COMPUTER VISION
1. SELF DRIVING CAR AND TRAFFIC CONTROLLING
2. EDUCATION AND TRAINING
3. E-COMMERCE
4. FACIAL RECOGNITION
5. HEALTH CARE
6. SEARCH BY USING IMAGES
7. MEDICAL IMAGES
8. BAKING
9. TRACKING CUSTOMERS IN RETAIL MARKET
10. COMPUTER VISION THROUGH SMARTPHONES
11. INSURANCE
12. MANUFACTURING
13. SPORTS
14. AGRICULTURE
15. FACE FILTERS
16. INTERNET USES COMPUTER VISION IN MANY APPLICATIONS
17. GOOGLE TRANSLATE
18. TRANSPORT
19. RETAIL
20. MARKETING
COMPUTER VISION TASKS
1. CLASSIFICATION: Classification is the simplest and the most common task related to
images. It is the task of identifying the object in an image as a whole and separated from
other similar looking images. For example, distinguishing one face from another, one
animal from another etc. The information gathered from the pixels of the image is used to
correlate the values to compare and classify the images in two separate groups. Because
of its simplicity the classification has a large variety of practical applications.
2. CLASSIFICATION + LOCALISATION: This task involves both identifying what object is present
in an image and at the same time identifying at what location that object is present in the
image. It can be used only for single objects. For example, the Al system must be able to
identify a particular player in the image and where a particular player is in the image.
3. OBJECT DETECTION: Object detection is the process of finding instances of real world
objects such as faces, buildings, cars, animals in images or videos. Object detection
algorithms typically use extracted features and learning algorithms to recognize the
instances of an object and act according to its category. This concept is commonly used in
particular image retrieval from an image and automated vehicle parking systems. Object
detection is a complex task especially when the system has to discover which of the
objects in the image are relevant to the project.
4. INSTANCE SEGMENTATION: . It is the process of detecting instances of the object, assigning
demo category and then giving each pixel a label on that basis. A segmentation algorithm
takes an image as input and outputs a collection of segments. Here, segmentation is the
process of dividing an image into several segments so as to identify objects in the image.
5. SEMANTIC SEGMENTATION: In this process of classifying each pixel belonging to a
particular label. It doesn't different instances of the same object.
COMPUTER VISION TASKS
1. CLASSIFICATION: Classification is the simplest and the most common task related to images. It is the task of
identifying the object in an image as a whole and separated from other similar looking images. For
example, distinguishing one face from another, one animal from another etc. The information gathered
from the pixels of the image is used to correlate the values to compare and classify the images in two
separate groups. Because of its simplicity the classification has a large variety of practical applications.
COMPUTER VISION TASKS
2. CLASSIFICATION + LOCALISATION: This task involves both identifying what
object is present in an image and at the same time identifying at what location
that object is present in the image. It can be used only for single objects. For
example, the Al system must be able to identify a particular player in the image
and where a particular player is in the image.
COMPUTER VISION TASKS
3. OBJECT DETECTION: Object detection is the process of finding instances of real world
objects such as faces, buildings, cars, animals in images or videos. Object detection algorithms
typically use extracted features and learning algorithms to recognize the instances of an object
and act according to its category. This concept is commonly used in particular image retrieval
from an image and automated vehicle parking systems. Object detection is a complex task
especially when the system has to discover which of the objects in the image are relevant to
the project.
COMPUTER VISION TASKS
4. INSTANCE SEGMENTATION: It is the process of detecting instances of the object,
assigning demo category and then giving each pixel a label on that basis. A
segmentation algorithm takes an image as input and outputs a collection of
segments. Here, segmentation is the process of dividing an image into several
segments so as to identify objects in the image.
COMPUTER VISION TASKS
5. SEMANTIC SEGMENTATION: In this process of classifying each pixel belonging to
a particular label. It doesn't different instances of the same object.
HOW DO COMPUTER SEE IMAGES
1. PIXELS(PICTURER ELEMENT): It is a smallest element of an image on a computer display, CRT, TFT, LCD. A screen is
made up of a matrix of thousands or millions of pixels. A pixel is represented with a dot or a square on a computer
screen. Each pixel has a value or unique logical address.
Total Number of Pixels= Number of Rows X Number of columns
1. RESOLUTION: A resolution of a computer screen depends upon graphics card and display monitor, the quantity, size
and color combination o pixels. Usually round or square, they are typically arranged in a 2-dimensional grid. The more
pixels you have, the more closely the image resembles the original.
2. PIXEL VALUE: Each pixel has a unique value. 0 is unique value that means the absence of light. It means that 0 is used
to denote dark.
3. GRAYSCALE IMAGES: Grayscale images which have a range of shades of gray without apparent color. The darkest
possible shade is black, which is the total absence of color or zero value of pixel. The lightest possible shade is white is
the total presence of color or 255 value of a pixel. Intermediate shades of gray are represented by equal brightness
level of the three primary colors. A grayscales has each pixel of size 1 byte having a single plane of 2D array of pixels.
The size of a grayscale image is defined as the height X width of the image.
4. RGB IMAGES: All the images that we see around are colored images. These images are made up of three primary
colors Red, Green and Blue. All the colours that are present can be made by combining different intensities of red,
green and blue. Every RGB image is stored in the form of three different channels called the R channel, G channel and
the B channel. Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three
planes when combined together form a color image. This means that in a RGB image, each pixel has a set of three
different values which together give color to that particular pixel.
HOW DO COMPUTER SEE IMAGES
1. PIXELS (PICTURER ELEMENT): It is a smallest element of an image on a computer
display, CRT, TFT, LCD. A screen is made up of a matrix of thousands or millions of
pixels. A pixel is represented with a dot or a square on a computer screen. Each
pixel has a value or unique logical address.
Total Number of Pixels= Number of Rows X Number of columns
HOW DO COMPUTER SEE IMAGES
2. RESOLUTION: A resolution of a computer screen depends upon graphics card and display
monitor, the quantity, size and color combination o pixels. Usually round or square, they are
typically arranged in a 2-dimensional grid. The more pixels you have, the more closely the
image resembles the original.
HOW DO COMPUTER SEE IMAGES
3. PIXEL VALUE: Each pixel has a unique value. 0 is unique value that
means the absence of light. It means that 0 is used to denote dark.
HOW DO COMPUTER SEE IMAGES
4. GRAYSCALE IMAGES: Grayscale images which have a range of shades of gray without
apparent color. The darkest possible shade is black, which is the total absence of color or zero
value of pixel. The lightest possible shade is white is the total presence of color or 255 value
of a pixel. Intermediate shades of gray are represented by equal brightness level of the three
primary colors. A grayscales has each pixel of size 1 byte having a single plane of 2D array of
pixels. The size of a grayscale image is defined as the height X width of the image.
HOW DO COMPUTER SEE IMAGES
5. RGB IMAGES: All the images that we see around are colored images. These images are made up of three primary
colors Red, Green and Blue. All the colours that are present can be made by combining different intensities of red, green
and blue. Every RGB image is stored in the form of three different channels called the R channel, G channel and the B
channel. Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three planes
when combined together form a color image. This means that in a RGB image, each pixel has a set of three different
values which together give color to that particular pixel.
IMAGE FEATURES
THESE IMAGES FEATURES CAN BE CLASSIFIED INTO THREE MAJOR TYPES: COLOR, TEXTURE, AND
SHAPE FEATURES. EACH IS ASSOCIATED WITH SIMILARITY METRICS USED TO MEASURE THE
SIMILARITY OR DISTANCE BETWEEN THESE FEATURES OF TWO IMAGES OR IMAGE OBJECTS. THE
FEATURES MAY BE SPECIFIC STRUCTURE IN THE IMAGE SUCH AS POINTS, EDGES, OBJECTS.