Computer Vision

Computer Vision is a field of Artificial Intelligence that enables machines to interpret and analyze visual data through various applications such as facial recognition, self-driving cars, and medical imaging. Key tasks in Computer Vision include image classification, object detection, and instance segmentation, which help extract meaningful information from images. The document also discusses the basics of images, including pixels, resolution, and RGB images, as well as the use of Optical Character Recognition in applications like Google Translate.

Uploaded by

Abigail Aji George

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Computer Vision

Uploaded by

Abigail Aji George

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

COMPUTER VISION

Computer Vision:

The Computer Vision domain of Artificial Intelligence, enables machines to see

through images or visual data, process and analyse them on the basis of algorithms

and methods in order to analyse and extract meaningful information from visual

inputs.
Applications of Computer Vision:
1. Facial Recognition: With the arrival of smart cities and smart homes, Computer
Vision plays a vital role in making the home smarter. Security being the most important
application involves use of Computer Vision for facial recognition. It enhances home
security by recognizing visitors or maintains visitor logs. Used in schools for automated
attendance system based on facial recognition of students.

2. Face Filters: The modern-day apps like Instagram and snapchat have a lot of
features based on computer vision. The application of face filters is one among
them. Through the camera the machine or the algorithm is able to detect facial
features and apply fun or creative filters that is selected by the user.

3.Google’s Search by Image: Max amount of searching for data on Google’s

search engine comes from textual data, but at the same time it has an
interesting feature of getting search results through an image. This uses
Computer Vision as it compares different features of the input image with an
extensive database of images and finds matches as the search result while at
the same time analysing various features of the image.
4. Computer Vision in Retail: The retail field has been one of the fastest
growing field and at the same time is using Computer Vision for making the user
experience more fruitful. Retailers can use Computer Vision techniques to track
customers’ movements through stores, analyse navigational routes and detect
walking and shopping patterns . Inventory Management is another such
application. Through security camera image analysis, a CV algorithm can
generate a very accurate estimate of the items available in the store. Also, it can
analyse the use of shelf space to identify poor arrangements and suggest better
item placement.

5.Self-Driving Cars: Computer Vision is the fundamental technology behind

developing autonomous vehicles. Most leading car manufacturers in the world
are getting the benefits of investing in AI for developing on-road versions of
hand-free technology. This involves the process of identifying the objects,
getting navigational routes and also at the same time environment monitoring.

6. Medical Imaging: For the last decades, computer supported medical imaging
application has been a trustworthy help for physicians. It doesn’t only create and
analyse images, but also becomes an assistant and helps doctors with their
interpretation. The application is used to read and convert 2D scan images into
interactive 3D models that enable medical professionals to gain a detailed
understanding of a patient’s health condition.
7.Google Translate App: To read signs in a foreign language, simply point your
phone’s camera at the words and let the Google Translate app will instantly tell
you what it means in your preferred language.
This tool uses optical character recognition to interpret the text and augmented
(greater) reality to show the translation, making it a convenient use of
computer vision.

Optical Character Recognition (OCR) is a technology that converts text within images,
such as printed or handwritten text, into machine-readable text. For example, it allows a
computer to "read" words from a scanned document, photo, or sign and process them as
editable or searchable text.
LO: Understand CV Tasks and basics of images.
Basics of Images:
Basics of Images
• Basics of Pixels :The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels.
They are the smallest unit of information that make up a picture. Usually round or square, they are typically arranged
in a 2-dimensional grid (rows and columns that form sq or rec). In the image below, one portion has been magnified
many times over so that you can see its individual composition(structure) in pixels. As you can see, the pixels
approximate the actual image. The more pixels you have, the more closely the image resembles the original

• Resolution : The number of pixels in an image is sometimes called the resolution. The term is used to describe pixel
count, one convention is to express resolution as the width by the height, for example a monitor resolution of
1280×1024. This means there are 1280 pixels from one side to the other, and 1024 from top to bottom. Another
convention(protocol) is to express the number of pixels as a single number, like a 5 mega pixel camera (a
megapixel=million pixels). This means the pixels along the width multiplied by the pixels along the height of the image
taken by the camera equals 5 million pixels. In the case of our 1280×1024 monitors, it could also be expressed as
1280 x 1024 = 1,310,720, or 1.31 megapixels.
• Pixel value: Each of the pixels that represents an image stored inside a computer has a pixel value
which describes how bright that pixel is/or what colour it should be. The most common pixel format is
the byte image, where this number is stored as an 8-bit integer giving a range of possible values from 0
to 255.Typically, zero is to be taken as no colour or black and 255 is taken to be full colour or white.
Why do we have a value of 255 ? In the computer systems, computer data is stored in binary which
means everything is made up of 1s and 0s which we call the binary system. Each bit in a computer
system can have either a zero or a one. Since each pixel uses 1 byte of an image, which is equivalent to 8
bits of data, each bit can have two possible values which tells us that the 8 bit can have 255 possibilities
of values which starts from 0 and ends at 255.

(A byte is a unit of data that has 8 bits, and each bit can either be 0 or 1. With 8 bits, you can represent 2⁸
= 256 values. These values range from 0 to 255 because we start counting from 0.)
• Grayscale Images
Grayscale images are images which have a range of shades of gray without apparent colour. The darkest
possible shade is black, which is the total absence of colour or zero value of pixel. The lightest possible
shade is white, which is the total presence of colour or 255 value of a pixel . Intermediate shades of gray
are represented by equal brightness levels of the three primary colours. A grayscale has each pixel of size 1
byte having a single plane of 2d array of pixels. The size of a grayscale image is defined as the Height x
Width of that image. Let us look at an image to understand about grayscale images.

Let us look at an image to understand about grayscale images.

Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- 255.
The computers store the images we see in the form of these numbers. ( In digital imaging, computers store
images as grids of pixels, and each pixel is assigned a numerical value that defines its color)
Computer Vision Tasks:
The various applications of Computer Vision are based on a certain number of tasks which are performed
to get certain info from the input image which can be directly used for prediction or forms the base for
further analysis. The tasks used in a computer vision application are :
• Classification :Image Classification is the task of assigning an input image one label from a fixed set of
categories. This is one of the core problems in CV that, despite its simplicity/straightforward nature, has a
large variety of practical applications.

• Classification + Localisation: This is the task which involves both processes of identifying what object is
present in the image and at the same time identifying at what location that object is present in that image. It
is used only for single objects.

• Object Detection :Object detection is the process of finding instances of real-world objects such as faces,
bicycles, and buildings in images or videos. Object detection algorithms typically use extracted features and
learning algorithms to recognize instances of an object category. It is commonly used in applications such as
image retrieval and automated vehicle parking systems.

• Instance Segmentation :Instance Segmentation is the process of detecting instances of the objects, giving
them a category and then giving each pixel a label on the basis of that. A segmentation algorithm takes an
image as input and outputs a collection of regions (or segments). Ex: If there are three dogs in a picture,
instance segmentation can identify each dog separately and mark their exact shapes.
• RGB Images: All the images that we see around are coloured images. These images are made up of three
primary colours Red, Green and Blue. All the colours that are present can be made by combining different
intensities of red, green and blue.
How do computers store RGB images?

Every RGB image is stored in the form of three different channels called the R channel, G
channel and the B channel. Each plane separately has a number of pixels with each pixel
value varying from 0 to 255. All the three planes when combined together form a colour
image. This means that in a RGB image, each pixel has a set of three different values
which together give colour to that particular pixel.
As you can see, each colour image is stored in the form of three different channels,
each having different intensity. All three channels combine together to form a colour
we see. In the above given image, if we split the image into three different channels,
namely Red (R), Green (G) and Blue (B), the individual layers will have the following
intensity of colours of the individual pixels. These individual layers when stored in
the memory looks like the image on the extreme right. The images look in the
grayscale image because each pixel has a value intensity of 0 to 255 and as studied
earlier, 0 is considered as black or no presence of colour and 255 means white
or full presence of colour. These three individual RGB values when combined
together form the colour of each pixel. Therefore, each pixel in the RGB image has
three values to form the complete colour.
HW: Research and find the Python library that supports CV.

A:The most popular Python library for computer vision is OpenCV.

It provides many functions for image and video processing, object
detection, and more.

Computer Vision Class 10 Notes
100% (5)
Computer Vision Class 10 Notes
7 pages
Artificial Intelligence (Computer Vision) : by Dr. Sehat Ullah Department of Computer Science & IT University of Malakand
No ratings yet
Artificial Intelligence (Computer Vision) : by Dr. Sehat Ullah Department of Computer Science & IT University of Malakand
35 pages
Pipe Identification: Color Codes For API Grades
No ratings yet
Pipe Identification: Color Codes For API Grades
8 pages
Computer Vision
No ratings yet
Computer Vision
19 pages
C10_AI_COMPUTER VISION (1)
No ratings yet
C10_AI_COMPUTER VISION (1)
40 pages
Chapter-4 Computer Vision Study material
No ratings yet
Chapter-4 Computer Vision Study material
4 pages
AI 10th grade pdfs
No ratings yet
AI 10th grade pdfs
30 pages
Screenshot 2023-10-23 at 5.51.17 AM
No ratings yet
Screenshot 2023-10-23 at 5.51.17 AM
14 pages
Computer Vision Class X
No ratings yet
Computer Vision Class X
39 pages
Computer Vision
No ratings yet
Computer Vision
29 pages
CV
No ratings yet
CV
9 pages
Computer Vision
No ratings yet
Computer Vision
36 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
4 pages
Question Bank 9 (1)
No ratings yet
Question Bank 9 (1)
6 pages
X AI SS CH5 LM
No ratings yet
X AI SS CH5 LM
54 pages
cv
No ratings yet
cv
4 pages
AI CV NOTES
No ratings yet
AI CV NOTES
6 pages
Ch-Computer Vision
No ratings yet
Ch-Computer Vision
6 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
2 pages
HW_675075_1Compu
No ratings yet
HW_675075_1Compu
3 pages
Computer Vision
No ratings yet
Computer Vision
4 pages
COMPUTER VISION notes
No ratings yet
COMPUTER VISION notes
3 pages
Computer Vision
No ratings yet
Computer Vision
21 pages
Computer Vision
No ratings yet
Computer Vision
3 pages
Introduction to Computer Vision
No ratings yet
Introduction to Computer Vision
8 pages
Unit-5 Computer Vision(Ai)
No ratings yet
Unit-5 Computer Vision(Ai)
14 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
4 pages
52 BDB
No ratings yet
52 BDB
3 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Class 10 AI 417 Computer Vision
No ratings yet
Class 10 AI 417 Computer Vision
22 pages
PartA-Unit5-Ass01
No ratings yet
PartA-Unit5-Ass01
3 pages
AI-Computer Vision
No ratings yet
AI-Computer Vision
16 pages
PDF Computer Vision
No ratings yet
PDF Computer Vision
3 pages
Unit-5 Computer Vision
No ratings yet
Unit-5 Computer Vision
3 pages
ASSIGNMENT 5 - X - AI Handout Computer Vision1
No ratings yet
ASSIGNMENT 5 - X - AI Handout Computer Vision1
3 pages
Machine - Learning (Computer Vision)
No ratings yet
Machine - Learning (Computer Vision)
56 pages
2023 - 12 - 06 7 - 57 PM Office Lens
No ratings yet
2023 - 12 - 06 7 - 57 PM Office Lens
11 pages
Ip Cv Summary Finaaaal-1
No ratings yet
Ip Cv Summary Finaaaal-1
178 pages
Ai
No ratings yet
Ai
14 pages
Classical Computer Vision - Session 1
No ratings yet
Classical Computer Vision - Session 1
130 pages
CV (Unit1&2ans)
No ratings yet
CV (Unit1&2ans)
32 pages
Q-Ans-ComputerVision-Ass01
No ratings yet
Q-Ans-ComputerVision-Ass01
2 pages
4. Computer Vision
No ratings yet
4. Computer Vision
23 pages
Multimedia and Computer Vision unit 5
No ratings yet
Multimedia and Computer Vision unit 5
25 pages
lecture 1 AI Summary
No ratings yet
lecture 1 AI Summary
31 pages
CV 1
No ratings yet
CV 1
21 pages
Computer Vision and Image Processing (updated) (2)
No ratings yet
Computer Vision and Image Processing (updated) (2)
165 pages
e98da8fbc33b80a8a7c6cfc6ddfd7cf5
No ratings yet
e98da8fbc33b80a8a7c6cfc6ddfd7cf5
36 pages
Computer vision
No ratings yet
Computer vision
13 pages
CV GTU ANSWERS
No ratings yet
CV GTU ANSWERS
56 pages
Chunk 2
No ratings yet
Chunk 2
31 pages
COMPUTER VISION
No ratings yet
COMPUTER VISION
14 pages
Computer Vision
No ratings yet
Computer Vision
30 pages
Computer Vision and Data Science Notes
No ratings yet
Computer Vision and Data Science Notes
11 pages
Computer Vision Xth (1)
No ratings yet
Computer Vision Xth (1)
9 pages
Class 10 Revision
No ratings yet
Class 10 Revision
10 pages
Class X Computer Vision
No ratings yet
Class X Computer Vision
7 pages
Computer Vision Class X
No ratings yet
Computer Vision Class X
17 pages
Lecture1_merged
No ratings yet
Lecture1_merged
182 pages
COMPUTER VISION
No ratings yet
COMPUTER VISION
12 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Introduction To Low Vision
No ratings yet
Introduction To Low Vision
15 pages
System Diagram Ec3000e 5r5 011 4 (Ups Epm)
No ratings yet
System Diagram Ec3000e 5r5 011 4 (Ups Epm)
22 pages
International Ayurvedic Medical Journal: Case Report ISSN: 2320 5091 Impact Factor: 4.018
No ratings yet
International Ayurvedic Medical Journal: Case Report ISSN: 2320 5091 Impact Factor: 4.018
5 pages
Pentacam en
No ratings yet
Pentacam en
18 pages
Test Holmgren (Wool)
No ratings yet
Test Holmgren (Wool)
1 page
AAV ColorLab Readme
No ratings yet
AAV ColorLab Readme
8 pages
MCQ EYE 5
No ratings yet
MCQ EYE 5
1 page
Feature Matching in Iris Recognition System Using MATLAB
No ratings yet
Feature Matching in Iris Recognition System Using MATLAB
10 pages
Plantilla de Psicologia
No ratings yet
Plantilla de Psicologia
44 pages
PLEX-Elite Biblio
No ratings yet
PLEX-Elite Biblio
18 pages
Camera Setup Matching and Alignment Application Note 25W271590
No ratings yet
Camera Setup Matching and Alignment Application Note 25W271590
20 pages
Brosur CARL ZEISS RESCAN 700
No ratings yet
Brosur CARL ZEISS RESCAN 700
14 pages
Hasselblad 202fa
No ratings yet
Hasselblad 202fa
40 pages
He Dressmaking Gr10 q1 Module-1-For-student
88% (8)
He Dressmaking Gr10 q1 Module-1-For-student
40 pages
Color 1 Color 2 Color 3 Color 4 Color 5
No ratings yet
Color 1 Color 2 Color 3 Color 4 Color 5
8 pages
Glaucoma
No ratings yet
Glaucoma
25 pages
Paleta de Colores
No ratings yet
Paleta de Colores
2 pages
Development of Binocular Vision: University of Gondar Department of Optometry by Nebiyat Feleke
No ratings yet
Development of Binocular Vision: University of Gondar Department of Optometry by Nebiyat Feleke
35 pages
Andrew Mundi's Site Principles of Design
No ratings yet
Andrew Mundi's Site Principles of Design
1 page
Colours - 2
No ratings yet
Colours - 2
3 pages
1 English For Kid
No ratings yet
1 English For Kid
28 pages
2021 Basic Eye GLS Tests
No ratings yet
2021 Basic Eye GLS Tests
19 pages
Samsung Q9 CNET Calibration
No ratings yet
Samsung Q9 CNET Calibration
3 pages
ANSI Z535.1-2006 Safety Colors
No ratings yet
ANSI Z535.1-2006 Safety Colors
40 pages
Photography As Art, Communication, DSLR Camera
No ratings yet
Photography As Art, Communication, DSLR Camera
23 pages
Product Information Flat Panel Detector XenOR 35CW - EN
No ratings yet
Product Information Flat Panel Detector XenOR 35CW - EN
2 pages
Bernardo Pictura 2024-2025 Rate Card
No ratings yet
Bernardo Pictura 2024-2025 Rate Card
8 pages
Photoshop CS5 - Upscaling: Enlarge Your Images The Right Way
No ratings yet
Photoshop CS5 - Upscaling: Enlarge Your Images The Right Way
5 pages
Confidential 3501132: Appen - Person/Object Segmentation
No ratings yet
Confidential 3501132: Appen - Person/Object Segmentation
16 pages

Computer Vision

Uploaded by

Computer Vision

Uploaded by

COMPUTER VISION

The Computer Vision domain of Artificial Intelligence, enables machines to see

3.Google’s Search by Image: Max amount of searching for data on Google’s

5.Self-Driving Cars: Computer Vision is the fundamental technology behind

Let us look at an image to understand about grayscale images.

A:The most popular Python library for computer vision is OpenCV.

You might also like