0% found this document useful (0 votes)
31 views6 pages

Ai CV Notes

Computer vision is a branch of artificial intelligence that enables systems to extract and analyze information from visual inputs like images and videos. Its applications range from facial recognition and self-driving cars to medical imaging and augmented reality translation. Key tasks in computer vision include image classification, object detection, and understanding pixel-based image data.

Uploaded by

Sky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views6 pages

Ai CV Notes

Computer vision is a branch of artificial intelligence that enables systems to extract and analyze information from visual inputs like images and videos. Its applications range from facial recognition and self-driving cars to medical imaging and augmented reality translation. Key tasks in computer vision include image classification, object detection, and understanding pixel-based image data.

Uploaded by

Sky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Chapter 4

Computer Vision

Computer Vision

Computer vision is a branch of artificial intelligence (AI) that enables computers


and systems to extract useful information from digital photos, videos, and other visual
inputs and to execute actions or make recommendations based on that information.

Computer vision is a branch in the Domain of AI that enables computers to


analyze meaningful information from images, videos, and other visual inputs.
Computer vision is the same as the human eye. It enables us to see-through images
or visual data, process and analyze them on the basis of algorithms and methods in
order to analyze actual phenomena with images.

Applications of Computer Vision


In the 1970s, computer vision as a concept was first introduced. Everyone was
excited by the new uses for computer vision. However, a considerable technological
advance in recent years has elevated computer vision to the top of many companies’
priority lists. Let’s examine a few of them:

Facial Recognition
The most frequently used technology is smartphones. It is a technology to
remember and verify a person, object, etc from the visuals from the given pre-defined
data. Such kinds of mechanics are often used for security and safety purposes.
For eg
: Face security lock-in devices and traffic cameras are some examples using facial
Recognition.
Face Filters
Modern days social media apps like Snapchat and Instagram use such kinds of
technology that extract facial landmarks and process them using AI to get the best
result.

Google’s Search by Image

To search data, Google uses computer vision for capturing and analyzing different
features of the input image to the database of images and then gives us the search.

Self driving cars


Computer Vision is the fundamental technology behind developing autonomous
vehicles. Most leading car manufacturers in the world are reaping the benefits of
investing in artificial intelligence for developing on-road versions of hands-free
technology.

For eg:Companies like Tesla are now interested in developing self-driving cars.

Medical Imaging
For the last decades, computer vision medical imaging application has been a
trustworthy help for physicians and doctors. It creates and analyzes images and helps
doctors with their interpretation. The application is used to read and convert 2D scan
images into interactive 3D models.
Google Translate App
To read signs written in a foreign language, all you have to do is point the
camera on your phone at the text, and the Google Translate software will very
immediately translate them into the language of your choice. This is a useful
application that makes use of Computer Vision, utilizing optical character recognition
to view the image and augmented reality to overlay an accurate translation.
Computer Vision Tasks
The Application of the computer is performed by certain tasks on the data or
input provided by the user so it can process and analyze the situation and predict the
outcome.

I. Single object
1) Image Classification
Image Classification is the task of identifying an object in the input
image and label from a predefined category.
2) Classification + Localization
This is the task which involves both processes of identifying what object is
present in the image and at the same time identifying at what location that object is
present in that image. It is used only for single objects.

II. Multiple object


1) Object detection
Object detection is the process of finding instances of real-world objects such
as faces, bicycles, and buildings in images or videos. Object detection algorithms
typically use extracted features and learning algorithms to recognize instances of an
object category. It is commonly used in applications such as image retrieval and
automated vehicle parking systems.
2) Instance segmentation

Instance Segmentation is the process of detecting instances of the objects,


giving them a category and then giving each pixel a label on the basis of that. It is
used for tasks such as counting the number of objects Basics of Images.

Basics of Images
The word "pixel" means a picture element.
Pixels
● Pixels are the fundamental element of a photograph.
● They are the smallest unit of information that make up a picture.
● They are typically arranged in a 2-dimensional grid.
● In general terms, The more pixels you have, the more closely the image
resembles the original.
Resolution

● The number of pixels covered in an image is sometimes called the resolution.


● Term for area covered by the pixels in conventionally known as resolution.
For eg :1080 x 720 pixels is a resolution giving numbers of pixels in width

and height of that picture.

● A megapixel is a million pixels

Pixel value

• Pixel value represents the brightness of the pixel.

• The range of a pixel value in 0-255(2^8-1)

where 0 is taken as Black or no color and 255 is taken as white

Grayscale Images
● Grayscale images are images which have a range of shades of gray without
apparent color.
● The lightest shade is white total presence of color or 255 and darkest color is
black at 0.
● Intermediate shades of gray have equal brightness levels of the three primary
colors RGB.
● The computers store the images we see in the form of these numbers.

RBG Images
● All the coloured images are made up of three primary colors Red, Green and
Blue.
● All the other colors are formed by using these primary colors at different
proportions.
● Computer stores RGB Images in three different channels called the R channel,
G channel and the B channel.

You might also like