0% found this document useful (0 votes)
14 views8 pages

Adobe Scan 26 Nov 2024

This document provides an overview of computer vision, detailing its applications across various industries such as facial recognition, document verification, and autonomous vehicles. It explains the fundamental concepts of computer vision, including how computers process images and colors, and highlights the integration of machine learning to enhance these capabilities. Additionally, it discusses the impact of computer vision on sectors like retail, education, and digital marketing, emphasizing its role in improving efficiency and user experience.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

Adobe Scan 26 Nov 2024

This document provides an overview of computer vision, detailing its applications across various industries such as facial recognition, document verification, and autonomous vehicles. It explains the fundamental concepts of computer vision, including how computers process images and colors, and highlights the integration of machine learning to enhance these capabilities. Additionally, it discusses the impact of computer vision on sectors like retail, education, and digital marketing, emphasizing its role in improving efficiency and user experience.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Computer Vision

UNIT-5

Introduction to
Computer Vision
1
Learning Milestones
ComputerVision Tasks
Applications of ComputerVision Image Features
ComputerVision
/ Fundamental of
that for a
Science. We know
Data some sort
many concepts related to Al and for a machineitisall with CV
consolidated video, machines
At this stage,
we have learnt and text, strings, images,audio or like we do. But
eoof data is anumeric value. Be it
ethevisuals Let us explore
machine,every piece sofnumbers. Machines do not and cannot perceive incredible speed and accuracy.
oflotso visuals at
ofsetorsequence trained with a lot of visual data to process
algorithms can be with Machine Learningand DeepLearning.
Computer Vision aided
somemajorapplicationsof

Applications of Computer Vision


posSible
easilyfigure out the
opency library, we can
vision and have a look at some
understanding ofcomputer techniques ofMachineLearning. Let is
basic
Now that we have especiallywhen it is equipped with the
applicationsofCV recognitionis one of the
application of CV. Facial Facial
majorapplicationsofCV: most common and popular computers to unlock the device.
This is one of the various types of criminals. Other
Facial Recognition: digital communicationdevices and missingpersons and known
handheld of suspects, and stranger,
securityfeatures in
useful in law enforcementinidentifyingfaces distinguish between a friendly guest records
recognition is also
are ability ofa smart home digital
camera to
logging the details in the system as official
recognition and then,
visitors/employees/passengersarrivingand leaving
uses offacial
validation of the
tracking
audit. extensive documentation need the accurate identification
forlater checking or involving visa,
Document Verification: In
allthe industries
banking and finance, travel (passport, verification
and as quickly as possible. Industries such as
(courts and panchayats), vehicle registration and verifying
documents lawand legal services authenticating individuals,
documents), property andreal estate, can be applied. This helps in fraud detection, of
capabilities ML can speed up the process
are major areas where CVidentifying forfeit currency and documents etc. CV with
processing leading to a happy lot of
Ownership of property, and minimise frauds with quick application
verification with enhance accuracy
CV-MLalgorithms can
people. equipped with embedded CV with MLor cloud-based
Diagnostic Services: Devices and sensors

Artificial lntelligence-X
Real-time
and health conditions.
help in faster,
monitoring even and
in victims real-time
during
and accurate
operations, diagnosis
locating of the diseases
tumours, cysts, brain haemorrhage, internal blood loSs
infections, internal
body conditions like cardiovascular ailments etc. are allinvolvescanned visuals. Atrained CV algorithm, can save alot of
ime in days to a few hours and thus minimising the death Tates items in
Retail keeps updating your invoice as you keep the your shopping
cart in aindustry:
retail superThink
store.of Cart's
asmart CVsystem
sensor that perform
can easily object identification of the items in the cart and match their
for
details in the database. When you checkout, you just needto pay instead of waitingin a queue billing individual
of
items.
for reorders
itemsin stockitems
can help in taking inventov of.Optimising and helping in tracking FSN status of iteme
storage in shelves, stacks and godowns can)be aided by CV for
selling, Slow-selling, Not-selling items).
betterDeeplearning
use of emptyalgorithm
spaces. canlearn the buying habits ofthe regularbuyers and in their nexttvisit suggest themn where
the new arrivals are located this
minimisinggtheir movement and
effort around acrowded and largearea. This is helpful
tor elderly people and also optimises crowd flowin ashopping mall.
item, checking shop-liftings, minimised
Quick regular buyers,
identifyingcheckout at the classifying locatingon misplaced
mall exit, buyers billing errors,
their brand and item type preferences, age (parent accompanied by
kids, bachelors, females: age-wise, males: age-wise), day-specific buyers(weekend buyers, mnonthly bulk
buyers,
nonday buyers for new arrivals, weekday buvers of saving schemes items ete.) are some najot Denents to be harvested
bythe help of CV-MLalgorithms.
mage-based search: E-Commerce stores rely on presenting the images and audio-VMSual demos of their produe
Language-based search has barriers of variety of languages and vocabulary for items. But people identify items through
their visuals irrespective of their location and language. Clicking a picture of an item and locating it on internet is a
common practice today. Associating it with face recognition can also help in finding people. Big players in E-Commere
Amazon, Flipkart and the likes are already using it to enhance customer experience and also in their backend
operations
such as warehouse maintenance. Finding items, people and places can be highly efficient with CV-ML algorithms. They
also help in product comparison for variety, finding similar products based on looks and make, clasifying items on
physical features and peoplefor age-specific orgender-specifc services, consolidating product catalogues ete.
Digital Audio-Visual Marketing: This is the new trending innovation in digital marketing in which computer vision
helps users to create realistic visual content for intended audiences. Creating 3D avatars of models, cartoon caricatures of
celebrities and public figures, using augmented reality for product demos, game-based marketing for youth and kids.
movie and books promos are some potential areas where giants like Amazon Netflix have already begun the use of CV-MI.
Augmented Reality
The term augmented reality refers to the interactive environment created by a computer
application by adding digital elements to a live video feed or image. Augmenting means to "add
to". It is different from virtual reality in which entire interactive 3D world is created and shuts out.
the user from real world. Augmented reality is an interaction between live video views using
digital technology.

Education and training: Learning based on augmented reality and virtualreality need CV algorithms to convert allthe
static content into interactive immersive content for learning instead of developing content from scratch. Generating 3D
visual froma 2D image, assignments and answer sheets verification, online class attendance, virtual tours and excursions
can be enriched by CV applications.
CV helps those learners who learn by watching. Learning through images and visuals can be made more intuitive
and effective with CV applications.
Autonomous vehicles, routing and parking assistance: Self-driven cars, driverless trucks, pilotless planes and
hands-free driving assistance are some examples of CV application. Vehicle control and driving involves audio-vsual dad
and its processing. An autonomous vehicle can be a small carriage or cart in a ware house or a car
on road. Theynvo
Computer Vision to process real-time, rapid visuals. Imagine a car moving at 8okmph with not driver! The algorthns

216) Artificial lntelligence-X


instantcontrol
controllingthe car must beflawless, robust (crashproo) and reasonably faster in processing speedsothat approaching
collision with an
actionscould be performed such as sudden break and pullover on asteep curve to avoid able to
form the opposite side. CV helps in vehicles. Such vehicles are
vehicle adding all intelligencetothe autonomous road
identify still and moving obstacles, road signs, from other vehicles, correct route, and
quickly gestures, signals conditions,barriers
conditions,localItraffic rules, distance from other objects and vehicles, weather conditions, light such
car Waymois one
diversions. CV assists inlocating a vacant parking slot and parkthe vehicle in it. Googledriverless vehicles, hands-free
autonomous vehicle. The technology has its applications in freight trucks, industrial in-house
assistance,, auto-pilot planes and drones, combat vehicles and space shuttles etc.
Street View
Apioneer
innovation by Google, street view is generated by stitching the street images
together inthe form ofan interactive panorama. It is the part of Google Mapsiand Google Earth
applications. Google did it painstakingly by car and boats with sophisticated digital
photography.
icons and
recognise various signs, vehicle.
applications can self-driven
Character, Sign and Symbol Recognitions: ML enabled CV road signs by a such
traffic and Translateis one
optical codes. This helps in visual communication like understanding languages..Google Associating
translate scannedtext in one language to manyyother
o correction.
Translation applications can documentation and especially for
digital systems
application..Extracting text from images by pixelscanningis helpful inteducational and information
1
augmented reality with such application can help in making efficient all classes
can be
for
phycially challenged andelderly people. dream of sustainable smart homes
and cities
security and
upkeep,
Cities: As we learnt earliertoo, home
Smart Homes and
Public services, domestic services,
and enrich the quality of
achieved by the great help from CV-ML applications. enhance
consumption etc. can use CV
applicationsto
of technology can being
hospitals office, schools, traffic, energy disciplined use un-
future generations. Bfficient and
and healthy life with no
living without risking the environment and happy
the nation in a better way while enjoyingga value-added,
society together to serve
equality among masses.

Fundamentals of Computer Vision of


processing of imagesfor a variety
with images tor
the domain of artificial intelligence that primarilyy deals Computers have been processing image are
Computer Vision is ofstill image frames, processing an
but aseries ofhundreds techniques involved in
purposes. Videos are nothingidea-of AI had caught up, So. the basic together with machine learning models
before the bringing them interpret the
a long time even efforts are in tweaking these technigues and of computer vision domain is to
alreadyinplace.The So, the primary goal images can be as
algorithms for computer vision. frame of a video bya digital device. In real world,
toachieve smart image generated
the form of an image or a single image captured by a radar, satellite, a thermal
visual captured in
captured by a camera, an X-ray shot, (Magnetic Resonance Imaging or MRIscan).
ordinary as a still
an image rendered by the
help ofmagneticfield
by object's temperature or
Computers Understand Images? arrangement ofnumbers. Computers are least bothered
How datais an is for your
for computers, everyform of a different stance while posing for the picture. That
As we discussed earlier,closeup photo or youshould have
ifyou lookgood in your computerit isjust an array ofpixels. constitutes an
friends to suggest. Fora picture element. A pixel is the smallest unit that
combination of two terms - microscopic in size and
The term pixel is the (arrangement) of hundreds or thousands of pixels.They are
into an image
image. An image isa dense array ofthe scene which our brain perceives as a single image. Ifwe zoom
together form the smooth, final visual it.
explodes into the set ofpixels that constitute
to magnifyit several hundred times, it computer is able to "see" that image.
computer is able to store the information of each pixel of an image then that
Ifa

Artificial lntelligence-X
But seeingis not processing. Thatis alater part..Let us understand what apixel is!
To understand pixel, you needto understandthefollowing1gtwo concepts:
Basics ofdigital colours
O How computers store
numbers? Red, Green and Blue (RGB).
Digital devices like
Digitalcameras
digital Colouretc.
Basics:
use RGBThecolourmodel.
3 primary digital
Varietycolours are are
of colours produced by thecombination andlintensity ofthesethree
basic colours. The maximum intensity of these colours combined together generates white colour and minimum or zero
computers,
ensity
basic colours threemixed
of allare then combined.
colours other colours
black colour (absence of all the colours). Ifthe varying intensities of thees
oivesappear.
out For example, half the intensity of all three will generate ashade of grey.

ombining red and green will generate vellow colour. red and blue will give magenta and green and blue will give cyan.
ACTIVITY: Colour Basics Just Color Picker

Download Just Color Picker from https://annystudio. Options Tools Colour List Help
com/software/colorpicker/ and try various intensities of
basic colours between 0 and 255 and observe how colours
are generated by mixing RGB colours.
Alternatively, you can go to leonardocolor.io and try mixing o16ED3
the colours in the Key Colors drop-down. (1350, 943]

ieonardocolor io
HTML Hotkeys: Alt+X Copy Value
OLeonardo Color her
255 255

G
Key Colors
ORGB O HSV OHSL

R:255
255 255 G: 255
B: 0

255
255 255
255

Note
There are other colour models also such as CMYK (Cyan, Magenta, Yellow, blacK) which is used by printers to mix printable
colours, Hue-Saturation-Value (HSV) etc. but basic concept of each model is to mix primary colours in varying intensity to
generate secondary colours.

How computersstore numbers? You have seen that colour representation isjust a
1 X 2 1
combination of numbers. But why 255 is the maximum intensity in RGB model? That is
because in RGB model, computer uses 1byte to represent each of the three colours and if 1 X 2' 2
you turn on allthe 8 bits in a byte it will represent the value 255. All 8 bits on will be 1 X 2 4
represented by 111111. Decimal equivalent of 111uu1 is 255. 1 X 2 8
Inabyte, the position ofbits begins (left to right) from o. Left most bit is at position o 1 X 2' 16
and 8th bit is at position 7. Take these positions as the powers of2 and multiply them with
the valuesofeachbit i.e. 1.Add up all the products and you will get 255 as shown here. 1 X 2 32
So, if all the bits in the byte meant for red colour are set to one then it is the 1 X 24 64
brightest red colour. Similarly, each basic colour is represented by 1byte each. Thus, 1 X 2 128
for coloured image, each pixel has 3 basic colours - R, G and B and each pixel has 3 255

218 Artificial lntelligence-X


that meanseach
f for
bytes-one R. one for Gand one for B. For example, you see an image of red rectangle on the screen
ifyousee ayellow oval
image has bytefor red colour as 11111111 and bytes for green and blue as 00000000 or 00000000.
the 5
pixelin
allt enixelsthat make the oval lhave bytefor red colour and green colour as1111111 andthat ofblue as
then
byteshave some bits on and some bits offthen their combination will give out other colours. simple workout:
g
fallthecolouredimage It's a
is made of an array of50 X 50 pixels then how many bytes wouldiitoccupy?image is increased,
the
Ifa Bytes. As the density of the
X3= 7500 bytes i.e. 7500/ 1024 =approximately 7 Kilo
50X50 clearerandimage file size gets larger.
jmagegets
Resolution
1oof the image picture.
and clearer would bethe ofan
learntthat higher the number of pixels that constitute the image, sharper approximatelyg6th part
just lis 1561
As we calledi image resolution. Image resolution is the pixel count in the image. Apixel height,for example
is
This sacross width and of, for
inchi.e.o.2616 mm. Image resolution is generally specified as an array of pixelsIdevicecomes with the specificationdisplay
d
means 1561 pixels across width and 747 pixels across height. Ifa digital this resolution clearly. A
digital
x747 fmaximum pixels. Intermnsof
example, 1920 X then it means the device candisplay graphics ofr resolution equals11,66, 067
108ot
edifferent resolutions for images and videos. A1561 X7471 megapixel.
devicemay have So, 11,66,067is 1.16
approximately.
megapixels,11megapixelequals 10,48,576 or 1olakh or 1 million 101 108 110
J47167 14S 113

Coloured Images 34 14
learnt how RGB details of each pixel together makea coloured 156 16 16
193 191 J60
171

We have already memory when an image is opened on


s7148S1S2 1811RS 154

image. How is this detail stored in computer's 208 2192242s3249 1059


transparent separate planes - one
o0mputer? To understand this, think of three red
709

planes are called channels. So, there is a 215 251 175

for each of the primary colour. These


211
252 252 203
is 255 20S 200
memory level, a coloured image 251 247 151
channel. At 200 235

channel, a green channel and a blue


245 203
2s3
232
in each 230 24o 24317015o

one dimension is 3 channels and, 249 245

represented by a 3-Dimensional array - The image here shows a conceptual


253 231 168
7I37 245 Z15 200

channel, there is height and width. 205 205 206 255


724
23S70
representation of3 channels with bytes.
ba 21s 243
La7210 254 23

Here, image of Rose is shown with its 3channels.

Grayscale images
varying intensity
Earlier we learnt that
together make other
levels of R, G and B intensities of
intermediate
colours. Ifthe some
R, G and Bare
mixed together then
generated. TThe
shade of grey colour is shade
and darkest
lightest shade is white such shades are
black. Images with
images. In a grayscale
called grayscale
pixel, only 1 byte is
image, for each grayscale image does not have separate
enough. That means, only a 2D array which
stores pixel
colour channels. There is width. So, grayscale images
are
information across height and
their coloured versions. The X-rays are
smaller in size than scan. Here, are shown two images
grayscale images of
grayscale- with similar resolution but graayscale
coloured and
in filesize.
imageissmaller

Artificial Intelligence-X 219)


The pixelated magnification of onetiny part of the grayscaleimage Rose.png isshown here. Noticethedifferent
d shades
of grey colour
which together form the image.
ACTIVITY: Experience Grayscale Images convert
Go to image into grayscale. You can tany image
onlinepngtool.com
into PNG by using Save As...
alsoand comparetethe sizes of command
and try converting any coloured PNG
in any simple image tool
yourobservation
such as Paint. Then,
with theteacher.
download the grayscale image
both theimages.. Discuss
onlinepngtools com/convert png to grayscale

png grayscaled png

Copy to
Import from Copy to Chain with.. Save as.
Save as.. clipboard
fle clipboard

Computer Vision Tasks


While understanding the basics of image processing, we learnt how an image is understood by the computers and what
properties of images they utilise to process the images. We also learnt that graphics data is also translated pixel by pixel
into its binary equivalent and stored in computer's memory. Image processing applications such as Photoshop,GIMP
etc. apply several computational methods to modify, enhance, transform and change the images but they are not
equipped with machine learning intelligence to extract image features in such a way that are applicable in the field of
computer vision. But the concepts behind image processing makes the primary foundation of image handling. Then
utilising the information from the image is the task of machine learning. Let us now understand the standard tasks
which a machine or computer can be used to perform on an image in Computer Vision domain.
Classification
This is the simplest and most common task related to the images. Classification means identifying the object in an image
as awhole and distinguish it from other similar looking images. For example, identifying one make of acar from others
oridentifying one face from another.
Theinformation gathered from thepixelsofthe imagesisusedto correlate the values tocompare and classfy the images.
Classification and Localisation
This involves identifying asingle object in an image and also figuring out where it
is. Like in the image shownhere, machine should identify where aparticular player
is in the image. This process involves defining theboundary of that single object.

220) Artificial Intelligence-X


object Detection
When allthe.objects are localised in an
object detection. In the example image by an ML. algorithm then it is called
of two football
distinctively markthe players, the ML: algorithm should
boundaries of both the players and, if balllis visible, then the
boundary of the ball too. Object
machineihasto discover which detection is acomplex task especially when the
ofthe objects in theimage are relevant tothe project.
Instance Segmentation
Thisis an extension of object detection.
object of a class in an image. For
Instance is one or more Occurrences of an

instances ofclass egg. example, the image shown, there are


in 2
But, if algorithm tries to distinguish the eggs on the
leo then, the algornthm will first classify all 12 eggs asbasis
of their colour shades
egq and then it will label
maccording to the
colour-pale red, pale orange, yellow, pale green, pale blue
nd dark blue (See the coloured pairs of bounding
rectangles).

Image Features
feature that
Aspecificccomponent or structure in an image which is uniquely specificinitselfis calledimagefeature. The Computer
makes the component specific may be colour orshape or both. important part of
Finding image ffeatures is anvarious computations in the
Vision domain. It gives important information useful in processing the image through
desired way. Some common image features are points, edges, corners, blobs and ridges.
TWo major goals of extracting image features are:
o Overlook unnecessary regions in theimage.
o Interpret and describe the scene captured in the image.
o Identify the relationship between different objects in the image.
Commonimage features are discussed here:
Edges
An edge generally denotes the abrupt or sharp change in the intensity of a
range of pixels and helps in identifying the boundary of the two objects or
regions. Edges hold the information that helps in determining the shape of
an object in the image. Edges can be as simple as horizontal, vertical or
diagonal or as complex as the outline of a tree, cloud or a river traversing
through a landscape. Edges are spread along a line or boundary where
pixels are similar in property and beyond that boundary, there is distinct,
sharp change in the pixel property (colour, intensity). Determination of
exact location of edge is an example of localisation, Think of a satellite
image of a river. The rough edges along the either side of the river would be
same but there will be sharp difference in the property of pixels
interpret the width and stretch of the
representing edge of the river bank and the water. This sharp difference will help
identification and segmentation.
river and distinguish it from the rest of the image. This way, edges help object
identified and by following their similar
The ML algorithm tries to classify other similar pixels along the pixels identify and whatever is left out is
feature maps the shape of the edge. In the image shown here, dark waterway is easy to
the part ofland. intensity of the pixels
Edge detection: To detect the edge, when you move the inspection window, the change in the
or down, no change in
will occur only in one direction not the other. Notice in the figure, if you move the window up
left-ward, the change in
pixels would occur until it reaches one of the edges up or down. But, if you move the window
pixel intensity will give a hint of an edge (boundary between gray and white pixels..

Artificial Intelligence-X
Corners
CornersSare another unique part of an edges, corners are easier to detect
since maximum change in the intensityimage.
of pixels occur at theto corners.
As compared The reason is that acorner
dueS at least two edges and hence change in the intensity of the pixels is higher in corners as
compared to edges.
Corner detection: Ifvou move the nixel window and ifthe region is acorner then there wl be
two adjacent edges and intensity change would be more
as comparedto onee5
Blobs
**0S Tiat region in an image with zero or negligible intensity change in the pixels. Blobs
usually make the major part of an image. As long as a blob is detected the inspection of the pixels
cat coninue until considerable change in the intensity isdetected which could mean either a
corner or an edge.
Blob detection: In the example shown here. moving the inspection window a
or down will detect no intensity change since little distance up
majority or all pixels are similar gray.
ACTIVITY: Detecting fish in the ocean
The image here shows a fish in the water. Its magnification is given which is superimposed
by a grid of numbers. Let us assume that these are the pixel locations of the fish in
entire
image. Can you list or shade the pixels that represent the edges (outline ofthe fish)?

117 33 49 65 81 l92 113 129145l61 77l193 209 225241257273 289 305321 337953369 985
401417433449 465481497 sE
18 345066829B 114 1014616217899210226242 238 274290 306 922938354 370386402418434 450466482 498514
3551$783 9 15131 147 163 79195211 227243 259 275 291907923339 355|371387403419 435 451 467483 499 s15
20 36 526884.100116432t481648o 196212228 244260276 292908 924340 356|372 388404420436 452468484 500 516
21 32 53 101117 433L49165L8 197 2132292A526277293 30992534L957|373 389A05 4214371453 469 485 501 517
6 98 54 70B6 10241819450166 182 198 214 230 246|262 278|294310 326|342|358|374|390406 422 438454
470486502518
7 22 39 55 7187 co19 35159167183199 215/231 247 Z63 |279/295911927343 359375391 407 423 439455471487 503s19
24 40 56728810a 120 136|152168184 200 216 232 248264280 296312 3928 344 360376s92 A08 A24 440 456|472 488 504 520
2 41|5773 89105 121197 15 L69185201217299249265|281297 19 9245 364 977 893409 425|441 457 473 489 505 521
1026 42 58 74 9010612213615A70186202 2L8234230 266 2:22 298 B149904662 978 894410 426442 458474 490 506 522)
2743 5975 91107 123 139165187 20919 235 231257 283 299 B 9 B:2S63 97993411 427 443459475 491|507523)
12 44 607692 L08 124 40156172L88 2044220 2062208 : , 10) e 948 36438o 996|412428444 460476 492 508 24
13 29 45 617793 109 125 14117 7189205 227 29*4, 1: 93349 365 381 397419 429 445 461477 493 509|525
1430 6278 9o126142 1- 921, 2 E r23302318 334 350|366|382398 414|43o 62 A78|494|510| 523
15 31 47 1 2 7 1 4
6 22323925A 267 303319935 351367 383399 415|4; 4 } 27
16 3 48 80 96i12128 144|160 176|192 208224 240 256 272 288 304320336 352 368 384
40o416 492448 464 480 496 |52 8

222) Artificial Intelligence-X

You might also like