Adobe Scan 26 Nov 2024
Adobe Scan 26 Nov 2024
UNIT-5
Introduction to
Computer Vision
1
Learning Milestones
ComputerVision Tasks
Applications of ComputerVision Image Features
ComputerVision
/ Fundamental of
that for a
Science. We know
Data some sort
many concepts related to Al and for a machineitisall with CV
consolidated video, machines
At this stage,
we have learnt and text, strings, images,audio or like we do. But
eoof data is anumeric value. Be it
ethevisuals Let us explore
machine,every piece sofnumbers. Machines do not and cannot perceive incredible speed and accuracy.
oflotso visuals at
ofsetorsequence trained with a lot of visual data to process
algorithms can be with Machine Learningand DeepLearning.
Computer Vision aided
somemajorapplicationsof
Artificial lntelligence-X
Real-time
and health conditions.
help in faster,
monitoring even and
in victims real-time
during
and accurate
operations, diagnosis
locating of the diseases
tumours, cysts, brain haemorrhage, internal blood loSs
infections, internal
body conditions like cardiovascular ailments etc. are allinvolvescanned visuals. Atrained CV algorithm, can save alot of
ime in days to a few hours and thus minimising the death Tates items in
Retail keeps updating your invoice as you keep the your shopping
cart in aindustry:
retail superThink
store.of Cart's
asmart CVsystem
sensor that perform
can easily object identification of the items in the cart and match their
for
details in the database. When you checkout, you just needto pay instead of waitingin a queue billing individual
of
items.
for reorders
itemsin stockitems
can help in taking inventov of.Optimising and helping in tracking FSN status of iteme
storage in shelves, stacks and godowns can)be aided by CV for
selling, Slow-selling, Not-selling items).
betterDeeplearning
use of emptyalgorithm
spaces. canlearn the buying habits ofthe regularbuyers and in their nexttvisit suggest themn where
the new arrivals are located this
minimisinggtheir movement and
effort around acrowded and largearea. This is helpful
tor elderly people and also optimises crowd flowin ashopping mall.
item, checking shop-liftings, minimised
Quick regular buyers,
identifyingcheckout at the classifying locatingon misplaced
mall exit, buyers billing errors,
their brand and item type preferences, age (parent accompanied by
kids, bachelors, females: age-wise, males: age-wise), day-specific buyers(weekend buyers, mnonthly bulk
buyers,
nonday buyers for new arrivals, weekday buvers of saving schemes items ete.) are some najot Denents to be harvested
bythe help of CV-MLalgorithms.
mage-based search: E-Commerce stores rely on presenting the images and audio-VMSual demos of their produe
Language-based search has barriers of variety of languages and vocabulary for items. But people identify items through
their visuals irrespective of their location and language. Clicking a picture of an item and locating it on internet is a
common practice today. Associating it with face recognition can also help in finding people. Big players in E-Commere
Amazon, Flipkart and the likes are already using it to enhance customer experience and also in their backend
operations
such as warehouse maintenance. Finding items, people and places can be highly efficient with CV-ML algorithms. They
also help in product comparison for variety, finding similar products based on looks and make, clasifying items on
physical features and peoplefor age-specific orgender-specifc services, consolidating product catalogues ete.
Digital Audio-Visual Marketing: This is the new trending innovation in digital marketing in which computer vision
helps users to create realistic visual content for intended audiences. Creating 3D avatars of models, cartoon caricatures of
celebrities and public figures, using augmented reality for product demos, game-based marketing for youth and kids.
movie and books promos are some potential areas where giants like Amazon Netflix have already begun the use of CV-MI.
Augmented Reality
The term augmented reality refers to the interactive environment created by a computer
application by adding digital elements to a live video feed or image. Augmenting means to "add
to". It is different from virtual reality in which entire interactive 3D world is created and shuts out.
the user from real world. Augmented reality is an interaction between live video views using
digital technology.
Education and training: Learning based on augmented reality and virtualreality need CV algorithms to convert allthe
static content into interactive immersive content for learning instead of developing content from scratch. Generating 3D
visual froma 2D image, assignments and answer sheets verification, online class attendance, virtual tours and excursions
can be enriched by CV applications.
CV helps those learners who learn by watching. Learning through images and visuals can be made more intuitive
and effective with CV applications.
Autonomous vehicles, routing and parking assistance: Self-driven cars, driverless trucks, pilotless planes and
hands-free driving assistance are some examples of CV application. Vehicle control and driving involves audio-vsual dad
and its processing. An autonomous vehicle can be a small carriage or cart in a ware house or a car
on road. Theynvo
Computer Vision to process real-time, rapid visuals. Imagine a car moving at 8okmph with not driver! The algorthns
Artificial lntelligence-X
But seeingis not processing. Thatis alater part..Let us understand what apixel is!
To understand pixel, you needto understandthefollowing1gtwo concepts:
Basics ofdigital colours
O How computers store
numbers? Red, Green and Blue (RGB).
Digital devices like
Digitalcameras
digital Colouretc.
Basics:
use RGBThecolourmodel.
3 primary digital
Varietycolours are are
of colours produced by thecombination andlintensity ofthesethree
basic colours. The maximum intensity of these colours combined together generates white colour and minimum or zero
computers,
ensity
basic colours threemixed
of allare then combined.
colours other colours
black colour (absence of all the colours). Ifthe varying intensities of thees
oivesappear.
out For example, half the intensity of all three will generate ashade of grey.
ombining red and green will generate vellow colour. red and blue will give magenta and green and blue will give cyan.
ACTIVITY: Colour Basics Just Color Picker
Download Just Color Picker from https://annystudio. Options Tools Colour List Help
com/software/colorpicker/ and try various intensities of
basic colours between 0 and 255 and observe how colours
are generated by mixing RGB colours.
Alternatively, you can go to leonardocolor.io and try mixing o16ED3
the colours in the Key Colors drop-down. (1350, 943]
ieonardocolor io
HTML Hotkeys: Alt+X Copy Value
OLeonardo Color her
255 255
G
Key Colors
ORGB O HSV OHSL
R:255
255 255 G: 255
B: 0
255
255 255
255
Note
There are other colour models also such as CMYK (Cyan, Magenta, Yellow, blacK) which is used by printers to mix printable
colours, Hue-Saturation-Value (HSV) etc. but basic concept of each model is to mix primary colours in varying intensity to
generate secondary colours.
How computersstore numbers? You have seen that colour representation isjust a
1 X 2 1
combination of numbers. But why 255 is the maximum intensity in RGB model? That is
because in RGB model, computer uses 1byte to represent each of the three colours and if 1 X 2' 2
you turn on allthe 8 bits in a byte it will represent the value 255. All 8 bits on will be 1 X 2 4
represented by 111111. Decimal equivalent of 111uu1 is 255. 1 X 2 8
Inabyte, the position ofbits begins (left to right) from o. Left most bit is at position o 1 X 2' 16
and 8th bit is at position 7. Take these positions as the powers of2 and multiply them with
the valuesofeachbit i.e. 1.Add up all the products and you will get 255 as shown here. 1 X 2 32
So, if all the bits in the byte meant for red colour are set to one then it is the 1 X 24 64
brightest red colour. Similarly, each basic colour is represented by 1byte each. Thus, 1 X 2 128
for coloured image, each pixel has 3 basic colours - R, G and B and each pixel has 3 255
Coloured Images 34 14
learnt how RGB details of each pixel together makea coloured 156 16 16
193 191 J60
171
Grayscale images
varying intensity
Earlier we learnt that
together make other
levels of R, G and B intensities of
intermediate
colours. Ifthe some
R, G and Bare
mixed together then
generated. TThe
shade of grey colour is shade
and darkest
lightest shade is white such shades are
black. Images with
images. In a grayscale
called grayscale
pixel, only 1 byte is
image, for each grayscale image does not have separate
enough. That means, only a 2D array which
stores pixel
colour channels. There is width. So, grayscale images
are
information across height and
their coloured versions. The X-rays are
smaller in size than scan. Here, are shown two images
grayscale images of
grayscale- with similar resolution but graayscale
coloured and
in filesize.
imageissmaller
Copy to
Import from Copy to Chain with.. Save as.
Save as.. clipboard
fle clipboard
Image Features
feature that
Aspecificccomponent or structure in an image which is uniquely specificinitselfis calledimagefeature. The Computer
makes the component specific may be colour orshape or both. important part of
Finding image ffeatures is anvarious computations in the
Vision domain. It gives important information useful in processing the image through
desired way. Some common image features are points, edges, corners, blobs and ridges.
TWo major goals of extracting image features are:
o Overlook unnecessary regions in theimage.
o Interpret and describe the scene captured in the image.
o Identify the relationship between different objects in the image.
Commonimage features are discussed here:
Edges
An edge generally denotes the abrupt or sharp change in the intensity of a
range of pixels and helps in identifying the boundary of the two objects or
regions. Edges hold the information that helps in determining the shape of
an object in the image. Edges can be as simple as horizontal, vertical or
diagonal or as complex as the outline of a tree, cloud or a river traversing
through a landscape. Edges are spread along a line or boundary where
pixels are similar in property and beyond that boundary, there is distinct,
sharp change in the pixel property (colour, intensity). Determination of
exact location of edge is an example of localisation, Think of a satellite
image of a river. The rough edges along the either side of the river would be
same but there will be sharp difference in the property of pixels
interpret the width and stretch of the
representing edge of the river bank and the water. This sharp difference will help
identification and segmentation.
river and distinguish it from the rest of the image. This way, edges help object
identified and by following their similar
The ML algorithm tries to classify other similar pixels along the pixels identify and whatever is left out is
feature maps the shape of the edge. In the image shown here, dark waterway is easy to
the part ofland. intensity of the pixels
Edge detection: To detect the edge, when you move the inspection window, the change in the
or down, no change in
will occur only in one direction not the other. Notice in the figure, if you move the window up
left-ward, the change in
pixels would occur until it reaches one of the edges up or down. But, if you move the window
pixel intensity will give a hint of an edge (boundary between gray and white pixels..
Artificial Intelligence-X
Corners
CornersSare another unique part of an edges, corners are easier to detect
since maximum change in the intensityimage.
of pixels occur at theto corners.
As compared The reason is that acorner
dueS at least two edges and hence change in the intensity of the pixels is higher in corners as
compared to edges.
Corner detection: Ifvou move the nixel window and ifthe region is acorner then there wl be
two adjacent edges and intensity change would be more
as comparedto onee5
Blobs
**0S Tiat region in an image with zero or negligible intensity change in the pixels. Blobs
usually make the major part of an image. As long as a blob is detected the inspection of the pixels
cat coninue until considerable change in the intensity isdetected which could mean either a
corner or an edge.
Blob detection: In the example shown here. moving the inspection window a
or down will detect no intensity change since little distance up
majority or all pixels are similar gray.
ACTIVITY: Detecting fish in the ocean
The image here shows a fish in the water. Its magnification is given which is superimposed
by a grid of numbers. Let us assume that these are the pixel locations of the fish in
entire
image. Can you list or shade the pixels that represent the edges (outline ofthe fish)?
117 33 49 65 81 l92 113 129145l61 77l193 209 225241257273 289 305321 337953369 985
401417433449 465481497 sE
18 345066829B 114 1014616217899210226242 238 274290 306 922938354 370386402418434 450466482 498514
3551$783 9 15131 147 163 79195211 227243 259 275 291907923339 355|371387403419 435 451 467483 499 s15
20 36 526884.100116432t481648o 196212228 244260276 292908 924340 356|372 388404420436 452468484 500 516
21 32 53 101117 433L49165L8 197 2132292A526277293 30992534L957|373 389A05 4214371453 469 485 501 517
6 98 54 70B6 10241819450166 182 198 214 230 246|262 278|294310 326|342|358|374|390406 422 438454
470486502518
7 22 39 55 7187 co19 35159167183199 215/231 247 Z63 |279/295911927343 359375391 407 423 439455471487 503s19
24 40 56728810a 120 136|152168184 200 216 232 248264280 296312 3928 344 360376s92 A08 A24 440 456|472 488 504 520
2 41|5773 89105 121197 15 L69185201217299249265|281297 19 9245 364 977 893409 425|441 457 473 489 505 521
1026 42 58 74 9010612213615A70186202 2L8234230 266 2:22 298 B149904662 978 894410 426442 458474 490 506 522)
2743 5975 91107 123 139165187 20919 235 231257 283 299 B 9 B:2S63 97993411 427 443459475 491|507523)
12 44 607692 L08 124 40156172L88 2044220 2062208 : , 10) e 948 36438o 996|412428444 460476 492 508 24
13 29 45 617793 109 125 14117 7189205 227 29*4, 1: 93349 365 381 397419 429 445 461477 493 509|525
1430 6278 9o126142 1- 921, 2 E r23302318 334 350|366|382398 414|43o 62 A78|494|510| 523
15 31 47 1 2 7 1 4
6 22323925A 267 303319935 351367 383399 415|4; 4 } 27
16 3 48 80 96i12128 144|160 176|192 208224 240 256 272 288 304320336 352 368 384
40o416 492448 464 480 496 |52 8