0% found this document useful (0 votes)
21 views20 pages

Unit 1

The document discusses computer vision, which enables machines to understand and interpret visual information like images and videos. It describes the typical computer vision process of capturing images, processing them using algorithms, and analyzing the data to take appropriate actions. Some common computer vision tasks are object classification, detection, verification, landmark detection, image segmentation, and recognition. Computer vision has various applications in areas like facial recognition, healthcare, self-driving vehicles, OCR, machine inspection, retail automation, 3D modeling, medical imaging, automotive safety, and surveillance. Some challenges of computer vision include issues with reasoning, privacy and security concerns, and dealing with duplicate or false content.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views20 pages

Unit 1

The document discusses computer vision, which enables machines to understand and interpret visual information like images and videos. It describes the typical computer vision process of capturing images, processing them using algorithms, and analyzing the data to take appropriate actions. Some common computer vision tasks are object classification, detection, verification, landmark detection, image segmentation, and recognition. Computer vision has various applications in areas like facial recognition, healthcare, self-driving vehicles, OCR, machine inspection, retail automation, 3D modeling, medical imaging, automotive safety, and surveillance. Some challenges of computer vision include issues with reasoning, privacy and security concerns, and dealing with duplicate or false content.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT-I

COMPUTER VISION:

Computer vision is a sub-field of AI and machine learning that enables the machine to
see, understand, and interpret the visuals such as images, video, etc., and extract useful
information from them that can be helpful in the decision-making of AI applications.

It can be considered as an eye for an AI application. With the help of computer vision
technology, such tasks can Be done that would be impossible without this technology,
such as Self Driving Cars.
COMPUTER VISION PROCESS:

A typical process of Computer vision is illustrated in the above image. It mainly


performs three steps, which are:

1. Capturing an Image

A computer vision software or application always includes a digital camera or CCTV to


capture the image. So, firstly it captures the image and puts it as a digital file that
consists of Zero and one's.

2. Processing the image

In the next step, different CV algorithms are used to process the digital data stored in a
file. These algorithms determine the basic geometric elements and generate the image
using the stored digital data.

3. Analyzing and taking required action

Finally, the CV analyses the data, and according to this analysis, the system takes the
required action for which it is designed.

TASK ASSOCIATED WITH COMPUTER VISION:

Although computer vision has been utilized in so many fields, there are a few common
tasks for computer vision systems. These tasks are given below:
o Object classification: Object classification is a computer vision technique/task
used to classify an image, such as whether an image contains a dog, a person's
face, or a banana. It analyzes the visual content (videos & images) and classifies
the object into the defined category. It means that we can accurately predict the
class of an object present in an image with image classification.
o Object Identification/detection: Object identification or detection uses image
classification to identify and locate the objects in an image or video. With such
detection and identification technique, the system can count objects in a given
image or scene and determine their accurate location and labelling. For example,
in a given image, one dog, one cat, and one duck can be easily detected and
classified using the object detection technique.
o Object Verification: The system processes videos, finds the objects based on
search criteria, and tracks their movement.
o Object Landmark Detection: The system defines the key points for the given
object in the image data.
o Image Segmentation: Image segmentation not only detects the classes in an
image as image classification; instead, it classifies each pixel of an image to
specify what objects it has. It tries to determine the role of each pixel in the
image.
o Object Recognition: In this, the system recognizes the object's location with
respect to the image.

APPLICATIONS OF COMPUTER VISION:

o Facial recognition: Computer vision has enabled machines to detect face images
of people to verify their identity. Initially, the machines are given input data
images in which computer vision algorithms detect facial features and compare
them with databases of fake profiles. Popular social media platforms like
Facebook also use facial recognition to detect and tag users. Further, various
government spy agencies are employing this feature to identify criminals in
video feeds.
o Healthcare and Medicine: Computer vision has played an important role in the
healthcare and medicine industry. Traditional approaches for evaluating
cancerous tumours are time-consuming and have less accurate predictions,
whereas computer vision technology provides faster and more accurate
chemotherapy response assessments; doctors can identify cancer patients who
need faster surgery with life-saving precision.
o Self-driving vehicles: Computer vision technology has also contributed to its
role in self-driving vehicles to make sense of their surroundings by capturing
video from different angles around the car and then introducing it into the
software. This helps to detect other cars and objects, read traffic signals,
pedestrian paths, etc., and safely drive its passengers to their destination.
o Optical character recognition (OCR)
Optical character recognition helps us extract printed or handwritten text from
visual data such as images. Further, it also enables us to extract text from
documents like invoices, bills, articles, etc.
o Machine inspection: Computer vision is vital in providing an image-based
automatic inspection. It detects a machine's defects, features, and functional
flaws, determines inspection goals, chooses lighting and material-handling
techniques, and other irregularities in manufactured products.
o Retail (e.g., automated checkouts): Computer vision is also being implemented
in the retail industries to track products, shelves, wages, record product
movements into the store, etc. This AI-based computer vision technique
automatically charges the customer for the marked products upon checkout
from the retail stores.
o 3D model building: 3D model building or 3D modeling is a technique to
generate a 3D digital representation of any object or surface using the software.
In this field also, computer vision plays its role in constructing 3D computer
models from existing objects. Furthermore, 3D modeling has a variety of
applications in various places, such as Robotics, Autonomous driving, 3D
tracking, 3D scene reconstruction, and AR/VR.
o Medical imaging: Computer vision helps medical professionals make better
decisions regarding treating patients by developing visualization of specific body
parts such as organs and tissues. It helps them get more accurate diagnoses and
a better patient care system. E.g., Computed Tomography (CT) or Magnetic
Resonance Imaging (MRI) scanner to diagnose pathologies or guide medical
interventions such as surgical planning or for research purposes.
o Automotive safety: Computer vision has added an important safety feature in
automotive industries. E.g., if a vehicle is taught to detect objects and dangers, it
could prevent an accident and save thousands of lives and property.
o Surveillance: It is one of computer vision technology's most important and
beneficial use cases. Nowadays, CCTV cameras are almost fitted in every place,
such as streets, roads, highways, shops, stores, etc., to spot various doubtful or
criminal activities. It helps provide live footage of public places to identify
suspicious behaviour, identify dangerous objects, and prevent crimes by
maintaining law and order.
o Fingerprint recognition and biometrics: Computer vision technology detects
fingerprints and biometrics to validate a user's identity. Biometrics deals with
recognizing persons based on physiological characteristics, such as the face,
fingerprint, vascular pattern, or iris, and behavioural traits, such as gait or
speech. It combines Computer Vision with knowledge of human physiology and
behavior.

COMPUTER VISION CHALLENGES

Computer vision has emerged as one of the most growing domains of artificial
intelligence, but it still has a few challenges to becoming a leading technology. There are
a few challenges observed while working with computer vision technology.

o Reasoning and analytical issues: All programming languages and technologies


require the basic logic behind any task. To become a computer vision expert, you
must have strong reasoning and analytical skills. If you don't have such skills,
then defining any attribute in visual content may be a big problem.
o Privacy and security: Privacy and security are among the most important
factors for any country. Similarly, vision-powered surveillance is also having
various serious privacy issues for lots of countries. It restricts users from
accessing unauthorized content. Further, various countries also avoid such face
recognition and detection techniques for privacy and security reasons.
o Duplicate and false content: Cyber security is always a big concern for all
organizations, and they always try to protect their data from hackers and cyber
fraud. A data breach can lead to serious problems, such as creating duplicate
images and videos over the internet.

OCR (Optical Character Recognition):

Optical Character Recognition (OCR) is the process that converts an image of text into
a machine-readable text format.

Working Process of OCR:

Image acquisition :A scanner reads documents and converts them to binary data. The
OCR software analyzes the scanned image and classifies the light areas as background
and the dark areas as text.
Preprocessing: The OCR software first cleans the image and removes errors to prepare
it for reading. These are some of its cleaning techniques:

1. Deskewing or tilting the scanned document slightly to fix alignment


issues during the scan.
2. Despeckling or removing any digital image spots or smoothing the
edges of text images.
3. Cleaning up boxes and lines in the image.
4. Script recognition for multi-language OCR technology
Text recognition

The two main types of OCR algorithms or software processes that an OCR software uses
for text recognition are called pattern matching and feature extraction.

1. Pattern matching: :Pattern matching works by isolating a character image,


called a g. Glyph, and comparing it with a similarly stored glyph. a glyph
(pronounced GLIHF) ; is a graphic symbol that provides the appearance or
form for a character Pattern recognition works only if the stored glyph has a
similar font and scale to the input glyph. This method works well with scanned
images of documents that have been typed in a known font.

2. Feature extraction: Feature extraction breaks down or decomposes the glyphs


into features such as lines, closed loops, line direction, and line intersections. It
then uses these features to find the best match or the nearest neighbor among its
various stored glyphs.

Postprocessing

After analysis, the system converts the extracted text data into a computerized file.
Some OCR systems can create annotated PDF files that include both the before and after
versions of the scanned document.

Types of OCR:

Data scientists classify different types of OCR technologies based on their use and
application. The following are a few examples:

1. Simple optical character recognition software:

A simple OCR engine works by storing many different font and text image patterns as
templates. The OCR software uses pattern-matching algorithms to compare text images,
character by character, to its internal database. If the system matches the text word by
word, it is called optical word recognition. This solution has limitations because there
are virtually unlimited font and handwriting styles, and every single type cannot be
captured and stored in the database.

2. Intelligent character recognition software:

Modern OCR systems use intelligent character recognition (ICR) technology to


read the text in the same way humans do. They use advanced methods that train
machines to behave like humans by using machine learning software. A machine
learning system called a neural network analyzes the text over many levels, processing
the image repeatedly. It looks for different image attributes, such as curves, lines,
intersections, and loops, and combines the results of all these different levels of analysis
to get the final result. Even though ICR typically processes the images one character at a
time, the process is fast, with results obtained in seconds.

3. Intelligent word recognition

Intelligent word recognition systems work on the same principles as ICR, but process
whole word images instead of preprocessing the images into characters.

4. Optical mark recognition

Optical mark recognition identifies logos, watermarks, and other text symbols in a
document.

Applications of OCR

 Automatic license/number plate recognition (ALPR/ANPR)

 Traffic sign recognition

 Analyzing and defeating CAPTCHAs (Completely Automated Public Turing tests to


tell Computers and Humans Apart) on websites

 Extracting information from business cards

 Automatically reading the machine-readable zone (MRZ) and other relevant parts of
a passport

 Parsing the routing number, account number, and currency amount from a bank
check

 Understanding text in natural scenes such as the photos captured from your
smartphone

OBJECT RECOGNITION:
Object recognition is the technique of identifying the object present in images
and videos. It is one of the most important applications of machine learning and deep
learning. The goal of this field is to teach machines to understand (recognize) the
content of an image just like humans do.

Object Recognition Using Machine Learning


 HOG (Histogram of oriented Gradients) feature Extractor and SVM (Support
Vector Machine) model: Before the era of deep learning, it was a state-of-the-art
method for object detection. It takes histogram descriptors of both positive
( images that contain objects) and negative (images that does not contain objects)
samples and trains our SVM model on that.
 Bag of features model: Just like bag of words considers document as an orderless
collection of words, this approach also represents an image as an orderless
collection of image features. Examples of this are SIFT, MSER, etc.
 Viola-Jones algorithm: This algorithm is widely used for face detection in the
image or real-time. It performs Haar-like feature extraction from the image. This
generates a large number of features. These features are then passed into a
boosting classifier. This generates a cascade of the boosted classifier to perform
image detection. An image needs to pass to each of the classifiers to generate a
positive (face found) result. The advantage of Viola-Jones is that it has a detection
time of 2 fps which can be used in a real-time face recognition system.

Object Recognition Using Deep Learning:


Convolution Neural Network (CNN) is one of the most popular ways of doing
object recognition. It is widely used and most state-of-the-art neural networks used
this method for various object recognition related tasks such as image classification.
This CNN network takes an image as input and outputs the probability of the different
classes. If the object present in the image then it’s output probability is high else the
output probability of the rest of classes is either negligible or low. The advantage of
Deep learning is that we don’t need to do feature extraction from data as compared to
machinelearning.

Challenges of Object Recognition:


 Since we take the output generated by last (fully connected) layer of the CNN
model is a single class label. So, a simple CNN approach will not work if more than
one class labels are present in the image.
 If we want to localize the presence of an object in the bounding box, we need to try
a different approach that not only outputs the class label but also outputs the
bounding box locations.
AUGMENTED REALITY V/S VIRTUAL REALITY:

Augmented Reality:

Augmented Reality is defined as the technology and methods that allow


overlaying of real-world objects and environments with 3D virtual objects using an AR
device, and allow the virtual to interact with the real-world objects to create intended
meanings.

Types of Augmented Reality


Augmented reality is of four types: Marker-less, Marker-based, Projection-based,
and Superimposition-based AR. Let us see them one by one in detail.

1) Marker-based AR: A marker, which is a special visual object like a special


sign or anything, and a camera are used to initiate the 3D digital animations. The system
will calculate the orientation and position of the market to position the content effectively.
Marker-based AR example: A marker-based mobile-based AR furnishing app.

2) Marker-less AR
It is used in events, business, and navigation apps, for instance, the technology uses
location-based information to determine what content the user gets or finds in a certain
area. It may use GPS, compasses, gyroscopes, and accelerometers as can be used on
mobile phones.
The below example shows that a Marker-less AR does not need any physical markers to
place objects in a real-world space:
3) Project-based AR
This kind uses synthetic light projected on the physical surfaces to detect the interaction
of the user with the surfaces. It is used on holograms like in Star Wars and other sci-fi
movies.

The below image is an example showing a sword projection in AR project-based


AR headset:

4) Superimposition-based AR
In this case, the original item is replaced with an augmentation, fully or partially. The
below example is allowing users to place a virtual furniture item over a room image with
a scale on the IKEA Catalog app.

IKEA is an example of superimposition-based AR:


Components of AR

Augmented reality creates an immersive experience for all its users. Though the most
common AR forms are through glasses or a camera lens, interest in AR is growing, and
businesses are showcasing more types of lenses and hardware through the marketplace. There
are five significant components of AR:

1. Artificial intelligence. Most augmented reality solutions need artificial intelligence (AI) to work,
allowing users to complete actions using voice prompts. AI can also help process information for
your AR application.

2. AR software. These are the tools and applications used to access AR. Some businesses can create
their own form of AR software.

3. Processing. You’ll need processing power for your AR technology to work, generally by leveraging
your device’s internal operating system.

4. Lenses. You’ll need a lens or image platform to view your content or images. The better quality your
screen is, the more realistic your image will appear.

5. Sensors. AR systems need to digest data about their environment to align the real and digital worlds.
When your camera captures information, it sends it through software for processing.

Augmented reality merges the physical world with computer-generated virtual


elements.

EG: Augmented Reality are the Pokemon Go game, Snapchat lenses.

Advantages of Augmented Reality

The advantages of Augmented Reality are listed as follows -

o It increases accuracy.
o It offers innovation, continuous improvement, and individualized learning.
o It helps developers to build games that offer real experiences.
o It enhances the knowledge and information of the user.

Disadvantages of Augmented Reality

The limitations of Augmented Reality are listed as follows -

o Projects based on AR technology are expensive to implement and develop.


o Excessive use of augmented reality technology can lead to eye problems, obesity,
etc.
o It can cause mental health issues
o
Virtual Reality (VR)

Virtual Reality (VR) is a computer-generated simulation of an alternate


world or reality. It is used in 3D movies and video games. It helps to create
simulations similar to the real world and “immerse” the viewer using
computers and sensory devices like headsets and gloves.
Advantages of Virtual Reality

The benefits of virtual reality are listed as follows -

o It creates an interactive environment.


o It helps us to explore the world by creating a realistic world using computer
technology.
o It makes education comfortable and easy.
o It allows users to do an experiment in an artificial environment.
o It increases the work capabilities.
o Virtual reality is helpful for medical students to do practice well. It will be helpful
for patients, too, as it offers a safe environment to them by which a patient can
come into contact with the things they fear.
o Virtual reality helps to measure the performance of sportsperson and analyze
their techniques.

Disadvantages of Virtual Reality

The limitations of virtual reality are listed as follows -

o Using VR, people start ignoring the real world. They started living in the virtual
world instead of dealing with the issues of the real world.
o Training in the virtual environment does not have the same result as training in
the actual world.
o It is not guaranteed that a person has done a task well in the real world if he/she
has performed that task well in the virtual world.

Augmented Reality v/s Virtual Reality

Now, let's see the comparison chart between Augmented reality and Virtual reality.
Here, we are showing the comparison between both terms on the basis of some
characteristics.
On the basis Augmented Reality Virtual Reality
of

Involvement In AR user is partially immersed with In VR, the user is completely


the real world, i.e. user is immersed immersed in a virtual world.
with mix of real-world and virtual
world.

Distinction In augmented reality, it is easy to In Virtual reality, it is hard


distinguish between both real-world to distinguish between the
and virtual world. virtual world and real
world.

Devices used In AR, there is a use of tablet, In VR, there is a use of head-
smartphones, or another mobile mounted display or glasses.
device.

Reality and Augmented reality is 75% real and Virtual reality is 75% virtual
virtuality 25% virtual. and 25% real.

Network data Augmented reality requires upwards A virtual reality video with
of 100Mbps bandwidth. 720p requires a connection
of atleast 50Mbps.

Revenue The projected revenue share for The projected revenue share
augmented reality in 2020 is $120 for virtual reality in 2020 is
million. $30 million.

Visual senses In Augmented reality, a user always Whereas, in virtual reality,


has a sense of presence in the real the visual senses are under
world. control of the system.
CONTENT-BASED IMAGE RETRIEVAL (CBIR)

Content-Based Image Retrieval (CBIR) is a way of retrieving images from a database. In


CBIR, a user specifies a query image and gets the images in the database similar to the
query image. To find the most similar images, CBIR compares the content of the input
image to the database images.
More specifically, CBIR compares visual features such as shapes, colors, texture and
spatial information and measures the similarity between the query image with the
images in the database with respect to those features:

In this picture, we have an example query image that illustrates the user’s information
need and a very large dataset of images. CBIR system’s task is to rank all the images in
the dataset according to how likely they are to fulfill the user’s information need.
Feature Extraction Methods in CBIR

CBIR systems need to perform feature extraction, which plays a significant role in
representing an image’s semantic content.
There are two main categories of visual features: global and local.

Global Features

Global features are those that describe an entire image. They contain information on the
entire image. For example, several descriptors characterize color spaces, such as color
moments, color histograms, and so on.
Other global features are concerned with other visual elements such as e.g. shapes and
texture.In this diagram, we find various methods for global feature extraction:

Local Features

While global features have many advantages, they change under scaling and rotation.
For this reason, local features are more reliable in various conditions.
Local features describe visual patterns or structures identifiable in small groups of
pixels. For example, edges, points, and various image patches.
The descriptors used to extract local features consider the regions centered around the
detected visual structures. Those descriptors transform a local pixel neighborhood into
a vector presentation.
One of the most used local descriptors is SIFT which stands for Scale-Invariant Feature
Transform. It consists of a descriptor and a detector for key points. It doesn’t change
when we rotate the image we’re working on. However, it has some drawbacks, such as
needing a fixed vector for encoding and a huge amount of memory.
Deep Neural Networks
Recently, state-of-the-art CBIR systems have started using machine-learning methods
such as deep-learning algorithms. They can perform feature extraction far better than
traditional methods.
Usually, a Deep Convolutional Neural Network (DCNN) is trained using available data.
Its job is to extract features from images. So, when a user sends the query image to
the database system, DCNN extracts its features. Then, the query-image features are
compared to those of the database images. In that step, the database system finds the
most similar images using similarity measures and returns them to the user:

Since there are various pre-trained convolutional networks as well as Computer Vision
Datasets, some people prefer ready-to-use models such as AlexNet, GoogLeNet, and
ResNet50 over training their networks from scratch.
So, deep-learning models such as DCNN extract features automatically. In contrast, in
traditional models, we pre-define the features to extract.
Computer Vision Retail Industry:

Computer vision enables retailers to build customer loyalty through improved


in-store experience. It can speed up the buying process by analyzing the buying habits
of customers. The data gotten by computer vision can be used to optimize the layout of
store shelves in order to streamline purchases. It is also a solution of choice to improve
self-service in stores and can help prevent fraud and theft. Automated visual inspection
installed in the aisles and at checkout will detect shoplifters faster than current devices.
Computer Vision can be used to track movement in a store and help retailers set out
routes around their stores that feel natural for the shopper while maintaining safe
distances between individuals.

Applications of computer vision in retail

1. Automated payment

With the use of computer vision in retail, you don’t have to wait in a long queue to pay.
Products can be monitored using a combination of sensors and computer vision. In
addition, they can also recognize the customer and automatically charge them after they
leave the store.

2. In-store advertisement

Computer vision in retail can also be used to identify certain customers when they enter
the store and send them special discounts. They can also get recommendations on what
to buy, depending on their purchase history.

3. Stock management

Computer vision can be used to detect empty shelves and reduce the replenishment
period, this increases product availability on the shelf. This solution can also verify shelf
price, which is often a time-consuming manual operation, minimizing pricing anomalies.

4. Customer advisory

In the near future, CV algorithms will be so advanced that they will help you find the
perfect product or an accessory matching your new dress. They have the ability to
become fully-operational customer advisors.

5. Virtual Mirrors

Virtual mirrors may become the central focus of personalization and customer
experience enhancement in retail. A virtual mirror is basically a mirror with a display
behind the glass. It is powered by computer vision cameras and AR and can display a
wide range of contextual information which helps buyers connect with the brand better.

6. Crowd Analysis

Computer vision can correctly count retail shoppers and study customer behavior in
total. Retailers will be able to track the customer journeys throughout the physical
store, calculate the total time spent with each product, and guarantee that the store
follows all standardized protocols.

7. Self-checkout

Self-checkout has already solidified its importance for brick-and-mortar stores.


Customer service automation is becoming a priority these days so companies need to
update their workflow to make them more efficient.

8. Cashierless Stores

As revolutionary as it may sound, cashier-less stores are paving the way for a more
streamlined, AI-assisted shopping experience in stores across the world. Computer
Vision is being tested out in various retail stores to completely replace the need for
human staff.

9. Inventory Management
By automating inventory cycle counts with computer vision, retail businesses can
update their inventory system in real-time to develop an omni-channel retail
experience.

10. Optimize marketing campaigns using behavioral analytics

Systems can monitor facial expressions and identify how a customer feels, giving
marketers a way to know how people respond to specific goods.

ADVANTAGES OF COMPUTER VISION HEALTH CARE:

1. A more accurate meter. Computer vision in healthcare applications results in faster


and more accurate diagnoses. And the more data the system receives for algorithm
training, the higher the accuracy rates.

2. Early disease recognition. There are a number of diseases that answer to medical
treatment only in the early stages. Computer vision technology allows recognizing
symptoms when they are not yet apparent and enables doctors to intervene early. This
makes a huge difference in treating patients that would not otherwise get the help they
need. By recognizing early-onset illnesses, doctors can prescribe drugs to help fight
those diseases or even perform surgeries earlier and save lives. The aim here is to
accelerate the speed of the diagnosis process through computer vision employment and
make treatment more successful.

3. Enhanced medical procedures efficiency. Computer vision is known not only for
diagnostic accuracy but also for being generally efficient for patients and healthcare
professionals alike. In particular, computer-aided diagnoses minimize doctor and
patient interaction. This reduction is especially of benefit in light of the physician
shortage projections

4. Automatic generation of medical reports. The progress in computer vision


evolution has enabled extensive use of medical imaging data for more accurate
diagnosis, treatment, and prediction of diseases. By using computer vision techniques,
healthcare professionals can acquire enhanced medical information that is not only
interpreted to establish a diagnosis and prescribe medication but can also be used for
disease prediction and analysis report generation. Healthcare specialists can leverage
the power of computer vision to automate the medical report generation. Feeding data
from X-rays, ultrasound, CT scans, and MRI to computer vision algorithms, clinicians
will be able to gain in-depth insight into an individual's physical condition, predict when
a disease will develop, and when appropriate treatment will be needed.

5. Interactive medical imaging. Computer vision for medical imaging allows 3D


visualization in an interactive and detailed way. Medical image analysis has been
significantly benefitting from the application of deep learning techniques over the past
years. These developments paved the way for computer vision to become more effective
in healthcare image processing. Now, deep learning and computer vision can be used to
perform a visual analysis of interactive 3D models to make more accurate medical
diagnoses. 3D models are a more informative format than traditional 2D images. Thus,
for example, 3D breast imaging driven by advanced computer vision systems proves
more effective for cancer prevention in the early stages of disease progression.

Use cases of computer vision in healthcare

Intelligent computer vision algorithms are capable of learning to identify intricate


patterns through training on previously diagnosed cases. Today, computer vision is
deployed in a growing number of medical fields and continues making positive changes
to healthcare.

1. Radiology and oncology. Computer vision has broad application in healthcare but
especially in the fields of radiology and oncology. The potential use cases include
monitoring of tumor progression, bone fractures detection, and the search for
metastases in the tissues. Breast cancer, lung cancer, leukemia, prostate cancer, and
others are all malignancies that can be detected through computer-aided diagnosis. In
particular, AI-powered solutions like IBM Watson Imaging Clinical Review are designed
to augment radiologists and make the medical image interpretation cheaper, faster, and
more accurate. They allow improving overall radiology department quality and
providing patients with better and more reliable medical care.

2. Cardiology. Although deep learning is still developing and its applications for
computer vision in the field of cardiology are limited, there are some ways in which CV
can benefit the industry. The rapid adoption of automated algorithms for computer
vision in radiology suggests the same is going to happen in other fields, too.
Remarkably, the incorporation of AI into cardiology is happening in the form of:


 Vascular imaging
 Artery highlighting
 AI-aided echocardiographic views analysis
 Automated cardiac pathology and anomaly detection
 Automated analysis, diagnostics, and prognosis in cardiac CT
 Electronic segmentation and calculation of variables in cardiac MRI

As a result, patient groups with cardiovascular risk will be able to get improved care as
physicians will be able to interpret more data in greater depth than ever before.
Computer vision algorithms will unobtrusively assist physicians and enable broader
characterization of patients’ disorders. As a result, they can potentially help plan early
intervention in patients at high risk and lead to better treatment selection and improved
outcomes.

3. Dermatology. Computer vision algorithms are being created to spot patterns in


images and identify any visual signs of pathology that are crucial for diagnosis in the
abovementioned fields of radiology and cardiology. Still, they have a broad application
in dermatology as well. Dermatology is mostly about a visual inspection of the patient’s
skin. And artificial intelligence has the power to enhance care.

High computer-aided diagnosis systems’ accuracy in dermatology can facilitate an


expert decision-making process, leading to better treatment decisions. Specifically,
computerized skin image analysis is leveraged for delivering personalized skincare
(including skin treatments, makeup, facial creams and gels, humidity, etc.) for people
based on their photos. It also has the potential to be used for the early detection of skin
conditions, such as the diagnosis of skin cancer. There are techniques in computer
vision that allow doing so. Besides, these methods are continually advancing, and there
is a chance that in the future, computer vision will be part of routine dermatology
treatment.

4. Lab tests automation. Cloud computing technology is also used for blood count,
tissue cell analysis, change tracking, and other lab tests. Computer-vision powered
blood analyzers either take images of blood samples or receive comprehensible input in
the form of a picture of the already prepared slide containing a film of blood. As a rule,
trained professionals take such images from a custom-designed camera attached to an
ordinary microscope. Then, based on image processing and computer vision
technologies, the system processes the input and automatically detects specific
abnormalities in blood samples.

You might also like