Unit-1 AI....
Unit-1 AI....
Here, one of the booming technologies of computer science is Artificial Intelligence which is
ready to create a new revolution in the world by making intelligent machines.The Artificial
Intelligence is now all around us. It is currently working with a variety of subfields, ranging
from general to specific, such as self-driving cars, playing chess, proving theorems, playing
music, Painting, etc.
AI is one of the fascinating and universal fields of Computer science which has a great scope
in future. AI holds a tendency to cause a machine to work as a human.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial
defines "man-made," and intelligence defines "thinking power", hence AI means "a man-
made thinking power."
"It is a branch of computer science by which we can create intelligent machines which can
behave like a human, think like humans, and able to make decisions."
Artificial Intelligence exists when a machine can have human based skills such as learning,
reasoning, and solving problems
With Artificial Intelligence you do not need to preprogram a machine to do some work,
despite that you can create a machine with programmed algorithms which can work with own
intelligence, and that is the awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth,
there were Mechanical men in early days which can work and behave like humans.
o With the help of AI, you can create such software or devices which can solve real-
world problems very easily and with accuracy such as health issues, marketing, traffic
issues, etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana,
Google Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.
AD
To achieve the above factors for a machine or software Artificial Intelligence requires the
following discipline:
o Mathematics
o Biology
o Psychology
o Sociology
o Computer Science
o Neurons Study
o Statistics
o High Accuracy with less error: AI machines or systems are prone to less errors and
high accuracy as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because
of that AI systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action
multiple times with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a
bomb, exploring the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such
as AI technology is currently used by various E-commerce websites to show the
products as per customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-
driving car which can make our journey safer and hassle-free, facial recognition for
security purpose, Natural language processing to communicate with the human in
human-language, etc.
o High Cost: The hardware and software requirement of AI is very costly as it requires
lots of maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still
they cannot work out of the box, as the robot will only do that work for which they
are trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it
does not have the feeling so it cannot make any kind of emotional attachment with
human, and may sometime be harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are
getting more dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas
but still AI machines cannot beat this power of human intelligence and cannot be
creative and imaginative.
Application of AI
Artificial Intelligence has various applications in today's society. It is becoming essential for
today's time because it can solve complex problems with an efficient way in multiple
industries, such as Healthcare, entertainment, finance, education, etc. AI is making our daily
life more comfortable and fast.
Following are some sectors which have the application of Artificial Intelligence:
1. AI in Astronomy
o Artificial Intelligence can be very useful to solve complex universe problems. AI technology
can be helpful for understanding the universe such as how it works, origin, etc.
2. AI in Healthcare
o In the last, five to ten years, AI becoming more advantageous for the healthcare industry and
going to have a significant impact on this industry.
o Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI
can help doctors with diagnoses and can inform when patients are worsening so that medical
help can reach to the patient before hospitalization.
AD
3. AI in Gaming
o AI can be used for gaming purpose. The AI machines can play strategic games like chess,
where the machine needs to think of a large number of possible places.
4. AI in Finance
o AI and finance industries are the best matches for each other. The finance industry is
implementing automation, chatbot, adaptive intelligence, algorithm trading, and machine
learning into financial processes.
5. AI in Data Security
o The security of data is crucial for every company and cyber-attacks are growing very rapidly
in the digital world. AI can be used to make your data more safe and secure. Some examples
such as AEG bot, AI2 Platform,are used to determine software bug and cyber-attacks in a
better way.
6. AI in Social Media
o Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles,
which need to be stored and managed in a very efficient way. AI can organize and manage
massive amounts of data. AI can analyze lots of data to identify the latest trends, hashtag, and
requirement of different users.
8. AI in Automotive Industry
o Some Automotive industries are using AI to provide virtual assistant to their user for better
performance. Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
o Various Industries are currently working for developing self-driven cars which can make your
journey more safe and secure.
9. AI in Robotics:
o Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are
programmed such that they can perform some repetitive task, but with the help of AI, we can
create intelligent robots which can perform tasks with their own experiences without pre-
programmed.
o Humanoid Robots are best examples for AI in robotics, recently the intelligent Humanoid
robot named as Erica and Sophia has been developed which can talk and behave like humans.
10. AI in Entertainment
o We are currently using some AI based applications in our daily life with some entertainment
services such as Netflix or Amazon. With the help of ML/AI algorithms, these services show
the recommendations for programs or shows.
11. AI in Agriculture
o Agriculture is an area which requires various resources, labor, money, and time for best
result. Now a day's agriculture is becoming digital, and AI is emerging in this field.
Agriculture is applying AI as agriculture robotics, solid and crop monitoring, predictive
analysis. AI in agriculture can be very helpful for farmers.
12. AI in E-commerce
o AI is providing a competitive edge to the e-commerce industry, and it is becoming more
demanding in the e-commerce business. AI is helping shoppers to discover associated
products with recommended size, color, or even brand.
13. AI in education:
o AI can automate grading so that the tutor can have more time to teach. AI chatbot can
communicate with students as a teaching assistant.
o AI in the future can be work as a personal virtual tutor for students, which will be accessible
easily at any time and any place.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were
invented. And the enthusiasm for AI was very high at that time.
2. General AI:
o General AI is a type of intelligence which could perform any intellectual task with efficiency
like a human.
o The idea behind the general AI to make such a system which could be smarter and think like a
human by its own.
o Currently, there is no such system exist which could come under general AI and can perform
any task as perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.
o As systems with general AI are still under research, and it will take lots of efforts and time to
develop such systems.
3. Super AI:
o Super AI is a level of Intelligence of Systems at which machines could surpass human
intelligence, and can perform any task better than human with cognitive properties. It is an
outcome of general AI.
o Some key characteristics of strong AI include capability include the ability to think, to
reason,solve the puzzle, make judgments, plan, learn, and communicate by its own.
o Super AI is still a hypothetical concept of Artificial Intelligence. Development of such
systems in real is still world changing task.
2. Limited Memory
o Limited memory machines can store past experiences or some data for a short period of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These cars can
store recent speed of nearby cars, the distance of other cars, speed limit, and other information
to navigate the road.
3. Theory of Mind
o Theory of Mind AI should understand the human emotions, people, beliefs, and be able to
interact socially like humans.
o This type of AI machines are still not developed, but researchers are making lots of efforts
and improvement for developing such AI machines.
4. Self-Awareness
o Self-awareness AI is the future of Artificial Intelligence. These machines will be super
intelligent, and will have their own consciousness, sentiments, and self-awareness.
o These machines will be smarter than human mind.
o Self-Awareness AI does not exist in reality still and it is a hypothetical concept.
What is an Agent?
An agent can be anything that perceive its environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking,
and acting. An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors and
various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and act
on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even
we are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.
Intelligent Agents:
An intelligent agent is an autonomous entity which acts upon an environment using sensors
and actuators for achieving goals. An intelligent agent may learn from the environment to
achieve their goals. A thermostat is an example of an intelligent agent.
AD
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way
to maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use
for game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be
judged on the basis of following points:
Note: Rationality differs from Omniscience because an Omniscient agent knows the actual
outcome of its action and act accordingly, which is not possible in reality.
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The
structure of an intelligent agent is a combination of architecture and agent program. It can be
viewed as:
Following are the main three terms involved in the structure of an AI agent:
f:P* → A
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made
up of four words:
o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
AD
Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the
time. These are given below:
3. Goal-based agents
o The knowledge of the current state environment is not always sufficient to decide for an agent
to what to do.
o The agent needs to know its goal which describes desirable situations.
o Goal-based agents expand the capabilities of the model-based agent by having the "goal"
information.
o They choose an action, so that they can achieve the goal.
o These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
searching and planning, which makes an agent proactive.
4. Utility-based agents
o These agents are similar to the goal-based agent but provide an extra component of utility
measurement which makes them different by providing a measure of success at a given state.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an agent
has to choose in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each action
achieves the goals.
5. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences, or it has
learning capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically through
learning.
o A learning agent has mainly four conceptual components, which are:
a. Learning element: It is responsible for making improvements by learning from
environment
b. Critic: Learning element takes feedback from critic which describes that how well
the agent is doing with respect to a fixed performance standard.
c. Performance element: It is responsible for selecting external action
d. Problem generator: This component is responsible for suggesting actions that will
lead to new and informative experiences.
o Hence, learning agents are able to learn, analyze performance, and look for new ways to
improve the performance.
Agent Environment in AI
An environment is everything in the world which surrounds the agent, but it is not a part of
an agent itself. An environment can be described as a situation in which an agent is present.
The environment is where agent lives, operate and provide the agent with something to sense
and act upon it. An environment is mostly said to be non-feministic.
Features of Environment
As per Russell and Norvig, an environment can have various features from the point of view
of an agent:
o If an agent sensor can sense or access the complete state of an environment at each point of
time then it is a fully observable environment, else it is partially observable.
o A fully observable environment is easy as there is no need to maintain the internal state to
keep track history of the world.
o An agent with no sensors in all environments then such an environment is called
as unobservable.
2. Deterministic vs Stochastic:
o If an agent's current state and selected action can completely determine the next state of the
environment, then such environment is called a deterministic environment.
o A stochastic environment is random in nature and cannot be determined completely by an
agent.
o In a deterministic, fully observable environment, agent does not need to worry about
uncertainty.
3. Episodic vs Sequential:
o In an episodic environment, there is a series of one-shot actions, and only the current percept
is required for the action.
o However, in Sequential environment, an agent requires memory of past actions to determine
the next best actions.
4. Single-agent vs Multi-agent
o If only one agent is involved in an environment, and operating by itself then such an
environment is called single agent environment.
o However, if multiple agents are operating in an environment, then such an environment is
called a multi-agent environment.
o The agent design problems in the multi-agent environment are different from single agent
environment.
5. Static vs Dynamic:
o If the environment can change itself while an agent is deliberating then such environment is
called a dynamic environment else it is called a static environment.
o Static environments are easy to deal because an agent does not need to continue looking at the
world while deciding for an action.
o However for dynamic environment, agents need to keep looking at the world at each action.
o Taxi driving is an example of a dynamic environment whereas Crossword puzzles are an
example of a static environment.
6. Discrete vs Continuous:
o If in an environment there are a finite number of percepts and actions that can be performed
within it, then such an environment is called a discrete environment else it is called
continuous environment.
o A chess gamecomes under discrete environment as there is a finite number of moves that can
be performed.
o A self-driving car is an example of a continuous environment.
7. Known vs Unknown
o Known and unknown are not actually a feature of an environment, but it is an agent's state of
knowledge to perform an action.
o In a known environment, the results for all actions are known to the agent. While in unknown
environment, agent needs to learn how it works in order to perform an action.
o It is quite possible that a known environment to be partially observable and an Unknown
environment to be fully observable.
8. Accessible vs Inaccessible
o If an agent can obtain complete and accurate information about the state's environment, then
such an environment is called an Accessible environment else it is called inaccessible.
o An empty room whose state can be defined by its temperature is an example of an accessible
environment.
o Information about an event on earth is an example of Inaccessible environment.
Turing Test in AI
In 1950, Alan Turing introduced a test to check whether a machine can think like a human or
not, this test is known as the Turing Test. In this test, Turing proposed that the computer can
be said to be an intelligent if it can mimic human response under specific conditions.
Turing Test was introduced by Turing in his 1950 paper, "Computing Machinery and
Intelligence," which considered the question, "Can Machine think?"
The Turing test is based on a party game "Imitation game," with some modifications. This
game involves three players in which one player is Computer, another player is human
responder, and the third player is a human Interrogator, who is isolated from other two
players and his job is to find that which player is machine among two of them.
The test result does not depend on each correct answer, but only how closely its responses
like a human answer. The computer is permitted to do everything possible to force a wrong
identification by the interrogator.
PlayerA (Computer): No
In this game, if an interrogator would not be able to identify which is a machine and which is
human, then the computer passes the test successfully, and the machine is said to be
intelligent and can think like a human.
"In 1991, the New York businessman Hugh Loebner announces the prize competition,
offering a $100,000 prize for the first computer to pass the Turing test. However, no AI
program to till date, come close to passing an undiluted Turing test".
Parry: Parry was a chatterbot created by Kenneth Colby in 1972. Parry was designed to
simulate a person with Paranoid schizophrenia(most common chronic mental disorder).
Parry was described as "ELIZA with attitude." Parry was tested using a variation of the
Turing Test in the early 1970s.
Eugene Goostman: Eugene Goostman was a chatbot developed in Saint Petersburg in 2001.
This bot has competed in the various number of Turing Test. In June 2012, at an event,
Goostman won the competition promoted as largest-ever Turing test content, in which it has
convinced 29% of judges that it was a human.Goostman resembled as a 13-year old virtual
boy.
In the year 1980, John Searle presented "Chinese Room" thought experiment, in his paper
"Mind, Brains, and Program," which was against the validity of Turing's Test. According
to his argument, "Programming a computer may make it to understand a language, but
it will not produce a real understanding of language or consciousness in a computer."
He argued that Machine such as ELIZA and Parry could easily pass the Turing test by
manipulating keywords and symbol, but they had no real understanding of language. So it
cannot be described as "thinking" capability of a machine such as a human.
Computer vision helps to understand the complexity of the human vision system and trains
computer systems to interpret and gain a high-level understanding of digital images or videos.
In the early days, developing a machine system having human-like intelligence was just a
dream, but with the advancement of artificial intelligence and machine learning, it also
became possible. Similarly, such intelligent systems have been developed that can "see" and
interpret the world around them, similar to human eyes. The fiction of yesterday has become
the fact of today. In this tutorial, "Computer Vision Introduction", we will discuss a few
important concepts of computer vision, such as:
Further, Artificial intelligence is the branch of computer science that primarily deals with
creating a smart and intelligent system that can behave and think like the human brain. So, we
can say if artificial intelligence enables computer systems to think intelligently, computer
vision makes them capable of seeing, analyzing, and understanding.
o 1959: The first experiment with computer vision was initiated in 1959, where they showed a
cat as an array of images. Initially, they found that the system reacts first to hard edges or
lines, and scientifically, this means that image processing begins with simple shapes such as
straight edges.
o 1960: In 1960, artificial intelligence was added as a field of academic study to solve human
vision problems.
o 1963: This was another great achievement for scientists when they developed computers that
could transform 2D images into 3-D images.
o 1974: This year, optical character recognition (OCR) and intelligent character recognition
(ICR) technologies were successfully discovered. The OCR has solved the problem of
recognizing text printed in any font or typeface, whereas ICR can decrypt handwritten text.
These inventions are one of the greatest achievements in document and invoice processing,
vehicle number plate recognition, mobile payments, machine translation, etc.
o 1982: In this year, the algorithm was developed to detect edges, corners, curves, and other
shapes. Further, scientists also developed a network of cells that could recognize patterns.
o 2000: In this year, scientists worked on a study of object recognition.
o 2001: The first real-time face recognition application was developed.
o 2010: The ImageNet data set became available to use with millions of tagged images, which
can be considered the foundation for recent Convolutional Neural Network (CNN) and deep
learning models.
o 2012: CNN has been used as an image recognition technology with a reduced error rate.
o 2014: COCO has also been developed to offer a dataset for object detection and support
future research.
On a certain level, computer vision is all about pattern recognition which includes the
training process of machine systems for understanding the visual data such as images and
videos, etc.
Firstly, a vast amount of visual labeled data is provided to machines to train it. This labeled
data enables the machine to analyze different patterns in all the data points and can relate to
those labels. E.g., suppose we provide visual data of millions of dog images. In that case, the
computer learns from this data, analyzes each photo, shape, the distance between each shape,
color, etc., and hence identifies patterns similar to dogs and generates a model. As a result,
this computer vision model can now accurately detect whether the image contains a dog or
not for each input image.
These are a few important prerequisites that are essentially required to start your career in
computer vision technology. Once you are prepared with the above prerequisites, you can
easily start learning and make a career in Computer vision.
o Facial recognition: Computer vision has enabled machines to detect face images of people to
verify their identity. Initially, the machines are given input data images in which computer
vision algorithms detect facial features and compare them with databases of fake profiles.
Popular social media platforms like Facebook also use facial recognition to detect and tag
users. Further, various government spy agencies are employing this feature to identify
criminals in video feeds.
o Healthcare and Medicine: Computer vision has played an important role in the healthcare
and medicine industry. Traditional approaches for evaluating cancerous tumors are time-
consuming and have less accurate predictions, whereas computer vision technology provides
faster and more accurate chemotherapy response assessments; doctors can identify cancer
patients who need faster surgery with life-saving precision.
o Self-driving vehicles: Computer vision technology has also contributed to its role in self-
driving vehicles to make sense of their surroundings by capturing video from different angles
around the car and then introducing it into the software. This helps to detect other cars and
objects, read traffic signals, pedestrian paths, etc., and safely drive its passengers to their
destination.
o Optical character recognition (OCR)
Optical character recognition helps us extract printed or handwritten text from visual data
such as images. Further, it also enables us to extract text from documents like invoices, bills,
articles, etc.
o Machine inspection: Computer vision is vital in providing an image-based automatic
inspection. It detects a machine's defects, features, and functional flaws, determines
inspection goals, chooses lighting and material-handling techniques, and other irregularities in
manufactured products.
o Retail (e.g., automated checkouts): Computer vision is also being implemented in the retail
industries to track products, shelves, wages, record product movements into the store, etc.
This AI-based computer vision technique automatically charges the customer for the marked
products upon checkout from the retail stores.
o 3D model building: 3D model building or 3D modeling is a technique to generate a 3D
digital representation of any object or surface using the software. In this field also, computer
vision plays its role in constructing 3D computer models from existing objects. Furthermore,
3D modeling has a variety of applications in various places, such as Robotics, Autonomous
driving, 3D tracking, 3D scene reconstruction, and AR/VR.
o Medical imaging: Computer vision helps medical professionals make better decisions
regarding treating patients by developing visualization of specific body parts such as organs
and tissues. It helps them get more accurate diagnoses and a better patient care system. E.g.,
Computed Tomography (CT) or Magnetic Resonance Imaging (MRI) scanner to diagnose
pathologies or guide medical interventions such as surgical planning or for research purposes.
o Automotive safety: Computer vision has added an important safety feature in automotive
industries. E.g., if a vehicle is taught to detect objects and dangers, it could prevent an
accident and save thousands of lives and property.
o Surveillance: It is one of computer vision technology's most important and beneficial use
cases. Nowadays, CCTV cameras are almost fitted in every place, such as streets, roads,
highways, shops, stores, etc., to spot various doubtful or criminal activities. It helps provide
live footage of public places to identify suspicious behavior, identify dangerous objects, and
prevent crimes by maintaining law and order.
o Fingerprint recognition and biometrics: Computer vision technology detects fingerprints
and biometrics to validate a user's identity. Biometrics deals with recognizing persons based
on physiological characteristics, such as the face, fingerprint, vascular pattern, or iris, and
behavioral traits, such as gait or speech. It combines Computer Vision with knowledge of
human physiology and behavior.
o To create and implement a vision algorithm for working with image and video content pixels
o To develop a data-based approach for better problem solutions.
o Whenever required, you have to work on various AI and ML tasks required for computer
vision, such as image processing.
o Experience in working on various real-time project scenarios for problem-solving.
o Hierarchical problem decomposition, implementation of solutions, and integration with other
sub-systems.
o Hierarchical problem decomposition, implementation of solutions, and integration with other
sub-systems.
o Should be capable of understanding business objectives and can connect to technical solutions
through effective system design and architecture.
o The candidate must have cumulative work experience in visual data processing and analysis
using machine learning and deep learning.
o Hands-on experience with various AI/ML frameworks such as Python, TensorFlow, PyTorch,
Keras, CPP, etc.
o Candidates must have good experience in implementing AI techniques.
o Must have good written and verbal communication skills.
o Candidates should be aware of object detection techniques and models such as YOLO,
RCNN, etc.
AD
OpenCV with Python could be the most preferred choice for beginners due to its flexibility,
simple syntax, and versatility. Various reasons make Python the best programming language
for computer vision, which is as follows:
o Easy-to-use: Python is very famous as it is easy to learn for entry-level persons and
professionals. Further, Python is also easily adaptable and covers all business needs.
o Most used programming language: Python is one of the most popular programming
languages as it contains complete learning environments to get started with machine learning,
artificial intelligence, deep learning, and computer vision.
o Debugging and visualization: Python has an in-built facility for debugging via 'PDB' and
visualization through Matplotlib.
o Reasoning and analytical issuesAll programming languages and technologies require the
basic logic behind any task. To become a computer vision expert, you must have strong
reasoning and analytical skills. If you don't have such skills, then defining any attribute in
visual content may be a big problem.
o Privacy and security: Privacy and security are among the most important factors for any
country. Similarly, vision-powered surveillance is also having various serious privacy issues
for lots of countries. It restricts users from accessing unauthorized content. Further, various
countries also avoid such face recognition and detection techniques for privacy and security
reasons.
o Duplicate and false content: Cyber security is always a big concern for all organizations,
and they always try to protect their data from hackers and cyber fraud. A data breach can lead
to serious problems, such as creating duplicate images and videos over the internet.
o Object Tracking: Object tracking is a computer vision technique used to follow a particular
object or multiple items. Generally, object tracking has applications in videos and real-world
interactions, where objects are firstly detected and then tracked to get observation. Object
tracking is used in applications such as Autonomous vehicles, where apart from object
classification and detection such as pedestrians, other vehicles, etc., tracking of real-time
motion is also required to avoid accidents and follow the traffic rules.
o Semantic Segmentation: Image segmentation is not only about detecting the classes in an
image as image classification. Instead, it classifies each pixel of an image to specify what
objects it has. It tries to determine the role of each pixel in the image.
AD
Above are some most common applications of Computer vision. Now let us discuss
applications of computer vision across different sectors such as Retail, healthcare, etc.
o X-Ray Analysis
Computer vision can be successfully applied for medical X-ray imaging. Although most
doctors still prefer manual analysis of X-ray images to diagnose and treat diseases, with
computer vision, X-ray analysis can be automated with enhanced efficiency and
accuracy. The state-of-art image recognition algorithm can be used to detect patterns in an X-
ray image that are too subtle for the human eyes.
o Cancer Detection
Computer vision is being successfully applied for breast and skin cancer detection. With
image recognition, doctors can identify anomalies by comparing cancerous and non-
cancerous cells in images. With automated cancer detection, doctors can diagnose cancer
faster from an MRI scan.
o CT Scan and MRI
Computer vision has now been greatly applied in CT scans and MRI analysis. AI with
computer vision designs such a system that analyses the radiology images with a high level of
accuracy, similar to a human doctor, and also reduces the time for disease detection,
enhancing the chances of saving a patient's life. It also includes deep learning algorithms that
enhance the resolution of MRI images and hence improve patient outcomes.
o Self-driving cars
Computer vision is widely used in self-driving cars. It is used to detect and classify objects
(e.g., road signs or traffic lights), create 3D maps or motion estimation, and plays a key role
in making autonomous vehicles a reality.
o Pedestrian detection
Computer vision has great application and research in Pedestrian detection due to its high
impact on the designing of pedestrian systems in various smart cities. With the help of
cameras, pedestrian detection automatically identifies and locate the pedestrians in image or
video. Moreover, it also considers the variations among pedestrians related to attire, body
position, and illuminance in different scenarios. This pedestrian detection is very helpful in
different fields such as traffic management, autonomous driving, transit safety, etc.
o Road Condition Monitoring & Defect detection
Computer vision has also been applied for monitoring the road infrastructure condition by
accessing the variations in concrete and tar. A computer vision-enabled system automatically
senses pavement degradation, which successfully increases road maintenance allocation
efficiency and decreases safety risks related to road accidents.
To perform road condition monitoring, CV algorithms collect the image data and then process
it to create automatic crack detection and classification system.
o Defect Detection
This is perhaps, the most common application of computer vision. Until now, the detection of
defects has been carried out by trained people in selected batches, and total production control
is usually impossible. With computer vision, we can detect defects such as cracks in metals,
paint defects, bad prints, etc., in sizes smaller than 0.05mm.
o Analyzing text and barcodes (OCR)
Nowadays, each product contains a barcode on its packaging, which can be analyzed or read
with the help of the computer vision technique OCR. Optical character recognition or OCR
helps us detect and extract printed or handwritten text from visual data such as images.
Further, it enables us to extract text from documents like invoices, bills, articles, etc. and
verifies against the databases.
o Fingerprint recognition and Biometrics
Computer vision technology is used to detect fingerprints and biometrics to validate a user's
identity.
Biometrics is the measurement or analysis of physiological characteristics of a person that
make a person unique such as Face, Finger Print, iris Patterns, etc. It makes use of computer
vision along with knowledge of human physiology and behaviour.
o 3D Model building
3D model building or 3D modelling is a technique to generate a 3D digital representation of
any object or surface using the software. Computer vision plays its role here also in
constructing 3D computer models from existing objects. Furthermore, 3D modelling has a
variety of applications in various places, such as Robotics, Autonomous driving, 3D tracking,
3D scene reconstruction, and AR/VR.
o Crop Monitoring
In the agriculture sector, crop and yield monitoring are the most important tasks for better
agriculture. Traditionally, it depends on subjective human judgment, but that is not always
accurate. With computer vision systems, real-time crop monitoring and identification of any
crop variation due to any disease or deficiency of nutrition can be made.
o Automatic Weeding
An automatic weeding machine is an intelligent project enabled with AI and computer vision
that removes unwanted plants or weeds around the crops. Traditionally weeding methods
require human labour, which is costly and inefficient compared to automatic weeding
systems.
Computer vision enables the intelligent detection and removal of weeds using robots, which
reduces costs and ensures higher yields.
o Plant Disease Detection
Computer vision is also used in automated plant disease detection, which is important at an
early stage of plant development. Various deep-learning-based algorithms use computer
vision to identify plant diseases, estimate their severity and predict their impact on yields.
o Self-checkout
Self-checkout enables the customers to complete their transactions from a retailer without the
need for human staff, and this becomes possible with computer vision. Self-checkouts are
now helping retailers in avoiding long queues and manage customers.
o Automatic replenishment
Automated stock replenishment is a leading technology innovation in retail sectors.
Traditionally, stock replenishment is performed by store staff, who check selves to track the
items for inventory management. But now, automatic replenishment with computer vision
systems captures the image data and performs a complete inventory scan to track the shelves
item at regular intervals.
o People Counting
Nowadays, various situations occur where we may need the count of people or customers
entering and leaving the stores. This foot count or people counting can be done by computer
vision systems that analyze the image or video data captured by the in-store cameras. People
counting is helpful in managing the people and allowing the limited people for cases such as
Covid social distancing.
Computer Vision Techniques
As human beings, we can see, process, understand, and act on anything that we can see or
any visual input; in other words, we have the ability to see and understand any visual data.
But how we can implement the same thing in machines? So, here Computer Vision comes
into the picture. Although there are still various limitations in machines to visualise similar to
humans, they are very close to analysing, understanding, and extracting meaningful
information from any visual input. Nowadays, Computer vision is one of the trending
research areas with deep learning.
In this topic, we will have a deep understanding of different computer vision techniques that
are currently being used in several applications. However, before starting, let's first
understand the basic introduction of computer vision.
A typic process of Computer vision is illustrated in the above image. It mainly performs three
steps, which are:
1. Capturing an Image
In the next step, different CV algorithms are used to process the digital data stored in a file.
These algorithms determine the basic geometric elements and generate the image using the
stored digital data.
Finally, the CV analyses the data, and according to this analysis, the system takes the
required action for which it is designed.
Image classification is the simplest technique of Computer Vision. The main aim of image
classification is to classify the image into one or more different categories. Image classifier
basically takes an image as input and tells about different objects present in that image, such
as a person, dog, tree, etc. However, it would not give you other more information about the
image data, such as how many persons are there, tree colour, item positions, etc., and for this,
we need to go for any other CV technique.
2. Object Detection
Object detection is another popular technique of computer vision that can be performed after
Image classification or which uses image classification to detect the objects in visual data. It
is basically used to recognize the objects within the boundary boxes and find the class of the
objects in the image. Object detection makes use of deep learning and machine learning
technology to generate useful results.
Object detection has several applications, including object tracking, retrieval, video
surveillance, image captioning, etc.
A variety of techniques can be used to perform object detection, which includes R-CNN,
YOLO v2, etc.
3. Semantic Segmentation
Semantic Segmentation is not only about detecting the classes in an image as image
classification. Instead, it classifies each pixel of an image to specify what objects it has. It
tries to determine the role of each pixel in the image. It basically classifies pixelS in a
particular category without differentiating the object instances. Or we can say it classifies
similar objects as a single class from the pixel levels. For example, if an image contains two
dogs, then semantic segmentation will put both the dogs under the same label. It tries to
understand the role of each pixel in an image.
4. Instance Segmentation
Instance segmentation can classify the objects in an image at pixel level as similar to
semantic segmentation but with a more advanced level. It means Instance Segmentation can
classify similar types of objects into different categories. For example, if visual consists of
various cars, then with semantic segmentation, we can tell that there are multiple cars, but
with instance segmentation, we can label them according to their colour, shape, etc.
Using the below image, we can analyse the difference between semantic segmentation and
instance segmentation, where semantic segmentation classified all the persons as singly
entities, whereas instance segmentation classified all the persons as different by considering
colours also.
5. Panoptic Segmentation
6. Keypoint Detection
Keypoint detection tries to detect some key points in an image to give more details about a
class of objects. It basically detects people and localizes their key points. There are mainly
two keypoint detection areas, which are Body Keypoint Detection and Facial Keypoint
Detection.
For example, Facial keypoint detection includes detecting key parts of the human face such
as the nose, eyes, corners, eyebrows, etc. Keypoint detection mainly has applications,
including face detection, pose detection, etc.
With Pose estimation, we can detect what pose people have in a given image, which usually
includes where the head, eyes, nose, arms, shoulders, hands, and legs are in an image. This
can be done for a single person or multiple people as per the need.
7. Person Segmentation
Person segmentation is a type of image segmentation technique which is used to separate the
person from the background within an image. It can be used after the pose estimation, as with
this, we can closely identify the exact location of the person in the image as well as the pose
of that person.
8. Depth Perception
Depth perception is a computer vision technique that provides the visual ability to machines
to estimate the 3D depth/distance of an object from the source. Depth Perception has wide
applications, including the Reconstruction of objects in Augmented Reality, Robotics, self-
driving cars, etc. LiDAR(Lights Detection and Ranging) is one of the popular techniques that
is used for in-depth perception. With the help of laser beams, it measures the relative distance
of an object by illuminating it with laser light and then measuring the reflections using
sensors.
9. Image Captioning
Image captioning, as the name suggests, is about giving a suitable caption to the image that
can describe the image. It makes use of neural networks, where when we input an image, then
it generates a caption for that image that can easily describe the image. It is not only the task
of Computer vision but also an NLP task.
As the name suggests, 3D object reconstruction is a technique that can extract 3D objects
from a 2D image. Currently, it is a much-developing field of computer vision, and it can be
done in different ways for different objects. On this technique, one of the most successful
papers is PiFuHD, which tells about 3D human digitization.