0% found this document useful (0 votes)
16 views

Unit-1 AI....

Artificial intelligence (AI) is an area of computer science that aims to create intelligent machines that work and react like humans. The document discusses what AI is, its goals and components. It explains that AI uses techniques like machine learning, natural language processing and computer vision. The document also outlines several applications of AI such as healthcare, gaming, finance, security, social media and more. Overall, AI is an important field that has wide-ranging applications and potential to solve complex problems.

Uploaded by

arpitsharmaafs3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Unit-1 AI....

Artificial intelligence (AI) is an area of computer science that aims to create intelligent machines that work and react like humans. The document discusses what AI is, its goals and components. It explains that AI uses techniques like machine learning, natural language processing and computer vision. The document also outlines several applications of AI such as healthcare, gaming, finance, security, social media and more. Overall, AI is an important field that has wide-ranging applications and potential to solve complex problems.

Uploaded by

arpitsharmaafs3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Unit-1

What is Artificial Intelligence (AI)?


In today's world, technology is growing very fast, and we are getting in touch with different
new technologies day by day.

Here, one of the booming technologies of computer science is Artificial Intelligence which is
ready to create a new revolution in the world by making intelligent machines.The Artificial
Intelligence is now all around us. It is currently working with a variety of subfields, ranging
from general to specific, such as self-driving cars, playing chess, proving theorems, playing
music, Painting, etc.

AI is one of the fascinating and universal fields of Computer science which has a great scope
in future. AI holds a tendency to cause a machine to work as a human.

Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial
defines "man-made," and intelligence defines "thinking power", hence AI means "a man-
made thinking power."

So, we can define AI as:

"It is a branch of computer science by which we can create intelligent machines which can
behave like a human, think like humans, and able to make decisions."

Artificial Intelligence exists when a machine can have human based skills such as learning,
reasoning, and solving problems

With Artificial Intelligence you do not need to preprogram a machine to do some work,
despite that you can create a machine with programmed algorithms which can work with own
intelligence, and that is the awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth,
there were Mechanical men in early days which can work and behave like humans.

Why Artificial Intelligence?


Before Learning about Artificial Intelligence, we should know that what is the importance of
AI and why should we learn it. Following are some main reasons to learn about AI:

o With the help of AI, you can create such software or devices which can solve real-
world problems very easily and with accuracy such as health issues, marketing, traffic
issues, etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana,
Google Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.

Goals of Artificial Intelligence


Following are the main goals of Artificial Intelligence:

1. Replicate human intelligence


2. Solve Knowledge-intensive tasks
3. An intelligent connection of perception and action
4. Building a machine which can perform tasks that requires human intelligence such as:
o Proving a theorem
o Playing chess
o Plan some surgical operation
o Driving a car in traffic
5. Creating some system which can exhibit intelligent behavior, learn new things by
itself, demonstrate, explain, and can advise to its user.

AD

What Comprises to Artificial Intelligence?


Artificial Intelligence is not just a part of computer science even it's so vast and requires lots
of other factors which can contribute to it. To create the AI first we should know that how
intelligence is composed, so the Intelligence is an intangible part of our brain which is a
combination of Reasoning, learning, problem-solving perception, language
understanding, etc.

To achieve the above factors for a machine or software Artificial Intelligence requires the
following discipline:

o Mathematics
o Biology
o Psychology
o Sociology
o Computer Science
o Neurons Study
o Statistics

Advantages of Artificial Intelligence


Following are some main advantages of Artificial Intelligence:

o High Accuracy with less error: AI machines or systems are prone to less errors and
high accuracy as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because
of that AI systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action
multiple times with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a
bomb, exploring the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such
as AI technology is currently used by various E-commerce websites to show the
products as per customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-
driving car which can make our journey safer and hassle-free, facial recognition for
security purpose, Natural language processing to communicate with the human in
human-language, etc.

Disadvantages of Artificial Intelligence


Every technology has some disadvantages, and thesame goes for Artificial intelligence. Being
so advantageous technology still, it has some disadvantages which we need to keep in our
mind while creating an AI system. Following are the disadvantages of AI:

o High Cost: The hardware and software requirement of AI is very costly as it requires
lots of maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still
they cannot work out of the box, as the robot will only do that work for which they
are trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it
does not have the feeling so it cannot make any kind of emotional attachment with
human, and may sometime be harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are
getting more dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas
but still AI machines cannot beat this power of human intelligence and cannot be
creative and imaginative.

Application of AI
Artificial Intelligence has various applications in today's society. It is becoming essential for
today's time because it can solve complex problems with an efficient way in multiple
industries, such as Healthcare, entertainment, finance, education, etc. AI is making our daily
life more comfortable and fast.

Following are some sectors which have the application of Artificial Intelligence:
1. AI in Astronomy
o Artificial Intelligence can be very useful to solve complex universe problems. AI technology
can be helpful for understanding the universe such as how it works, origin, etc.

2. AI in Healthcare
o In the last, five to ten years, AI becoming more advantageous for the healthcare industry and
going to have a significant impact on this industry.
o Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI
can help doctors with diagnoses and can inform when patients are worsening so that medical
help can reach to the patient before hospitalization.

AD

3. AI in Gaming
o AI can be used for gaming purpose. The AI machines can play strategic games like chess,
where the machine needs to think of a large number of possible places.

4. AI in Finance
o AI and finance industries are the best matches for each other. The finance industry is
implementing automation, chatbot, adaptive intelligence, algorithm trading, and machine
learning into financial processes.
5. AI in Data Security
o The security of data is crucial for every company and cyber-attacks are growing very rapidly
in the digital world. AI can be used to make your data more safe and secure. Some examples
such as AEG bot, AI2 Platform,are used to determine software bug and cyber-attacks in a
better way.

6. AI in Social Media
o Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles,
which need to be stored and managed in a very efficient way. AI can organize and manage
massive amounts of data. AI can analyze lots of data to identify the latest trends, hashtag, and
requirement of different users.

7. AI in Travel & Transport


o AI is becoming highly demanding for travel industries. AI is capable of doing various travel
related works such as from making travel arrangement to suggesting the hotels, flights, and
best routes to the customers. Travel industries are using AI-powered chatbots which can make
human-like interaction with customers for better and fast response.

8. AI in Automotive Industry
o Some Automotive industries are using AI to provide virtual assistant to their user for better
performance. Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
o Various Industries are currently working for developing self-driven cars which can make your
journey more safe and secure.

9. AI in Robotics:
o Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are
programmed such that they can perform some repetitive task, but with the help of AI, we can
create intelligent robots which can perform tasks with their own experiences without pre-
programmed.
o Humanoid Robots are best examples for AI in robotics, recently the intelligent Humanoid
robot named as Erica and Sophia has been developed which can talk and behave like humans.

10. AI in Entertainment
o We are currently using some AI based applications in our daily life with some entertainment
services such as Netflix or Amazon. With the help of ML/AI algorithms, these services show
the recommendations for programs or shows.
11. AI in Agriculture
o Agriculture is an area which requires various resources, labor, money, and time for best
result. Now a day's agriculture is becoming digital, and AI is emerging in this field.
Agriculture is applying AI as agriculture robotics, solid and crop monitoring, predictive
analysis. AI in agriculture can be very helpful for farmers.

12. AI in E-commerce
o AI is providing a competitive edge to the e-commerce industry, and it is becoming more
demanding in the e-commerce business. AI is helping shoppers to discover associated
products with recommended size, color, or even brand.

13. AI in education:
o AI can automate grading so that the tutor can have more time to teach. AI chatbot can
communicate with students as a teaching assistant.
o AI in the future can be work as a personal virtual tutor for students, which will be accessible
easily at any time and any place.

History of Artificial Intelligence


Artificial Intelligence is not a new word and not a new technology for researchers. This
technology is much older than you would imagine. Even there are the myths of Mechanical
men in Ancient Greek and Egyptian Myths. Following are some milestones in the history of
AI which defines the journey from the AI generation to till date development.
Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren McCulloch
and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection
strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered Machine
learning in 1950. Alan Turing publishes "Computing Machinery and Intelligence" in
which he proposed a test. The test can check the machine's ability to exhibit intelligent
behavior equivalent to human intelligence, called a Turing test.

The birth of Artificial Intelligence (1952-1956)


o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial intelligence
program" Which was named as "Logic Theorist". This program had proved 38 of 52
Mathematics theorems, and find new and more elegant proofs for some theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer scientist
John McCarthy at the Dartmouth Conference. For the first time, AI coined as an academic
field.

At that time high-level computer languages such as FORTRAN, LISP, or COBOL were
invented. And the enthusiasm for AI was very high at that time.

The golden years-Early enthusiasm (1956-1974)


o Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which was
named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named as
WABOT-1.

The first AI winter (1974-1980)


o The duration between years 1974 to 1980 was the first AI winter duration. AI winter refers to
the time period where computer scientist dealt with a severe shortage of funding from
government for AI researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert systems
were programmed that emulate the decision-making ability of a human expert.
o In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University.

The second AI winter (1987-1993)


o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high cost but
not efficient result. The expert system such as XCON was very cost effective.

The emergence of intelligent agents (1993-2011)


o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary Kasparov,
and became the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like Facebook,
Twitter, and Netflix also started using AI.

Deep learning, big data and artificial general intelligence


(2011-present)
o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had to solve
the complex questions as well as riddles. Watson had proved that it could understand natural
language and can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was able to
provide information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the infamous
"Turing test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two master
debaters and also performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and which
had taken hairdresser appointment on call, and lady on other side didn't notice that she was
talking with the machine.
Types of Artificial Intelligence:
Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is
flow diagram which explain the types of AI.

AI type-1: Based on Capabilities


1. Weak AI or Narrow AI:
o Narrow AI is a type of AI which is able to perform a dedicated task with intelligence. The
most common and currently available AI is Narrow AI in the world of Artificial Intelligence.
o Narrow AI cannot perform beyond its field or limitations, as it is only trained for one specific
task. Hence it is also termed as weak AI. Narrow AI can fail in unpredictable ways if it goes
beyond its limits.
o Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined range of
functions.
o IBM's Watson supercomputer also comes under Narrow AI, as it uses an Expert system
approach combined with Machine learning and natural language processing.
o Some Examples of Narrow AI are playing chess, purchasing suggestions on e-commerce site,
self-driving cars, speech recognition, and image recognition.

2. General AI:
o General AI is a type of intelligence which could perform any intellectual task with efficiency
like a human.
o The idea behind the general AI to make such a system which could be smarter and think like a
human by its own.
o Currently, there is no such system exist which could come under general AI and can perform
any task as perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.
o As systems with general AI are still under research, and it will take lots of efforts and time to
develop such systems.

3. Super AI:
o Super AI is a level of Intelligence of Systems at which machines could surpass human
intelligence, and can perform any task better than human with cognitive properties. It is an
outcome of general AI.
o Some key characteristics of strong AI include capability include the ability to think, to
reason,solve the puzzle, make judgments, plan, learn, and communicate by its own.
o Super AI is still a hypothetical concept of Artificial Intelligence. Development of such
systems in real is still world changing task.

Artificial Intelligence type-2: Based on functionality


1. Reactive Machines
o Purely reactive machines are the most basic types of Artificial Intelligence.
o Such AI systems do not store memories or past experiences for future actions.
o These machines only focus on current scenarios and react on it as per possible best action.
o IBM's Deep Blue system is an example of reactive machines.
o Google's AlphaGo is also an example of reactive machines.

2. Limited Memory
o Limited memory machines can store past experiences or some data for a short period of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These cars can
store recent speed of nearby cars, the distance of other cars, speed limit, and other information
to navigate the road.

3. Theory of Mind
o Theory of Mind AI should understand the human emotions, people, beliefs, and be able to
interact socially like humans.
o This type of AI machines are still not developed, but researchers are making lots of efforts
and improvement for developing such AI machines.

4. Self-Awareness
o Self-awareness AI is the future of Artificial Intelligence. These machines will be super
intelligent, and will have their own consciousness, sentiments, and self-awareness.
o These machines will be smarter than human mind.
o Self-Awareness AI does not exist in reality still and it is a hypothetical concept.

Agents in Artificial Intelligence


An AI system can be defined as the study of the rational agent and its environment. The
agents sense the environment through sensors and act on their environment through actuators.
An AI agent can have mental properties such as knowledge, belief, intention, etc.

What is an Agent?
An agent can be anything that perceive its environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking,
and acting. An agent can be:

o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors and
various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and act
on those inputs and display output on the screen.

Hence the world around us is full of agents such as thermostat, cellphone, camera, and even
we are also agents.

Before moving forward, we should first know about sensors, effectors, and actuators.

Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.

Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.

Effectors: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.

Intelligent Agents:
An intelligent agent is an autonomous entity which acts upon an environment using sensors
and actuators for achieving goals. An intelligent agent may learn from the environment to
achieve their goals. A thermostat is an example of an intelligent agent.

Following are the main four rules for an AI agent:

o Rule 1: An AI agent must have the ability to perceive the environment.


o Rule 2: The observation must be used to make decisions.
o Rule 3: Decision should result in an action.
o Rule 4: The action taken by an AI agent must be a rational action.

AD

Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way
to maximize its performance measure with all possible actions.

A rational agent is said to perform the right things. AI is about creating rational agents to use
for game theory and decision theory for various real-world scenarios.

For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.

Note: Rational agents in AI are very similar to intelligent agents.

Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be
judged on the basis of following points:

o Performance measure which defines the success criterion.


o Agent prior knowledge of its environment.
o Best possible actions that an agent can perform.
o The sequence of percepts.

Note: Rationality differs from Omniscience because an Omniscient agent knows the actual
outcome of its action and act accordingly, which is not possible in reality.

Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The
structure of an intelligent agent is a combination of architecture and agent program. It can be
viewed as:

1. Agent = Architecture + Agent program

Following are the main three terms involved in the structure of an AI agent:

Architecture: Architecture is machinery that an AI agent executes on.

Agent Function: Agent function is used to map a percept to an action.

f:P* → A

Agent program: Agent program is an implementation of agent function. An agent program


executes on the physical architecture to produce function f.

PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made
up of four words:

o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors

Here performance measure is the objective for the success of an agent's behavior.

PEAS for self-driving cars:

Let's suppose a self-driving car then PEAS representation will be:

Performance: Safety, time, legal drive, comfort

AD

Environment: Roads, other vehicles, road signs, pedestrian

Actuators: Steering, accelerator, brake, signal, horn

Sensors: Camera, GPS, speedometer, odometer, accelerometer, sonar.

Example of Agents with their PEAS representation


Agent Performance Environment Actuators Sensors
measure

1. Healthy patient Patient Tests Keyboard


Medical Minimized cost Hospital Treatments (Entry of symptoms)
Diagnose Staff

2. Cleanness Room Wheels Camera


Vacuum Efficiency Table Brushes Dirt detection sensor
Cleaner Battery life Wood floor Vacuum Extractor Cliff sensor
Security Carpet Bump Sensor
Various obstacles Infrared Wall
Sensor

3. Part - Percentage of parts in Conveyor belt with Jointed Arms Camera


picking correct bins. parts, Hand Joint angle sensors.
Robot Bins

Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the
time. These are given below:

o Simple Reflex Agent


o Model-based reflex agent
o Goal-based agents
o Utility-based agent
o Learning agent

1. Simple Reflex agent:


o The Simple reflex agents are the simplest agents. These agents take decisions on the basis of
the current percepts and ignore the rest of the percept history.
o These agents only succeed in the fully observable environment.
o The Simple reflex agent does not consider any part of percepts history during their decision
and action process.
o The Simple reflex agent works on Condition-action rule, which means it maps the current
state to action. Such as a Room Cleaner agent, it works only if there is dirt in the room.
o Problems for the simple reflex agent design approach:
o They have very limited intelligence
o They do not have knowledge of non-perceptual parts of the current state
o Mostly too big to generate and to store.
o Not adaptive to changes in the environment.

2. Model-based reflex agent


o The Model-based agent can work in a partially observable environment, and track the
situation.
o A model-based agent has two important factors:
o Model: It is knowledge about "how things happen in the world," so it is called a
Model-based agent.
o Internal State: It is a representation of the current state based on percept history.
o These agents have the model, "which is knowledge of the world" and based on the model they
perform actions.
o Updating the agent state requires information about:
a. How the world evolves
b. How the agent's action affects the world.

3. Goal-based agents
o The knowledge of the current state environment is not always sufficient to decide for an agent
to what to do.
o The agent needs to know its goal which describes desirable situations.
o Goal-based agents expand the capabilities of the model-based agent by having the "goal"
information.
o They choose an action, so that they can achieve the goal.
o These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
searching and planning, which makes an agent proactive.
4. Utility-based agents
o These agents are similar to the goal-based agent but provide an extra component of utility
measurement which makes them different by providing a measure of success at a given state.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an agent
has to choose in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each action
achieves the goals.
5. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences, or it has
learning capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically through
learning.
o A learning agent has mainly four conceptual components, which are:
a. Learning element: It is responsible for making improvements by learning from
environment
b. Critic: Learning element takes feedback from critic which describes that how well
the agent is doing with respect to a fixed performance standard.
c. Performance element: It is responsible for selecting external action
d. Problem generator: This component is responsible for suggesting actions that will
lead to new and informative experiences.
o Hence, learning agents are able to learn, analyze performance, and look for new ways to
improve the performance.

Agent Environment in AI
An environment is everything in the world which surrounds the agent, but it is not a part of
an agent itself. An environment can be described as a situation in which an agent is present.

The environment is where agent lives, operate and provide the agent with something to sense
and act upon it. An environment is mostly said to be non-feministic.
Features of Environment
As per Russell and Norvig, an environment can have various features from the point of view
of an agent:

1. Fully observable vs Partially Observable


2. Static vs Dynamic
3. Discrete vs Continuous
4. Deterministic vs Stochastic
5. Single-agent vs Multi-agent
6. Episodic vs sequential
7. Known vs Unknown
8. Accessible vs Inaccessible

1. Fully observable vs Partially Observable:

o If an agent sensor can sense or access the complete state of an environment at each point of
time then it is a fully observable environment, else it is partially observable.
o A fully observable environment is easy as there is no need to maintain the internal state to
keep track history of the world.
o An agent with no sensors in all environments then such an environment is called
as unobservable.

2. Deterministic vs Stochastic:

o If an agent's current state and selected action can completely determine the next state of the
environment, then such environment is called a deterministic environment.
o A stochastic environment is random in nature and cannot be determined completely by an
agent.
o In a deterministic, fully observable environment, agent does not need to worry about
uncertainty.

3. Episodic vs Sequential:

o In an episodic environment, there is a series of one-shot actions, and only the current percept
is required for the action.
o However, in Sequential environment, an agent requires memory of past actions to determine
the next best actions.

4. Single-agent vs Multi-agent

o If only one agent is involved in an environment, and operating by itself then such an
environment is called single agent environment.
o However, if multiple agents are operating in an environment, then such an environment is
called a multi-agent environment.
o The agent design problems in the multi-agent environment are different from single agent
environment.

5. Static vs Dynamic:

o If the environment can change itself while an agent is deliberating then such environment is
called a dynamic environment else it is called a static environment.
o Static environments are easy to deal because an agent does not need to continue looking at the
world while deciding for an action.
o However for dynamic environment, agents need to keep looking at the world at each action.
o Taxi driving is an example of a dynamic environment whereas Crossword puzzles are an
example of a static environment.

6. Discrete vs Continuous:

o If in an environment there are a finite number of percepts and actions that can be performed
within it, then such an environment is called a discrete environment else it is called
continuous environment.
o A chess gamecomes under discrete environment as there is a finite number of moves that can
be performed.
o A self-driving car is an example of a continuous environment.

7. Known vs Unknown

o Known and unknown are not actually a feature of an environment, but it is an agent's state of
knowledge to perform an action.
o In a known environment, the results for all actions are known to the agent. While in unknown
environment, agent needs to learn how it works in order to perform an action.
o It is quite possible that a known environment to be partially observable and an Unknown
environment to be fully observable.

8. Accessible vs Inaccessible

o If an agent can obtain complete and accurate information about the state's environment, then
such an environment is called an Accessible environment else it is called inaccessible.
o An empty room whose state can be defined by its temperature is an example of an accessible
environment.
o Information about an event on earth is an example of Inaccessible environment.

Turing Test in AI
In 1950, Alan Turing introduced a test to check whether a machine can think like a human or
not, this test is known as the Turing Test. In this test, Turing proposed that the computer can
be said to be an intelligent if it can mimic human response under specific conditions.

Turing Test was introduced by Turing in his 1950 paper, "Computing Machinery and
Intelligence," which considered the question, "Can Machine think?"

The Turing test is based on a party game "Imitation game," with some modifications. This
game involves three players in which one player is Computer, another player is human
responder, and the third player is a human Interrogator, who is isolated from other two
players and his job is to find that which player is machine among two of them.

Consider, Player A is a computer, Player B is human, and Player C is an interrogator.


Interrogator is aware that one of them is machine, but he needs to identify this on the basis of
questions and their responses.
The conversation between all players is via keyboard and screen so the result would not
depend on the machine's ability to convert words as speech.

The test result does not depend on each correct answer, but only how closely its responses
like a human answer. The computer is permitted to do everything possible to force a wrong
identification by the interrogator.

The questions and answers can be like:

Interrogator: Are you a computer?

PlayerA (Computer): No

Interrogator: Multiply two large numbers such as (256896489*456725896)

Player A: Long pause and give the wrong answer.

In this game, if an interrogator would not be able to identify which is a machine and which is
human, then the computer passes the test successfully, and the machine is said to be
intelligent and can think like a human.

"In 1991, the New York businessman Hugh Loebner announces the prize competition,
offering a $100,000 prize for the first computer to pass the Turing test. However, no AI
program to till date, come close to passing an undiluted Turing test".

Chatbots to attempt the Turing test:


ELIZA: ELIZA was a Natural language processing computer program created by Joseph
Weizenbaum. It was created to demonstrate the ability of communication between machine
and humans. It was one of the first chatterbots, which has attempted the Turing Test.

Parry: Parry was a chatterbot created by Kenneth Colby in 1972. Parry was designed to
simulate a person with Paranoid schizophrenia(most common chronic mental disorder).
Parry was described as "ELIZA with attitude." Parry was tested using a variation of the
Turing Test in the early 1970s.

Eugene Goostman: Eugene Goostman was a chatbot developed in Saint Petersburg in 2001.
This bot has competed in the various number of Turing Test. In June 2012, at an event,
Goostman won the competition promoted as largest-ever Turing test content, in which it has
convinced 29% of judges that it was a human.Goostman resembled as a 13-year old virtual
boy.

The Chinese Room Argument:


There were many philosophers who really disagreed with the complete concept of Artificial
Intelligence. The most famous argument in this list was "Chinese Room.”

In the year 1980, John Searle presented "Chinese Room" thought experiment, in his paper
"Mind, Brains, and Program," which was against the validity of Turing's Test. According
to his argument, "Programming a computer may make it to understand a language, but
it will not produce a real understanding of language or consciousness in a computer."

He argued that Machine such as ELIZA and Parry could easily pass the Turing test by
manipulating keywords and symbol, but they had no real understanding of language. So it
cannot be described as "thinking" capability of a machine such as a human.

Features required for a machine to pass the Turing test:


o Natural language processing: NLP is required to communicate with Interrogator in general
human language like English.
o Knowledge representation: To store and retrieve information during the test.
o Automated reasoning: To use the previously stored information for answering the questions.
o Machine learning: To adapt new changes and can detect generalized patterns.
o Vision (For total Turing test): To recognize the interrogator actions and other objects during
a test.
o Motor Control (For total Turing test): To act upon objects if requested.

Computer Vision Introduction


Computer vision is a subfield of artificial intelligence that deals with acquiring, processing,
analyzing, and making sense of visual data such as digital images and videos. It is one of
the most compelling types of artificial intelligence that we regularly implement in our daily
routines.

Computer vision helps to understand the complexity of the human vision system and trains
computer systems to interpret and gain a high-level understanding of digital images or videos.
In the early days, developing a machine system having human-like intelligence was just a
dream, but with the advancement of artificial intelligence and machine learning, it also
became possible. Similarly, such intelligent systems have been developed that can "see" and
interpret the world around them, similar to human eyes. The fiction of yesterday has become
the fact of today. In this tutorial, "Computer Vision Introduction", we will discuss a few
important concepts of computer vision, such as:

o What is Computer Vision?


o How does Computer Vision Work?
o The evolution of computer vision
o Applications of computer vision
o Challenges of computer vision

What is Computer Vision?


Computer vision is one of the most important fields of artificial intelligence (AI) and
computer science engineering that makes computer systems capable of extracting
meaningful information from visual data like videos and images. Further, it also helps to
take appropriate actions and make recommendations based on the extracted information.

Further, Artificial intelligence is the branch of computer science that primarily deals with
creating a smart and intelligent system that can behave and think like the human brain. So, we
can say if artificial intelligence enables computer systems to think intelligently, computer
vision makes them capable of seeing, analyzing, and understanding.

History of Computer Vision


Computer vision is not a new technology because scientists and experts have been trying to
develop machines that can see and understand visual data for almost six decades. The
evolution of computer vision is classified as follows:

o 1959: The first experiment with computer vision was initiated in 1959, where they showed a
cat as an array of images. Initially, they found that the system reacts first to hard edges or
lines, and scientifically, this means that image processing begins with simple shapes such as
straight edges.
o 1960: In 1960, artificial intelligence was added as a field of academic study to solve human
vision problems.
o 1963: This was another great achievement for scientists when they developed computers that
could transform 2D images into 3-D images.
o 1974: This year, optical character recognition (OCR) and intelligent character recognition
(ICR) technologies were successfully discovered. The OCR has solved the problem of
recognizing text printed in any font or typeface, whereas ICR can decrypt handwritten text.
These inventions are one of the greatest achievements in document and invoice processing,
vehicle number plate recognition, mobile payments, machine translation, etc.
o 1982: In this year, the algorithm was developed to detect edges, corners, curves, and other
shapes. Further, scientists also developed a network of cells that could recognize patterns.
o 2000: In this year, scientists worked on a study of object recognition.
o 2001: The first real-time face recognition application was developed.
o 2010: The ImageNet data set became available to use with millions of tagged images, which
can be considered the foundation for recent Convolutional Neural Network (CNN) and deep
learning models.
o 2012: CNN has been used as an image recognition technology with a reduced error rate.
o 2014: COCO has also been developed to offer a dataset for object detection and support
future research.

How does Computer Vision Work?


Computer vision is a technique that extracts information from visual data, such as images and
videos. Although computer vision works similarly to human eyes with brain work, this is
probably one of the biggest open questions for IT professionals: How does the human brain
operate and solve visual object recognition?

On a certain level, computer vision is all about pattern recognition which includes the
training process of machine systems for understanding the visual data such as images and
videos, etc.
Firstly, a vast amount of visual labeled data is provided to machines to train it. This labeled
data enables the machine to analyze different patterns in all the data points and can relate to
those labels. E.g., suppose we provide visual data of millions of dog images. In that case, the
computer learns from this data, analyzes each photo, shape, the distance between each shape,
color, etc., and hence identifies patterns similar to dogs and generates a model. As a result,
this computer vision model can now accurately detect whether the image contains a dog or
not for each input image.

Task Associated with Computer Vision


Although computer vision has been utilized in so many fields, there are a few common tasks
for computer vision systems. These tasks are given below:

o Object classification: Object classification is a computer vision technique/task used to


classify an image, such as whether an image contains a dog, a person's face, or a banana. It
analyzes the visual content (videos & images) and classifies the object into the defined
category. It means that we can accurately predict the class of an object present in an image
with image classification.
o Object Identification/detection: Object identification or detection uses image classification
to identify and locate the objects in an image or video. With such detection and identification
technique, the system can count objects in a given image or scene and determine their
accurate location and labeling. For example, in a given image, one dog, one cat, and one duck
can be easily detected and classified using the object detection technique.
o Object Verification: The system processes videos, finds the objects based on search criteria,
and tracks their movement.
o Object Landmark Detection: The system defines the key points for the given object in the
image data.
o Image Segmentation: Image segmentation not only detects the classes in an image as image
classification; instead, it classifies each pixel of an image to specify what objects it has. It
tries to determine the role of each pixel in the image.
o Object Recognition: In this, the system recognizes the object's location with respect to the
image.

How to learn computer Vision?


Although, computer vision requires all basic concepts of machine learning, deep learning,
and artificial intelligence. But if you are eager to learn computer vision, then you must follow
below things, which are as follows:

1. Build your foundation:


o Before entering this field, you must have strong knowledge of advanced
mathematical concepts such as Probability, statistics, linear algebra, calculus, etc.
o The knowledge of programming languages like Python would be an extra advantage
to getting started with this domain.
2. Digital Image Processing:
It would be best if you understood image editing tools and their functions, such as histogram
equalization, median filtering, etc. Further, you should also know about compressing images
and videos using JPEG and MPEG files. Once you know the basics of image processing and
restoration, you can kick-start your journey into this domain.
3. Machine learning understanding
To enter this domain, you must deeply understand basic machine learning concepts such
as CNN, neural networks, SVM, recurrent neural networks, generative adversarial
neural networks, etc.
4. Basic computer vision: This is the step where you need to decrypt the mathematical models
used in visual data formulation.

These are a few important prerequisites that are essentially required to start your career in
computer vision technology. Once you are prepared with the above prerequisites, you can
easily start learning and make a career in Computer vision.

Applications of computer vision


Computer vision is one of the most advanced innovations of artificial intelligence and
machine learning. As per the increasing demand for AI and Machine Learning technologies,
computer vision has also become a center of attraction among different sectors. It greatly
impacts different industries, including retail, security, healthcare, automotive, agriculture, etc.
Below are some most popular applications of computer vision:

o Facial recognition: Computer vision has enabled machines to detect face images of people to
verify their identity. Initially, the machines are given input data images in which computer
vision algorithms detect facial features and compare them with databases of fake profiles.
Popular social media platforms like Facebook also use facial recognition to detect and tag
users. Further, various government spy agencies are employing this feature to identify
criminals in video feeds.
o Healthcare and Medicine: Computer vision has played an important role in the healthcare
and medicine industry. Traditional approaches for evaluating cancerous tumors are time-
consuming and have less accurate predictions, whereas computer vision technology provides
faster and more accurate chemotherapy response assessments; doctors can identify cancer
patients who need faster surgery with life-saving precision.
o Self-driving vehicles: Computer vision technology has also contributed to its role in self-
driving vehicles to make sense of their surroundings by capturing video from different angles
around the car and then introducing it into the software. This helps to detect other cars and
objects, read traffic signals, pedestrian paths, etc., and safely drive its passengers to their
destination.
o Optical character recognition (OCR)
Optical character recognition helps us extract printed or handwritten text from visual data
such as images. Further, it also enables us to extract text from documents like invoices, bills,
articles, etc.
o Machine inspection: Computer vision is vital in providing an image-based automatic
inspection. It detects a machine's defects, features, and functional flaws, determines
inspection goals, chooses lighting and material-handling techniques, and other irregularities in
manufactured products.
o Retail (e.g., automated checkouts): Computer vision is also being implemented in the retail
industries to track products, shelves, wages, record product movements into the store, etc.
This AI-based computer vision technique automatically charges the customer for the marked
products upon checkout from the retail stores.
o 3D model building: 3D model building or 3D modeling is a technique to generate a 3D
digital representation of any object or surface using the software. In this field also, computer
vision plays its role in constructing 3D computer models from existing objects. Furthermore,
3D modeling has a variety of applications in various places, such as Robotics, Autonomous
driving, 3D tracking, 3D scene reconstruction, and AR/VR.
o Medical imaging: Computer vision helps medical professionals make better decisions
regarding treating patients by developing visualization of specific body parts such as organs
and tissues. It helps them get more accurate diagnoses and a better patient care system. E.g.,
Computed Tomography (CT) or Magnetic Resonance Imaging (MRI) scanner to diagnose
pathologies or guide medical interventions such as surgical planning or for research purposes.
o Automotive safety: Computer vision has added an important safety feature in automotive
industries. E.g., if a vehicle is taught to detect objects and dangers, it could prevent an
accident and save thousands of lives and property.
o Surveillance: It is one of computer vision technology's most important and beneficial use
cases. Nowadays, CCTV cameras are almost fitted in every place, such as streets, roads,
highways, shops, stores, etc., to spot various doubtful or criminal activities. It helps provide
live footage of public places to identify suspicious behavior, identify dangerous objects, and
prevent crimes by maintaining law and order.
o Fingerprint recognition and biometrics: Computer vision technology detects fingerprints
and biometrics to validate a user's identity. Biometrics deals with recognizing persons based
on physiological characteristics, such as the face, fingerprint, vascular pattern, or iris, and
behavioral traits, such as gait or speech. It combines Computer Vision with knowledge of
human physiology and behavior.

How to become a computer vision engineer?


Computer vision is one of the world's most popular & high-demand technologies. Although
starting your career in this domain is not easy, if you have a good command of machine
learning basics, advanced mathematics concepts, and the basics of computer vision, you can
easily start your career as a computer vision engineer.
There are some roles and responsibilities required to become a computer vision engineer,
which is as follows

o To create and implement a vision algorithm for working with image and video content pixels
o To develop a data-based approach for better problem solutions.
o Whenever required, you have to work on various AI and ML tasks required for computer
vision, such as image processing.
o Experience in working on various real-time project scenarios for problem-solving.
o Hierarchical problem decomposition, implementation of solutions, and integration with other
sub-systems.
o Hierarchical problem decomposition, implementation of solutions, and integration with other
sub-systems.
o Should be capable of understanding business objectives and can connect to technical solutions
through effective system design and architecture.

Job description (JD) for Computer vision engineer

o The candidate must have cumulative work experience in visual data processing and analysis
using machine learning and deep learning.
o Hands-on experience with various AI/ML frameworks such as Python, TensorFlow, PyTorch,
Keras, CPP, etc.
o Candidates must have good experience in implementing AI techniques.
o Must have good written and verbal communication skills.
o Candidates should be aware of object detection techniques and models such as YOLO,
RCNN, etc.

Which programming language is best for computer vision?


Computer vision engineers require in-depth knowledge of machine learning and deep
learning concepts with strong command over at least one programming language. There are
so many programming languages that can be used in this domain, but Python is among the
most popular. However, one can also choose OpenCV with Python, OpenCV with C++, or
MATLAB to learn and implement computer vision applications.

AD

OpenCV with Python could be the most preferred choice for beginners due to its flexibility,
simple syntax, and versatility. Various reasons make Python the best programming language
for computer vision, which is as follows:
o Easy-to-use: Python is very famous as it is easy to learn for entry-level persons and
professionals. Further, Python is also easily adaptable and covers all business needs.
o Most used programming language: Python is one of the most popular programming
languages as it contains complete learning environments to get started with machine learning,
artificial intelligence, deep learning, and computer vision.
o Debugging and visualization: Python has an in-built facility for debugging via 'PDB' and
visualization through Matplotlib.

Computer Vision Challenges


Computer vision has emerged as one of the most growing domains of artificial intelligence,
but it still has a few challenges to becoming a leading technology. There are a few challenges
observed while working with computer vision technology.

o Reasoning and analytical issuesAll programming languages and technologies require the
basic logic behind any task. To become a computer vision expert, you must have strong
reasoning and analytical skills. If you don't have such skills, then defining any attribute in
visual content may be a big problem.
o Privacy and security: Privacy and security are among the most important factors for any
country. Similarly, vision-powered surveillance is also having various serious privacy issues
for lots of countries. It restricts users from accessing unauthorized content. Further, various
countries also avoid such face recognition and detection techniques for privacy and security
reasons.
o Duplicate and false content: Cyber security is always a big concern for all organizations,
and they always try to protect their data from hackers and cyber fraud. A data breach can lead
to serious problems, such as creating duplicate images and videos over the internet.

Computer Vision Applications


Computer vision is a subfield of AI (Artificial Intelligence), which enables machines to
derive some meaningful information from any image, video, or other visual input and
perform the required action on that information. Computer vision is like eyes for an AI
system, which means if AI enables the machine to think, computer vision enables the
machines to see and observe the visual inputs. Computer vision technology is based on the
concept of teaching computers to process an image or a visual input at pixels and derive
meaningful information from it. Nowadays, Computer vision is in great demand and used in
different areas, including robotics, manufacturing, healthcare, etc. In this topic, we will
discuss some popular applications of Computer Vision, but before that, let's first understand
some common tasks that are performed by computer vision.
Below are some common tasks for which computer vision can be
used:
o Image Classification: Image classification is a computer vision technique used to classify
an image, such as whether an image contains a dog, a person's face, or a banana. It means that
with image classification, we can accurately predict the class of an object present in an
image.
o Object Detection: Object detection uses image classification to identify and locate the
objects in an image or video. With such detection and identification technique, the system can
count objects in a given image or scene and determine their accurate location, along with their
labelling. For example, in a given image, there is one person and one cat, which can be easily
detected and classified using the object detection technique.

o Object Tracking: Object tracking is a computer vision technique used to follow a particular
object or multiple items. Generally, object tracking has applications in videos and real-world
interactions, where objects are firstly detected and then tracked to get observation. Object
tracking is used in applications such as Autonomous vehicles, where apart from object
classification and detection such as pedestrians, other vehicles, etc., tracking of real-time
motion is also required to avoid accidents and follow the traffic rules.
o Semantic Segmentation: Image segmentation is not only about detecting the classes in an
image as image classification. Instead, it classifies each pixel of an image to specify what
objects it has. It tries to determine the role of each pixel in the image.

Computer Vision Applications


As per the increasing demand for AI and Machine Learning technologies, computer vision
also has a great demand among different sectors. It has a massive impact on different
industries, including retail, security, healthcare, automotive, agriculture, etc. Below are some
most popular applications of computer vision:

o Defect detection using Computer Vision


o OCR using Computer vision
o Crop Monitoring
o Analysis of X-rays, MRI, and CT scans using Computer Vision
o Road Condition Monitoring
o 3D model Building using Computer vision
o Cancer Detection using Computer Vision
o Plant Disease Detection using Computer Vision
o Traffic Flow Analysis

AD

Above are some most common applications of Computer vision. Now let us discuss
applications of computer vision across different sectors such as Retail, healthcare, etc.

1. Computer Vision in Healthcare


The Healthcare industry is rapidly adopting new technologies and automation solutions, one
of which is computer vision. In the healthcare industry, computer vision has the following
applications:

o X-Ray Analysis
Computer vision can be successfully applied for medical X-ray imaging. Although most
doctors still prefer manual analysis of X-ray images to diagnose and treat diseases, with
computer vision, X-ray analysis can be automated with enhanced efficiency and
accuracy. The state-of-art image recognition algorithm can be used to detect patterns in an X-
ray image that are too subtle for the human eyes.
o Cancer Detection
Computer vision is being successfully applied for breast and skin cancer detection. With
image recognition, doctors can identify anomalies by comparing cancerous and non-
cancerous cells in images. With automated cancer detection, doctors can diagnose cancer
faster from an MRI scan.
o CT Scan and MRI
Computer vision has now been greatly applied in CT scans and MRI analysis. AI with
computer vision designs such a system that analyses the radiology images with a high level of
accuracy, similar to a human doctor, and also reduces the time for disease detection,
enhancing the chances of saving a patient's life. It also includes deep learning algorithms that
enhance the resolution of MRI images and hence improve patient outcomes.

2. Computer Vision in Transportation


With the enhanced demand for the transportation sector, there has occurred various
technological development in this industry, and one of such technologies is Computer vision.
Below are some popular applications of computer vision in the transportation industry:

o Self-driving cars
Computer vision is widely used in self-driving cars. It is used to detect and classify objects
(e.g., road signs or traffic lights), create 3D maps or motion estimation, and plays a key role
in making autonomous vehicles a reality.

o Pedestrian detection
Computer vision has great application and research in Pedestrian detection due to its high
impact on the designing of pedestrian systems in various smart cities. With the help of
cameras, pedestrian detection automatically identifies and locate the pedestrians in image or
video. Moreover, it also considers the variations among pedestrians related to attire, body
position, and illuminance in different scenarios. This pedestrian detection is very helpful in
different fields such as traffic management, autonomous driving, transit safety, etc.
o Road Condition Monitoring & Defect detection
Computer vision has also been applied for monitoring the road infrastructure condition by
accessing the variations in concrete and tar. A computer vision-enabled system automatically
senses pavement degradation, which successfully increases road maintenance allocation
efficiency and decreases safety risks related to road accidents.
To perform road condition monitoring, CV algorithms collect the image data and then process
it to create automatic crack detection and classification system.

3. Computer Vision in Manufacturing


In the manufacturing industry, the demand for automation is at its peak. Many tasks have
already been automated, and other new technology innovations are in trend. For providing
these automatic solutions, Computer vision is also widely used. Below are some most popular
applications

o Defect Detection
This is perhaps, the most common application of computer vision. Until now, the detection of
defects has been carried out by trained people in selected batches, and total production control
is usually impossible. With computer vision, we can detect defects such as cracks in metals,
paint defects, bad prints, etc., in sizes smaller than 0.05mm.
o Analyzing text and barcodes (OCR)
Nowadays, each product contains a barcode on its packaging, which can be analyzed or read
with the help of the computer vision technique OCR. Optical character recognition or OCR
helps us detect and extract printed or handwritten text from visual data such as images.
Further, it enables us to extract text from documents like invoices, bills, articles, etc. and
verifies against the databases.
o Fingerprint recognition and Biometrics
Computer vision technology is used to detect fingerprints and biometrics to validate a user's
identity.
Biometrics is the measurement or analysis of physiological characteristics of a person that
make a person unique such as Face, Finger Print, iris Patterns, etc. It makes use of computer
vision along with knowledge of human physiology and behaviour.
o 3D Model building
3D model building or 3D modelling is a technique to generate a 3D digital representation of
any object or surface using the software. Computer vision plays its role here also in
constructing 3D computer models from existing objects. Furthermore, 3D modelling has a
variety of applications in various places, such as Robotics, Autonomous driving, 3D tracking,
3D scene reconstruction, and AR/VR.

4. Computer Vision in Agriculture


In the agriculture sector, Machine Learning has made a great contribution with its models,
including Computer vision. It can be used in areas such as crop monitoring, weather analysis,
etc. Below are some popular cases of computer vision applications in Agriculture:

o Crop Monitoring
In the agriculture sector, crop and yield monitoring are the most important tasks for better
agriculture. Traditionally, it depends on subjective human judgment, but that is not always
accurate. With computer vision systems, real-time crop monitoring and identification of any
crop variation due to any disease or deficiency of nutrition can be made.
o Automatic Weeding
An automatic weeding machine is an intelligent project enabled with AI and computer vision
that removes unwanted plants or weeds around the crops. Traditionally weeding methods
require human labour, which is costly and inefficient compared to automatic weeding
systems.
Computer vision enables the intelligent detection and removal of weeds using robots, which
reduces costs and ensures higher yields.
o Plant Disease Detection
Computer vision is also used in automated plant disease detection, which is important at an
early stage of plant development. Various deep-learning-based algorithms use computer
vision to identify plant diseases, estimate their severity and predict their impact on yields.

5. Computer Vision in Retail


In the retail sector, computer vision system enables retailers to collect a huge volume of
visual data and hence design better customer experiences with the help of cameras installed in
stores. Some popular applications of computer vision in the retail industry are given below:

o Self-checkout
Self-checkout enables the customers to complete their transactions from a retailer without the
need for human staff, and this becomes possible with computer vision. Self-checkouts are
now helping retailers in avoiding long queues and manage customers.
o Automatic replenishment
Automated stock replenishment is a leading technology innovation in retail sectors.
Traditionally, stock replenishment is performed by store staff, who check selves to track the
items for inventory management. But now, automatic replenishment with computer vision
systems captures the image data and performs a complete inventory scan to track the shelves
item at regular intervals.

o People Counting
Nowadays, various situations occur where we may need the count of people or customers
entering and leaving the stores. This foot count or people counting can be done by computer
vision systems that analyze the image or video data captured by the in-store cameras. People
counting is helpful in managing the people and allowing the limited people for cases such as
Covid social distancing.
Computer Vision Techniques
As human beings, we can see, process, understand, and act on anything that we can see or
any visual input; in other words, we have the ability to see and understand any visual data.
But how we can implement the same thing in machines? So, here Computer Vision comes
into the picture. Although there are still various limitations in machines to visualise similar to
humans, they are very close to analysing, understanding, and extracting meaningful
information from any visual input. Nowadays, Computer vision is one of the trending
research areas with deep learning.

In this topic, we will have a deep understanding of different computer vision techniques that
are currently being used in several applications. However, before starting, let's first
understand the basic introduction of computer vision.

What is Computer Vision?


Computer vision is a sub-field of AI and machine learning that enables the machine to see,
understand, and interpret the visuals such as images, video, etc., and extract useful
information from them that can be helpful in the decision-making of AI applications. It can
be considered as an eye for an AI application. With the help of computer vision technology,
such tasks can be done that would be impossible without this technology, such as Self
Driving Cars.

Computer Vision Process

A typic process of Computer vision is illustrated in the above image. It mainly performs three
steps, which are:
1. Capturing an Image

A computer vision software or application always includes a digital camera or CCTV to


capture the image. So, firstly it captures the image and puts it as a digital file that consists of
Zero and one's.

2. Processing the image

In the next step, different CV algorithms are used to process the digital data stored in a file.
These algorithms determine the basic geometric elements and generate the image using the
stored digital data.

3. Analyzing and taking required action

Finally, the CV analyses the data, and according to this analysis, the system takes the
required action for which it is designed.

Top Computer Vision Techniques


1. Image Classification

Image classification is the simplest technique of Computer Vision. The main aim of image
classification is to classify the image into one or more different categories. Image classifier
basically takes an image as input and tells about different objects present in that image, such
as a person, dog, tree, etc. However, it would not give you other more information about the
image data, such as how many persons are there, tree colour, item positions, etc., and for this,
we need to go for any other CV technique.

Image classification is basically of two types, Binary classification and multi-class


classification. As the name suggests, binary image classification looks for a single class in the
given image and provides results based on if the image has that object or not. For example,
we can achieve superhuman performance in detecting skin cancer in humans by training an
AI system on both images that have skin cancer and images that do not have skin cancer.

2. Object Detection

Object detection is another popular technique of computer vision that can be performed after
Image classification or which uses image classification to detect the objects in visual data. It
is basically used to recognize the objects within the boundary boxes and find the class of the
objects in the image. Object detection makes use of deep learning and machine learning
technology to generate useful results.

As human beings, whenever we see a visual or look at an image or video, we can


immediately recognize and even locate the objects within a moment. So, the aim of object
detection is to replicate the same human intelligence into machines to identify and locate the
objects.

Object detection has several applications, including object tracking, retrieval, video
surveillance, image captioning, etc.
A variety of techniques can be used to perform object detection, which includes R-CNN,
YOLO v2, etc.

3. Semantic Segmentation

Semantic Segmentation is not only about detecting the classes in an image as image
classification. Instead, it classifies each pixel of an image to specify what objects it has. It
tries to determine the role of each pixel in the image. It basically classifies pixelS in a
particular category without differentiating the object instances. Or we can say it classifies
similar objects as a single class from the pixel levels. For example, if an image contains two
dogs, then semantic segmentation will put both the dogs under the same label. It tries to
understand the role of each pixel in an image.

4. Instance Segmentation

Instance segmentation can classify the objects in an image at pixel level as similar to
semantic segmentation but with a more advanced level. It means Instance Segmentation can
classify similar types of objects into different categories. For example, if visual consists of
various cars, then with semantic segmentation, we can tell that there are multiple cars, but
with instance segmentation, we can label them according to their colour, shape, etc.

Instance segmentation is a typical computer vision task compared to other techniques as it


needs to analyse the difference within visual data with different overlapping objects and
different backgrounds.

In Instance segmentation, CNN or Convolutional Neural Networks can be effectively used,


where they can locate the objects at pixels level instead of just bounding the boxes. A well-
known example of CNN and instance segmentation is Facebook AI. This application can
detect or differentiate two colours of the same object, and the architecture of CNN used in
this is known as Mask R-CNN or Mask Region-Based Convolutional Neural Network.

Using the below image, we can analyse the difference between semantic segmentation and
instance segmentation, where semantic segmentation classified all the persons as singly
entities, whereas instance segmentation classified all the persons as different by considering
colours also.
5. Panoptic Segmentation

Panoptic Segmentation is one of the most powerful computer vision techniques as it


combines the Instance and Semantic Segmentation techniques. It means with Panoptic
Segmentation, you can classify image objects at pixel levels and can also identify separate
instances of that class.

6. Keypoint Detection

Keypoint detection tries to detect some key points in an image to give more details about a
class of objects. It basically detects people and localizes their key points. There are mainly
two keypoint detection areas, which are Body Keypoint Detection and Facial Keypoint
Detection.

For example, Facial keypoint detection includes detecting key parts of the human face such
as the nose, eyes, corners, eyebrows, etc. Keypoint detection mainly has applications,
including face detection, pose detection, etc.

With Pose estimation, we can detect what pose people have in a given image, which usually
includes where the head, eyes, nose, arms, shoulders, hands, and legs are in an image. This
can be done for a single person or multiple people as per the need.

7. Person Segmentation

Person segmentation is a type of image segmentation technique which is used to separate the
person from the background within an image. It can be used after the pose estimation, as with
this, we can closely identify the exact location of the person in the image as well as the pose
of that person.

8. Depth Perception

Depth perception is a computer vision technique that provides the visual ability to machines
to estimate the 3D depth/distance of an object from the source. Depth Perception has wide
applications, including the Reconstruction of objects in Augmented Reality, Robotics, self-
driving cars, etc. LiDAR(Lights Detection and Ranging) is one of the popular techniques that
is used for in-depth perception. With the help of laser beams, it measures the relative distance
of an object by illuminating it with laser light and then measuring the reflections using
sensors.

9. Image Captioning

Image captioning, as the name suggests, is about giving a suitable caption to the image that
can describe the image. It makes use of neural networks, where when we input an image, then
it generates a caption for that image that can easily describe the image. It is not only the task
of Computer vision but also an NLP task.

10. 3D Object Reconstruction

As the name suggests, 3D object reconstruction is a technique that can extract 3D objects
from a 2D image. Currently, it is a much-developing field of computer vision, and it can be
done in different ways for different objects. On this technique, one of the most successful
papers is PiFuHD, which tells about 3D human digitization.

You might also like