Unit 2
Unit 2
Neural Networks
Using the human brain as a source of inspiration, artificial neural networks (NNs) are massively
parallel distributed networks that have the ability to learn and generalize from examples. This
area of research includes feedforward NNs, recurrent NNs, self-organizing NNs, deep learning,
convolutional neural networks and so on.
Fuzzy Systems
Using the human language as a source of inspiration, fuzzy systems (FS) model linguistic
imprecision and solve uncertain problems based on a generalization of traditional logic, which
enables us to perform approximate reasoning. This area of research includes fuzzy sets and
systems, fuzzy clustering and classification, fuzzy controllers, linguistic summarization, fuzzy
neural networks, type 2 fuzzy sets and systems, and so on
Evolutionary Computation
Using the biological evolution as a source of inspiration, evolutionary computation (EC) solves
optimization problems by generating, evaluating and modifying a population of possible
solutions. EC includes genetic algorithms, evolutionary programming, evolution strategies,
genetic programming, swarm intelligence, differential evolution, evolvable hardware, multi-
objective optimization and so on.
Swarm intelligence
Swarm intelligence is based on the collective behavior of decentralized, self-organizing
systems, typically consisting of a population of simple agents that interact locally with each
other and with their environment.
Probabilistic methods
Being one of the main elements of fuzzy logic, probabilistic methods firstly introduced
by Paul Erdos and Joel Spencer in 1974, aim to evaluate the outcomes of a Computation
Intelligent system , mostly defined by randomness. Therefore, probabilistic methods bring
out the possible solutions to a problem, based on prior knowledge.
Artificial Intelligence (AI) is an area of computer science which focuses on the development
of intelligent machines, and Computational Intelligence (CI) is a sub-field within AI that
focuses on creating systems capable of performing complex tasks.
Artificial Intelligence (AI) refers to the ability of machines to perform tasks that
typically require human intelligence, such as learning, reasoning, problem-solving, and
decision-making.
Ai have key features of
Mimicking Human Intelligence:
AI aims to create systems that can mimic human cognitive abilities, allowing machines to
learn from data, understand language, recognize patterns, and make decisions.
Algorithms and Data:
AI relies on algorithms, which are sets of instructions, and vast amounts of data to train and
improve the performance of AI systems.
Machine Learning:
A key component of AI is machine learning, where systems learn from data without explicit
programming, allowing them to adapt and improve over time.
Applications:
AI is used in a wide range of applications, including:
Natural Language Processing (NLP): Enabling machines to understand and generate
human language (e.g., chatbots, translation software).
Computer Vision: Allowing machines to "see" and interpret images and videos (e.g., self-
driving cars, object recognition).
Robotics: Developing robots capable of performing tasks autonomously.
Data Analysis: Identifying patterns and insights from large datasets.
Recommendation Systems: Suggesting products, content, or services based on user
preferences.
Types of AI:
Narrow or Weak AI: Designed for a specific task (e.g., a spam filter).
General or Strong AI: Capable of performing any intellectual task that a human being can
(currently theoretical).
Super AI: Hypothetical AI that surpasses human intelligence in all aspects (currently
theoretical).
AI in the Workplace:
AI is increasingly being used to automate tasks, improve efficiency, and enhance decision-
making in various industries.
Ethical Considerations:
As AI becomes more powerful, it's important to address ethical concerns related to bias,
privacy, and job displacement.
AI programming focuses on three cognitive skills: learning, reasoning and self-correction.
Learning processes: This aspect of AI programming focuses on acquiring data and creating
rules for how to turn the data into actionable information. The rules, which are
called algorithms, provide computing devices with step-by-step instructions for how to
complete a specific task.
Reasoning processes: This aspect of AI programming focuses on choosing the right algorithm
to reach a desired outcome.
Self-correction processes: This aspect of AI programming is designed to continually fine-tune
algorithms and ensure they provide the most accurate results possible.
AI itself generates the algorithm to It generates only the information and allows the end result
5.
produce end results and decisions to be interpreted by humans itself
Machines and Cognition
• Cognitive AI, also called cognitive artificial intelligence, is software that tries to think
and learn by imitating the way human brains work.
• It uses natural language processing (NLP) and machine learning (ML) to attempt to
learn, reason, and understand the human intention behind queries to deliver more
relevant responses.
What is Machine Cognition?
It's a type of AI that aims to develop machines that can think and learn like humans.
It involves using techniques like natural language processing (NLP) and machine learning
(ML) to enable machines to understand human intentions and deliver relevant responses.
It goes beyond simple data processing and aims to create systems that can reason, learn, and
adapt to new situations.
Key Concepts in Machine Cognition:
Learning: Machines learn from data and experience, similar to how humans learn.
Reasoning: Machines can draw conclusions and make inferences based on the information
they have.
Understanding Language: Machines can understand and respond to human language,
enabling natural interactions.
Cognitive Computing: This term is often used interchangeably with machine cognition and
refers to systems that learn at scale, reason with purpose, and interact with humans naturally.
Examples of Machine Cognition Applications:
Healthcare: Diagnosing diseases, managing patient records, and personalizing treatments.
Finance: Detecting fraud, analyzing market trends, and providing financial advice.
Manufacturing: Optimizing production processes, predicting equipment failures, and
improving quality control.
Customer Service: Providing personalized support, answering customer questions, and
resolving issues.
Machine Learning vs. Machine Cognition:
Machine Learning (ML): Focuses on developing algorithms that enable machines to learn
from data.
Machine Cognition (MC): Builds upon ML and other AI technologies to create systems that
can simulate human thought processes.
The Future of Machine Cognition:
Researchers are exploring ways to create machines with higher levels of intelligence
and cognitive abilities.
Cognitive AI systems are expected to play an increasingly important role in various
aspects of society.
The goal is to develop machines that can understand language, integrate knowledge,
and adapt to new situations, similar to humans.
cognitive architecture
• A cognitive architecture is both a theory about the structure of the human mind and to
a computational instantiation of such a theory used in the fields of artificial intelligence
(AI) and computational cognitive science.
• “symbolic memory" refers to a memory system that stores information in discrete,
meaningful symbols, allowing for logical reasoning and manipulation of concepts
• "Emergent memory" refers to a type of memory that arises spontaneously from the
interactions within a system, without being explicitly programmed
• "Hybrid memory“ refers to combining both short-term and long-term memory, or
integrating episodic and semantic memory
Cognitive architecture is a theory about the structures that create a mind in natural or artificial
systems. It focuses on how these structures work with each other and use the knowledge and
skills that are incorporated into the architecture to create and manage intelligent behavior in
various complex environments.
The Elementary Perceiver and Memorizer (EPAM), created in 1960 by Ed Feigenbaum was
one of the first possible cognitive architecture models. He intended to use the EPAM cognitive
architecture model to glean insights into the inner workings of the human brain.
Generically cognitive architectures include creating artificial intelligence and modeling natural
intelligence at appropriate levels of abstraction. A grand unified architecture is integrated
across higher-level thought processes as well as aspects that are essential for successful
intelligent behavior in human-like environments. These include emotions, motor control, and
perception. Functionally elegant architectures bring an expanse of capabilities from
interactions with a tiny set of mechanisms. These can be considered to be a set of cognitive
Newton’s laws.
What is the purpose of cognitive architecture?
Cognitive architecture seeks to employ the research that is carried out in the domain of
cognitive psychology to build a complete computer model of cognition. Cognitive architecture
acts as a blueprint for creating and implementing intelligent agents.
It concentrates on merging cognitive science and artificial intelligence and seeks to create
artificial computational system processes which behave like natural cognitive systems.
The ultimate purpose of cognitive architecture is to model the human brain and eventually
empower us to build artificial intelligence that is on par with humans (Artificial General
Intelligence).
you can draw above and below diagram for any questions in 2 nd unit(draw based on the
questions).
knowledge-based system
• A knowledge-based system (KBS) is a computer program that uses a knowledge base
to solve problems. KBSs are a type of artificial intelligence (AI) that use reasoning to
make decisions.
Knowledge-based systems (KBS) in AI cognition are computer programs that use a
centralized knowledge base to support decision- making and problem-solving, mimicking
human expertise and reasoning.
A Knowledge-Based System (KBS) in computer intelligence is a computer program that
uses a knowledge base, a centralized repository of information, to solve problems and make
decisions by simulating human expertise, relying on reasoning and inference to derive new
knowledge.
Key Components of a KBS:
Knowledge Base:
This is a structured collection of facts, rules, and other knowledge relevant to a specific
domain.
Inference Engine:
This component uses the knowledge base to draw conclusions and make decisions.
Reasoning System:
This system allows the KBS to derive new knowledge from the existing knowledge base.
How KBS Works:
1. Knowledge Acquisition:
The system needs to acquire knowledge from various sources, such as human experts,
documents, or databases.
2. Knowledge Representation:
The acquired knowledge is then organized and represented in a structured format within the
knowledge base.
3. Inference and Reasoning:
The inference engine uses the knowledge base and reasoning techniques (e.g., rules, logic, or
constraint handling) to solve problems or make decisions.
4. Output and Action:
The KBS provides solutions, recommendations, or takes actions based on the results of the
inference and reasoning process.
Examples of KBS Applications:
Expert Systems: These systems are designed to emulate the decision-making abilities of
human experts in a specific domain, such as medical diagnosis or financial analysis.
Intelligent Tutoring Systems: These systems use knowledge bases to provide personalized
learning experiences to students.
Robotics: KBS can be used to control robots and make decisions in real-time.
Cybersecurity: KBS can be used to detect and prevent cyberattacks.
Logical Representation
It represents a conclusion based on various conditions and lays down some important
communication rules. Also, it consists of precisely defined syntax and semantics which
supports the sound inference. Each sentence can be translated into logics using syntax and
semantics.
Knowledge Representation
Knowledge representation involves converting information from the real world into a form a
computer can understand and manipulate. It aims to capture the world’s complexities by
modelling entities, their attributes, and the relationships between them in a structured format.
Various forms of representation are used in KRR, including:
Symbolic Representation: Using symbols, such as logic or mathematical notations, to
represent knowledge.
Semantic Networks: Representing knowledge through nodes (concepts) and edges
(relationships) in a graphical structure.
Frames and Scripts: Organising knowledge into structures that capture typical attributes and
behaviours of entities.
Ontologies: Formal representations that define concepts, relationships, and constraints within
a particular domain.
Reasoning
Reasoning in AI refers to the cognitive processes by which knowledge is leveraged to
conclude, make predictions, or solve problems. It involves the application of logical rules,
inference mechanisms, and decision-making processes based on the represented knowledge.
Types of Reasoning:
Deductive Reasoning: Deriving specific conclusions from general principles or rules.
Inductive Reasoning: Generalising from specific observations or cases to make broader
conclusions.
Abductive Reasoning: Inferring the most likely explanation for a set of observations, even if
not necessarily the only reason.
Defining Reasoning:
Cognitive Process: Reasoning in artificial intelligence refers to the mental process of
applying knowledge to conclude, make predictions, or solve problems. It mimics the human
ability to think logically and make informed decisions based on available information.
Decision-Making Mechanism: At its essence, reasoning is the decision- making mechanism
within intelligent systems. It involves leveraging knowledge representations to infer new
information, deduce logical consequences, and navigate complex problem spaces.
Types of Reasoning:
1. Deductive Reasoning:
Principle: Deriving specific conclusions from general principles or rules.
Application: Often employed in rule-based systems where logical rules are applied to reach
specific conclusions.
2. Inductive Reasoning:
Principle: Generalising from specific observations or cases to make broader conclusions.
Application: Commonly used in machine learning, where patterns in data are identified to
make predictions about unseen instances.
3. Abductive Reasoning:
Principle: Inferring the most likely explanation for a set of observations, even if not
necessarily the only reason.
Application: Applied in diagnostic systems and problem-solving scenarios where multiple
plausible explanations may exist.
Interdependence of Representation and Reasoning:
Mutual Enhancement: Knowledge representation and reasoning are symbiotic components
in AI. The quality of representation significantly influences the efficacy of reasoning, and
practical reasoning, in turn, refines the representation.
Dynamic Iteration: The iterative process of refining representation and reasoning enhances
the overall cognitive capabilities of AI systems, enabling them to adapt and learn from new
information.
Applications of Reasoning in AI:
Expert Systems: Employing deductive reasoning to emulate human expertise in specific
domains.
Decision Support Systems: Leveraging reasoning mechanisms to assist users in making
informed decisions.
Natural Language Processing: Utilising various forms of reasoning to understand and
generate human- like language.
What is Reasoning in artificial intelligence?
The reasoning is the mental process of deriving logical conclusion and making predictions
from available knowledge, facts, and beliefs. Or we can say, "Reasoning is a way to infer
facts from existing data." It is a general process of thinking rationally, to find valid
conclusions. In artificial intelligence, the reasoning is essential so that the machine can also
think rationally as a human brain, and can perform like a human.
Types of Reasoning
Deductive reasoning
Inductive reasoning
Abductive reasoning
Common Sense Reasoning
Monotonic Reasoning
Non-monotonic Reasoning
Note: Inductive and deductive reasoning are the forms of propositional logic.
1. Deductive reasoning:
The mental process of deducing logical conclusions and forming predictions from accessible
knowledge, facts, and beliefs is known as reasoning. "Reasoning is a way to deduce facts
from existing data," we can state. It is a general method of reasoning to arrive to valid
conclusions.
Artificial intelligence requires thinking in order for the machine to think rationally like a
human brain.
Deductive reasoning is the process of deducing new information from previously known
information that is logically linked. It is a type of legitimate reasoning in which the
conclusion of an argument must be true if the premises are true.
In AI, deductive reasoning is a sort of propositional logic that necessitates a number of rules
and facts. It's also known as top-down reasoning, and it's the polar opposite of inductive
reasoning.
Example:
Premise-1: All the human eats veggies
Premise-2: Suresh is human.
Conclusion: Suresh eats veggies.
2. Inductive Reasoning:
The truth of the premises ensures the truth of the conclusion in deductive reasoning.
Deductive reasoning typically begins with generic premises and ends with a specific
conclusion, as shown in the example below.
Inductive reasoning is a type of reasoning that uses the process of generalization to arrive at a
conclusion with a limited collection of information. It begins with a set of precise facts or
data and ends with a broad assertion or conclusion.
Inductive reasoning, often known as cause-effect reasoning or bottom-up reasoning, is a kind
of propositional logic. In inductive reasoning, we use historical evidence or a set of premises
to come up with a general rule, the premises of which support the conclusion.The truth of
premises does not ensure the truth of the conclusion in inductive reasoning because premises
provide likely grounds for the conclusion.
Example:
Premise: All of the pigeons we have seen in the zoo are white.
Conclusion: Therefore, we can expect all the pigeons to be white.
3. Abductive reasoning:
Abductive reasoning is a type of logical reasoning that begins with a single or several
observations and then searches for the most plausible explanation or conclusion for the
observation.The premises do not guarantee the conclusion in abductive reasoning, which is an
extension of deductive reasoning.
Example:
Implication: Cricket ground is wet if it is raining
Axiom: Cricket ground is wet.
Conclusion It is raining.
Common sense thinking is a type of informal reasoning that can be learned through personal
experience.Common Sense thinking mimics the human ability to make educated guesses
about occurrences that occur on a daily basis. It runs on heuristic knowledge and heuristic
rules and depends on good judgment rather than exact reasoning.
Example:
One person can be at one place at a time.
If I put my hand in a fire, then it will burn.
The preceding two statements are instances of common sense thinking that everyone
may comprehend and assume.
5. Monotonic Reasoning:
When using monotonic reasoning, once a conclusion is reached, it will remain the same even
if new information is added to the existing knowledge base. Adding knowledge to a
monotonic reasoning system does not reduce the number of prepositions that can be
deduced.We can derive a valid conclusion from the relevant informa tion alone to address
monotone problems, and it will not be influenced by other factors.Monotonic reasoning is
ineffective for real-time systems because facts change in real time, making monotonic
reasoning ineffective.In typical reasoning systems, monotonic reasoning is applied, and a
logic-based system is monotonic. Monotonic reasoning can be used to prove any theorem.
Example:
Earth revolves around the Sun.
It is a fact that cannot be changed, even if we add another sentence to our knowledge
base, such as "The moon revolves around the earth" or "The Earth is not round," and
so on.
If we deduce some facts from existing facts, then it will always be valid.
Hypothesis knowledge cannot be conveyed using monotonic reasoning, hence facts must be
correct.
New knowledge from the real world cannot be added because we can only draw inferences
from past proofs
6. Non-monotonic Reasoning
Some findings in non-monotonic reasoning may be refuted if we add more information to our
knowledge base.
If certain conclusions can be disproved by adding new knowledge to our knowledge base,
logic is said to be non-monotonic.
Non-monotonic reasoning deals with models that are partial or uncertain.
"Human perceptions for various things in daily life, " is a basic example of non-monotonic
reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
Birds can fly
Penguins cannot fly
Pitty is a bird
In conclusion we can say that “pitty is flying”
However, if we add another line to the knowledge base, such as "Pitty is a penguin," the
conclusion "Pitty cannot fly" is invalidated.
Advantages of Non-monotonic Reasoning:
When using non-monotonic reasoning, old truths can be negated by adding new statements.
Artificial intelligence reasoning can be divided into two types: inductive reasoning and
deductive reasoning. Both modes of thinking contain premises and conclusions, but they are
incompatible with one another. The following is a list of inductive and deductive reasoning
comparisons:
Inductive reasoning involves making a generalization from specific facts and observations,
whereas deductive reasoning employs accessible facts, information, or knowledge to draw a
correct conclusion.
Deductive reasoning is done from the top down, whereas inductive reasoning is done from
the bottom up.
Deductive reasoning leads to a correct conclusion from a generalized assertion, but inductive
reasoning leads to a generalization from a specific observation.
The findings in deductive reasoning are certain, whereas the conclusions in inductive
reasoning are probabilistic.
Deductive arguments can be valid or invalid, implying that if the premises are true, the
conclusion must be true, but inductive arguments can be strong or weak, implying that even if
the premises are correct, the conclusion may be untrue.
On the basis of arguments, the distinctions between inductive and deductive reasoning can be
demonstrated using the picture below:
Basics for Deductive Reasoning Inductive Reasoning
comparison
Validity Premises are the starting point for The conclusion is where inductive reasoning
deductive reasoning. begins.
Usage Deductive reasoning is difficult to use Because we need evidence rather than
since we need facts that must be true. genuine facts, we can use inductive
reasoning quickly and easily. It is frequently
used in our daily lives.
Structure Deductive reasoning progresses from From specific data to general facts, inductive
broad to specific information. reasoning is used.
logical decision-making
In AI, logical decision-making involves AI systems using reasoning and knowledge
representation to make decisions, often based on rules and inferences, aiming for explainability
and reliability.
Logical decision-making refers to using logic to make choices. A logical decision-maker uses
evidence and develops arguments and reasons to draw conclusions and make decisions.
Practicing logical decision-making seems more limited in scope and a more realistic goal than
"rational" decision-making.
Natural language processing (NLP) is a field of computer science and a subfield of artificial
intelligence that aims to make computers understand human language. NLP uses
computational linguistics, which is the study of how language works, and various models
based on statistics, machine learning, and deep learning. These technologies allow
computers to analyze and process text or voice data, and to grasp their full meaning,
including the speaker’s or writer’s intentions and emotions.
NLP powers many applications that use language, such as text translation, voice recognition,
text summarization, and chatbots. You may have used some of these applications yourself,
such as voice-operated GPS systems, digital assistants, speech-to-text software, and
customer service bots. NLP also helps businesses improve their efficiency, productivity, and
performance by simplifying complex tasks that involve language.
NLP Techniques
NLP encompasses a wide array of techniques that aimed at enabling computers to process
and understand human language. These tasks can be categorized into several broad areas,
each addressing different aspects of language processing.
Stopword Removal: Removing common words (like “and”, “the”, “is”) that may not carry
significant meaning.
Part-of-Speech (POS) Tagging: Assigning parts of speech to each word in a sentence (e.g.,
noun, verb, adjective).
Constituency Parsing: Breaking down a sentence into its constituent parts or phrases (e.g.,
noun phrases, verb phrases).
3. Semantic Analysis
Named Entity Recognition (NER): Identifying and classifying entities in text, such as names
of people, organizations, locations, dates, etc.
Coreference Resolution: Identifying when different words refer to the same entity in a text
(e.g., “he” refers to “John”).
4. Information Extraction
Entity Extraction: Identifying specific entities and their relationships within the text.
Relation Extraction: Identifying and categorizing the relationships between entities in a text.
Sentiment Analysis: Determining the sentiment or emotional tone expressed in a text (e.g.,
positive, negative, neutral).
6. Language Generation
8. Question Answering
Retrieval-Based QA: Finding and returning the most relevant text passage in response to a
query.
Generative QA: Generating an answer based on the information available in a text corpus.
9. Dialogue Systems
Chatbots and Virtual Assistants: Enabling systems to engage in conversations with users,
providing responses and performing tasks based on user input.
Computer vision needs lots of data. It runs analyses of data over and over until it discerns
distinctions and ultimately recognize images. For example, to train a computer to recognize
automobile tires, it needs to be fed vast quantities of tire images and tire-related items to learn
the differences and recognize a tire, especially one with no defects.
Two essential technologies are used to accomplish this: a type of machine learning
called deep learning and a convolutional neural network (CNN).
Machine learning uses algorithmic models that enable a computer to teach itself about the
context of visual data. If enough data is fed through the model, the computer will “look” at
the data and teach itself to tell one image from another. Algorithms enable the machine to
learn by itself, rather than someone programming it to recognize an image.
A CNN helps a machine learning or deep learning model “look” by breaking images down
into pixels that are given tags or labels. It uses the labels to perform convolutions (a
mathematical operation on two functions to produce a third function) and makes predictions
about what it is “seeing.” The neural network runs convolutions and checks the accuracy of
its predictions in a series of iterations until the predictions start to come true. It is then
recognizing or seeing images in a way similar to humans.
Scientists and engineers have been trying to develop ways for machines to see and understand
visual data for about 60 years. Experimentation began in 1959 when neurophysiologists
showed a cat an array of images, attempting to correlate a response in its brain. They
discovered that it responded first to hard edges or lines and scientifically, this meant that
image processing starts with simple shapes like straight edges. 2
At about the same time, the first computer image scanning technology was developed,
enabling computers to digitize and acquire images. Another milestone was reached in 1963
when computers were able to transform two-dimensional images into three-dimensional
forms. In the 1960s, AI emerged as an academic field of study and it also marked the
beginning of the AI quest to solve the human vision problem.
1974 saw the introduction of optical character recognition (OCR) technology, which could
recognize text printed in any font or typeface. 3 Similarly, intelligent character recognition
(ICR) could decipher hand-written text that is using neural networks. 4 Since then, OCR and
ICR have found their way into document and invoice processing, vehicle plate recognition,
mobile payments, machine conversion and other common applications.
In 1982, neuroscientist David Marr established that vision works hierarchically and
introduced algorithms for machines to detect edges, corners, curves and similar basic shapes.
Concurrently, computer scientist Kunihiko Fukushima developed a network of cells that
could recognize patterns. The network, called the Neocognitron, included convolutional
layers in a neural network.
By 2000, the focus of study was on object recognition; and by 2001, the first real-time face
recognition applications appeared. Standardization of how visual data sets are tagged and
annotated emerged through the 2000s. In 2010, the ImageNet data set became available. It
contained millions of tagged images across a thousand object classes and provides a
foundation for CNNs and deep learning models used today. In 2012, a team from the
University of Toronto entered a CNN into an image recognition contest. The model, called
AlexNet, significantly reduced the error rate for image recognition. After this breakthrough,
error rates have fallen to just a few percent. 5
Computer vision applications
There is a lot of research being done in the computer vision field, but it doesn't stop there.
Real-world applications demonstrate how important computer vision is to endeavors in
business, entertainment, transportation, healthcare and everyday life. A key driver for the
growth of these applications is the flood of visual information flowing from smartphones,
security systems, traffic cameras and other visually instrumented devices. This data could
play a major role in operations across industries, but today goes unused. The information
creates a test bed to train computer vision applications and a launchpad for them to become
part of a range of human activities:
IBM used computer vision to create My Moments for the 2018 Masters golf tournament. IBM
Watson® watched hundreds of hours of Masters footage and could identify the sights (and
sounds) of significant shots. It curated these key moments and delivered them to fans as
personalized highlight reels.
Google Translate lets users point a smartphone camera at a sign in another language and
almost immediately obtain a translation of the sign in their preferred language. [6]
The development of self-driving vehicles relies on computer vision to make sense of the
visual input from a car’s cameras and other sensors. It’s essential to identify other cars, traffic
signs, lane markers, pedestrians, bicycles and all of the other visual information encountered
on the road.
IBM is applying computer vision technology with partners like Verizon to bring intelligent AI
to the edge and to help automotive manufacturers identify quality defects before a vehicle
leaves the factory.
Computer vision applications
There is a lot of research being done in the computer vision field, but it doesn't stop there.
Real-world applications demonstrate how important computer vision is to endeavors in
business, entertainment, transportation, healthcare and everyday life. A key driver for the
growth of these applications is the flood of visual information flowing from smartphones,
security systems, traffic cameras and other visually instrumented devices. This data could
play a major role in operations across industries, but today goes unused. The information
creates a test bed to train computer vision applications and a launchpad for them to become
part of a range of human activities:
IBM used computer vision to create My Moments for the 2018 Masters golf tournament. IBM
Watson® watched hundreds of hours of Masters footage and could identify the sights (and
sounds) of significant shots. It curated these key moments and delivered them to fans as
personalized highlight reels.
Google Translate lets users point a smartphone camera at a sign in another language and
almost immediately obtain a translation of the sign in their preferred language. [6]
The development of self-driving vehicles relies on computer vision to make sense of the
visual input from a car’s cameras and other sensors. It’s essential to identify other cars, traffic
signs, lane markers, pedestrians, bicycles and all of the other visual information encountered
on the road.
IBM is applying computer vision technology with partners like Verizon to bring intelligent AI
to the edge and to help automotive manufacturers identify quality defects before a vehicle
leaves the factory.
Computer vision examples
Many organizations don’t have the resources to fund computer vision labs and create deep
learning models and neural networks. They may also lack the computing power that is
required to process huge sets of visual data. Companies such as IBM are helping by offering
computer vision software development services. These services deliver pre-built learning
models available from the cloud—and also ease demand on computing resources. Users
connect to the services through an application programming interface (API) and use them to
develop computer vision applications.
IBM has also introduced a computer vision platform that addresses both developmental and
computing resource concerns. IBM Maximo® Visual Inspection includes tools that enable
subject matter experts to label, train and deploy deep learning vision models—without coding
or deep learning expertise. The vision models can be deployed in local data centers, the cloud
and edge devices.
While it’s getting easier to obtain resources to develop computer vision applications, an
important question to answer early on is: What exactly will these applications do?
Understanding and defining specific computer vision tasks can focus and validate projects
and applications and make it easier to get started.
Image classification sees an image and can classify it (a dog, an apple, a person’s face).
More precisely, it is able to accurately predict that a given image belongs to a certain class.
For example, a social media company might want to use it to automatically identify and
segregate objectionable images uploaded by users.
Object detection can use image classification to identify a certain class of image and then
detect and tabulate their appearance in an image or video. Examples include detecting
damages on an assembly line or identifying machinery that requires maintenance.
Object tracking follows or tracks an object once it is detected. This task is often executed
with images captured in sequence or real-time video feeds. Autonomous vehicles, for
example, need to not only classify and detect objects such as pedestrians, other cars and road
infrastructure, they need to track them in motion to avoid collisions and obey traffic laws. [7]
Content-based image retrieval uses computer vision to browse, search and retrieve images
from large data stores, based on the content of the images rather than metadata tags associated
with them. This task can incorporate automatic image annotation that replaces manual image
tagging. These tasks can be used for digital asset management systems and can increase the
accuracy of search and retrieval.