AI Unit5
AI Unit5
Image formation is the process by which a physical scene is captured as an image, typically
through an optical system like the human eye or a camera. The process involves the interaction
of light with objects in the scene, the optics of the imaging system, and the recording of the
resulting image.
Image Recording:
o In the human eye:
The retina records the light pattern and converts it to electrical signals via
photoreceptors (rods and cones).
o In a digital camera:
The image sensor (CCD or CMOS) captures light intensity and color,
converting it to a digital image.
5. Image Perception:
o The human brain or a computational system interprets the captured image to
extract meaningful information.
Real-World Examples
1. Human Vision:
o Light enters through the cornea and is focused by the lens onto the retina.
o The brain interprets signals from the retina to form the perception of the scene.
2. Photography:
o Light is focused by the camera lens onto a digital sensor.
o The camera adjusts exposure, aperture, and focus to capture a high-quality image.
3. Medical Imaging:
o Techniques like X-rays and MRIs use similar principles to form images of the
internal body structures.
Applications
Early image processing operations in AI and computer vision laid the foundation for many
modern applications. These operations focus on enhancing, transforming, and extracting
information from images. Here are some key operations and their purposes:
2. Edge Detection
Gradient-Based Methods: Algorithms like Sobel, Prewitt, and Roberts calculate the
gradient to identify edges.
Canny Edge Detection: A multi-step process involving noise reduction, gradient
calculation, non-maximum suppression, and edge tracking by hysteresis.
Laplacian of Gaussian (LoG): Combines Gaussian smoothing with Laplacian operator
for edge detection.
3. Thresholding
4. Morphological Operations
5. Image Filtering
6. Feature Extraction
Corner Detection (Harris, FAST): Identifies key points in images for object recognition
or motion tracking.
Texture Analysis (Gabor Filters, LBP): Captures surface patterns in an image for
classification.
Hough Transform: Detects simple shapes like lines, circles, and ellipses.
7. Image Transformations
8. Color Processing
Color Space Conversion: Transforms images between RGB, HSV, and YCbCr spaces
for better analysis.
Color Histogram: Represents the distribution of colors in an image for comparison or
classification.
9. Segmentation
Over time, these foundational methods evolved into more sophisticated approaches driven by
deep learning and AI, enabling advanced tasks like object detection, semantic segmentation, and
3D image reconstruction.
Edge Detection
Edge detection is a fundamental operation in image processing that identifies significant
transitions in pixel intensity, often corresponding to object boundaries. Algorithms such as
Sobel, Prewitt, and Canny are widely used for this purpose.
Here’s an example demonstrating the Sobel operator with a simple numerical problem.
Image Segmentation
Image segmentation is a crucial task in computer vision that involves partitioning an image into
meaningful regions or segments. Each segment typically corresponds to objects or regions of
interest in the image, allowing finer analysis and understanding of the scene.
1. Semantic Segmentation
o Assigns a class label to each pixel.
o Example: All pixels corresponding to "sky" are labeled as one class, "road" as another.
2. Instance Segmentation
o Extends semantic segmentation by distinguishing between different instances of the
same object class.
o Example: Each car in an image gets a unique label, even if all are of the same type.
3. Panoptic Segmentation
o Combines semantic and instance segmentation.
o Ensures every pixel is assigned to either a known object (instance) or background
(semantic).
Methods and Techniques
1. Thresholding
o Simplest method, based on pixel intensity.
o Example: Otsu’s method to find the optimal threshold value for binary segmentation.
2. Region-Based Segmentation
o Region Growing: Starts from a seed pixel and expands by adding neighboring pixels of
similar intensity.
o Watershed Segmentation: Treats the image as a topographic surface and finds "basins"
corresponding to segments.
3. Edge-Based Segmentation
o Detects object boundaries using edge-detection techniques (e.g., Sobel, Canny).
o Regions are defined by closed contours.
4. Clustering-Based Methods
o k-Means Clustering: Groups pixels based on color or intensity similarity.
o Mean-Shift Clustering: Groups pixels based on density in the feature space.
5. Graph-Based Segmentation
o Models the image as a graph with pixels as nodes and edges representing similarity.
o Techniques like Minimum Cut or Normalized Cut divide the graph into segments.
6. Deep Learning-Based Segmentation
o Fully Convolutional Networks (FCNs): Extend CNNs for pixel-wise prediction.
o U-Net: A popular architecture for biomedical segmentation tasks.
o Mask R-CNN: Combines object detection with instance segmentation.
o DeepLab: Employs atrous convolutions for semantic segmentation.
Applications
1. Medical Imaging
o Tumor detection, organ delineation (e.g., MRI, CT scans).
2. Autonomous Vehicles
o Road, pedestrian, and obstacle segmentation for navigation.
3. Satellite Imaging
o Land use analysis, vegetation detection.
4. Augmented Reality
o Accurate object and scene segmentation.
5. Image Editing
o Separating foreground and background for manipulation.
Input Image:
1. Preprocessing:
Pixels are labeled as sky, road, or car based on their predicted class.
Visualization:
Vision plays a crucial role in enabling AI systems, especially robots, to interact with their
environment for tasks like object manipulation and navigation. Here's how this topic can be
explained to students in the context of an Artificial Intelligence subject.
1. Vision for Manipulation
Manipulation involves using robotic arms or hands to interact with objects in the environment,
requiring precise perception and control.
1. Perception:
o Object Detection: Identify objects in the scene using vision-based techniques (e.g.,
YOLO, Faster R-CNN).
o Pose Estimation: Determine the position and orientation of the object in 3D space.
o Depth Sensing: Use depth cameras (e.g., Intel RealSense) or stereo vision to measure
object distance.
2. Planning:
o Plan a path for the manipulator to reach and interact with the object.
o Avoid collisions using algorithms like RRT (Rapidly-exploring Random Tree).
3. Control:
o Execute the planned path using actuators and continuously adjust based on visual
feedback.
Navigation involves a robot moving autonomously within an environment, requiring vision for
understanding and interacting with the surroundings.
1. Perception:
o Environment Mapping: Create a map using cameras or LIDAR (e.g., SLAM).
o Obstacle Detection: Identify objects or barriers in the robot's path.
o Lane or Path Detection: Recognize paths in structured environments like roads.
2. Localization:
o Determine the robot's position using visual cues (Visual Odometry).
o Combine with other sensors like GPS or IMU for accurate localization.
3. Path Planning:
o Use algorithms like A* or Dijkstra to compute the shortest path to the goal.
o Dynamic re-planning to avoid moving obstacles.
4. Control:
o Follow the planned path and adjust based on real-time visual feedback.
1. Hardware:
o Cameras: RGB, stereo, depth (e.g., Kinect, RealSense).
o Sensors: LIDAR, ultrasonic, IMU.
o Processors: GPUs for real-time vision processing.
2. Software:
o OpenCV for image processing.
o ROS (Robot Operating System) for integrating vision and navigation.
o Deep learning frameworks (TensorFlow, PyTorch) for perception.
1. Lighting Variations: Vision systems can struggle in low-light or overly bright environments.
2. Dynamic Environments: Objects or obstacles that move unpredictably.
3. Sensor Noise: Errors in depth or RGB data can affect accuracy.
6. Applications
1. Autonomous Vehicles:
o Use vision to detect lanes, traffic signs, and pedestrians.
2. Warehouse Robots:
o Navigate shelves and retrieve products.
3. Medical Robotics:
o Perform surgeries using visual guidance.
# Display obstacles
for contour in contours:
cv2.drawContours(image, [contour], -1, (0, 255, 0), 2)
This provides a simple visualization of obstacles using edge detection. Pair this with a path-
planning algorithm for full navigation functionality.
Introduction to Robotics
Robotics is a multidisciplinary field where Artificial Intelligence (AI) plays a key role in
enabling robots to perceive, reason, and act in dynamic environments. In the context of AI,
robotics focuses on building intelligent systems that can perform tasks autonomously or semi-
autonomously.
1. What is Robotics?
Robotics is the design, construction, operation, and use of robots. It integrates principles from:
2. What is a Robot?
1. Perceive: Understand their environment using computer vision, depth sensing, and object
recognition.
2. Plan: Make decisions about how to achieve goals (e.g., path planning in navigation).
3. Act: Execute tasks efficiently using control systems.
4. Learn: Improve performance over time through machine learning and reinforcement
learning.
4. Categories of Robots
1. Industrial Robots:
o Used in manufacturing for tasks like assembly, welding, and painting.
o Example: Robotic arms on car assembly lines.
2. Service Robots:
o Perform tasks for humans, like cleaning, delivery, or assistance.
o Example: Autonomous vacuum cleaners (e.g., Roomba).
3. Mobile Robots:
o Navigate environments using wheels, legs, or other means.
o Example: Delivery robots, drones.
4. Humanoid Robots:
o Mimic human appearance and actions.
o Example: Honda’s ASIMO.
5. Medical Robots:
o Assist in surgeries, rehabilitation, or diagnostics.
o Example: Da Vinci Surgical System.
1. Perception:
o Gathering information about the environment using sensors like cameras, LIDAR,
and IMUs.
o Example: A robot detecting obstacles using vision.
2. Computation:
o Processing sensor data and making decisions using algorithms or AI.
o Example: Path planning algorithms like A* or Dijkstra.
3. Control:
o Executing planned actions via motors and actuators.
o Example: Controlling a robotic arm to pick up an object.
1. Autonomous Vehicles:
o Self-driving cars that perceive roads, plan routes, and avoid obstacles.
o Example: Tesla's Autopilot.
2. Industrial Automation:
oRobots assembling parts with precision and efficiency.
oExample: ABB’s robotic arms.
3. Search and Rescue:
o Robots exploring hazardous areas to locate survivors.
o Example: Drones with thermal cameras.
4. Healthcare:
o Robotic assistants for surgeries or elderly care.
o Example: Rehabilitation robots.
5. Agriculture:
o Robots for precision farming, like planting, watering, and harvesting.
o Example: Autonomous tractors.
Robot Hardware in AI
Robot hardware forms the physical structure of a robot, enabling it to sense, process, and act
upon its environment. In the context of AI, robot hardware works in tandem with software and
algorithms to achieve autonomous or semi-autonomous behavior.
1.1 Sensors
Types of Sensors:
1. Visual Sensors:
Cameras (RGB, depth, stereo): Used for object detection, localization, and
navigation.
Example: Kinect sensor for depth mapping.
2. Proximity Sensors:
Ultrasonic, infrared: Detect obstacles and measure distance.
3. Motion Sensors:
IMU (Inertial Measurement Unit): Tracks acceleration and orientation.
Encoders: Measure rotational movement of wheels or joints.
4. Environmental Sensors:
LIDAR: Creates detailed 3D maps of the environment.
GPS: For outdoor localization.
5. Tactile Sensors:
Pressure or touch sensors for physical interactions.
1.2 Actuators
Types of Actuators:
1. Motors:
DC motors: For wheels and lightweight movement.
Servo motors: Precise control of angular motion (e.g., robotic arms).
Stepper motors: For incremental movement and precision.
2. Linear Actuators:
Convert rotational motion into linear movement for sliding actions.
3. Pneumatic and Hydraulic Actuators:
Use compressed air or fluids for heavy-duty tasks (e.g., robotic cranes).
Power Sources:
o Batteries: Lithium-ion or nickel-metal hydride (common for mobile robots).
o Solar Panels: For sustainable outdoor robots.
o Wired Power: For stationary robots in industrial setups.
The brain of the robot processes data from sensors and executes AI algorithms.
Types of Processors:
1. Microcontrollers: Simple tasks (e.g., Arduino).
2. Microprocessors: For advanced processing (e.g., Raspberry Pi).
3. Dedicated AI Chips:
GPUs: High-speed parallel computation for deep learning tasks (e.g., NVIDIA
Jetson).
TPUs: Specialized for neural network inference.
4. Edge Devices: Real-time processing on robots without relying on cloud computing.
Materials:
o Lightweight: Aluminum, carbon fiber (for mobile robots).
o Heavy-duty: Steel (for industrial robots).
o Flexible: Plastic, rubber (for soft robotics).
Mechanisms:
o Wheels: For rolling motion.
o Tracks: For rough terrain.
o Legs: For walking or climbing.
1. Sensor Fusion:
o Combining data from multiple sensors (e.g., vision and LIDAR) for better decision-
making.
o Example: Autonomous cars use cameras and LIDAR for obstacle detection.
2. Motion Control:
o AI algorithms optimize actuator usage for efficient movement.
o Example: Reinforcement learning for robotic arm manipulation.
3. Hardware Adaptation:
o AI can adapt robot behavior to hardware constraints (e.g., low battery or limited range).
4.1 Components:
Sensors: LIDAR for mapping, IMU for motion tracking, and RGB camera for object detection.
Actuators: DC motors for wheels, servo motors for steering.
Power: Lithium-ion battery.
Processor: NVIDIA Jetson Nano for real-time AI inference.
4.2 Application:
Robotic perception
Robotic perception in AI refers to the ability of robots to interpret and understand the world
around them through sensory inputs, typically using cameras, LIDAR, infrared sensors, and other
types of sensors. It involves processing raw data from the environment, such as visual, auditory,
tactile, or depth data, to build an understanding of the world that can then inform decision-
making, navigation, manipulation, and interaction with humans or objects.
1. Computer Vision: Robots use computer vision to analyze visual data (images or video)
to identify objects, people, or features in their environment. This involves tasks such as
object detection, segmentation, depth estimation, and scene understanding.
2. Sensor Fusion: Robotic systems often combine data from multiple sensors (e.g., visual,
auditory, force sensors) to create a more complete and accurate representation of the
environment. This is crucial for improving the robot's understanding of complex and
dynamic surroundings.
3. Simultaneous Localization and Mapping (SLAM): SLAM is a technique that allows
robots to build a map of an unknown environment while simultaneously keeping track of
their location within it. This is essential for autonomous navigation in unfamiliar settings.
4. Depth Perception: Robotic perception often relies on depth sensors (such as LIDAR or
stereo cameras) to understand the distance to objects, which is key for tasks like obstacle
avoidance and object manipulation.
5. Speech and Sound Processing: Some robots incorporate auditory perception to
recognize sounds or spoken commands, enhancing human-robot interaction capabilities.
6. Object and Gesture Recognition: Perception systems can be trained to recognize human
gestures, faces, or specific objects. This is often used in service robots and human-robot
collaborative environments.
7. Machine Learning for Perception: Deep learning and other machine learning
techniques are frequently used to improve the accuracy and efficiency of robotic
perception systems, enabling robots to learn from experience and adapt to new
environments.
Planning to Move
Moving into robotics for robot control within AI involves focusing on the development of
systems that enable robots to perform tasks autonomously or with minimal human intervention.
Robot control integrates perception, decision-making, and actuation, enabling a robot to execute
planned actions based on its understanding of the environment and goals.
1. Motion Planning: This involves calculating a sequence of movements that the robot
must make to achieve a goal while avoiding obstacles. Algorithms like A*, RRT
(Rapidly-exploring Random Tree), and PRM (Probabilistic Roadmaps) are often used.
2. Control Systems: In robotics, control theory is used to manage the robot's movement and
maintain stability. Common control systems include PID (Proportional-Integral-
Derivative) controllers, adaptive control, and model predictive control (MPC), which is
used to plan and control robot trajectories.
3. Inverse Kinematics: This is used to calculate the joint parameters (angles, positions)
needed for a robot’s end effector (like a hand or tool) to reach a desired position. Inverse
kinematics solutions are often used for robot arms or manipulators.
4. Autonomous Navigation: Robots need to plan and execute movements in dynamic
environments, often using SLAM (Simultaneous Localization and Mapping) for
localization and path planning. This involves not only navigating through static obstacles
but also reacting to moving objects.
5. Reinforcement Learning (RL): RL is increasingly used in robotics for learning optimal
policies based on rewards. Robots can learn to perform tasks through trial and error,
improving their control over time. In particular, RL can be applied to fine-tuning robot
movement and decision-making in uncertain environments.
6. Trajectory Optimization: This focuses on finding the best possible trajectory for the
robot to follow, optimizing for factors like energy efficiency, speed, or smoothness. It’s
commonly applied in scenarios such as robotic arms performing intricate tasks.
7. Human-Robot Interaction (HRI): For robots that collaborate with humans,
understanding and predicting human behavior is important. Control systems are designed
to enable robots to interact safely and effectively with people, adapting to human actions
in real-time.
8. Feedback Systems: Feedback loops allow the robot to continuously monitor its state and
adjust its actions. These systems are essential for maintaining control and stability,
especially in complex or uncertain environments.
As you move into this area, developing expertise in these core topics will be crucial for
designing and controlling robots that can operate autonomously, adapt to changing
environments, and execute complex tasks efficiently.
In robotics, the software architecture is the framework that governs how different components of
a robotic system interact with each other to perform tasks. Effective robotic software
architectures are essential for managing the complexity of robots and their interactions with the
environment. Several common software architectures in AI-driven robotics are used to design
modular, flexible, and scalable systems.
Overview: ROS is one of the most popular frameworks for building robotic systems. It is
an open-source middleware that provides services such as hardware abstraction, device
drivers, communication, and process management. ROS helps integrate various software
components (e.g., perception, planning, and control) into a single platform.
Key Features:
o Modular Design: ROS encourages modularity, with nodes (processes)
performing specific functions like sensing or planning, which can communicate
with each other via messages.
o Communication: It uses a publisher-subscriber model for message passing, along
with a service-client model for request-response interactions.
o Tools: ROS provides tools like RViz for visualization, Gazebo for simulation, and
RQT for diagnostics and control.
2. Behavior-Based Architecture
4. Hybrid Architectures
5. Component-Based Architecture
Overview: ROS 2 is an evolution of ROS designed to meet the needs of more complex,
real-time robotic systems. It incorporates improvements like real-time communication,
better security, and enhanced support for multi-robot systems.
Key Features:
o Real-Time Performance: ROS 2 is designed to support real-time computing for
time-sensitive tasks (e.g., robotics in industrial applications).
o DDS (Data Distribution Service): It uses DDS as a communication layer, which
supports better scalability and real-time communication.
o Improved Security: It includes security features, ensuring safer deployment in
sensitive environments.
7. Actor-Based Architecture
Overview: Software architecture for autonomous vehicles (self-driving cars, drones, etc.)
includes components for perception, decision-making, planning, and control, all
integrated into a robust system.
Key Features:
o Perception: Includes computer vision, LIDAR, and radar for environment
sensing.
o Localization: Uses GPS, SLAM, or other localization methods to track the
vehicle's position.
o Planning: Path planning algorithms ensure the vehicle reaches its destination
while avoiding obstacles and adhering to traffic rules.
o Control: Low-level control of the vehicle’s steering, throttle, and braking.
Application domains
Robotics is a diverse and rapidly evolving field with numerous application domains across
industries. Below are some key areas where robotics plays a crucial role:
1. Industrial Automation
Manufacturing: Robots are used for tasks like assembly, welding, painting, packaging,
and material handling. Industrial robots, such as articulated arms, can work at high
precision, improving productivity and safety.
Inspection and Maintenance: Robots can autonomously inspect and maintain
equipment, such as in pipeline inspections, power plants, and hazardous environments.
Surgical Robotics: Robots like the da Vinci Surgical System assist surgeons in
performing precise and minimally invasive surgeries.
Rehabilitation Robotics: Robots help in physical therapy by aiding in recovery through
controlled movements for patients with injuries or disabilities.
Assistive Robots: Robots designed to help elderly or disabled people with daily tasks,
such as mobility aids, robotic prosthetics, or exoskeletons.
Medical Robotics for Diagnostics: Robots are also used in diagnostics, such as robotic
microscopes or systems for automated analysis of medical images.
3. Autonomous Vehicles
4. Service Robotics
Domestic Robots: Robots for household tasks such as vacuuming (e.g., Roomba), lawn
mowing, and window cleaning.
Hospitality Robots: Robots used in hotels or restaurants for tasks like room service,
guiding guests, or preparing food.
Retail Robots: Robots for tasks like shelf scanning, customer service, and inventory
management in stores.
5. Agricultural Robotics
Precision Farming: Robots are used to monitor crops, automate planting, and optimize
resource use (water, fertilizers, pesticides). Drones, autonomous tractors, and harvesters
are examples.
Weed Control: Robots can perform targeted herbicide spraying or use mechanical
methods to remove weeds, minimizing pesticide use and improving sustainability.
6. Defense and Security
Military Robots: Unmanned aerial vehicles (UAVs), underwater drones, and ground
robots are used in surveillance, bomb disposal, and reconnaissance missions.
Search and Rescue: Robots are deployed in disaster zones to search for survivors,
navigate through rubble, and carry out tasks that are too dangerous for humans.
7. Space Exploration
Mars Rovers: Robots like NASA’s Perseverance rover explore the surface of Mars,
conducting scientific experiments, analyzing soil, and sending back valuable data.
Satellite Maintenance: Robotic systems are used for satellite repairs, orbiting missions,
or asteroid exploration.
Automated Guided Vehicles (AGVs): Robots are used to transport goods in warehouses
and factories, improving speed and reducing labor costs.
Sorting and Packaging: Robots are employed in sorting and packaging tasks, often in e-
commerce warehouses, to optimize efficiency and throughput.
Robotics in Education: Robots are used in STEM education to teach students about
engineering, coding, and problem-solving through hands-on learning experiences.
Robotic Research Platforms: Research labs utilize robots to explore new technologies,
such as swarm robotics, AI, and multi-agent systems.
Each of these domains leverages robotics and AI in unique ways, enhancing efficiency, safety,
and the ability to perform tasks beyond human capability or in environments that are unsafe or
inaccessible to humans.
Expert Systems in AI
Expert Systems in AI are computer programs that simulate the decision-making ability of a
human expert in a specific domain. These systems use knowledge and inference mechanisms to
solve complex problems that would typically require human expertise. Expert systems are one of
the earliest and most successful applications of AI and are used in various fields such as
medicine, engineering, finance, and troubleshooting.
1. Knowledge Base: This is the core of an expert system, containing all the facts, rules, and
heuristics relevant to the problem domain. The knowledge is typically encoded in the
form of "if-then" rules (also called production rules) that represent the expertise of human
specialists.
2. Inference Engine: The inference engine is responsible for processing the knowledge
base and drawing conclusions or making decisions. It applies logical rules to the
knowledge base to infer new facts or solve problems. There are two primary types of
inference:
o Forward Chaining: Data-driven reasoning, where the system starts with known
facts and applies rules to derive new facts until a goal is reached.
o Backward Chaining: Goal-driven reasoning, where the system starts with a goal
and works backward through the rules to determine the facts that support that
goal.
3. User Interface: The user interface is the part of the expert system that allows users to
interact with the system. It enables users to input data, ask questions, and receive answers
or suggestions from the system.
4. Explanation Facility: This component provides reasoning and justification for the
conclusions or recommendations made by the expert system. It helps users understand
how the system arrived at its decisions, making the system more transparent and
trustworthy.
5. Knowledge Acquisition Module: This module helps to build and maintain the
knowledge base. It can involve the process of extracting knowledge from human experts,
literature, or other sources, and converting that knowledge into a format usable by the
system.
1. Rule-Based Expert Systems: These use "if-then" rules to make inferences. An example
is MYCIN, a medical expert system developed in the 1970s to diagnose bacterial
infections.
2. Frame-Based Expert Systems: These use frames, which are data structures that
represent knowledge about a particular concept. They are used to store information about
objects, events, or situations and can handle more complex relationships than rule-based
systems.
3. Case-Based Reasoning (CBR): Instead of using predefined rules, CBR systems solve
new problems by retrieving and adapting solutions from similar past cases stored in a
database.
4. Fuzzy Expert Systems: These expert systems work with fuzzy logic rather than binary
logic. They are used for problems that involve vague or imprecise data and can provide
more nuanced conclusions.
1. Medical Diagnosis: Systems like MYCIN and its modern counterparts help doctors
diagnose diseases and recommend treatments based on patient symptoms and medical
history.
2. Troubleshooting: Expert systems are used in technical support to diagnose problems in
equipment or software. They guide users through a series of questions and suggest
solutions based on the answers.
3. Financial Analysis: Expert systems assist in evaluating financial data, providing
recommendations for investment or risk management.
4. Engineering Design: In industries such as aerospace and automotive, expert systems
help engineers design components or systems based on a set of constraints and goals.
5. Legal Advice: Some expert systems help provide legal advice by evaluating cases based
on legal rules and precedents.
6. Process Control: In industrial settings, expert systems can monitor and control complex
processes such as manufacturing or chemical processing, ensuring optimal performance.
Lack of Flexibility: Expert systems are often domain-specific and lack the flexibility to
adapt to problems outside their knowledge base.
Knowledge Acquisition: Building and maintaining the knowledge base is challenging
and time-consuming, as it requires the expertise of human specialists.
Dependence on Experts: The quality of an expert system depends on the quality of the
knowledge and rules it is provided. If the knowledge base is incomplete or outdated, the
system’s performance will degrade.
Limited Learning Ability: Traditional expert systems do not learn or adapt over time,
although newer systems may incorporate machine learning techniques to improve their
performance.
Conversational AI
1. Chatbots:
o Simple text-based AI that can perform tasks like answering customer service
questions, making reservations, or providing product recommendations.
o Rule-Based Chatbots: Follow predefined scripts or decision trees. They provide
fixed responses based on specific input.
o AI-Based Chatbots: Use NLP and machine learning to understand and respond
to a wider range of queries more dynamically.
2. Virtual Assistants:
o More advanced than chatbots, virtual assistants (e.g., Siri, Alexa, Google
Assistant) can carry out a variety of tasks like setting reminders, answering
questions, playing music, and controlling smart devices.
o They often use dialogue management to handle multi-turn conversations and
integrate with various services and applications.
3. Voice Assistants:
oFocus primarily on voice interactions and are often used in smart devices such as
smartphones, smart speakers, or in-car systems.
o Voice assistants are built on top of automatic speech recognition (ASR) and
text-to-speech (TTS) technologies.
4. Conversational Agents for Customer Support:
o These systems help businesses automate customer support, solving problems
through self-service.
o They are integrated into websites, mobile apps, and social media platforms.
1. Transformers:
o The introduction of transformer-based models, such as GPT (Generative
Pretrained Transformer), BERT (Bidirectional Encoder Representations
from Transformers), and T5 (Text-to-Text Transfer Transformer), has
revolutionized conversational AI by providing more powerful, flexible models
capable of handling complex language tasks with high accuracy.
2. Pre-trained Language Models:
o Models like GPT-4 or BERT have been trained on massive amounts of data and
can perform various language tasks out-of-the-box, significantly reducing the
need for task-specific training.
3. Multi-modal AI:
o Some conversational systems can process multiple types of input, such as voice,
text, and images. These systems can have applications in customer service, virtual
shopping assistants, or even medical consultations.
Generative AI
1. Generative Models:
o These are machine learning models that learn the underlying distribution of a
dataset and use that to generate new samples from the same distribution.
o The goal of generative models is not just to predict an output but to create entirely
new, realistic data points that resemble the original data.
2. Types of Generative Models:
o Generative Adversarial Networks (GANs): GANs consist of two neural
networks – a generator and a discriminator – that work together in a competitive
process. The generator creates fake data, while the discriminator tries to
distinguish between real and generated data. Over time, the generator improves,
producing increasingly realistic data.
Example: GANs can generate realistic images of people or objects.
o Variational Autoencoders (VAEs): VAEs are probabilistic models that encode
input data into a lower-dimensional representation and then decode it back to its
original form. This process helps in generating new data points by sampling from
the latent space.
Example: VAEs are used in generating new images or reconstructing data.
o Autoregressive Models: These models generate data one element at a time,
conditioned on previously generated elements. Examples include language
models like GPT (Generative Pretrained Transformer) for text generation and
PixelCNN for image generation.
o Normalizing Flows: A method that transforms simple distributions (like
Gaussian) into complex ones through a series of invertible functions, allowing for
the generation of data samples.
3. Training Generative AI Models:
o Generative AI models are typically trained on large datasets containing examples
of the type of data the model is expected to generate (e.g., images, text).
o The models learn the probability distribution of the training data and then sample
from that distribution to generate new data points.
o Unsupervised Learning: Generative models often rely on unsupervised learning,
where the model learns from data without requiring labeled output. For example,
in image generation, the model learns the structure and content of the images but
doesn’t need explicit labels.
4. Latent Space:
o In many generative models, particularly VAEs and GANs, data is compressed
into a lower-dimensional latent space. This space captures the essential features
of the data, which can then be sampled to generate new data points.
o Manipulating points in the latent space can lead to new, often creative, variations
of the generated data.
5. Generative AI Architectures:
o Transformer Models: These models, especially those like GPT-3 and GPT-4,
have revolutionized text generation. By using self-attention mechanisms,
transformers are able to understand and generate complex, coherent text.
o Convolutional Neural Networks (CNNs) for Image Generation: CNNs are
often used as part of generative models, especially in GANs, to generate high-
quality images by learning spatial hierarchies of data.
o Recurrent Neural Networks (RNNs): RNNs, and particularly Long Short-Term
Memory (LSTM) networks, are sometimes used in generative models for tasks
like text or music generation due to their ability to handle sequential data.
1. Content Creation:
o Text Generation: Models like GPT-3 and GPT-4 are capable of generating
highly coherent, human-like text. These models can be used for writing articles,
generating poetry, creating code, composing emails, or answering questions.
o Image Generation: GANs, such as StyleGAN, can generate realistic images of
people, objects, or scenes. These models are widely used in entertainment,
fashion, and advertising.
o Music Generation: AI models can generate original music compositions, often
by learning from patterns in large datasets of existing music. These are used for
creating soundtracks, background music, or even helping musicians with
composition.
o Video Generation: Generative AI is starting to be used for video generation and
manipulation, such as creating realistic deepfake videos or generating animated
scenes.
2. Healthcare:
o Drug Discovery: Generative models can design novel molecules with specific
properties for use in drug development, speeding up the discovery process.
o Medical Imaging: AI can generate synthetic medical images, helping to augment
datasets for training other AI models or to generate images for research purposes
where privacy concerns exist.
3. Gaming and Virtual Worlds:
o Procedural Content Generation: In gaming, generative models are used to
create vast virtual worlds, landscapes, characters, and storylines automatically,
which makes the development process faster and more scalable.
o Character Design: Generative models can be used to create lifelike avatars or
characters for games and virtual environments.
4. Personalized Content:
o Recommendations: Generative models can be used in recommendation systems,
where the AI generates personalized suggestions based on user behavior and
preferences.
o Ad Generation: Generative models are also used to create personalized
advertisements, tailoring the content to the interests of individual users.
5. Art and Design:
o Digital Art: Artists and designers use generative AI to create unique artwork,
explore creative concepts, and design products or graphics that would be
challenging to conceptualize manually.
o Fashion: AI can generate new fashion designs, propose color schemes, and even
predict fashion trends by learning from large datasets of past designs.