PDF Unit-5 (Full Unit)

The document provides an overview of reinforcement learning, focusing on intelligent agents that maximize cumulative rewards through a series of actions and observations in an environment. It discusses key concepts such as Markov Decision Processes, Bellman Equations, and various learning algorithms including Q-learning and Deep Q Networks, emphasizing the importance of effective state representation and reward functions. The document also highlights the challenges of delayed rewards and the need for sophisticated strategies to achieve long-term success in learning tasks.

Uploaded by

nkota1843

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views37 pages

PDF Unit-5 (Full Unit)

Uploaded by

nkota1843

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

 Reinforcement Learning

 Fundamentals
 Intelligent Agents
• The stepping-stone crossing challenge described illustrates the intelligent agent
approach that underpins reinforcement learning. We can consider the scout to be an
intelligent agent (or simply agent) attempting to complete a task within an
environment. The goal of the agent is to complete the task as successfully as possible.
Each attempt at the task is referred to as an episode. At any point in time, t, the agent
observes the current state of its environment, ot; considers these observations to select
an action, at; and takes this action, receiving immediate feedback, rt, from the
environment about whether this was a good or bad action to take. We use rt to refer to
feedback, as in reinforcement learning feedback is more commonly referred to as
reward (where reward can be either positive or negative). This gives a series of
discrete steps that make up an episode

• where the episode proceeds through time-steps t = 1,……,e. At each time-step the
agent makes an observation, ot, of the environment, takes an action, at, and receives a
reward, rt, based on that action. This cycle is illustrated in Figure 11.1
The sequence of observations, actions and rewards that precede any time-step, t, is
referred to as a history, Ht. The job of the agent in the environment is to make decisions
at each time-step, t, about what action to take next on the basis of its current
observations of the environment, ot, and the history, Ht. Maintaining long histories of
actions, rewards, and observations (which are possibly only very slightly different from
one iteration to the next) is not a very efficient way to reason about the world,
particularly as episodes might cover hundreds or thousands of time-steps.
Instead, we collapse this information into a single representation, referred to as a state.
• The state at time-step t, st, should contain all the important information about the
environment at that time-step, any important information about what has been
happening in the environment at preceding time-steps, and any important
information about the internal composition of the agent. For example, for a robot
deployed within a hospital to deliver equipment to operating theaters, the state
might include the robot’s position in the environment, the positions of people
nearby, whether the robot is on the way to collect items or to deliver them, and the
current levels of the robot’s batteries. In Figure 11.1 we show how the observations
made about the environment at timestep t are converted into a state, st, using a state
generation function, . In many cases, if the environment is fully observable this
function is a simple identity function because the observation fully defines the state.
It is also possible, however, for this function to be more
elaborate when the observations over multiple time-steps are accumulated into a
state.1 Using states instead of observations, Equation (11.1) can be restated

• We see in subsequent examples that designing good state representations is one of

the arts of reinforcement learning.
• The goal of the intelligent agent is to complete a task as successfully as possible
 Fundamentals of Reinforcement Learning
• The fundamental idea underpinning reinforcement learning is that the only goal of
an intelligent agent is to maximize cumulative reward across an episode.3 The
cumulative reward earned across an episode is referred to as the return from the
episode and can be defined as

• That intelligent behavior can be driven by the singular goal of maximizing return is
a bold statement—it is often argued that it is very ambitious to expect sophisticated,
longterm behavior to emerge from simple accumulation of instantaneous rewards.
Reward is often delayed, and the real value of an action is not reflected immediately
but rather by the fact that an action takes us toward a later state that will ultimately
allow an agent to earn a reward. For example, early moves in a game of chess do not
lead to large positive rewards but set the ground for later high-reward moves.
Rewards can also often be somewhat contradictory, and an action that gives an
immediate positive reward may turn out to be a bad one in the longer term. For
example, eating cake almost always seems like a good idea in the moment, but in
terms of long-term health is probably not always a strong choice. It has been shown
repeatedly, however, that it is in fact possible to learn sophisticated, longterm
behaviors using the maximization of cumulative reward alone. This introduces the
second art of reinforcement learning: the design of effective reward functions.
• The policy can be thought of as a simple lookup table that records the action that should
be taken in every state, and reinforcement learning problems can be framed as an effort to
learn this table directly. Policies can also be encoded as a rule used to choose an action
from those available in a particular state, and this is the approach we focus on in this
chapter. For example, we might use a greedy action selection policy that says the agent
should always take the action that will give it this highest immediate reward. This would,
however, ignore the fact that sometimes reward is delayed and that taking an action that
gives a low immediate reward can be a good idea if it leads the agent to a state that could
give it large positive rewards later on. This suggests the need for a more sophisticated measure
of the value of taking an action in a given state and leads to the final fundamental
component of a reinforcement learning agent: a value function
 Markov Decision Processes
• Markov decision processes (MDPs) are an attractive mathematical framework
within which to reason about decision making scenarios in which outcomes are
partly under the control of a decision maker, but also partly random. This has made
them an attractive framework for applications ranging from financial modeling, to
robot control, to modeling the flow of human conversation. This also makes them
ideal for reasoning about reinforcement learning
• A Markov process, a more basic framework than an MDP that does not include
decision making, can be used to model a discrete random process that transitions
through a finite set of states, S . For example, we could use a Markov process to
model how infection progresses in an individual when a disease epidemic breaks
out. Individuals can belong to one of three states: SUSCEPTIBLE, INFECTED, or
RECOVERED (these are often referred to as S-I-R models). An individual can
belong to only one of these states at a time and moves between them according to a
Markov process. Figure 11.2(a) shows these states and how an individual can move
between them
• Markov processes are built on the Markov assumption that the probability of
transitioning to a particular state at the next time-step relies only on the current
state, and does not require any knowledge of the history of states that came before
that, or

• Given the Markov assumption, we can write the probability of transitioning

between two states

• where S t and S t+1 are random variables to which the states at time t and t􀀀 1 are
assigned.
• The full dynamics of a Markov process can be captured in a transition matrix

• where s1 to sn are n different states. A Markov process can be fully characterized

by the set of states, S , and the transition matrix, P
 The Bellman Equations
• Although an MDP tells us everything we need to know about how an agent can take
actions to move between states in an environment, it does not tell us anything about
what actions the agent should take to be most successful. However, the action-value
function given in Equation (11.9) can be expressed in terms of the components of
an MDP to do just this. Before presenting the formal version of this, it is worth
stating the intuition. We can calculate the value of taking a particular action in a
given state as the expected reward for taking the action plus the value of the state
that the agent arrives in after taking that action.
• To restate Equation (11.9) in terms of the components of an MDP, we modify
Equation (11.9)to sum to infinity rather than just the end of an episode,12 and we
state the equation a little more succinctly
• This states that the expected return of taking action at in state st is the immediate
expected reward from taking that action plus the sum of discounted expected
rewards that will arise if we continue to follow policy .
• To restate Equation (11.17) using the components of an MDP we must explicitly
represent the uncertainty associated with state transitions. This uncertainty is one of
the key things that the MDP formulation allows us to model. We don’t know
exactly what state an agent will arrive in after taking action at from state st, but
from the transition matrix, Pat , we do know the probability of each possible
transition between states, To calculate the expected return of taking
action at from state st we can calculate a weighted sum across the expected returns
that the agent could receive in every state, st+1, that the agent could reach after
taking action at in state st. The weights in this weighted sum are the probabilities of
the state transitions
however, iterative approaches that calculate approximate solutions are typically used
 Temporal Difference Learning
• Temporal-difference learning is a simple, iterative, tabular approach to learning the
action value function, that is quite effective. We say that temporal-
difference learning is tabular because an action-value table is used to store
estimates of the expected return from taking each available action, at, in each
possible state, st—the value of . Recall that the expected return is the
cumulative reward that the agent expects to receive if it takes action at in state st
and then follows the policy all the way to the end of the episode. All entries in the
action-value table are first initialized to random values (or sometimes zeros). Table
11.2 shows an example action-value table for the Twenty Twos playing agent
discussed previously. Note that there is an entry for each action-state combination
and that the terminal states always have a value of 0:000 for every action.
Table 11.2
An action-value table for an agent trained to play the card game TwentyTwos
 Standard Approach: Q learning, Off Policy Temporal
Difference Learning
• Algorithm below shows the algorithm for the Q-learning approach to temporal-
difference learning. In this approach an agent is deployed into the world and acts
sequentially, observing the state of the world and taking actions that move it to new
states and generate reward. This algorithm assumes an episodic scenario in which
the agent will repeat multiple episodes of the task that is performing (for example,
multiple iterations of a game or attempts to navigate an environment). During
learning, after completing an episode theagent will return to the initial state and
start again. This will repeat for some pre-specified number of episodes after which
the expectation is that the agent will have learned to perform the task well. The -
greedy policy is often used together with Q-learning for choosing actions that
balance exploration and exploitation and will be used here.
Extensions and Variations
Two important extensions to the temporal-difference learning approach introduced in
the previous section. The first is the SARSA on-policy modification to temporal-
difference learning. The second is an extension that uses a predictive machine learning
model to replace the action-value table to accommodate environments in which the
state-action space is too large for tabular methods to work. Specifically we present the
deep Q network (DQN) algorithm.
 SARSA, OnPolicy Temporal Difference Learning
• The Q-learning algorithm discussed previously is referred to as off-policy as the
behavior policy is not used to select the action to be taken in the next state when Q
values are updated (a greedy target policy is used instead). On-policy temporal-
difference learning is an alternative in which the behavior policy is used to select the
next action at the update step. SARSA24 is the most well-known on-policy
temporal-difference learning algorithm
• The value calculated is not terribly different from the value calculated
before. Cumulatively, however, these differences will lead to quite different
behavior. Agents trained using SARSA tend to learn more conservative
strategies than agents trained using Q-learning.
• Since agents using SARSA use a policy with some exploration in their
action-value table update equation, they will often base their estimation of
expected return on next actions with quite poor return
 Deep Q Network
• The deep Q network (DQN) algorithm addresses the issue of changing the network
frequently during the training process, which can cause the training to oscillate using two key
ideas: experience replay and network freezing.
• When experience replay is used, each time an agent uses an action-value network QMW
to select and take an action, at, in a state, st, earning a reward, rt, and moving the agent
to a new state, St+1, an instance of the form

is added to a replay memory, D. After taking the action, instead of performing a single step
of stochastic gradient descent, the agent randomly selects a random sample of b instances
from the replay memory, and performs an iteration of mini-batch gradient descent using
this sample as the mini-batch. The target feature values for the instances in the mini-batch
are generated as described in naive neural Q-learning algorithm. This means that the training
process is using its experience of the environment much more efficiently because each step is
used in network training multiple times. Furthermore, the correlations between instances are
broken because mini-batches are randomly selected from the replay memory. The replay
memory is given a maximum size, N (usually greater than 10;000), and when it reaches
this the oldest instances are dropped as new ones are added. Figure 11.9 illustrates this
process.
• In the naive approach described the network being trained is also being used to
generate target feature values . This can cause the network training process to
become unstable as small change in the outputs of the action-value network can
lead to sudden changes in the policy as a different action is suddenly favored in a
type of state. Target network freezing is used to address this. Two different
networks are used in the training process: an action-value behavior network that is
used to predict the values of actions for making decisions and an action-value target
network that is used to predict the value of taking subsequent actions in subsequent
states when generating target feature values. The action-value target network is
frozen and not updated at each iteration of the algorithm. It does, however, need to
be updated occasionally because otherwise the estimated values used in the loss
function will be inaccurate. Therefore, after every C steps the current action-value
target network is replaced with a copy of the action-value behavior network. This is
also illustrated in Figure 11.9. Target network freezing makes the training process
more stable and leads to faster convergence. A pseudocode description of the deep
Q network algorithm is given in Algorithm 16
• The deep Q network algorithm can be used with any state representation that can be
input into a neural network, and can use different neural network architectures. The
simplest version of this would be a numeric state vector input into a multi-layer
perceptron feedforward network. The algorithm was first proposed, however, as an
approach to playing video games in which the only inputs were screenshots of the
game. To best handle image inputs a convolutional neural network29 was used. A
single screenshot of a game does not contain sufficient information about the state of
an environment and an agent for the environment to be considered fully observable,
and so the Markov assumption does not hold. For example, in the single screenshot
of the Lunar Lander environment in Figure 11.7, it is not possible to tell at what
velocity the spaceship is moving. To overcome this, sequences of the last k
screenshots stacked together can be used as the state representation.
• This is an example of using a state generation function Usually small stacks of
screenshots (e.g. k = 4) provide enough information to capture the state. It is difficult
to provide a detailed worked example of the DQN algorithm because the number of
weights to be learned and steps required for anything interesting is too large for clear
presentation. Instead, to illustrate the DQN algorithm we will examine at a higher
level how an automated player of the Lunar Lander game can be trained As
mentioned before this game has four actions available to the agent: None, Up, Left,
and Right. State can be represented as a stack of the last 4 frames in the game. This
is illustrated in Figure 11.9
There are two ways that an episode can end: an agent can either land successfully
or crash. The agent earns a reward of +100 for landing successfully and a reward of 100
for crashing. During landing the agent receives a reward of +10 each time one of its legs
touches the ground gently. For every step that the agent is firing one of its thrusters it
receives a reward of 0:3.
A convolutional neural network was used as the action-value network. Input images
were scaled to 84 84 and the network contained hidden convolutional layers with 32, 64
and 64 units.30 Filter sizes were 8* 8 (stride 4), 4* 4 (stride 3), and 3* 3 (stride 1).
Rectified linear activation functions were used in all hidden layer units. A final hidden
layer flattened the outputs of the previous convolutional layer and contained 512 fully
connected units with rectified linear activations. The output layer was a fully connected
layer with 4 outputs (one per action) using linear activations. Figure 11.9 illustrates
this architecture. The behavior policy used was greedy, but linear annealing was also
used. Linear annealing allows the value for used in greedy policy to change over time.
At the beginning, a large value 0:9 is used and this slowly moves down toward a small
value 0:05. During DQN training the size of the replay memory was 50;000 and the target
action-value function network, QM, was replaced every 10;000 steps.

CSE2530 Reinforcement Learning 2025 P1+2
No ratings yet
CSE2530 Reinforcement Learning 2025 P1+2
115 pages
F90de-Introduction To Reinforcement Learning
No ratings yet
F90de-Introduction To Reinforcement Learning
67 pages
RL DQN PG
No ratings yet
RL DQN PG
65 pages
Unit 03 RL Problem
No ratings yet
Unit 03 RL Problem
9 pages
Unit 5 Deep Learning
No ratings yet
Unit 5 Deep Learning
24 pages
Introduction To Reinforcement Learning
No ratings yet
Introduction To Reinforcement Learning
62 pages
Lec17 ReinforcementLearning
No ratings yet
Lec17 ReinforcementLearning
58 pages
A Crash Course On Reinforcement Learning - Felix Wagner
No ratings yet
A Crash Course On Reinforcement Learning - Felix Wagner
84 pages
Unit 04 Finite Markov Decision Processes
No ratings yet
Unit 04 Finite Markov Decision Processes
8 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
31 pages
Finite Markov Decision Processes-BR
No ratings yet
Finite Markov Decision Processes-BR
31 pages
Sections
No ratings yet
Sections
76 pages
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
No ratings yet
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
23 pages
Reinforcement Learning2A
No ratings yet
Reinforcement Learning2A
88 pages
DRL #4-5 - Introducing MDP and Dynamic Programming Solution
No ratings yet
DRL #4-5 - Introducing MDP and Dynamic Programming Solution
74 pages
An Introduction To Reinforcement Learning From Theory To Algorithms (December 19, 2024) - Joon Kwon
No ratings yet
An Introduction To Reinforcement Learning From Theory To Algorithms (December 19, 2024) - Joon Kwon
66 pages
ANN Unit-2 Chapter-2
No ratings yet
ANN Unit-2 Chapter-2
56 pages
17 - Markov Decision Processes
No ratings yet
17 - Markov Decision Processes
59 pages
Machine Learning For NLP
No ratings yet
Machine Learning For NLP
58 pages
10 ML Introduction To Reinforcement Learning
No ratings yet
10 ML Introduction To Reinforcement Learning
8 pages
Ai (It) Unit-4
No ratings yet
Ai (It) Unit-4
37 pages
Reinforcement Learning: Karan Kathpalia
No ratings yet
Reinforcement Learning: Karan Kathpalia
80 pages
Unit 4
No ratings yet
Unit 4
6 pages
L13 Reinforcement Learning
No ratings yet
L13 Reinforcement Learning
35 pages
RL Unit - Ii
No ratings yet
RL Unit - Ii
20 pages
Chapter 2
No ratings yet
Chapter 2
21 pages
Deep RL - Content Beyond Syllabus
No ratings yet
Deep RL - Content Beyond Syllabus
16 pages
Unit Vi
No ratings yet
Unit Vi
17 pages
20ai903 - RL - Unit 2
No ratings yet
20ai903 - RL - Unit 2
27 pages
Markov Decision
No ratings yet
Markov Decision
4 pages
DSA5102 Lecture11
No ratings yet
DSA5102 Lecture11
44 pages
MLT Unit-5 Notes
No ratings yet
MLT Unit-5 Notes
17 pages
Reinforcement Learning-1
No ratings yet
Reinforcement Learning-1
13 pages
16 RL PDF
No ratings yet
16 RL PDF
87 pages
Lecture 9 Reiforcement Learning
No ratings yet
Lecture 9 Reiforcement Learning
29 pages
L12 Reinforcement Learning 2
No ratings yet
L12 Reinforcement Learning 2
26 pages
Reinforcement Learning Note
No ratings yet
Reinforcement Learning Note
16 pages
RL Ese Answers
No ratings yet
RL Ese Answers
16 pages
An Application of Inverse Reinforcement Learning To Medical Records of Diabetes Treatment
No ratings yet
An Application of Inverse Reinforcement Learning To Medical Records of Diabetes Treatment
8 pages
Reinf 2
No ratings yet
Reinf 2
4 pages
The Routledge Handbook of AI and Literature - Genevieve Liveley, Will Slocombe (Eds - ) - Routledge Handbooks, 2024 - Routledge - 9781032186948 - Anna's Archive
No ratings yet
The Routledge Handbook of AI and Literature - Genevieve Liveley, Will Slocombe (Eds - ) - Routledge Handbooks, 2024 - Routledge - 9781032186948 - Anna's Archive
397 pages
21ai020 & Reinforcement Learning: The Agent-Environment Interface
No ratings yet
21ai020 & Reinforcement Learning: The Agent-Environment Interface
8 pages
RL RS-Unit - 3
No ratings yet
RL RS-Unit - 3
6 pages
Reinforcement Learning: Nguyen Do Van, PHD
No ratings yet
Reinforcement Learning: Nguyen Do Van, PHD
40 pages
Reinforcement Learning MY101
No ratings yet
Reinforcement Learning MY101
15 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
9 pages
AS02
No ratings yet
AS02
16 pages
22 Reinforcement Learning
No ratings yet
22 Reinforcement Learning
18 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
6 pages
Foundations of Machine Learning: Module 2: Linear Regression and Decision Tree
100% (2)
Foundations of Machine Learning: Module 2: Linear Regression and Decision Tree
16 pages
Lect28 4up
No ratings yet
Lect28 4up
11 pages
1.1 Discounted (Infinite-Horizon) Markov Decision Processes
No ratings yet
1.1 Discounted (Infinite-Horizon) Markov Decision Processes
26 pages
A17 Complexdecisions
No ratings yet
A17 Complexdecisions
28 pages
5.4-Reinforcement Learning-Part1-Introduction
No ratings yet
5.4-Reinforcement Learning-Part1-Introduction
15 pages
Reinforcement Learning Model Based Planning Dynamic Programming
No ratings yet
Reinforcement Learning Model Based Planning Dynamic Programming
17 pages
RL Frra
No ratings yet
RL Frra
10 pages
ML Unit 4
No ratings yet
ML Unit 4
9 pages
Reinforcement Learning in A Nutshell
No ratings yet
Reinforcement Learning in A Nutshell
12 pages
Active Learning For Reward Estimation in Inverse Reinforcement Learning
No ratings yet
Active Learning For Reward Estimation in Inverse Reinforcement Learning
16 pages
WF-AI Legal Guide
No ratings yet
WF-AI Legal Guide
72 pages
Đề Thi Đề Xuất Lần Thứ Xi - Năm Học 2017-2018 Môn Thi: Tiếng Anh Lớp 10
No ratings yet
Đề Thi Đề Xuất Lần Thứ Xi - Năm Học 2017-2018 Môn Thi: Tiếng Anh Lớp 10
21 pages
Reinforcement Learning and Control: CS229 Lecture Notes
No ratings yet
Reinforcement Learning and Control: CS229 Lecture Notes
15 pages
Reinforcement Learning and Control: CS229 Lecture Notes
No ratings yet
Reinforcement Learning and Control: CS229 Lecture Notes
7 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 2) Quiz
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 2) Quiz
7 pages
Jaundice Your AI Assistant Robot Using Python and Arduino Cardboard
No ratings yet
Jaundice Your AI Assistant Robot Using Python and Arduino Cardboard
9 pages
A Brief Introduction To Reinforcement Learning
No ratings yet
A Brief Introduction To Reinforcement Learning
4 pages
Sample Internship Report
No ratings yet
Sample Internship Report
27 pages
Change Management: U C o I N
No ratings yet
Change Management: U C o I N
13 pages
Unit-I Basic Concepts Part A Questions (2 Marks)
No ratings yet
Unit-I Basic Concepts Part A Questions (2 Marks)
5 pages
Unmasking The Face Expression
No ratings yet
Unmasking The Face Expression
11 pages
RWKV-TS: Beyond Traditional Recurrent Neural Network For Time Series Tasks
No ratings yet
RWKV-TS: Beyond Traditional Recurrent Neural Network For Time Series Tasks
13 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Final Project
No ratings yet
Final Project
18 pages
Unec 1727169634
No ratings yet
Unec 1727169634
21 pages
Designing Efective Interview Chatbots: Automatic Chatbot Profiling and Design Suggestion Generation For Chatbot Debugging
No ratings yet
Designing Efective Interview Chatbots: Automatic Chatbot Profiling and Design Suggestion Generation For Chatbot Debugging
15 pages
Quality Manual 9001 2015
100% (3)
Quality Manual 9001 2015
40 pages
Abstract Project
No ratings yet
Abstract Project
28 pages
21CSC305P Machine Learning C Professional Core L T P C 2 1 0 3
No ratings yet
21CSC305P Machine Learning C Professional Core L T P C 2 1 0 3
2 pages
Data Analytics and Machine Learning Playbook 1644741143
No ratings yet
Data Analytics and Machine Learning Playbook 1644741143
12 pages
UG - Mid Sem Exam - Schedule-March-2024
No ratings yet
UG - Mid Sem Exam - Schedule-March-2024
11 pages
R20 B.Tech - CSM Siddarth Institute of Engineering & Technology: Puttur (Autonomous) Machine Learning Lab 3 Course Objectives
No ratings yet
R20 B.Tech - CSM Siddarth Institute of Engineering & Technology: Puttur (Autonomous) Machine Learning Lab 3 Course Objectives
2 pages
Course Syllabus BUS 614 FA24-Oct31
No ratings yet
Course Syllabus BUS 614 FA24-Oct31
9 pages
FDP Schedules GenAI
No ratings yet
FDP Schedules GenAI
2 pages
Black Blue Futuristic Modern Artificial Intelligence Project Presentation
No ratings yet
Black Blue Futuristic Modern Artificial Intelligence Project Presentation
10 pages
Threat Detection and Response Using AI and NLP in Cybersecurity
No ratings yet
Threat Detection and Response Using AI and NLP in Cybersecurity
11 pages
AI Infrastructure Outline 2025
No ratings yet
AI Infrastructure Outline 2025
4 pages
What Drives Students Toward ChatGPT An Investigation of The Factors Influencing Adoption and Usage of ChatGPT
No ratings yet
What Drives Students Toward ChatGPT An Investigation of The Factors Influencing Adoption and Usage of ChatGPT
1 page
ARTICLE - Is Agentic RAG Worth The Investment? Agentic RAG Pricing and ROI Breakdown
No ratings yet
ARTICLE - Is Agentic RAG Worth The Investment? Agentic RAG Pricing and ROI Breakdown
1 page
How Custom Application Software Is Transforming U.S. Businesses
No ratings yet
How Custom Application Software Is Transforming U.S. Businesses
9 pages