Applications of Game Theory in Deep Learning

SpringerBriefs in Computer Science
Tanmoy Hazra · Kushal Anjaria · Aditi Bajpai ·

Akshara Kumari
Applications of Game
Theory in Deep
Learning
SpringerBriefs present concise summaries of cutting-edge research and practical
applications across a wide spectrum of fields. Featuring compact volumes of 50 to
125 pages, the series covers a range of content from professional to academic.
Typical topics might include:
• A timely report of state-of-the art analytical techniques
• A bridge between new research results, as published in journal articles, and a
contextual literature review
• A snapshot of a hot or emerging topic
• An in-depth case study or clinical example
• A presentation of core concepts that students must understand in order to make
independent contributions
Briefs allow authors to present their ideas and readers to absorb them with minimal
time investment. Briefs will be published as part of Springer’s eBook collection,
with millions of users worldwide. In addition, Briefs will be available for individual
print and electronic purchase. Briefs are characterized by fast, global electronic
dissemination, standard publishing contracts, easy-to-use manuscript preparation
and formatting guidelines, and expedited production schedules. We aim for
publication 8–12 weeks after acceptance. Both solicited and unsolicited manuscripts
are considered for publication in this series.
**Indexing: This series is indexed in Scopus, Ei-Compendex, and zbMATH **

Tanmoy Hazra • Kushal Anjaria
Aditi Bajpai • Akshara Kumari
Applications of Game
Theory in Deep Learning
Tanmoy Hazra Kushal Anjaria
Department of Artificial Intelligence Department of IT and Systems
Sardar Vallabhbhai National Institute Institute of Rural Management Anand
of Technology (SVNIT) Anand, Gujarat, India
Surat, Gujarat, India
Akshara Kumari
Aditi Bajpai Department of Electronics and
Department of Computer Science Communication Engineering
and Engineering Indian Institute of Information
National Institute of Technology (NIT) Technology (IIIT)
Raipur, Chhattisgarh, India Pune, Maharashtra, India
ISSN 2191-5768 ISSN 2191-5776 (electronic)

ISBN 978-3-031-54652-5 ISBN 978-3-031-54653-2 (eBook)
https://doi.org/10.1007/978-3-031-54653-2
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Paper in this product is recyclable.

Foreword
In an age of ever-increasing complexity and innovation, the intersection of game

theory and deep learning stands as a beacon of intellectual exploration. This book is
a testament to the strong bond of two profound disciplines, a fusion that promises to
reshape our understanding of decision-making, optimization, and the dynamics of
artificial intelligence.
Game theory, originally developed as a framework for understanding strategic
interactions, has found new life in the context of deep learning. It provides the tools
to model and analyze the complex interactions that occur among agents, whether
they are humans, autonomous systems, or algorithms. Deep learning, on the other
hand, has ignited a revolution in machine intelligence, enabling computers to learn
from data, make predictions, and adapt to changing environments. When these two
worlds collide, the result is a profound synergy that opens up a wealth of
possibilities.
This volume takes you on a journey through this intriguing landscape, revealing
the myriad applications and implications of merging game theory and deep learn-
ing. From multi-agent reinforcement learning to the design of robust AI systems,
the chapters herein illuminate the transformative potential of this synergy. Whether
you are a seasoned researcher, an aspiring practitioner, or simply an intellectual
adventurer, this book offers a roadmap to unravel the mysteries of strategic AI and
its real-world consequences.
As you delve into the following pages, you will witness how game theory
empowers deep learning to tackle the challenges of decision-making in adversarial
settings, resource allocation, and strategic cooperation. You will see how it unveils
new dimensions of fairness and ethics in AI, and how it is poised to shape the future
of industries, from finance to healthcare.
The authors, each a luminary in their respective fields, guide you through these
fascinating terrains, sharing their insights, discoveries, and visions. Their work is a
testament to the power of interdisciplinary collaboration, forging a path toward a
deeper understanding of the potential, as well as the ethical and societal consider-
ations, of AI in a world where strategic interactions abound.
v
vi Foreword
As you embark on this intellectual adventure, keep in mind that the pages you are
about to turn represent not just a compendium of knowledge but a gateway to a
future where game theory and deep learning will undoubtedly leave an indelible
mark. May this journey ignite your curiosity, inspire your own explorations, and
lead to a richer comprehension of the applications of game theory in deep learning.
Anupam
Surat, Gujarat, India Shukla
Preface
Welcome on a fascinating journey that explores the symbiotic relationship between

two of the most significant pillars of artificial intelligence—Game Theory and Deep
Learning. The pages that follow will introduce you to a world where strategic think-
ing and machine learning converge, paving a path for a future that enables intelli-
gent systems to make cognitive decisions.
The ability of game theory to model the strategic interactions of rational agents
has long intrigued scholars and policymakers. From economics to biology, game
theory has proven to be an invaluable tool for comprehending and predicting out-
comes in situations where individuals must balance their own interests with the
interests of others. However, in our digital age, where algorithms and autonomous
agents are becoming more common, the combination of game theory and deep
learning has opened up a new frontier of exploration.
Deep learning, fueled by neural networks, has revolutionized the way computers
perceive, evaluate, and acquire knowledge from data. It has accomplished advance-
ments in image and speech recognition, natural language processing, and autono-
mous control systems. The combination of these two disciplines opens up new and
exciting avenues. We observe how artificial agents can think strategically, adapt to
ever-shifting environments, and make decisions that are consistent with their goals
and the dynamics of their surroundings.
You will embark on a journey of discovery in the following chapters. We have
assembled a panel of leading experts and researchers to share their perspectives and
findings on the numerous applications of game theory in deep learning. You will
learn how this combination can be utilized to improve the robustness and fairness of
AI systems, facilitate collaborative behavior among intelligent agents, and address
security, privacy, and ethical decision-making challenges.
We invite you to immerse yourself in this book’s case studies, methodologies,
and real-world applications. We hope that this book will serve as a valuable resource
for your understanding and exploration of this flourishing field, whether you are a
seasoned professional, a curious student, or an industry practitioner.
vii
viii Preface
We encourage you to think critically about the societal and ethical implications
of these technologies as you delve deeper into the intersection of game theory and
deep learning. The ability to model, predict, and influence strategic interactions
entails a great deal of responsibility. We can collectively shape the future of AI for
the betterment of all by understanding its potentials and pitfalls.
We extend our gratitude for joining us on this intellectual journey. This book’s
contents are a testament to the extraordinary possibilities that emerge when human
ingenuity, mathematics, and computational power come together. We hope you find
inspiration and insight in these pages, and that this book piques your interest in
further exploring the applications of game theory in deep learning.
Surat, Gujarat, India Tanmoy Hazra

Anand, Gujarat, India Kushal Anjaria
Raipur, Chhattisgarh, India Aditi Bajpai
Pune, Maharashtra, India Akshara Kumari
Acknowledgment
Writing a book is a labor of love that often involves the contributions, support, and
encouragement of many individuals. We are deeply grateful to those who have been
instrumental in the creation of this work, and we would like to express our sincere
appreciation.
First and foremost, we want to extend our heartfelt gratitude to our family for
their unwavering support and understanding throughout this endeavor. Your encour-
agement and patience have been my pillars of strength.
We wish to acknowledge the invaluable guidance and motivation provided by
Prof. Anupam Shukla, whose wisdom and expertise have been instrumental in shap-
ing the ideas presented in this book. Your insights have been a guiding light on this
intellectual journey.
We are also indebted to the numerous researchers and experts who have gener-
ously shared their knowledge and perspectives, which have enriched the content of
this book. Your contributions are a testament to the collaborative spirit of the aca-
demic and professional communities.
We would like to express our appreciation to the team at Springer for their dedi-
cation and professionalism in bringing this book to fruition. Your support through-
out the publication process has been exemplary.
We are grateful to our colleagues and friends who have provided feedback,
encouragement, and a sense of camaraderie during the writing process. Your insights
and camaraderie have been a source of inspiration.
Last but not least, we want to thank the readers of this book. Your interest in the
subject matter and willingness to explore these ideas are what make the effort of
writing a book truly worthwhile.
This book would not have been possible without the collective efforts of all those
mentioned and the countless others who have contributed in various ways. Thank
you for being a part of this journey.
ix
Contents
1 Introduction�� 1
1.1 Basics of Game Theory�� 2
1.2 Introduction to Deep Learning�� 3
1.3 Game Theory in Deep Learning�� 5
1.4 Chapter Overview �� 6
1.5 Cooperative Game Theory�� 6
1.6 Noncooperative Game Theory�� 8
1.7 Application of Game Theory in Deep Learning �� 8
1.8 Case Studies and Different Applications�� 10
1.9 Conclusion and Future Research Directions�� 11
References�� 12
2 Cooperative Game Theory�� 13
2.1 Introduction�� 13
2.2 Cooperative Game Theory�� 14
2.2.1 Coalitional Games�� 15
2.2.2 Stability �� 16
2.2.3 Core�� 17
2.2.4 Epsilon Core�� 18
2.2.5 Fairness �� 18
2.2.6 Nontransferable Utility �� 19
2.2.7 Shapley Value�� 20
References�� 22
3 Noncooperative Game Theory�� 23
3.1 Comparing Cooperative and Noncooperative Theory
and Their Strategies�� 23
3.2 Nash Equilibrium�� 24
3.3 Mixed Strategies �� 24
3.4 Sequential Game �� 30
3.5 Decision Trees�� 31
3.6 Game with Imperfect Information�� 31
xi
xii Contents
3.7 Games with Perfect Information�� 31

3.8 Extensive Form Games �� 32
3.9 Game Tree�� 32
3.10 Search Strategies for Game Trees �� 33
3.10.1 Breadth-First Search (BFS)�� 33
3.10.2 Depth-First Search (DFS) �� 33
3.11 Min-Max Strategy�� 34
3.11.1 Steps in Game Tree �� 35
3.12 Strategic Form Games�� 35
3.13 Extensive Form Games �� 35
3.14 Dominant Strategies�� 36
3.14.1 Strict Dominance�� 36
3.14.2 Weak Dominance�� 37
3.15 No Dominant Strategy�� 39
3.16 Bayesian Games�� 39
3.17 Matrix Games�� 40
3.18 Repeated Games�� 41
3.18.1 Finitely Repeated Games�� 42
3.18.2 Infinitely Repeated Games�� 42
3.19 Incentives�� 43
References�� 43
4
Applications of Game Theory in Deep Neural Networks �� 45
4.1 Introduction�� 45
4.2 Relation of Neural Network to Game Theory�� 47
4.3 Applications�� 51
References�� 65
5
Case Studies and Different Applications�� 69
5.1 Auctions�� 69
5.2 Game Theory in GAN�� 72
5.3 Game Theory in CNN �� 73
5.4 Game Theory and Reinforcement Learning�� 74
5.5 Other Applications�� 74
References�� 77
6
Conclusion and Future Research Directions �� 79
6.1 A Summary of Key Insights�� 79
6.2 Open Questions, Challenges, and Cross-Disciplinary
Opportunities�� 80
6.3 Future Research Direction�� 82
References�� 83
Chapter 1
Introduction
In “game theory in deep learning,” this book aims to unravel the complex tapestry
that interweaves strategic decision-making models with the forefront of deep learn-
ing techniques. Our objective is to provide an extensive and insightful exploration,
diving deep into both the theoretical foundations and the real-world applications
that showcase this intriguing intersection of fields. The journey begins in the intro-
duction, where we lay the groundwork for understanding both game theory and
deep learning, highlighting their individual significance and the pivotal role of game
theory in enhancing and shaping deep learning algorithms. The structure of the
introduction is meticulously designed to guide the reader through a progressive and
enlightening journey. Initially, we delve into the essentials of game theory, unravel-
ling its core principles and illustrating how strategic interactions are modeled and
analyzed. This sets the stage for understanding various fields’ complex scenarios
and decision-making processes. Subsequently, we transition into the realm of deep
learning. Here, we dissect the fundamental concepts and algorithms that constitute
the backbone of deep learning, providing a clear and accessible overview of this
dynamic and rapidly evolving area of technology. This section is designed to bring
clarity and context to those who are new to the field while offering fresh perspec-
tives to those already familiar with deep learning.
The third section of the introduction bridges these two worlds, illuminating the
necessity and impact of applying game theory within deep learning frameworks. We
explore how game-theoretic concepts enhance the functionality and efficiency of deep
learning models and provide a new lens through which we can interpret and improve
these advanced algorithms. Finally, the introduction culminates with an overview of
the subsequent chapters, each dedicated to exploring different facets of the synergy
between game theory and deep learning. This comprehensive roadmap is designed to
orient readers, setting clear expectations for the journey ahead and providing a cohesive
narrative thread that ties all the chapters together. We conclude the introduction with a
reflection on the overarching themes and the transformative potential of this
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1

T. Hazra et al., Applications of Game Theory in Deep Learning, SpringerBriefs
in Computer Science, https://doi.org/10.1007/978-3-031-54653-2_1
2 1 Introduction
multidisciplinary fusion, setting the stage for a deep and engaging exploration through-
out the book.
1.1 Basics of Game Theory
Game theory is a branch of mathematics that studies strategic decision-making. It

analyzes situations in which multiple decision-makers, or “players,” interact with
one another and make choices that affect each other’s outcomes. Game theory is used
to model a wide variety of situations, including economic markets, political elec-
tions, and even evolutionary biology (Ho et al., 2022) to understand the behavior of
people in strategic situations. Game theory uses a combination of logic and mathe-
matics to study strategic decision-making. Game theory is used to help businesses—
for example, by analyzing whether a new product should be positioned as a luxury
or a bargain (Song et al., 2022). A company might use game theory when making
strategic decisions about its products and services, as well as those of its competitors.
It can also be used to predict the outcome of events involving multiple players.
Games can be classified into two broad categories: cooperative games and non-
cooperative games (Mirzaei-Nodoushan et al., 2022). Cooperative games are ones
in which all the players are acting cooperatively and working toward a common
goal. The strategies adopted by each player are intended to achieve that common
goal. Examples of cooperative games include checkers, chess, and football.
Noncooperative games are ones in which the players have no incentive to work
together to achieve a common goal. Players compete to win the competition rather
than cooperate to win it. Examples of noncooperative games include poker and
rock-paper-scissors. Competitive games are those that are played by two or more
players who compete against each other. Winning the game requires having the
highest score at the end of the game. A player who adopts a noncooperative strategy
(also known as a Nash equilibrium) cannot expect to gain any advantage over his or
her opponents by cooperating. In other words, there are no strategies or actions that
the other players can adopt to affect the strategy of the player. Players who adopt a
cooperative strategy (also known as an NSGNE) have an incentive to cooperate
because their combined efforts will allow them to achieve the best possible out-
come. This outcome is referred to as a Nash equilibrium because it is an optimal
solution that is guaranteed to occur if all the players are rational and strategic in
their decision-making.
In game theory, players are often assumed to be rational and to act in their self-
interest. They make decisions based on their preferences and the choices available
to them. The outcomes of these decisions depend on the choices made by other
players (Gimpel et al., 2020). For example, in a game of chess, each player tries to
maximize their chances of winning by making the best possible moves given the
position of the pieces on the board.
1.2 Introduction to Deep Learning 3
Game theory helps us understand how different players will act in different situ-
ations and how these actions will affect the outcome of the game. It is a useful tool
for predicting behavior and for finding strategies that will maximize a player’s
chances of success. Game theory is also used to analyze situations in which the
players have conflicting interests, such as in negotiations or the design of economic
systems. Game theory is a branch of applied mathematics. It uses mathematical
models to analyze strategic situations, in which players have choices that affect
other players. Game theory can also be used to model situations such as elections,
business interactions, wars, and economic systems. Game theory is a powerful tool
that can be used to understand and predict behavior in a wide range of situations,
and it continues to be an active area of research today. In the next section, we study
deep learning.
1.2 Introduction to Deep Learning
Deep learning is a type of machine learning that involves training artificial neural
networks on large datasets (Kelleher, 2019). It is called “deep” learning because the
neural networks used in this type of machine learning are composed of many layers,
or “depths,” of interconnected nodes. These nodes are called neurons because they
mimic the activity of neurons in the human brain and work similarly. Each neuron
in a neural network receives input from other neurons in the network and then pro-
cesses this information using an algorithm known as an activation function. The
output of the activation function is the next layer of neurons in the neural network,
which feeds back into the network via inputs from the previously processed layer.
The neural network continues this process by processing the output of the previous
layer, resulting in an output that is representative of the input data and a representa-
tion of what the neural network has learned.
In deep learning, the neural network is trained to recognize patterns in the data
by adjusting the weights and biases of the connections between the nodes. The net-
work is fed large amounts of labeled data and uses this data to learn how to make
predictions or perform a specific task. The deep learning model can then be used to
make new predictions about new data (Kelleher, 2019). Deep learning is a field of
machine learning concerned with algorithms that attempt to model high-level
abstractions in data through the use of many layers of nonlinear transformations
(Pang et al., 2020). It is part of a broader family of machine learning methods called
unsupervised learning, which uses statistical techniques to discover patterns in large
datasets. Many modern advances in AI, including machine translation, speech rec-
ognition, computer vision, and robotics, are based on neural networks in one form
or another. However, most existing deep learning models are complex and require
specialized computer hardware to run efficiently. This limits their use in most appli-
cations. To train a deep learning model, it is necessary to provide a large amount of
4 1 Introduction
labeled training data. Since this data is very expensive to obtain, researchers have
developed several techniques to reduce the cost of training a deep learning model
(Hassan et al., 2020).
Deep learning has been successful in a wide range of applications, including
image and speech recognition, natural language processing, and even playing games
like Chess and Go. It has been used to improve the performance of many different
types of systems, from self-driving cars to language translation services. Most mod-
ern AI systems are powered by deep learning methods. One of its limitations is that
it can only solve problems that can be modeled as a series of linear operations on a
collection of inputs (e.g., a set of pixels from a digital image) and a collection of
outputs (e.g., a list of words) (Alber et al., 2019). This means that it is only suitable
for problems that are easy to describe mathematically (such as classification prob-
lems where the outcome can be represented as a vector with one dimension for each
type of object). Examples of problems that are not suitable for deep learning include
optimization problems, which require models that can learn how to achieve some
objective value rather than simply predicting which output will yield it. Another
limitation of deep learning is that its performance tends to degrade as it gets further
away from its original training set (the set of examples that was used to train the
model). This makes it unsuitable for real-world applications where only limited
amounts of data are available for training purposes.
One of the key advantages of deep learning is its ability to learn from unstruc-
tured data, such as images and audio recordings (Neu et al., 2022). This makes it a
powerful tool for tasks that require understanding complex patterns or extracting
meaning from large amounts of data. One area where deep learning has made a
major impact is image recognition. Recent advances in computer vision have led to
the development of technologies such as Google’s Deep Dream system (Toğaçar
et al., 2021), which uses deep neural networks to process images and generate
dreamlike imagery. The system was originally developed for academic purposes
and was not intended for production use. However, it has been deployed in applica-
tions as diverse as medical research and computer games. Although this technology
is still in its early stages, the potential for widespread adoption seems strong. Deep
learning is also being used to develop self-driving cars. These vehicles are equipped
with LiDAR sensors (Li et al., 2022) that can detect objects in the surrounding envi-
ronment. The data from the sensors is then analyzed by sophisticated algorithms
that can learn to recognize different types of objects and produce a map of the
vehicle’s surroundings. The main advantage of this approach is that it allows the
car’s computer to make its own decisions in real time rather than relying solely on
input from the driver. Research in this area has yielded promising results, but many
challenges remain before self-driving vehicles can be made available for commer-
cial sale. The next section discusses how combining game theory and deep learning
can yield interesting results.
1.3 Game Theory in Deep Learning 5
1.3 Game Theory in Deep Learning
Game theory can be used in deep learning in a few different ways. One way is to use
game theory to model the interactions between multiple agents in a multiagent sys-
tem (Zhou et al., 2022), such as a group of autonomous vehicles or robots. The
agents in the system can be modeled as rational players in a game, and the game
theory can be used to predict how they will interact with each other and make deci-
sions. For example, one application of this is for predicting how the agents in a
group will respond to certain behaviors of the other agents. Another application is
in predicting how different behaviors of an agent will impact the behavior of the
agent’s neighbors in the network (Monti et al., 2021). Another way that game theory
can be used is in the design of neural networks for computer vision tasks (Li et al.,
2021). One way to do this is by designing the network to be able to learn to recog-
nize objects from training data which mimics a multiplayer game environment,
where each object is represented by one of the players and the other players’ actions
are used to train the object network to learn to recognize objects correctly.
Another way to use game theory in deep learning is to use it to optimize the per-
formance of a deep learning model (Rajeswaran et al., 2020). For example, in a
reinforcement learning setting, game theory can be used to model the interactions
between the agent and the environment and to find the optimal policy for the agent
to follow. This can be done using techniques such as Nash equilibrium, which is a
solution concept in game theory that describes a stable state in which no player can
improve their payoff by changing their strategy. Essentially, this means that the
agents in this type of setting are locked in their current strategies—they cannot
change them to improve their outcomes. This can lead to suboptimal decision-
making by the agent and can therefore reduce the overall performance of the algo-
rithm. However, there are techniques available that can overcome this problem by
using ideas from game theory, such as dynamic programming or value iteration.
Additionally, game theory can be used to design and evaluate the security and
robustness of deep learning models in adversarial settings (Kamhoua et al., 2021),
where the model is under attack from malicious actors. Due to the emergent com-
plexity of neural networks and the black-box nature of machine learning, analyzing
the security properties of deep networks is a daunting task. There are several
approaches to studying the security of deep learning networks using game theory.
The first approach is to apply game theory to the problem of adversarial training
(Dasgupta & Collins, 2019). Adversarial training entails creating a model that is
robust against adversarial examples that are carefully engineered by a malicious
attacker to cause the model to make mistakes when used in real-world applications.
It is argued that game-theoretic approaches can be used to identify potential weak-
nesses in the structure and training procedure of a deep neural network that could
lead to susceptibility to adversarial attacks. In the second approach, game theory is
used to analyze the security properties of classifiers and regression models used in
adversarial settings (Chivukula, 2020). A classifier or regression model is trained to
classify or predict target variables that an adversary wishes to exploit for malicious
purposes. The adversarial examples are defined as pairs of input data and
6 1 Introduction
classification outputs; the goal is to identify regions of the input space that are clas-
sified differently than they should be by the classifier. The goal is to identify regions
of the input space that are classified differently than they should be by the classifier,
intending to exploit those regions to trick the classifier into misclassifying the data
as the target variable. Game-theoretic approaches can then be used to analyze the
vulnerability of the classifier to adversarial attacks (Pal & Vidal, 2020).
A third approach to studying the security of deep learning networks involves
combining game theory and deep learning techniques to analyze adversarial attacks
and defenses (Hossain et al., 2022). This approach uses reinforcement learning to
train deep neural networks to automatically defend a system from attack, which
improves the speed and accuracy of defense systems while reducing the complexity
of the underlying algorithms.
In summary, game theory can be used in deep learning to model the interactions
between agents in a multiagent system, optimize the performance of deep learning
models, and design and evaluate the security and robustness of deep learning models
in adversarial settings. Games are one of the fundamental tools for modeling social
interactions and decision-making in a wide range of disciplines including econom-
ics, psychology, computer science, philosophy, and biology. A common feature of
games is that they allow human players to engage in strategic interactions with each
other to maximize their performance or that of their opponents. The game-theoretic
approach to machine learning aims to enable machines to learn strategies from
observed data and use the learned strategies to solve tasks in a fully autonomous
manner. Thus, game theory can be used to design algorithms for learning and deci-
sion-making in a wide variety of domains such as robotics, artificial intelligence,
data mining, econometrics, and finance. In the present work, we explain cooperative
game theory, noncooperative game theory, deep learning algorithms, applications of
game theory in deep learning, and case studies in the upcoming chapters. In the fol-
lowing subsection, we provide an overview of the chapters presented in the book.
1.4 Chapter Overview
After the introduction chapter, we aim to present the idea of cooperative game the-
ory, Chap. 2: The Dynamics of How Individuals or Groups Work Together to
Achieve Common Goals. This chapter lays the foundational concepts and models,
illustrating how cooperation can lead to mutually beneficial outcomes. We explain
this chapter in the following subsection.
1.5 Cooperative Game Theory
Chapter 2 of “game theory in deep learning” is dedicated to exploring cooperative

game theory, an integral aspect of game theory that involves players forming coali-
tions to achieve mutually beneficial outcomes. This chapter explains how
1.5 Cooperative Game Theory 7
cooperative game theory operates, its principles, and its diverse applications.
Initially, the chapter introduces the concept of game theory, its historical develop-
ment, and its broad application across various domains such as economics, political
science, biology, cybersecurity, and healthcare. It then distinguishes between differ-
ent types of games, including cooperative and noncooperative games, and discusses
the min-max theorem and the concept of zero-sum and nonzero-sum games.
The core focus of the chapter is on defining cooperative game theory. It describes
how players in cooperative games form groups or coalitions, aiming for solutions
that benefit the group as a whole. This is contrasted with noncooperative games,
where players act independently. The chapter uses relatable examples, such as ice
cream sharing and voting scenarios, to illustrate how cooperative game theory
applies to real-world situations. Important concepts like “coalitional games,” “trans-
ferable utility,” and “stability” in cooperative games are thoroughly explained. The
chapter underscores the significance of stability in coalitions, highlighting that a
stable coalition indicates members are content with the agreement, reducing uncer-
tainty and conflict.
The “core,” a fundamental solution concept in cooperative game theory, is dis-
cussed in detail. The core represents a set of stable and feasible outcomes in a coali-
tion game, ensuring no subgroup or individual can gain more by forming their own
coalition. This section delves into the mathematical underpinnings of the core and
its implications for resource allocation and fairness. Further, the chapter examines
“nontransferable utility” and its relevance in cooperative games where direct
exchange or utility transfer among players is not feasible, emphasizing the com-
plexities in resource allocation and the importance of fair distribution in such set-
tings. The chapter also introduces the “Shapley value,” a method for fair value
distribution based on players’ marginal contributions to the game. This concept is
vital in assessing contributions in collaborative environments where the overall suc-
cess is a result of the combined efforts of all participants. Additionally, the chapter
explores the concept of “dominant strategies,” illustrating how players choose strat-
egies that provide the most favorable outcomes irrespective of others’ actions and
the role of Nash equilibria in optimizing payoffs in multiplayer systems.
Lastly, Nash equilibrium is discussed as a situation where each player’s strategy
is optimal, given the strategies chosen by other players. The chapter uses practical
examples, like the “monkey climb” game, to demonstrate the application and sig-
nificance of Nash equilibrium in real-world scenarios. This chapter effectively pro-
vides a comprehensive view of cooperative game theory, emphasizing its practicality
in various fields and highlighting key theoretical concepts for understanding coop-
erative interactions in game theory. In Chap. 2, as we understand the exploration of
cooperative game theory, where the focus has been on the dynamics of collaboration
and coalition building, we now turn our attention to a different yet equally fascinat-
ing aspect of game theory: noncooperative game theory in Chap. 3. The details in
Chap. 3 are in the following subsection.
8 1 Introduction
1.6 Noncooperative Game Theory
The chapter on “Noncooperative Game Theory” in the present book delves into the
competitive aspects of game theory where players act independently without form-
ing coalitions or agreements. It contrasts this with cooperative game theory, under-
scoring the independence and strategic decision-making in competitive
environments. Key concepts include Nash equilibrium, minimax strategies, and the
absence of binding agreements among players, which leads to a diverse range of
strategic interactions in various scenarios like business, politics, and social settings.
The chapter further explores various types of noncooperative games, such as the
well-known prisoner’s dilemma, rock-paper-scissors, and the Friend or Foe game,
each illustrating unique aspects of noncooperative strategy. It delves into sequential
games like chess and poker, where players’ decisions are influenced by previous
actions, and discusses the use of decision trees and game trees to analyze these games.
Other notable sections include the examination of games with imperfect infor-
mation, where uncertainty and hidden knowledge play crucial roles, and games with
perfect information, where players have complete knowledge about actions and his-
tory. The chapter also addresses advanced concepts like Bayesian games, which
incorporate elements of incomplete information, and mechanism design, which
explores creating systems to achieve specific goals in multiplayer settings. This
chapter provides a comprehensive overview of noncooperative game theory, its
applications, and the various strategic considerations involved in such games, mak-
ing it a valuable resource for understanding competitive strategic interactions in
diverse fields. Having delved into the intricate world of game theory, both coopera-
tive and noncooperative, we now start the discussion on the application of game
theory in deep learning in Chap. 4. The following subsection explains this idea.
1.7 Application of Game Theory in Deep Learning
The chapter titled “Applications of Game Theory in Deep Neural Networks” in the
book covers a wide range of applications, demonstrating the versatility and impact
of game theory in enhancing deep learning models and systems. The chapter estab-
lishes the foundational concepts of deep learning and neural networks, emphasizing
their ability to automatically classify and learn from data without human interfer-
ence. It then moves into exploring the relationships between neural networks and
game theory, using examples like the rock-paper-scissors game to illustrate how
game theory concepts can be applied to neural network strategies. Various applica-
tions of game theory in deep neural networks are then discussed in detail. These
include the following:
1. Wireless Network Security: Using game theory and deep learning to protect
wireless networks from jamming attacks, with transmitters and jammers engag-
ing in a strategic deception game.
1.7 Application of Game Theory in Deep Learning 9
2. Scene Recognition in Human-Robot Interaction: Integration of deep neural net-

works (DNNs) and game theory for advanced scene recognition, crucial in
enhancing robot intelligence and interaction with humans.
3. Adversarial Machine Learning in Cybersecurity: Implementing game theory to
develop strategies for protecting machine learning systems against adversarial
attacks, a major concern in cybersecurity.
4. Mobile Crowdsensing: Applying game theory in mobile crowdsensing to opti-
mize data collection and improve the resistance of systems against fraudulent
data attacks.
5. Learning Games and Supervised Learning Problems: Exploring the use of
game theory in developing no-regret learning strategies for large-scale games,
which can be applied in supervised learning contexts.
6. Adversarial Attack Strategies in Deep Learning: Studying various adversarial
attacks and defenses in deep learning models, focusing on game-theoretic
approaches to understanding and countering these threats.
7. Nuclear Security: Using game theory and deep learning to analyze complex
data related to nuclear security and terrorism, highlighting the role of game
theory in managing nonlinear algorithms for critical societal and business events.
8. Real-Time Strategy Games: Application of deep reinforcement learning in real-
time strategy games, showcasing the use of advanced AI methods in complex
gaming environments.
9. Green Security Games: Implementing deep Q-learning in Green Security
Games to develop real-time responsive strategies for protecting natural
resources against illegal activities.
10. Multiagent Reinforcement Learning: Exploring the impact of opponent-learning
awareness in multiagent environments, particularly in scenarios involving
cooperation and competition.
11. Multiagent Reinforcement Learning in Traffic Scenarios: Utilizing deep rein-
forcement learning combined with game theory for modeling decision-making
in multiagent traffic scenarios, emphasizing the strategic interactions among
different drivers.
12. Internet of Vehicles (IoV) and Edge Computing: Discussing the application of
game theory in the IoV for optimizing service offloading and improving the
quality of service through advanced computational methods.
13. AI-Assisted Diagnosis in Dermatology: Exploring the use of game theory in
developing computer-aided diagnostic systems for melanoma detection,
employing deep learning models to enhance diagnostic accuracy.
14. Smart Grid Technology: Investigating the application of game theory in smart
grid systems for enhancing load frequency control and addressing security
challenges through artificial neural networks.
15. Image Restoration Techniques: Utilizing generative adversarial networks
(GANs), inspired by game theory, for image restoration tasks, where the battle
between the generative and discriminative models mirrors game-theoretic
concepts.
16. Space Situational Awareness (SSA): Implementing machine learning and game
theory to analyze satellite behavior and enhance the accuracy of behavior clas-
sification in space.
10 1 Introduction
17. Adversarial Attacks and Defenses: Exploring a game theory framework to

understand the interplay between adversarial attacks and defenses in machine
learning models.
18. Resource Allocation in MIMO Networks: Analyzing the use of game theory in
optimizing resource allocation in multiple-input multiple-output (MIMO) net-
works, utilizing deep reinforcement learning for strategy optimization.
19. Image Segmentation: Application of game theory and deep learning in image
segmentation, improving feature extraction and generalization capacity.
20. Autonomous Vehicle Path Planning: Combining game theory and memory neu-
ral networks to enhance decision-making in autonomous vehicles, particularly
in complex driving scenarios.
21. Pricing Strategy for Perishable Food Items: Implementing game theory and
deep learning models to develop effective pricing strategies for perishable
goods in supermarkets, addressing environmental and supply chain concerns.
Each of these applications demonstrates the innovative ways game theory prin-
ciples are being integrated with deep learning technologies to solve. Having jour-
neyed through the theoretical landscapes of game theory in deep learning and its
diverse applications, we now arrive at a pivotal segment of our exploration: the case
studies. In Chap. 5, we explain case studies. The details of Chap. 5 are described in
the following subsection.
1.8 Case Studies and Different Applications
The chapter titled “Case Studies and Different Applications” in the book presents a
diverse range of practical applications of game theory in various domains, demon-
strating its versatility and depth. The chapter begins by delving into auction theory,
a subset of economics and game theory, which studies the mechanisms of auctions
and strategic interactions of bidders. It covers the different types of auctions like
English, Dutch, and Vickrey auctions and examines how game theory aids in under-
standing bidder strategies, revenue maximization, and fraud detection. In the realm
of pricing, the chapter explores how game theory aids in strategic decision-making
by firms in competitive markets, including models like the Bertrand model and anal-
ysis of Nash Equilibria in price competition games. It also discusses cooperative
pricing games and cartel stability, using examples of organizations to illustrate these
concepts.
The application of game theory in generative adversarial networks (GANs) is
another focus. It explains how GANs, consisting of a generator and discriminator,
can be viewed as a minimax game, a fundamental concept in game theory. This
perspective helps in understanding the training dynamics and stabilization of GANs.
Furthermore, the chapter touches upon the use of convolutional neural networks
(CNNs) in game theory. It illustrates how CNNs can analyze strategic interactions,
predict player behavior, and assist in game content generation and strategy recom-
mendations in various game types, including board, card, and video games. In the
1.9 Conclusion and Future Research Directions 11
context of reinforcement learning, the chapter explains how game theory concepts
like Nash equilibria and best responses can analyze the strategic behavior of agents
in multiagent reinforcement learning environments. This includes adversarial inter-
actions and the learning process in environments analogous to repeated games.
Additionally, the chapter covers a wide range of other applications, including
oligopolies, educational institutions, farming, and computer vision, demonstrat-
ing the extensive reach of game theory in modeling strategic interactions and
decision-making in diverse fields. Notable mentions include the use of coevolu-
tionary neural population models to simulate strategy evolution in repeated games
and game theory in computer vision for strategic interactions in visual scenes. In
summary, this chapter provides a comprehensive view of the multifaceted applica-
tions of game theory across various domains, showcasing its practicality and the
significant insights it offers in understanding and solving complex strategic inter-
actions. The final chapter, Chap. 6, concludes the present book and draws the
future research direction. The details of Chap. 6 are presented in the following
subsection.
1.9 Conclusion and Future Research Directions
As we draw our exploration to a close in the final chapter of this book, we reflect on
the insights gained and the journey undertaken in understanding the intersection of
game theory and deep learning. This concluding chapter serves as both a summary
of our findings and a forward-looking perspective on the potential future develop-
ments in this dynamic field. In this chapter, we synthesize the key concepts, theo-
ries, and applications discussed throughout the book, highlighting the most
significant contributions and the practical implications of our exploration. We revisit
the core ideas that form the backbone of our discussion, drawing connections
between the diverse topics covered, from the fundamentals of game theory and deep
neural networks to the intricate applications and case studies.
Beyond summarization, this chapter delves into the future of game theory in
deep learning. We identify emerging trends, nascent technologies, and unexplored
areas that hold promise for further research. This forward-looking section is
designed to inspire researchers, practitioners, and enthusiasts to continue exploring,
innovating, and contributing to this field. We discuss potential advancements in
algorithms, applications in new domains, and the evolution of current methodolo-
gies. Moreover, we address the challenges and open questions that remain in the
integration of game theory and deep learning. These reflections not only underscore
the complexities and nuances of this field but also serve as a call to action for the
research community to address these challenges and further advance our
understanding.
In essence, the final chapter is crafted to leave readers with a comprehensive
understanding of where we stand in the present and a clear vision of the exciting
possibilities that lie ahead. It’s an invitation to ponder, participate, and propel the
future of game theory in deep learning.
12 1 Introduction
References
Alber, M., Buganza Tepole, A., Cannon, W. R., De, S., Dura-Bernal, S., Garikipati, K., et al. (2019).
Integrating machine learning and multiscale modeling—Perspectives, challenges, and opportu-
nities in the biological, biomedical, and behavioral sciences. npj Digital Medicine, 2(1), 1–11.
Chivukula, A. S. (2020). Game theoretical adversarial deep learning algorithms for robust neural
network models (Doctoral dissertation).
Dasgupta, P., & Collins, J. (2019). A survey of game theoretic approaches for adversarial machine
learning in cybersecurity tasks. AI Magazine, 40(2), 31–43. https://doi.org/10.1609/aimag.
v40i2.2847
Gimpel, H., Graf-Drasch, V., Kammerer, A., Keller, M., & Zheng, X. (2020). When does it pay
off to integrate sustainability in the business model?–a game-theoretic analysis. Electronic
Markets, 30(4), 699–716.
Hassan, M. M., Gumaei, A., Alsanad, A., Alrubaian, M., & Fortino, G. (2020). A hybrid deep
learning model for efficient intrusion detection in big data environment. Information Sciences,
513, 386–396.
Ho, E., Rajagopalan, A., Skvortsov, A., Arulampalam, S., & Piraveenan, M. (2022). Game theory
in defence applications: A review. Sensors, 22(3), 1032.
Hossain, K. F., Tavakkoli, A., & Sengupta, S. (2022). A game theoretical vulnerability analysis of
adversarial attack. In International symposium on visual computing (pp. 369–380). Springer.
Kamhoua, C. A., Kiekintveld, C. D., Fang, F., Zhu, Q., & (Eds.). (2021). Game theory and machine
learning for cyber security. Wiley.
Kelleher, J. D. (2019). Deep learning. MIT press.
Li, G., Huang, Y., Chen, Z., Chesser, G. D., Purswell, J. L., Linhoss, J., & Zhao, Y. (2021). Practices
and applications of convolutional neural network-based computer vision systems in animal
farming: A review. Sensors, 21(4), 1492.
Li, N., Ho, C. P., Xue, J., Lim, L. W., Chen, G., Fu, Y. H., & Lee, L. Y. T. (2022). A progress review
on solid-state LiDAR and nanophotonics-based LiDAR sensors. Laser & Photonics Reviews,
16(11), 2100511.
Mirzaei-Nodoushan, F., Bozorg-Haddad, O., & Loáiciga, H. A. (2022). Evaluation of cooperative
and non-cooperative game theoretic approaches for water allocation of transboundary rivers.
Scientific Reports, 12(1), 1–11.
Monti, A., Bertugli, A., Calderara, S., & Cucchiara, R. (2021, January). Dag-net: Double attentive
graph neural network for trajectory forecasting. In 2020 25th international conference on pat-
tern recognition (ICPR) (pp. 2551–2558). IEEE.
Neu, D. A., Lahann, J., & Fettke, P. (2022). A systematic literature review on state-of-the-art deep
learning methods for process prediction. Artificial Intelligence Review, 55(2), 801–827.
Pal, A., & Vidal, R. (2020). A game theoretic analysis of additive adversarial attacks and defenses.
Advances in Neural Information Processing Systems, 33, 1345–1355.
Pang, B., Nijkamp, E., & Wu, Y. N. (2020). Deep learning with tensorflow: A review. Journal of
Educational and Behavioral Statistics, 45(2), 227–248.
Rajeswaran, A., Mordatch, I., & Kumar, V. (2020, November). A game theoretic framework
for model based reinforcement learning. In International conference on machine learning
(pp. 7953–7963). PMLR.
Song, L., Luo, Y., Chang, Z., Jin, C., & Nicolas, M. (2022). Blockchain adoption in agricultural
supply chain for better sustainability: A game theory perspective. Sustainability, 14(3), 1470.
Toğaçar, M., Cömert, Z., & Ergen, B. (2021). Enhancing of dataset using DeepDream, fuzzy color
image enhancement and hypercolumn techniques to detection of the Alzheimer's disease stages
by deep learning model. Neural Computing and Applications, 33(16), 9877–9889.
Zhou, L., Zheng, Y., Zhao, Q., Xiao, F., & Zhang, Y. (2022). Game-based coordination control of
multi-agent systems. Systems & Control Letters, 169, 105376.
Chapter 2
Cooperative Game Theory
Cooperative game theory is an important area of game theory. In cooperative games,

players form coalitions and work together to achieve certain goals. The players basi-
cally form groups and then they distribute the amount between them. In cooperative
games, players distribute their payoffs. It is called as “coalition game.” Examples of
cooperative games include situations where players collaborate in business ven-
tures, share resources, negotiate contracts, or work together to achieve mutual ben-
efits. Solutions for Cooperative games include “the Core” and “Shapley Value”.
Cooperative games give more emphasis to stability and fairness of solution. This
chapters presents a number of coordination games, cooperative games, and theoreti-
cal concepts of the cooperative games. The chapter also discusses various real-life
applications of cooperative game theory.
2.1 Introduction
Game theory is part of mathematics and computer science branch. It deals with situ-
ations or problems that have to be analyzed and for which players have to make
decisions. Game theory was developed by John von Neumann and Oskar Morgenstern.
The book titled Theory of Games and Economic Behavior (1944) by John von
Neumann and Oskar Morgenstern is considered to be a starting point in game theory.
Game theory is applied in various fields like in economics; it is applied to analyze
market structures, auctions, and pricing strategies. In political science, game theory
has been applied to solve conflict resolution. It is applied in the study of biology as
well. Game theory is used to provide insights into various auctions, such as first-
price auctions and Japanese auctions. Game theory helps in voting systems, candi-
date strategies, and in analyzing the results of campaign policies. Game theory is
integrated with psychology to study how individuals make decisions under different
factors, such as uncertainty and various physical and sensitive emotional conditions.

14 2 Cooperative Game Theory
In the healthcare and medicine field, it helps in understanding the evolution of drugs
so that the medicines can withstand and handle different pathogens. Game theory is
used to study the behavior of financial institutions. In the field of cybersecurity, game
theory models are used to understand strategies and plans to deal with cyberattacks
by providing mechanisms. Game theory models help in explaining the emergence of
trust and cooperation in social interactions. It is also used to examine the manage-
ment of forests, fisheries, and natural resources and to study pollution and its impact
on the environment. Game theory plays a significant role in computer science and
artificial intelligence. Game theory continues to find new applications in a wide
range of fields. Its decision-making dynamics make it a valuable tool in understand-
ing and predicting outcomes in a wide range of real-world scenarios (Rezek et al.,
2008). In the classification of game theory, there are several key types of games that
help in analyzing different situations and strategies, including cooperative and non-
cooperative games, extensive form games, symmetric games, asymmetric games,
sequential games, matrix games, simultaneous-move games, zero-sum games, and
nonzero-sum games. A zero-sum game which is competitive in nature is a game in
which the total utility or payoff gained by one player and other players is zero. Here,
one player has a positive outcome, and the other player has a negative outcome. The
gain and loss of both players are equal, with opposite signs meaning loss of person
1 = gain of person 2 (e.g., = 50, −50). A classic example of a zero-sum game is chess.
In nonzero-sum games, the sum of the outcomes of all the players is not zero which
means there is no balancing between the players. An example of a nonzero-sum
game is the prisoner’s dilemma. Real-world scenarios can often be matched with
nonzero-sum games. Cooperative games can also be considered as an example of
nonzero games. These are nonzero because of the group formation and formation of
cooperative policies. A sequential game is one in which the decisions are not made
simultaneously. The decision is represented in a tree where the node represents the
choice of player, the arrow tells the action, and leaf nodes are also present.
When resources need to be allocated among a group of players or organizations,
cooperative game theory can help determine fair and efficient ways to distribute
resources and share costs. In business contexts, cooperative game theory can guide
decision-making for joint ventures, mergers, and acquisitions by assessing the ben-
efits and risks of collaboration. In the field of artificial intelligence, cooperative game
theory can be applied to study interactions among autonomous agents and robots
collaborating on tasks. Noncooperative game theory can be used with auctions, as
there is competition involved where bidders have to compete to obtain the product.
2.2 Cooperative Game Theory
It could be defined as a game with the formation of groups to obtain the best solu-
tion. It basically works from “coalitions.” It is also called the “black box” (Hazra &
Anjaria, 2022), as the way to find the solution is abstracted. There are several exam-
ples to explain it, some of which are as follows:
Ice Cream Example Let us imagine that there are three people, Alice, Bob, and
Christine, and they have Rs. 60, Rs. 40, and Rs. 30, respectively. They need to
decide how to buy ice cream. Let there be three ice cream tubs of 50 grams, 70
grams, and 100 grams amounting to Rs. 70, Rs. 90, and Rs. 110, respectively. We
assume to solve the problem by cooperative or noncooperative approach. The coop-
erative approach will make more sense in this example, as we can obtain a kind of
agreement between the coalition. Alice and Bob need to coordinate their choices to
maximize their combined payoff. This ice cream example showcases how game
theory can be applied to analyze decision-making and strategic interactions in vari-
ous scenarios.
Voting Example Voting games basically focus on how individuals can form a
group or coalition to maximize their benefits. Basically, their outcomes depend on
each player’s contribution. These games are often used to understand voting sys-
tems, elections, and other situations where a group of individuals have to collec-
tively make a choice. Voting games can have a wide range of applications, including
understanding real-world elections, decision-making in committees, and in fields
such as political science, social options, and economics. Let’s assume a scenario
where we have four parties, namely, E, F, G, and H, and have 400 seats, 250 seats,
170 seats, and 180 seats, respectively. In majority voting, the option with the most
votes wins. Basically, the party that gets maximum votes win. There is also a “fea-
sibility condition” where the coalition formed should not exceed the seats by the
total number of seats. A majority vote is needed so they have to form the coalition
and what benefits each party would get, what will be their contributions, and whether
they agree to participate in the process or not. In this example, what share will each
party earn or what are the agreements that the parties will decide can all be com-
bined and encapsulated in a single term called cooperative game theory.
2.2.1 Coalitional Games
These games are situations where players cooperate and collaborate and are required
to achieve objectives such as resource allocation and bargaining. The players in a
coalition game work together to enhance their individual or collective payoffs.
Coalitional games have the following conditions:
(a) A transferable utility that is actually dividing the outcome.
(b) If we have a set N and the set is finite.
A subset of set N can be a coalition such as {1,2,4,8}, which can be a subset of
the above set where characteristic function is defined by V = 2N.
A transferable utility includes a subset and the characteristic function. We get V
(members) that is outcome which will come if they do coalition or as the amount of
money or utility that the coalition can divide between its members. The game is
denoted by both the value function and the subset from which the solution is
obtained.
2.2.2 Stability
The concept of stability in coalitional games refers to the notion of stable outcomes
or coalitions that are unlikely to break apart due to the players’ incentives for defec-
tion. Stability is a central concern in cooperative game theory, as it helps to identify
stable cooperation structures and predicts how players will behave in coalition for-
mations. Determining whether a given coalitional game has a stable outcome can be
challenging, and not all coalitional games have stable solutions. There are different
notions for stability, but the most commonly used are core and stable coalitions
(Hazra & Anjaria, 2022).
A coalition is considered as stable if no player or subgroup of player has no
motivation to leave the coalition to join another one. This concept ensures that no
subgroup of players can benefit by forming their own coalition or joining any other
group. The aim is to achieve optimality. When a coalition is stable, it indicates that
the members have reached an agreement that is resistant to deviations, or we can say
that there is no backing off and everyone is satisfied with what they get. Stability
can help resolve conflicts more effectively and reduces uncertainty. Stability can
also lead to fairness and equality in the outputs and establishes a robust structure for
decision-making. Let us understand this through an ice cream example in which
there are three sizes of ice cream with small, medium, and large. Small size is of 550
grams, medium is of 700 grams, and large is of 900 grams; A, B, and C have $5, $3,
and $4, respectively, and they cost approximately $7 for small size, $9 for medium
size, and $11 for large size, and they form grand coalitions For the outcome, we
generally have combination of vectors denoted by "X", where X = (Xa, Xb, Xc) and
(Xa+Xb+Xc) = 900, which individual rationality will be (Xa ≥ v(A), Xb ≥ v(B),
,Xc ≥ v(C), as they are nonnegative. If these three players play the game, then the
outcome will be
Case 1: When there is equal division of ice cream between three players A, B, and
C, (900/3, 900/3, 900/3), we obtain 300 as the value of the outcome and this
outcome is “nonstable.”
Case 2: Let’s assume that A and C make a group without B, we obtain 700 grams of
ice cream, and their total money sums up to $9. We have (Xa= 350 ,Xc= 350,
which is better in terms of value, A leaves B and A makes group with C as it will
get more good results, but still its “not stable.”
Case 3: Let us suppose that there is unequal division and a coalition is formed. If A
and B form a coalition and avoids B and their contribution adds up to $8 which
gets the to buy small size ice cream, (Xa = 350 grams and Xb = 350 grams) and B
makes an agreement to A that B will provide A 400 grams and it will end up tak-
ing 300 grams of ice cream, it is more profitable than the individual division of
the amount of ice cream. Group deviations can be considered as breaking the
group or dividing the coalition. The set of outcomes we obtain after A and B
make a group without C is {A, B} = {400, 300}. Thus, it is also observed that
v(A) = 400 and v(B) = 300 and C did not form a coalition.
In this way, coalitions can be formed and broken. If a member becomes more
influential or gains more bargaining power, they may choose to break away or form
a new coalition that better serves their interests. Players may receive better offers
from players outside the existing coalition.
2.2.3 Core
The core has important implications for understanding how players can cooperate
effectively to achieve maximizing outcomes and it’s a solution concept popular in
game theory. In other word, it could be considered as a set of all stable outcomes;
the solution concept helps to identify feasible and stable outcomes when players
form coalitions to achieve common goals. The core generates the particular solution
which is the optimal one. The pair of vectors satisfies rationality and feasibility for
every coalition, and the total payoff is not less than V(C). The core might be empty,
which indicates presence of instability and means that there is no solution that satis-
fies feasibility. So, other solution concepts such as the Shapley value or nucleolus
might help in the game. Mathematically, it is written as Core Core (N,v) =
n
{(x1, …., xn) ∈ Rn: ∑ x, xi = V(N); Xi∈C, Xi ≥ v(C) ∀C ⊆ N}.
1
i
Or G = { ∑ x ∈ C, such that Xi>= V(C) for all c ≤ n}, where C is the grand coalition
0
formed and G is the denotion for the core (Hazra & Anjaria, 2022).
Consider a coalition game with a set of players {1, 2, 3} and a characteristic
function v(S) that assigns a value to each coalition S of players.
v 1 3, v 2 4, v 3 5, v 1,2 8, v 1,3

8, v 2,3 9, v 1,2,3 12
Now, let’s check if the grand coalition {1, 2, 3} is in the core which is 12. The
core conditions require that no smaller coalition has an incentive to deviate. For
example, if we consider the coalition {1, 2}, the worth of this coalition is 7. However,
if players 1 and 2 deviate and form a coalition just between themselves, the worth
would be v({1}) + v({2}) = 3 + 4 which gives 7. Since 7 is greater than 8, which
makes them want to collaborate with each other. The case is same with other sets
also and due to high yield players won’t deviate. If the grand coalition is not blocked
by any subset of players, then it is in the core. In this example, the grand coalition
{1, 2, 3} is indeed in the core because no smaller coalition has an incentive to devi-
ate from it. But in core the value can be null also.
The core has the following properties:
Feasibility: The total payoff allocated to a coalition does not exceed its value. In
other words, for every coalition S, the sum of the payoffs allocated to the players
in S is at most the value of the coalition. The sum of the payoffs assigned to the
players in S must be less than or equal to the value of the coalition. Feasibility is
important because it reflects the practical limitations of resource allocation
within coalitions. It also ensures that the payoff allocation is realistic. Individual
Rationality: The payoff allocation must be individually rational, meaning that no
player should receive less of their worth.
All coalitional games should have a nonempty core. This means that no payoff
allocation satisfies both the feasibility and individual rationality conditions. Also,
each player in the core must receive the amount as they can receive individually. In
certain cases, core might be unique meaning that there is only one stable outcome.
2.2.4 Epsilon Core
The epsilon core is an extension of the core concept in cooperative game theory. It
addresses and takes care of situations where the core of a coalitional game might be
empty, meaning there is no feasible and individually rational payoff allocation that
satisfies all players. It permits individual players to receive payoffs slightly lower
than their actual worth in the game, as long as the deviation (epsilon) is within some
degree. It helps to capture stable outcomes in situations where the core is empty.
The instability displayed by various examples and some large games (such as the
glove market example) can be somewhat overcome using the notion of the epsilon-
approximate core. Given any number o, an allocation x is in the epsilon-approximate-
core of the coalitional game (N, v) is given by Xi belong to C such that [Xi ≥ v(C)—o
|C|] ∀C ⊆ N where C is coalition value, N is total number of elements, and Xi is
particular outcome of player. When the core is empty, certain approximations
acknowledge that no coalition has a strong incentive to deviate from the existing
arrangement but epsilon cores say that a few has to deviate but it is not enormous
bounded by epsilon core-epsilon actually breaking up and may earn high payoff
more than before coalition epsilon is very small infimum can be outside of set also
highest no less than all no in sets.
In epsilon cores also, the outcomes are present in the feasibility set. It may also
have unique values like the core.
2.2.5 Fairness
It involves treating individuals or groups in a morally right and unbiased manner,

ensuring that everyone is given equal opportunities and consideration (Narahari,
2022). The main concern in fairness within coalitional games is how to allocate the
value or payoff of the grand coalition (the entire set of players) among the players
in a way that is considered as equitable. In many organizations, in joint ventures, the
amount is divided in a ratio and shared between various members. The amount that
is surplus is also distributed among various members in their collaboration ratio
(Hazra & Anjaria, 2022). Therefore, fairness could be considered as the fair amount
every individual should receive after their collaboration is formed. For example, if
some individual has donated an amount more than the other members, the individ-
ual should earn more in terms of profit.
Let us take an example where N is the number of members, N = {1,2}.
v(1) = v(2) = 50, for both players 1 and 2 and v({1,2}) =200 if they both make
group. Let us consider the outcome as player 1 earns 150 and player 2 earns 50,
which is unfair as player 1 earns more payoff; however, this solution is stable.
X1 + X2 = v (player 1, player 2) = 200, which is the outcome generated from the
grand coalition. We know that X1 ≥ 5 and X2 ≥ 5; however, other outcomes could be
more stable, such as considering the given outcome X = (100, 100). Therefore,
(100,100) is stable outcome, and at some place, they could be equal for them.
Players have some marginal contributions, which are defined as grand group
outcomes—individual values. We obtain v({1,2}) − v({2}) = 200–50 = 150, and the
difference is the marginal contribution:
Marginal contribution of player Worth of coalition – worth of individual player
Marginal contribution of 1 group = 150, Marginal contribution of player

2 = v({1,2})-v({1}) = 200–150 = 50. The payoff should be proportional to each
player’s contribution, which is called fairness. The marginal contribution to the
game is v({1}) − v({ϕ}) if player 1 exists in the game and there is no coalition. The
marginal collaboration of player 1 is empty. Player 1’s contribution is the average
marginal collaboration of player 1 in group + the marginal collaboration of player
1. Fairness in game theory is often associated with solution concepts that distribute
the payoffs among players in a way that is perceived as equitable or justifiable.
2.2.6 Nontransferable Utility
Nontransferable Utility is used to describe situations where individuals or agents

cannot exchange or transfer their utility or preferences directly with each other
where as “transferable utility” refers to situations where the value or utility associ-
ated with goods can be transfered or shared between individuals. This typically
occurs in situations where goods are divisible. This concept is particularly relevant
in the context of bargaining, negotiations, and allocation problems. Side payments
or negotiations are not allowed in nontransferable utility. For example, in a project
where team members have distinct roles and responsibilities, the contributions of
each member are not interchangeable, which makes it a case of nontransferable util-
ity. This concept has applications in areas such as economics, negotiation, coalition
formation, and resource allocation. Key characteristics of nontransferable utility

(i) The value formed by the coalition cannot be divided among the members. Each
coalition can bring up to certain members or no members at all, which results
in indivisibility.
(ii) Since utilities are not transferable, they cannot be easily compared across dif-
ferent players. This lack of comparability makes it challenging to establish a
common measure of value or contribution. Cooperative games with nontrans-
ferable utility may have unique solution concepts such as the core, the nucleo-
lus, and the Shapley value.
(iii) The allocation of resources or outcomes in NTU settings can be more complex
compared to transferable utility. Here, redistribution between different mem-
bers is not straightforward and easy.
(iv) Players may have specific roles, skills, or responsibilities that mdake them
essential in their roles; thus, sharing of utility is difficult, and nontransferable
utility settings are typically studied within the framework of cooperative
game theory.
2.2.7 Shapley Value
It was introduced by the economist Lloyd Shapley in the early 1950s. It provides a
fair way to distribute the total value generated by the strategies of different players
based on their contributions to the game (Narahari, 2022). It provides a way to deal
with each and every player in a fairer way. The Shapley value has been widely
applied in various fields, including economics, political science, and computer sci-
ence, to allocate resources, assess contributions, analyze cooperative scenarios, and
even solve allocation problems in multiagent systems. According to the Shapley
value, the amount that player i is given in a coalitional game is
|C |! n C 1 !(v C i v C )
Shapley value i V
C N i n!
Let’s consider an example where three players are A, B, and C. For any single
player, v({A}) = 300, v({B}) = 400, v({C}) = 500. For any pair of players, v({A,
B}) = 900, v({A, C) = 700, v({B, C}) = 1000 and v({A, B, C}) = 2000. If we try to
calculate the marginal contribution for A, when A has different permutations to col-
laborate, there are different cases as follows:
Case 1: Considering A’s cases
(A, B) → v({A, B}) − v({A}) = 900–300 = 600
(A, C) → v({A, C}) − v({A}) = 700–300 = 400
(A, B, C) → v({A, B, C}) − v({A, B}) = 2000–900 = 1100.
Case 2: Considering B’s cases

(B, A) → v({A, B}) − v({B}) = 900–400 = 500
(B, C) → v({B, C}) − v({B}) = 1000–400 = 600
(B, A, C) → v({A, B, C}) − v({B, C}) = 2000–1000 = 1000.
Similarly, for C, different permutations can be considered.
The Shapley value for A is S(A) = (300 + 600 + 400 + 1100)/4 = 2400/4 = 600.
Similarly, for B, the Shapley value is 625, and for C, it is 625. Therefore, according
to the Shapley value, A’s fair share of the total value is 600, B’s fair share is 625, and
C’s fair share is also 625. The Shapley value(s) considers how much each player
contributes to the value of a coalition when joining it. To calculate, we consider the
players or a subset of players who come together to collaborate to make groups
where all possible coalitions are considered, ranging from individual players to the
full grand coalition containing all players. Then, every possible sequence in which
players could join the coalition is considered. With each and every player, value is
associated, and this value could be in terms of monetary payoffs, resources, or any
other relevant measure. To calculate the Shapley value for each player, the average
of their marginal contributions across all possible permutations of player orders
is taken.
Properties of Shapley values:
(i) The Shapley value works on the principle of average marginal contribution of
a player to all possible coalitions they can join. It considers how the player’s
presence impacts the payoff of other players as they form different coalitions.
(ii) In many cases, the Shapley value provides a unique and consistent solution
for the game. The Shapley value is considered a fair allocation method.
(iii) If a player’s contribution does not affect the outcomes of any coalition, they
are assigned a Shapley value of zero, which ensures that if they do not cooper-
ate, then they will not receive any benefit, and such players are called null
players.
(iv) The solution is efficient, which means that the final worth is equally and fairly
distributed among the players.
(v) Calculating the Shapley value may be computationally expensive, especially
for large games.
(vi) The Shapley value treats players symmetrically, meaning that if two players
have equal or similar contributions, then the same Shapley value will be pro-
vided to the players.
(vii) The Shapley value for each player is nonnegative, meaning that no player
receives a negative allocation.
(viii) There is another property that is a dummy property, which means that this
player will have no impact on the coalition, which means that it has no effect
on the total worth generated by that coalition.
(ix) The marginal contribution of a player to a coalition is the increase in value
that occurs when that player joins the coalition compared to when they are not
part of it, such as what they contribute to the group.
Other solutions include those that are similar to shapely values, provide alterna-
tives to shapely values, and handle fairness and stability. Some of them are that the
nucleolus provides a solution to cooperative game theory and is also a well-known
concept, such as the Shapley value, which provides a fair and stable way to allocate
the value or payoff of a coalitional game among the players. It was introduced by
Lloyd Shapley and David Gale in 1967 as an alternative to the Shapley value and
other solution concepts. It minimizes the excess value that any player claims. It
works on the “idea of excess.” The excess of a player is the difference between the
value they contribute to the coalition and their minimum requirement of participa-
tion. It ensures that the payoff allocation is as fair as possible in the sense that the
maximum excess among players is minimized. The nucleolus provides an interest-
ing alternative to other solution concepts, such as the Shapley value or the core, and
can be particularly useful in situations where players have different minimum
requirements or expectations for joining coalitions. One other could be the kernel.
The kernel is a subset of the core. Both the core and the kernel deal with the stability
and fairness of payoff allocations in coalitional games. It is a subset of the core that
provides more restrictions on feasibility. The kernel, unlike the Shapley value, is
applicable to games that have nonempty cores.
References
Hazra, T., & Anjaria, K. (2022). Applications of game theory in deep learning: A survey. Multimedia
Tools and Applications, 81(6), 8963–8994.
Narahari Y. (2022). Game theory lecture notes.
Rezek, I., Leslie, D. S., Reece, S., Roberts, S. J., Rogers, A., Dash, R. K., & Jennings, N. R. (2008).
On similarities between inference in game theory and machine learning. Journal of Artificial
Intelligence Research, 33, 259–283.
Chapter 3
Noncooperative Game Theory
In game theory, a noncooperative game is a game in which there is competition

between the members, but they do not form groups. In contrast to cooperative game
theory, players cannot form coalitions, and they cannot also make agreements.
Noncooperative game theory is used in the analysis of various real-world problems,
including business competition, auctions, pricing decisions, international relations,
and many other strategic interactions. The solutions includes the concept of Nash
equilibrium and strategies such as the minimax mixed strategy which is given by
John von Neumann. The key difference is that players are not bound to follow any
rules. There is no compulsion for them to form groups and share the input and out-
put terms. Every player is independent of the other. This is termed as “indepen-
dence.” The term noncooperative game theory was first used in 1951 by John Nash
in an article in the journal Annals of Mathematics. Noncooperative game theory
includes the number of players, objective function, actions and constraints imposed
on the players, and outcome of a probabilistic event.
3.1 Comparing Cooperative and Noncooperative Theory

and Their Strategies
John Nash made a statement that explains the difference between the two game
strategies. The statement is as follows: “This (cooperative game) theory is based on
an analysis of the interrelationships of the various coalitions that can be formed by
the players of the game. Our (noncooperative game) theory, in contradistinction, is
based on the absence of coalitions in that it is assumed that each participant acts
independently, without collaboration or communication with any of the others.”
There is no binding agreement, and players have to guess or predict the opponent’s
output. Noncooperative game theory is concerned with understanding strategic
interactions in competitive environments without formal agreements, while

24 3 Noncooperative Game Theory
cooperative game theory focuses on the study of how players can work together and
distribute the gains from cooperation. Noncooperative game theory involves less
inclusion than cooperative game theory. In noncooperative game theory, the concept
of Nash equilibrium is used. In cooperative games, the concept of a stable solution
is generally more complex. One common solution concept is the core, and the other
is the Shapley value. In noncooperative games, players make decisions without any
direct communication or cooperation with each other. Communication and coopera-
tion between players are essential in cooperative game theory. Classic examples of
cooperative games include the bargaining problem, cooperative coalition formation,
and the assignment problem, etc. Examples of noncooperative games include the
prisoner’s dilemma, Cournot duopoly, etc.
3.2 Nash Equilibrium
A Nash equilibrium in game theory is an idea that describes a situation in which

each player’s strategy is the best possible choice considering the strategies have
been already selected by all other players (Goeree et al., 2002). It is an outcome
where neither player has an objective to swap his/her behavior, given the actions of
other players. It simply puts in a state where no player can improve their position by
changing their strategy while others keep their strategies unchanged. All games do
not have a Nash equilibrium, and some may have multiple equilibria. Nash equilib-
rium provides a mutually optimal solution, meaning that no player can increase or
decrease their playoff, assuming that there is no change in players’ actions. It is a
stability concept and is a regret-free concept, as players are “best responders” to
each other’s strategies. It does not necessarily lead to the best overall outcome for
all players, and it may not capture all aspects of real-world behavior. It has various
applications in the fields of economics, politics, biological experiments, etc. Nash
equilibrium helps to analyze bidding strategies in auctions. In evolutionary game
theory, Nash equilibrium models are used to study the evolution of behaviors and
strategies in populations of organisms. Nash equilibrium has applications in politi-
cal science to study strategic voting behavior in elections and the formation of polit-
ical coalitions. Companies often use game theory, including Nash equilibrium, to
make strategic decisions about pricing, advertising, and product differentiation. It
helps predict pricing strategies and market shares.
3.3 Mixed Strategies
In this case, both players randomize their choices with specific probabilities. Mixed
strategy equilibria are particularly relevant in games where there is not a clear domi-
nant strategy for any player. In a mixed strategy equilibrium, players assign proba-
bilities to their available strategies. For each player, the expected value or expected
3.3 Mixed Strategies 25
payoff is calculated by multiplying the probabilities and payoffs. The mixed strat-
egy Nash equilibrium captures uncertainty and unpredictability in decision-making.
It is a way to find a fair and balanced way to play a game when both players have
different options and preferences. The player aims to maximize their expected pay-
off. In identifying Nash equilibrium there is a series of steps which are needed for
optimal combination of strategies of all the players:
1. Recognizing the players, their available strategies, and the associated pay-
off matrix.
2. These probabilities must sum to 1 for each player along with the assurance pro-
vided that no player can increase their expected payoff by moving away from
their chosen strategies.
3. Iterating through a process and estimation of the probabilities that each player is
trying to allocate to their strategies which would maximize their expected payoff.
Let us consider the most popular game, Battle of the Sexes, which depicts mixed
strategy Nash equilibrium.
Battle of the Sexes is a classic game-theoretic scenario that explores the dynam-
ics of coordination and conflicting interests between two players. There are two
players who have different preferences in which they want to become engaged.
Along with cooperation, there is also competition. There are two choices: either go
to fight match or go to ballet. The man prefers to go to fight. The woman wants to
go to ballet, and they prefer being together. A higher number indicates a more pre-
ferred outcome. The game provides a mixed strategy solution and no pure strategy
solution. One player will have a higher payoff in both cases of Nash equilibrium
solution. If both players prefer Opera or both players go for a match, they achieve
the highest combined payoff (200, 100). But if male chooses match and female
chooses Opera, male receives a payoff of 200, and female receives a payoff of 100.
Table 3.1 explains the Nash equilibrium concept by a popular example of a Battle
of Sexes game.
We try to assume a case where the man decides to engage in a fight with the prob-
ability of q and going for opera with 1 − q probability. Similarly, women choose to
participate in a fight with p probability and go for opera with 1 − p probability. To
obtain the expected value, multiplication of probabilities with payoff is performed.
For men, the payoff for going to opera is 100, and the payoff for going to match is
200. For women, going to opera is 200 payoff, and going to match is 100 payoff. For
(fight, opera), the multiplication of probabilities will yield q ∗ (1 − p) and so on. By
multiplying the respective payoffs and probabilities, such as the man’s expected
value for attending a match being q * 200 and for attending an opera being 100 *
(1-q), and subsequently solving for q in both equations, we get q = 1/3.
Table 3.1 Payoff matrix of Battle of Sexes game

Player 2 (woman)
Player 1 (man) Fight (p) Opera (1 − p)
Fight (q) (200,100) (0,0)
Opera (1 − q) (0,0) (100,200)
In the case of women, opting for a fight is associated with a reward of 100 times
p, while choosing to go shopping yields a reward of 200 times (1 − p). On solving
for p, the value of p becomes 2/3. 100*p= 200*(1 − p), which gives 2/3.
For woman, if p>2/3, then shopping is chosen, but if p<2/3, match is chosen. For
man if the value of q>1/3, then he is going for boxing, but if q<1/3, then he attempts
to go to shopping.
There can be multiple Nash equilibria or a situation where no Nash equilibria are
present. However, Nash equilibrium cannot guarantee optimal solutions; rather, it
could generate suboptimal solutions.
There are various examples of games in noncooperative game theory:
Spotlight Game The “Spotlight game” is a popular example in game theory that
illustrates the concept of mixed strategy Nash equilibrium. There are two players,
namely, Player 1 and Player 2, and there are two conditions: whether to go or to stop;
that is, they can either stay or move, which results in four outcomes that are discussed
in the form of cases below. In the payoff matrix which is being generated, the first
value in each cell represents the payoff for Player 1, and the second value shows the
payoff for Player 2. If both players choose to stop, they both have a higher chance of
spotting each other and receive a negative payoff. If both players decide to choose
“Go” then, this signifies a scenario where they both have a lower chance of spotting
each other and will recieve a negative payoff.
Case 1: Player 1 and Player 2 enter the collision state.
Case 2: Player 1 goes and Player 2 stops.
Case 3: Player 2 stops and Player 1 goes.
Case 4: Both players stop.
The second and third cases are Nash equilibrium cases where the outcomes
would not be negative and the first and fourth cases lead to negative results. Nash
equilibrium occurs when both players randomly choose between “stay” and “move.”
Table 3.2 explains the Nash equilibrium concept by a popular example of
Spotlight game.
Stag Hunt Game The “stag hunt” game is generally used to explain “Nash equi-
librium” concept. In the stag hunt game, there are two Nash equilibria. It is a sce-
nario where two individuals must decide whether to collaborate or act independently,
with the goal of achieving the best possible outcome. There are two hunters who
have two options: either hunting a stag or hunting a rabbit, which is smaller and
easier to catch than a stag. Hunting a hare can be done individually with less efforts
but would generate a lower payoff, whereas hunting a stag requires more efforts and
Table 3.2 Payoff matrix of Spotlight game

Player 2
Player 1 Go Stop
Go (−6, −6) (1,0)
Stop (0,1) (−2, −2)
mutual coordination between the players, as it provides a higher payoff. When both
players choose to hunt stag again, the solution is in Nash equilibrium. If one player
wants to hunt a stag but the other still hunts hare, they will result in a lower payoff,
so they decide to hunt rabbit, and thus, the solution is in Nash equilibrium. If one
player chooses to hunt stag and the other chooses to hunt hare, the player who
hunted the rabbit receives 20, while the other receives 0. We get to know that (10,10)
is in Nash equilibrium and (20,0), (0,20) is not in Nash equilibrium. If both players
decide to hunt stag, they each receive a payoff of 50. Therefore, if X hunts rabbit
and Y hunts stag, X earns 20, but Y will want to change the plan and would want to
prefer hunt of rabbit to get 10. In business, companies often face decisions about
whether to collaborate with competitors, where to collaborate, and where they basi-
cally follow the hunting stag technique while acting alone following the rabbit’s
plan. Conservation efforts can be seen as a stag hunt scenario. Individuals or nations
may need to decide whether to address global environmental issues such as climate
change or resource depletion (hunting a stag) or pursue short-term interests that
could harm the environment (hunting a hare).
Table 3.3 explains the Nash equilibrium concept by a popular example stag rab-
bit hunt game.
Rock Paper Scissors Rock-paper-scissors is a classic hand game played in differ-
ent countries that provides simple and easy examples in game theory and decision-
making discussions. It is usually played between two people who simultaneously
form one of three shapes with an outstretched hand. The three actions are “rock,”
“paper,” and “scissors.” Suppose that if Player 1 chooses scissors and Player 2
chooses stone, then Player 1 loses and Player 2 wins. If Player 1 chooses scissors
and Player 2 chooses paper, then Player 1 wins. A few cases have been dis-
cussed below:
• Case 1: (paper, scissors) player with scissors will win.
• Case 2: (paper, rock) player with paper will win.
• Case 3: (rock, scissors) player with rock will win.
There are three outcomes: win, loose, and tie, but for tie, there is no collabora-
tion, so they may restart the match. Rock-paper-scissors is an essential game in
game theory because it is a noncooperative two-player game with a pure strategy
Nash equilibrium. A pure strategy Nash equilibrium is a way in which no player can
deviate from their outcome by changing their strategy. There is also no concept of
collaboration between teams or players. Let us go through a simple example of a
rock-paper-scissors game between two players, Alice and Christine. In round 1,
Alice chooses rock, while Christine chooses scissors. Result comes out to be Alice
wins because “rock” crushes “scissors.” In round 2, Christine chooses rock and
Table 3.3 Payoff matrix of stag hunt game

Player X/Player Y Y hunts stag Y hunts rabbit
X hunts stag 50,50 0, 20
X hunts rabbit 20,0 10,10
Alice chooses scissors. Result is Christine wins because “rock” crushes “scissors.”
Similarly, the game goes on for multiple rounds until there is a clear winner. Rock-
paper-scissors can be utilized as a simple and unbiased method for generating ran-
dom numbers, particularly when a truly random source is not available. In various
situations, where parties or various organizations have conflicts and cannot agree on
a decision, rock-paper-scissors can be used as a way to resolve disputes.
Prisoner’s Dilemma The game provides an example of a situation in which the two
players have two choices: either cooperate or betray each other. The prisoner’s
dilemma was first introduced as a concept in game theory by mathematicians Merrill
M. Flood and Melvin Dresher in 1950. The prisoner’s dilemma has gained signifi-
cant attention and popularity in the field of game theory because of its ability to
model situations of cooperation and conflict. It is used to study various economic,
social, and political scenarios, arms races, business competition, environmental
issues, and international relations. It is an important tool in decision-making.
Here, there are two prisoners and they can’t communicate with each other and
police does not have enough evidence to send them into jail so they make a plan.
They provide them with a set of rules that they have to choose, and in this way, the
payoff matrix is formed.
The possible outcomes are as follows:
• Case 1: If A and B both betray each other, then, they will be given to be 2 years
in prison.
• Case 2: If A backstabs B but B does not speak up, then, A will be removed from
prison and B will be in prison for 10 years.
• Case 3: If A remains silent but B backstabs A, then, A will have to be in prison
for 10 years and B will be set free.
• Case 4: If A and B both do not open up, then, they will each attend the lesser
charge of 4 years in prison.
If both prisoners remain silent and they cooperate, neither of them will be con-
victed, but they will each receive a moderate sentence for the lesser charge. If one
prisoner betrays the other, the betrayer will be set free, while the other prisoner will
receive an enormous amount of punishment. However, if both of them confess, then
they will have to stay in jail for fewer years. In the payoff matrix, the first number
represents the payoff to prisoner A, and the second number represents the payoff to
prisoner B. Both players have a dominant strategy that means the best choice for
them individually, regardless of the other. If both players follow their dominant
strategy, which is to betray each other, they will receive reduced sentences. If both
of them cooperate, they would receive a moderate sentence of 1 year each. From an
individual perspective, each prisoner has a better choice in betraying the other, as
this could lead to anyone of them being set free. However, if both prisoners betray
each other, then, they both end up with longer sentences than being silent. The pris-
oner’s dilemma serves as a powerful model to understand situations of conflict and
cooperation and is widely used to analyze various real-life scenarios, such as
Table 3.4 Payoff matrix of prisoner’s dilemma game

Prisoners A/B Prisoner B is silent Prisoner B speaks up
Prisoner A stays silent Both serve 2 years A: 10 years, B will have 0 years
Prisoner A speaks up A: 0 years, B will have 10 years Both serve 4 years
Table 3.5 Payoff matrix of friend and foe game

Player A/Player B Friend Foe
Friend 2, 2 0, 5
Foe 5, 0 1, 1
business competition, environmental issues, and negotiations. In business, the pris-

oner’s dilemma can be used to understand and analyze decisions involving partner-
ships between companies, competitive strategies, and negotiation tactics.
Table 3.4 shows the moves of players A and B along with their incentives.
Friend or Foe The “friend and foe” game is inspired by the famous prisoner’s
dilemma, where individuals have the choice to either cooperate (be a friend) or to be
away (be a foe) with their partners. It was played in different rounds. It was a game
show that aired from 2002 to 2003 on the Game Show Network in the USA. Each
player is given two choices: “friend” or “foe.” In this game if both players choose
“friend,” then it would lead to moderate rewards. But if one chooses friend and other
chooses “foe,” then that would lead to higher reward to the former. Betrayal by one
player (“foe”) earns a higher reward if the other remains loyal (“friend”), but if both
betray, rewards are lower than mutual cooperation. The game helps researchers to
study trust and cooperation patterns among individuals and companies. This is quite
similar to the prisoner’s dilemma. It is highly used in the areas of economics and
also used in behavioral studies and the field of decision-making and psychology. It
has the ability to make decisions in a competitive environment. Understanding
cooperation and competition is important in organizational settings. The game helps
in collaborations between organizations and assures them regarding their outcomes.
Table 3.5 shows the moves of players A and B along with their points and penalties.
Matching Pennies Matching pennies is a two-player game that is a zero-sum
game and an example of simultaneous games (Narahari, 2012). Matching pennies is
a simple game, and its applications extend to a wide range of fields, providing
decision-making. Matching pennies is also used in intrusion detection systems in
the study of evolution theory. Matching pennies can be used in cryptographic proto-
cols. For example, if both players randomly choose “heads” or “tails,” the expected
payoff for each player will be the same. Let us move through an example; suppose
there are two players and each player has a penny with two sides, one marked as
“heads” (H) and the other as “tails” (T), both have to choose their moves simultane-
ously without knowing each other’s move. If both choose head-head or tail-tail, then
the opponent pays some pennies to Player 1. If there are different outcomes, that is,
there is a head, and in the other, there is a tail, then Player 1 has to give some pennies
Table 3.6 Payoff matrix of matching pennies game

Player A/Player B Heads Tails
Heads 2, −2 −2, 2
Tails −2, 2 2, −2
to the opponent. Suppose there are two players A and B and Player A chooses
“heads” (H) and Player B chooses “tails” (T). In this case, Player A receives a pay-
off of −2, and Player B obtains 2; that is, Player B wins 2 points, and Player A loses
2 points. The payoffs in this scenario are (A: −2, B: 2). Both players are trying to
maximize their payoffs. The payoff matrix is shown below, where the first value is
earned by Player A and the second value is earned by Player B.
Table 3.6 represents the moves of heads and tails for both the players and the
points which they earn are provided in the cells.
3.4 Sequential Game
It is a game where actions of the first player are chosen before the other players
choose the action. To find optimal strategy solution concepts such as Nash equilib-
rium, backward intuition is used. Chess, tic-tac-toe, and poker are some of the basic
sequential games where an unnatural and ever-changing environment is built for the
game. Here, players make decisions after taking into consideration of the strategies
of previous players. The games that are repeated in several rounds are also examples
of sequential games. Extensive forms are applied in sequential games. It’s used in
making business strategies and in making decisions of investment. In finance, vari-
ous organizations make strategies based on market values and information available
over time, which helps them in making financial decisions. Sequential games also
help in making policies in the political environment. Sequential games help political
parties to consider possible responses and outcomes. In legal institutions, sequential
games are also being used where the next move is being made on the information
available and evidence. Combinational games are also examples of sequential
games. Simultaneous games, on the other hand, are games where players do not
have prior knowledge of the moves of other players. Examples of such games are
rock-paper-scissors, prisoner’s dilemma, the Battle of the Sexes, the Hawk-Dove
game, and the Cournot duopoly game. For analyzing such games, the Nash equilib-
rium concept is used. Simultaneous games are applied in auctions where bidders
give their bids without knowing the bids of other participant bidders. In the field of
economics, they are used to analyze market interactions and competition among
various companies. In cybersecurity, defenders make plans without knowing the
moves of attackers and vice versa.
3.7 Games with Perfect Information 31
3.5 Decision Trees
In game theory, decision trees are used in the analysis of sequential games. They
provide a systematic representation of the moves, actions, and their respective pay-
offs. The decision tree helps players to understand the consequences of their actions
and make rational decisions based on the strategies of others. The tree representa-
tion mainly has nodes that provide information for moves, and the succession of
events occurs during the game. The nodes are divided into three types: root nodes,
internal nodes, and terminal nodes. Basically, the players need to be identified along
with their moves and strategies, and for each terminal node, payoffs are assigned.
The concept of backtracking starts from the last level and moves backward, choos-
ing the best strategy for each player at each node based on their payoffs. They are
used in bargaining, in producing the best and optimal business strategies, and in
making political decisions. It can also be used in making military plans considering
potential moves of opposition. It can be used to analyze various investment deci-
sions and resource management.
3.6 Game with Imperfect Information
Games with imperfect information are a type of strategic game in which players
have incomplete knowledge or hidden knowledge about the actions or states of the
game (Hazra & Anjaria, 2022). There is uncertainty present in the game, which
leads to complex solutions and decision-making. Games with imperfect information
are usually represented using extensive form game in which players generally use
strategic probability and reasoning to make decisions. Imperfect games are more
complex than perfect games, as they involve uncertainty. Some common examples
of games with imperfect information include poker, board games, auctions, and
business negotiations. Analyzing security scenarios, such as intrusion detection,
network defense, and cybersecurity games, involves dealing with uncertain or hid-
den information about potential threats and attackers’ actions. Auctions with incom-
plete information can also be considered as an application of incomplete games.
Medical diagnosis and treatment planning often require dealing with uncertain
medical data and hidden patient information.
3.7 Games with Perfect Information
Perfect information games (Hazra & Anjaria, 2022) have a scenario where players
have complete knowledge about actions taken by other players and the game’s his-
tory. Perfect information games are often represented using extensive form games
or decision trees. The presence of uncertainty is negligible in these games. Here,
players can observe all the plans and actions and generally could be played in a fully
known environment. These games are deterministic, and there are no random ele-
ments or chance involved in the game. Instances of perfect information games have
checkers, tic-tac-toe, and chess. Solution concepts used in perfect information
games include backward induction and perfect Nash equilibrium.
3.8 Extensive Form Games
The extensive form could be seen as a graphical representation used in game theory.
It generates a structured visual way to analyze the sequential nature of a game and
to depict the sequence of actions, strategies, moves, and payoffs in a better manner.
It solves games through techniques such as Nash equilibrium and backward induc-
tion. It is used for games with imperfect information which is generally used for
both sequential and simultaneous games. These games are generally represented in
the form of game trees. Economic games, such as the Cournot and Stackelberg
models, involve firms making sequential decisions. Let us consider a simple sequen-
tial game where Player A makes a choice first and then Player B makes the move
accordingly. In the game, suppose Player A chooses between H (head) and T (tail).
If Player A chooses H, Player B can choose between center and down. If Player A
chooses tail, Player B can choose between left and right. This information could be
drawn into a tree, which is shown below (Fig. 3.1).
3.9 Game Tree
A game tree can be considered as a graphical way which specifies all the states
within the game, the moves and penalties or rewards (Ho et al., 2022). Game trees
measure the complexity of a particular game along with the estimation of the
Fig. 3.1 A game tree

where A has options of
heads and tails and the
outcomes by moves
3.10 Search Strategies for Game Trees 33
outcomes of moves along with what moves are being taken from a set of all possible
moves. They provide real-time scenarios of a game. Games such as tic-tac-toe and
chess use game trees. They have certain limitations, such as consuming consider-
able time. They may or may not provide the best optimal solution. They also check
the moves of opponents, and then, from the history and knowledge obtained, they
try to make decisions. They need complete knowledge of how to move. We also
have initial nodes that tell about the initial state of the game. Then, we have edges
that represent what moves are to be taken, internal nodes that represent the internal
states or the decision that we can take, and, finally, leaf nodes that have final out-
comes. For example, the 8-puzzle game has nodes and edges. Nodes could be con-
sidered as different combinations of inputs. Edges can be considered different
actions, such as moving up or down.
3.10 Search Strategies for Game Trees
There are different search strategies for traversing a graph or a tree. Some of them
are listed below.
3.10.1 Breadth-First Search (BFS)
BFS explores all nodes at a given level before going to the next level, meaning it
traverses all the nodes in a graph or tree in a breadth-first manner. This allows for a
comprehensive exploration of the game’s decision-making possibilities. BFS in a
game tree is commonly used in artificial intelligence and game development to ana-
lyze different scenarios, compute optimal moves, and determine the best strategies
for players. BFS can use a very large amount of memory due to the exponential
growth in the number of nodes, so it is often applied with heuristics and pruning
techniques. It uses queues as the data structure for its implementation. The BFS
algorithm ensures the exploration of all the nodes level by level, guaranteeing the
visiting of all reachable nodes from the root node. It could be used to analyze all
possible moves and their consequences. It also helps in determining the optimal
strategy that leads to a desired outcome. BFS also helps in understanding the struc-
ture of the game tree, the number of possible paths, and the complexity of the game.
3.10.2 Depth-First Search (DFS)
It uses LIFO (last in, first out) and stack as a data structure. DFS traverses deeper
into the tree before considering sibling nodes. It goes deep into levels before back-
tracking to the previous level. DFS is often used for tasks such as finding a path,
exploring all possibilities, and searching for solutions. It may go into infinite depth
and lead to enormous computational effort, so stopping criteria need to be applied.
It is used in decision-making scenarios. One disadvantage of DFS is that it may not
return optimal path as BFS. DFS is well suited for analyzing sequential games,
where players take turns making decisions and can be applied to solve puzzles.
3.11 Min-Max Strategy
Minimax is a decision-making strategy (Ho et al., 2022) in which the objective of

one player is to maximize their payoff while the other player aims to minimize it. It
is basically used in two-player games and is a kind of zero-sum game. The primary
idea behind the minimax approach is to consider the worst-case scenario for them-
selves and then make moves that minimize the maximum potential loss given the
opponent’s best response. Basically, it has two players: Player A aims to maximize
their own payoff, and Player B aims to minimize A’s payoff, effectively trying to
maximize their own payoff in a negative sense. This leads to the concept of the
“minimax equilibrium,” which is a fundamental solution concept in game theory.
The minimax algorithm is often used in board games, such as chess and tic-tac-toe.
In this scenario, a fully observable environment is needed, and generally, the game
uses sequential moves. The algorithm recursively explores the game tree, starting
from the current state and considering all possible moves for the current player.
When the algorithm reaches a terminal state, it stops and calculates the payoffs
which is used to reach the optimal solution, and changes in the payoffs occur during
this phase. Minimax rotation has certain limitations. It can be computationally
expensive for large game trees and does not work well for games with imperfect
information as it requires a fully observable kind of environment. However, this
limitation could be solved using alpha-beta pruning.
There are some features of min-max game which are discussed below:
• It is a zero-sum game in which any gain for the first player results in an equal loss
for the other player. It involves two players and it’s a two-player game.
• It is a perfect game. This means that each player knows the entire game tree,
including all possible moves and their corresponding outcomes.
• It works on the Nash equilibrium concept.
• It is alternating game as the game progresses in a sequential manner, with each
player reacting to the other player’s previous move.
• To determine the optimal strategy in a minimax game, players use backward
induction. They start from the terminal nodes (final outcomes) and work their
way back to the initial state, making decisions that maximize or minimize the
utility at each decision point. Minimax uses a recursive approach to explore the
game tree.
• It is a complete algorithm and provides an optimal solution.
3.13 Extensive Form Games 35
• Here, real-world opponents may not play perfectly or may introduce elements of
randomness which is a disadvantage (Hazra & Anjaria, 2022).
• The worst-case time complexity of the basic minimax algorithm without any
optimizations is exponential (in the depth of the tree), but by alpha-beta pruning,
its time complexity is reduced.
3.11.1 Steps in Game Tree
To apply the minimax strategy, the game is typically represented as a game tree,
where each node corresponds to a specific game state, and the edges represent pos-
sible moves by the players. Firstly, utilities are being assigned. At the leaf nodes of
the game tree, payoffs are equal to represent the outcome for each player. In a two-
player zero-sum game, the utilities for the players are reverse in signs of each other
example −2 and +2. The minimax strategy then applies concept of backward induc-
tion. The players maximize and minimize the payoffs while travelling the tree. The
minimax strategy helps to identify the optimal strategy for each player to achieve
the best possible outcome for their respective goals. In theory, the optimal strategies
for both players are found at the Nash equilibrium of the game.
3.12 Strategic Form Games
This is a basic representation of a game using a matrix. Strategic game forms are
also called as normal game forms. Each cell in the matrix corresponds to integration
of strategies chosen by the players, and the entries represent the payoffs for each
player. The number of players is finite along with a finite number of strategies. This
allows for a clear and concise depiction of the game’s structure. Let us illustrate
with a simple example of a strategic form for a two-player game, often referred to
as a “prisoner’s dilemma”: Both players have two strategies: cooperate or do not
cooperate. The values in each cell represent the payoffs for Player 1 and Player 2
based on their chosen strategies. For example, if both players choose to cooperate,
each receives a payoff of 10.
Table 3.7 represents the prisoner’s dilemma, and the payoffs for cooperation and
betraying are given in the table.
3.13 Extensive Form Games
These games are useful in situations where players make decisions in a sequence
and can choose strategies on previous actions or imperfect information. In other
words, it basically depicts sequential games along with incomplete information.

Player 1/Player 2 Cooperate Don’t cooperate
Cooperate 10,10 0, 7
Don’t cooperate 7, 0 5,5
They are basically used in scenarios such as bargaining and negotiations. These try
to represent all the information regarding players, their moves, and their payoffs in
the tree form. They can be analyzed using various solution concepts, such as back-
ward induction, which involves working backward from the terminal nodes to get
optimal results.
3.14 Dominant Strategies
A dominant strategy is a concept in game theory that represents a player’s best out-
come or that yields the highest outcome of an action irrespective of the actions made
by their opponents. Let us consider a simple example to illustrate the concept of
dominant strategy, the prisoner’s dilemma, which is a classic game in game theory
that involves two players, Player A and Player B, who are given the choice to coop-
erate or not cooperate with their partners. Let us determine the dominant strategies:
If Player B betrays, then Player A’s best choice is still to not cooperate (as 1 is
greater than 0). Player A’s best choice according to the payoffs is to not cooperate
(as payoff 6 is more than 2). Therefore, Player A’s dominant strategy is “no coopera-
tion” because it always yields a higher payoff. If Player A chooses not to cooperate,
then we see in the second row that Player B’s best choice is still to not cooperate.
But suppose Player B cooperates, then we check in the column of B and consider
the two strategies of A. If Player A cooperates, then Player B’s best choice is to not
cooperate. Player B’s dominant strategy is also “don’t cooperate.”
Table 3.8 represents dominant strategies for both players, which have two
choices: either they cooperate or they betray with each other.
3.14.1 Strict Dominance
Strict domination in game theory presents a situation where a particular action for a
player results in a strictly greater payoff than another action for every action of the
other player. We explain this with the help of “strictly greater than sign.” Decisions
here are quite easily made, but sometimes, games can have mixed strategies.
Suppose we have two players, with Player 1 having two moves A and B and Player
2 having three moves C, D, and E. It is observed that A strictly dominated B for
Player 1 when the payoffs of A are always more than those of B in every C, D, and
E, which is illustrated in Table 3.9a.
3.14 Dominant Strategies 37

Player A/Player B Cooperate Don’t cooperate
Cooperate 2,2 0, 6
Don’t cooperate 6, 0 1,1
Table 3.9a Example of strict domination

Player1/Player 2 C D E
A 60 70 80
B 50 60 70
Table 3.9b Example of weak dominance

Player1, Player 2 C D E
A 60 70 90
B 60 70 80
Table 3.9c Example of equivalence dominance

Player1, Player 2 C D E
A 60 70 80
B 60 70 80
In this case, it is observed that A strictly dominates B for Player 1.

In Table 3.9b, it is observed that A weakly dominates B as 90 > 80.
Equal dominance could refer to a scenario where multiple strategies are equally
favorable for a player as they generate same kind of payoff, and therefore, the player
is indifferent between those strategies, and they are confronted with a situation of
strategic equilibrium. Table 3.9c depicts equivalence of two players with tow moves
of Player 1 and three moves of Player 2. A is equivalent to B, which is a case of
equivalence dominance.
3.14.2 Weak Dominance
It is a phenomenon where one action could dominate over other action which helps
to eliminate less favorable strategies. In other words, it could be considered as a
circumstance where a particular strategy for a player leads to at least as high a pay-
off as any other strategy but not considering what other player chooses. In some
games, weakly dominant strategies are not present, so analyzing those games can
become more complex which could be said as a disadvantage. Also, there is no
guarantee of always higher yield. In this case, equal to is also used along with a
greater sign that is “less than equal to.” Weak dominance introduces a more lenient
form of criterion allowing strategies to coexist.
Table 3.10a depicts weak dominance of middle action of two players with three
moves of Player 1 and three moves of Player 2.
In this case, the middle is going to weakly dominate up which could be analyzed.
Middle condition is the dominating condition as we can see that Player 2 moves
right which results in middle’s payoff equal to 60; if movement is in center, then
middle is 40; if movement is in left, then it has points.
Table 3.10b depicts weak dominance of center action of two players with three
moves of Player 1 and three moves of Player 2.
Player 2 witness’s strict dominance of center over left as payoff 40 is greater than
payoff 30, and if Player 1 plays middle, then center earns more point than left
60 > 30 when it goes down 40 > 30, and if Player 1 plays down, then center earns
more points than left.
Player 2 experiences strict domination of center over right, 40 > 20 in moving
left and 60 > 50 and 60 is superior to 50 in moving right.
Table 3.10c depicts that center strictly dominates right.
Player 1 will go in the middle.
Table 3.10d depicts the middle as the move to obtain a solution. Therefore, the
middle appears to be a solution.
Table 3.10a Example of weak dominance

Left Center Right
Up 20,20 30,20 30,20
Middle 20,30 40,30 60,20
Down 0,0 0,0 0,0
Table 3.10b Example of weak dominance

Left Center Right
Up 20,20 30,20 30,20
Middle 20,30 40,40 60,20
Down 25,30 25,60 50,50
Table 3.10c Example of strict dominance

Left Center Right
Up 20,20 30,20 30,20
Middle 20,30 40,40 60,20
Down 25,30 25,60 50,50
Table 3.10d Solution to the problem

Left Center
Up 20,20 30,20
Middle 20,30 40,40
Down 25,30 25,60
3.16 Bayesian Games 39
3.15 No Dominant Strategy
Let us consider an example of two players, Mike and Bonnie, where both have two
actions, going up or going down. First, we need to determine the best responses of
Bonnie to Mike’s moves. Let us suppose that if Mike goes up in the first column,
Bonnie’s optimal results are in payoff of 100; when she goes up, it is not zero but
10; when she goes down, indicating up is the best move for her. In column 2,
Bonnie’s moves are (70-up, 150-down), so down is the best optimal move for
Bonnie in a condition when Mike chooses down. There is “no dominant” strategy
followed over here. When we consider Mike’s moves while traversing rows, they
are to go down in the first row with 800 points and to go down in the second row,
resulting in 70, which are determined by getting the second value in the cell. Here,
Bonnie does not have a dominant move because Bonnie has different moves for dif-
ferent actions of Mike. Mike has a dominant strategy for him and the solution is the
down-down strategy (150,70). Therefore, down-down is the Nash equilibrium solu-
tion for the problem which is highlighted in the table. In summary, a lack of domi-
nant strategy implies that the best choice for a player depends on the choices made
by other players, making the decision-making process more complex and leading to
considerations of Nash equilibria.
Table 3.11 explains that no dominant strategies by the Mike-Bonnie game and
equilibrium are depicted by highlighted values.
3.16 Bayesian Games
In these games, players have uncertain or incomplete knowledge, and this uncer-
tainty is represented using probability. Players are being assigned with certain prob-
abilities incorporating into account their private information and the beliefs of other
players. The solution for such scenarios is Nash equilibrium. These games are not
easy to solve, as players have to take complete distribution of strategies and have to
take care of their optimal strategies against the opponents. They have applications
in the fields of economics, auctions, industries, decision-making, politics, etc. In a
Bayesian Nash equilibrium, each player’s strategy is optimal given their beliefs
about the types of other players, and no player deviate to achieve something better.
In traditional game theory, players make decisions based on their own information
and the expected actions of other players. The game has various stages, which
Table 3.11 Example of no dominance

Bonnie, Mike Up Down
Up (100, 500) (7, 800)
Down (10, 60) (150, 70)
• At the beginning of the game, nature chooses the types of each player according
to some probability distributions. That is, distribution is determined by nature
according to probability distribution.
• Players have complete information about their own moves and actions, but they
do not have information about other players.
• Players choose their strategies based on their private type and beliefs. It takes
into account the player’s goals, the actions and possible strategies of other play-
ers, and the potential outcomes of different choices. Strategies can be classified
into several types, such as mixed strategies, dominant strategies, and tit for tat
strategies. Then, certain payoffs are being generated with respect to the moves.
3.17 Matrix Games
It is a strategic way of interactions between multiple players where each player has
a finite number of strategies to choose from, and the outcome (payoff) of the games
is dependent on the strategies (Zhu et al., 2021). The essential elements of the games
are players, strategies, and payoffs which they aim to maximize by their decisions
which are influenced by the choices made by the other players. The strategic choices
that lead to optimal outcomes are often analyzed through equilibrium concepts, the
most common one being the Nash equilibrium. A matrix game is defined by a pay-
off matrix that captures the possible outcomes and payoffs associated with different
strategies chosen by the players. The payoff matrix has rows and columns. These
rows and columns depict various strategies of different players (Carmon et al. 2020).
For example, Player 1 moves can be expressed in rows, and Player 2 moves can be
expressed in columns or vice versa. It is a simple way of representing strategic inter-
actions between players. The intersection of rows and columns leads to a cell value
that corresponds to the payoff value. Matrix games possess several interesting prop-
erties that make them valuable tools for decision-making and analysis in various
fields. Some of these properties include the following:
1. Matrix games are represented in a simple tabular format, making them easy to
understand and analyze. The matrix shows the possible combinations of deci-
sions made by the players and the corresponding outcomes.
2. They are adhered to noncooperative nature, which means that players indepen-
dently choose their strategies without direct communication or negotiation.
3. This game involves multiple strategies between two players. Therefore, it’s a
two-player game.
Let us walk through a simple matrix game example involving two players: Alice
and Bob. In this scenario, they are both competing companies launching new prod-
ucts, and they have two choices each: “A” and “B.” The matrix will represent their
profits based on their combined marketing strategies. In the matrix, the first entry
represents Alice’s profit, while the second element represents Bob’s entry. If both of
them choose “A” type, then they will get less profits.
3.18 Repeated Games 41
Table 3.12 Example of matrix game

Bob, Alice A B
A $50, $50 $10, $70
B $70, $10 $30, $30
Table 3.13 Min-max strategy for matrix game

Player A, Player B First move Second move Third move Row minima
First move −600 −300 500 −600
Second move 200 0 200 0
Third move 600 −300 −400 −400
Column maxima 600 0 500 Minmax = 0
Max min = 0
Case 1: If both choose “A,” both will get $50.

Case 2: If Alice chooses “A” and Bob chooses “B,” Alice will have a profit of $10
and Bob will get $70.
Case 3: If Bob chooses “A” and Alice chooses “B,” Alice will have a profit of $70
and Bob will have a profit of $10.
Case 4: If both choose “B,” then they get $30.
Table 3.12 illustrates matrix games with aggressive and minimal marketing as
two strategies and Alice and Bob as players.
Let us consider one more example where there are two players, A and B, and they
have three moves to choose from. Let us try to determine the value for this game by
using the minimum and maximum strategies.
Table 3.13 illustrates matrix game representation and finding of a solution by
using the min-max strategy. By determining the maximum value of the row minima,
we obtain max (−600, 0, −400) = 0 as 0 is maximum of all, and selecting the mini-
mum of the maxima values, we obtain (600, 0, 500) =0. We obtain the max of the
minima as zero and the minimum of the maxima as zero. Finally, the value of the
game is 0 where neither player has any advantage indicating the game is fair. And
the strategy for Player A as well as for Player B is also second.
3.18 Repeated Games
Repeated games which are also known as super games are the games where a par-
ticular game is played multiple times by the same players. These are represented in
extensive forms. Unlike one-shot games where players make decisions only once,
repeated games introduce a dynamic element where players’ actions and payoffs in
one round can influence their choices and outcomes in subsequent rounds. Popular
strategies such as tit for tat, the grim trigger strategy, and the triggered strategy
are used.
• Tit for tat involves cooperation between players in the first round and basically
previous moves are seen; if opponent cooperates, then the other player cooper-
ates; but if one defects, the other defects; and this technique is easy to understand
and promotes fairness and reciprocity.
• Triggered strategies are the ones where Player 1 would collaborate unless and
until Player 1 does. Here, players stop cooperating indefinitely. Triggered strate-
gies are particularly effective when dealing with defectors and aim to bring about
cooperation.
• Grim trigger strategies are strategies in which a player cooperates until the oppo-
nent defects, and then the player defects forever, imposing a severe penalty on
the opponent.
Repeated games can be classified into finite and infinite games, depending on
whether the number of repetitions is finite or infinite.
3.18.1 Finitely Repeated Games
Finite repeated games are a specific class of repeated games where the game is
played for a fixed and known number of rounds where players know in advance how
many rounds they will play and players receive payoffs from the cumulative deci-
sions made in each round. Also, each round is treated separately, and no history is
taken into account while considering the strategy for the present round. One of the
most well-known models of finite repeated games is the repeated prisoner’s
dilemma, which is a classic example of a game with a temptation to defect. Other
examples include bidding in an auction with a fixed number of rounds and certain
board games such as chess or tic-tac-toe.
3.18.2 Infinitely Repeated Games
These are the games where the sequence of games continues without a predeter-
mined endpoint. In these games, players face a strategic dilemma over an indefinite
number of rounds, and their decisions in each round can have long-term conse-
quences for their overall payoff. The two most popular strategies are trigger strate-
gies and the folk theorem. The application of infinite games includes various social
and economic contexts, such as business competition, scientific research, and envi-
ronmental issues.
Let us consider an example where there are two players, and they try to collabo-
rate over some unit of money with an interest rate of r%. If they collaborate, their
amount will be $15 each, but if one tries to collaborate and the other tries to leave,
the person who leaves will end up in a defect state for the next iteration. The players
can collaborate if their amount of collaboration is more than the individual amounts.
References 43
Table 3.14 Example of infinitely long repeated game with defect cooperation
Player 1, Player 2 Defect Cooperate
Defect (0,0) (H,50)
Cooperate (50,H) (150,150)
Table 3.14 illustrates an example of an infinite repeated game with defective

cooperation between two players.
Player 1 assumes defects in the game. So, we assume that Player 2 will not allow
Player 1 to collaborate again in the next iteration. Therefore, Player 1 obtains (H,
50) the profit of 50, and then, Player 1 will be in defect state (0,0). Mainly, it can
have grim trigger or tit for tat strategy.
3.19 Incentives
Incentives refer to the motivations or rewards that influence the decisions and
actions of players in a strategic interaction. They come in various forms and can be
positive or negative, depending on how they affect. Players in a game are often
rational decision-makers who aim to maximize their own utility or payoff. There are
various types of incentives:
Positive Incentives: Positive incentives are rewards or benefits that encourage play-
ers to take certain actions. When a player perceives a potential gain or benefit
from a particular strategy, they are more likely to choose that option.
Negative Incentives: Negative incentives are deterrents or penalties that discourage
players from choosing certain actions. When a player faces potential losses or
adverse consequences from a particular strategy, they are less likely to select
that option.
References
Carmon, Y., Jin, Y., Sidford, A., & Tian, K. (2020). Coordinate methods for matrix games.
Springer.
Goeree, J. K., Holt, C. A., & Palfrey, T. R. (2002). Risk averse behavior in generalized matching
pennies games. Games and Economic Behavior, 45(1), 97–113.
Hazra, T., & Anjaria, K. (2022). Applications of game theory in deep learning: A survey. Springer.
Ho, E., Rajagopalan, A., Skvortsov, A., Arulampalam, S., & Piraveenan, M. (2022). Game theory
in defence applications: A review. Sensors, 22(3), 1032.
Narahari, Y. (2012). Game theory lecture notes by Y. Narahari Indian Institute of Science
Bangalore.
Zhu, M., Anwar, H., Wan, Z., Cho, H. J., Kamhoua, C., & Singh, P. M. (2021). Game-theoretic
and machine learning-based approaches for defensive deception: A survey. arXiv preprint
arXiv:2101.10121.
Chapter 4
Applications of Game Theory in Deep
Neural Networks
Over the last decade, deep learning has been a hot topic of discussion due to its
learning capabilities from data. As a brand-new area of study within machine learn-
ing (ML), the deep learning (DL) notion initially emerged in 2006. To understand
several applications (Hazra & Anjaria, 2022) of game theory in deep neural net-
works (DNNs), first let us go through some basic concepts of DL and game theory.
Deep learning techniques are a subset of machine learning that is able to classify
automatically by learning hierarchical representations in deep architectures. Have
you ever wondered how your mobile gallery is automatically organized on the basis
of different human faces? This is nothing but the product of DL. Why do we opt for
DL in place of ML? In ML, we have to tell machines about the different features that
help machines to classify between different species. For example, to classify sam-
ples from the mixture of guava and apple, features such as color, size, shape, etc.
play an important part. However, in the case of DL, features are picked by a neural
network without interference from humans.
4.1 Introduction
Deep learning models are based on DNNs, which are capable of supervised and
unsupervised learning with a huge collection of labelled data and back propagation
techniques. To understand neural networks (NNs), let us take an example. We have
to recognize the letter “A,” which is written by three different students. Humans can
easily identify letters, but it is possible with DNN, as the handwriting of different
students is different. The answer is yes; DNN can classify the letters.
The NN consists of three layers: the input layer, hidden layer, and output layer as
defined in Fig. 4.1. In deep learning, the NN is trained to identify letters. They are
images of 28 × 28 pixels. Initially, the image was fed into the input layer. Each pixel

46 4 Applications of Game Theory in Deep Neural Networks
Fig. 4.1 An artificial neural network architecture
of a given image is fed into the input; let us say X1, X2, …, X784. Now, it passes
through a hidden layer of a channel. Each channel consists of several weights.
Hence, they are called weighted channels, which can be represented as W1, W2, …,
W784. All the neurons of hidden layers are associated with numbers called bias,
which can be represented by b1, b2, …, b784. Now, we have to find a weighted sum
for all input neurons X1 · W1 + X2. · W2 + … + Xn · Wn:
i 1
X i ·Wi b i
n
After that, bias is added. The activation function, which aids in identifying neurons
that need to be triggered, is then passed via the summing function. Until the second-
to-last layer, each stimulated neuron transmits information to the next layer. The
letter is represented by one neuron in the output layer that is active. Back propaga-
tion is a technique for continuously adjusting weights and bias to create well-trained
models. There are numerous applications of DL in healthcare, business, agriculture,
research, and many more that you cannot even imagine.
Healthcare: Regular health factor analysis, coronary heart disease risk prediction,
cancer classification, diagnosis of COVID-19, detection of COVID-19.
Natural language processing: Text summarization, sentiment analysis.
Cybersecurity: Malware detection, suspect detection, network intrusion detection,
security incident and fraud analysis.
IoT and security: Smart parking system, air quality prediction, cybersecurity in
smart cities.
Smart agriculture: Plant disease detection, soil quality evaluation, smart agriculture
IoT system.
4.2 Relation of Neural Network to Game Theory 47
Business: Stock trend prediction, financial loan default prediction, power consump-
tion forecasting.
There are also some limitations of DL. Deep learning is the most efficient way to
deal with unstructured data. To train a neural network requires considerable data,
and processing a huge amount of data is not possible for every machine. Another
limitation is that the computational power to train the neural network requires
graphical processing units (GPUs). The GPU consists of thousands of cores com-
pared to CPUs. The GPU is more costly compared to the GPU. Another limitation
is the training time to train a neural network. It takes days or months. The duration
of training is dependent upon amount of data and the number of layers in the net-
work. Now, after understanding the basic concept of networks, let us enter the appli-
cation part of game theory. Therefore, there are many applications of game theory
in deep neural networks. It is critical to comprehend the relation between game
theory and deep neural networks.
4.2 Relation of Neural Network to Game Theory
To understand it in a better way, let us consider an example of a game (Van den

Nouweland, 2007). We have two players, Joy and Roy. The main components of a
game are the players of the game, rules of a game, and output of a game. To under-
stand the behavior of players’ strategy, a game payoff matrix is used for representa-
tion. Therefore, all of us are aware of a rock-paper-scissors zero-sum game in which
players simultaneously choose either rock, paper, or scissors. Let us discuss the
basic rule of this game.
Rules:
1. If Joy opts for rock and Roy opts for paper, then Roy will win one point and
vice versa.
2. If Joy opts for paper and Roy opts for scissors, then Roy will win one point and
vice versa.
3. If Joy opts for rock and Roy opts for scissors, then Joy will win one point and
vice versa.
4. If Joy opts for paper and Roy opts for rock, then Joy will win one point and
vice versa.
5. If Joy and Roy both opt for the same, then both will get zero points.
Figure 4.2 shows the payoff matrix as per the given game rules. The leftmost
point of each cell is the point of Joy, and the rightmost point of each cell is the point
of Roy. For example, if Joy opts for paper and Roy opts for rock, then Joy will get
1 point and Roy will get −1 point.
There are two types of games: static games or dynamic games. A game in which
complete information such as strategies and the payoff matrix is available for
another player is called a static game. Games in which incomplete information is
present, unlike static games, are known as dynamic games. The above game is a
Fig. 4.2 Payoff matrix of rock-paper-scissor game
Fig. 4.3 Payoff matrix of rock-paper-scissor game with mixed strategies
static game; as in this game, all players have to play simultaneously. In this game,
players are unaware of others’ strategies but play according to their assumption. In
dynamic games, players make their decision sequentially (e.g., game of chess).
Therefore, in this game, there will be no Nash equilibrium in pure strategies
(players choose action for sure), as the player who loses or ties can always switch to
another strategy and wins the game. We shall look into mixed strategy. Now let us
consider x, y, 1 − x − y as the chance of picking rock, paper, and scissors, respec-
tively, for Joy. Similarly, we can say l, m, 1 − l − m is the chance of picking rock,
paper, and scissors, respectively, for Roy. Each player’s set of actions is {rock,
paper, scissors} and (p(rock), p(paper), p(scissors)) ≥ (0,0,0) and {p(rock) + p(pap
er) + p(scissors) = 1}.
Mixed strategy responses by Joy, Joy’s expected payoff from playing the mixed
strategy (x, y, 1 − x − y) when Roy plays mixed strategy (l, m, 1 − l − m). According
to Fig. 4.3, the payoff amounts to Joy:
4.2 Relation of Neural Network to Game Theory 49
r x·l·0 x·m· 1 x·1 l m ·1 y·l·1 y·m·0 y·1 l m · 1
1 x y ·l· 1 1 x y ·m·1 1 x y ·1 l m ·0
xm x xl xm yl y yl ym l lx ly mx my
3xm x 3yl – y – l m x m – l – y – 3xm 3yl
Our main goal is to maximize the payoff expression. From the expression, we can
see that the payoff expression of Joy also depends on Roy’s probability (l, m).
Similarly, we can find the payoff expression of Roy according to Fig. 4.3:
q l·x·0 m·x·1 1 l m ·x· 1 l·y· 1 m·y·0 1 l m ·y·1

l·1 x y ·1 m·1 x y · 1 1 l m ·1 x y ·0
xm x xl xm yl y yl ym l lx y m mx my
3xm x 3yl y l m y l x m 3xm 3yl
Similar to the last expression, this expression of the payoff of Roy also depends
on Joy’s probability (x, y).
As we are aware of the concept of neural networks, in the last section, we dis-
cussed the game rock-paper-scissors, in which two human players, Joy and Roy,
were involved. Imagine that if the game is played between a human and an agent,
how will an agent be able to understand which strategy to choose? Learning is
needed to develop such systems. From the last example of game rock-paper scis-
sors, we know that with the help of the payoff matrix and mixed strategy values, it
becomes easy for a player to decide which strategy to use. An algorithm can be
developed in which we have to perform a simple classification task. First, to under-
stand the algorithm properly, let us take a game with a payoff matrix of size 2*2. In
place of players, we will consider neurons who are participating in the game.
Let’s consider a game in which there is a race to complete and there are two
options: either run or jump. To determine the reward matrix for the payoff function
and the mixed strategies, we require a learning algorithm for neural networks based
on a theoretical understanding of game theory.
In Fig. 4.4, all are unknown, and to find the values, an algorithm was used. Let
us consider a supervised learning classification task that is one dimensional, linearly
separable, and simple. Here, we need to classify between two classes: run and jump.
For classification, we are considering m objects, and these will be the training set
for the learning algorithm. Therefore, m objects of two different classes are repre-
sented in the x-y plane, and each object i belongs to Xi [0,1].
Consider red balls as class-1 representing run and blue balls as class-2 represent-
ing jump state. There are two points q and q′. q′ is able to correctly classify between
two classes, but q misclassifies the two balls as shown in Fig. 4.5. At the beginning
of the learning process, q is placed randomly on the x-axis and starts moving to the
origin until it reaches q′. Consider a mixed strategy for neuron 1 (p, 1-p) expressing
Fig. 4.4 Payoff matrix
Fig. 4.5 Classification of two classes that are linearly separable and 1D
neuron-1’s scepticism toward neuron-2, and neuron-1’s objective is to build a model

of neuron-2’s anticipated behavior.
In Fig. 4.6, f1, f2, and f3 are the payoff functions. Here, f1 is fixed and represents
the payoff function of class-1, which is the run state of neuron-1. f2 is found by the
angle R [0,90°], which is a function of class-2, jumps to the state of neuron-1. As p
belongs to [0,1], it is directly proportional to angle R, which belongs to [0,90°]. For
example, if p = 0.5, then angle R is equal to 45°. q represents random assignment,
and function f2 is formed. Now, we increase the angle R, and finally, at angle S, clas-
sification is done correctly. There are several such steps to reach at point q′. Now, the
intersection point of payoff function f1 and f3is the final solution to our problem.
From this position, we are able to find the payoff functions and mixed strategies of
neuron-1, which is the goal of our learning algorithm.
Keeping an eye on point x1, which is located to the left of p′ and intersects our
necessary payoff functions at f1 and f3. At this point, f1(x1) > f3(x1). Thus, the payoff
for the run state is always greater than that for the jump state. Similarly, we can see
f3(x3) > f1(x3), which means that the payoff function of the jump state is greater than
that of the run state. As a final result, we can say that f(x) = run for x x q ; other-
wise, jump.
4.3 Applications 51
Fig. 4.6 From the perspective of the learning algorithm, neuron-1
4.3 Applications
We have discussed some basics of deep neural networks and the relation between
game theory and deep neural networks. Now, we will focus on the applications part.
The authors (Weerasinghe et al., 2018) discussed that using theoretical game
theory and a deep learning approach can save wireless networks from jamming
attacks. The main contribution is using adversarial (deep) learning approach. The
transmitter and jammer aim to deceive one another by purposefully faking data that
the opponent uses to build judgments. Therefore, to understand jamming-related
problems, the authors used a game theory approach. Let us focus on the problem
statement where the transmitter has to form a complete communication network
with receivers and the job of the jammer is to get in the way. By increasing the
signal-to-interference plus noise ratio at the receiving end, a jammer might disrupt
communication. Therefore, instead of using a particular channel for communication
over a defined period of time, transmitters can shift to other available channels with
some probability. The jammer can also find the probability of the transmitter shift-
ing over that particular time period, which can cause a jammer attack. It can be
prevented when the transmitter changes its probability periodically. The transmitter
uses a pseudorandom number generator function for the probability distribution,
which changes periodically. When the probability of transmitter shifting changes
randomly, it becomes difficult for the receiver to communicate with the right chan-
nel. The jammer would need to know the probability distribution to be utilized in the
upcoming time interval to successfully halt transmission. The transmitter can
respond by either increasing the adversarial distortion intensity or by changing gen-
erating functions if the jammer is effective in predicting the probability values over
time. The jammer also intentionally modifies its jamming patterns to fool the trans-
mitter to believing that it has not been able to figure out the elementary designs.
In paper (Wang et al., 2019), the work suggests a scene recognition model with
applications to human-robot interaction that is built on DNN and game theory.
Conventional scene identification systems typically use either low-level features

or high-level features. In addition to being clear-cut and easy to use, DNN and
game theory also offer the benefits of solid logic and congruence with human
intuitive perception. The volume of scene data can also be accurately described by
deep learning-based methods. The main components of the deep learning-based
scene identification system are front-end local descriptor development, back-end
optimization, and loop detection of the created map. The key functions of the
front end are feature point extraction, camera position calculation, and local map
generation. The preprocessing operation is crucial for the effective processing of
the photos. In this article, the author discusses using game theory to complete the
task. The two most frequently used ways are the “spatial domain” and the “fre-
quency domain.” Game theory is used to handle simultaneous making of decision
concerns in a dispute surrounding. Large systems are decoupled into modules by
each module, which build classified judgments establish on data relation. To com-
plete the registration, nonzero and competitive games between the two players are
played using the example of the two characteristics as two players. In wagering,
two participants want to maximize their personal funds or keep the cost to a mini-
mal level, leading to two different types of features known as decouplings. By
using rational decision-making, the wagering balanced frame-matching solution
is obtained, and this proportion enhances a straightforward desired performance
for universality. The author found the game’s Nash equilibrium point, which pro-
vides the lowest energy level and completes the image preprocessing process, to
make use of the EM model to obtain the best result. Currently, the most important
technology for increasing a service robot’s intelligence is vision-based scenario
analysis and identification, which is also the first step in making intelligent robots.
Using DNN-assisted scenario identification, it is possible to continue the scene’s
descriptive analysis and training, the scene’s recognizable leak labelling to the
technique of recognizing prior periods, the tags made error, and the sample’s pro-
cessing improvements, which creates the new dominant character collection
indefinitely. Game theory is at the heart of this paper. Each person in the game
possesses a unique set of tactics because the images are thought of as a collection
of people. The model’s degree of freedom can reach infinity, which helps improve
accuracy and registration ability.
In paper (Dasgupta & Collins, 2019), the authors discussed, in cybersecurity
tasks, adversarial machine learning: game theoretic approaches. Using the compu-
tational framework of game theory, this paper presents methods considered to build
a machine learning system resistant upon adversarial attacks. In an example of
machine learning termed adversarial learning, two parties known as the learner and
adversary make an effort to develop a prediction mechanism for data pertaining to
the current problem domain, but with various goals. To accurately categorize or
predict the data is the learner’s goal when learning the prediction mechanism. The
adversary’s goal, on the other hand, is to misguide the learner into generating inac-
curate predictions about the facts in the future. Adverse learning poses a serious
threat to cybersecurity in several areas using machine learning-based classification
systems, including automatic spam filters, antivirus programs, and classification
formulas, image type in defense and security, medical apps, and many more. Game
4.3 Applications 53
theory provides an interesting tool for adversarial learning because it provides a

way to mathematically model learners and enemy behavior in terms of defensive
and offensive plan and to identify appropriate schemes to demote the loss of learn-
ers due to conflicting attacks. This paper discussed the background of game theory
and adversarial learning.
The authors (Li et al., 2017) discussed a safe mobile crowd sensing game with
deep reinforcement learning. As we all know, our smartphones consist of a number
of sensors, including accelerometers, and mobile crowdsensing (MCS) provides
location-based services. An MCS server seeks to enlist a few nearby smartphone
users to collect sensory data and create an MCS application. Game theory is used to
formulate MCS processes to provide auctions and pricing and offer an approach of
reputation-based mechanisms to encourage user participation in MCS services. In
this game, the server uses a classification algorithm to assess each sensor report’s
accuracy and determine the winner. To reduce the incentives for smartphone sensing
attacks, each user is compensated according to how accurately they use their senses.
Users who cheat are not paid anything. To understand the effects of the detecting
expenses, each user’s participation makes to the precision of the webserver, the
quantity of current smartphone owners, and the MCS software in evaluating the
sensing; the Stackelberg equilibria (SEs) and safe MCS game are calculated and
supplied. Increasing numbers of people are using smartphones to aid with fraudu-
lent detection attacks, and smartphone crowdsensing software is reduced by a larger
reward. The overpayments, however, cause oversensing and can occasionally cause
network congestion, which reduces the server’s usefulness.
A Q-function, or quality function that represents each state-discounted act’s
extended payout combination, along with the current condition of the preceding
recognition of the payment schedule and report performance, is used by the
Q-learning-equipped MCS server to determine the payment procedure. The high-
dimension curse serves as an example of the sluggish learning rate based on
Q-learning MCS system in the presence of a sizable state space. One solution to this
problem is the deep reinforcement learning network, which is a 2014 video game
deep Q-network (DQN) technique developed by Google DeepMind. More pre-
cisely, they suggest an MCS payment approach based on DQN that utilizes a deep
convolutional neural network (CNN) to calculate each Q-value payment value and
compact the training subspace. By utilizing Q-learning and deep learning, this MCS
payment plan expedites the process of determining the optimal payment policy to
acquire, so enhancing the behavior of the suspicious system in defense against
attacks utilizing faked sensing.
The authors (Schuurmans & Zinkevich, 2016) discussed how the monocyclic
neural network with the help of determinable convex gates is used in a game by
generating a bijection among the deep learning’s critical (or KKT) points and Nash
equilibria issue. The researchers of the paper want to find a novel way to reduce
guided learning to gaming. The employment of no-regret strategies to resolve large-
scale games as efficient training with stochastic techniques for supervised learning
problems is an intriguing discovery. Authors begin by thinking about the simpler
one-layer learning problem (OLP), allowing us to present the fundamental ideas.
Then, deep models will be added. The authors begin by identifying a
straightforward game whose Nash equilibria match the global minimum of the one-
layer learning problem (OLP). This fundamental connection creates a link between
supervised learning and playing games. According to the OLP, the authors proposed
different games: OLG (one-layer learning game), OCP (one-layer constrained
learning problem), and OCG (one-layer constrained learning game).
Regret matching (RM) and the normalized exponentiated weight algorithm
(EWA), a more straightforward technique from the research on economy and game
theory, were two algorithms that the authors took into consideration for learning
from expert advice. These algorithms conduct their updates for supervised learning
using a random sample of the gradient. The author made a comparison between
these and projected stochastic gradient descent (PSGD), which is a clear modifica-
tion of stochastic gradient descent (SGD) that nonetheless has a similar regret
bound. The authors ran experiments on both the MNIST dataset and synthetic data
to examine the usefulness of these strategies for supervised learning. This research
makes a significant contribution by demonstrating the gamelike nature of the task of
training a feedforward neural network with differentiable convex gates. A useful
outcome of this reduction is that it recommends novel training strategies for deep
models that are motivated by techniques that have recently demonstrated success in
trying to solve massively multiplayer online games. Many studies take regret mini-
mization into account to address offline optimization issues. Currently, Adagrad and
conventional stochastic gradient descent are two well-liked strategies. The concept
of selecting a minimizer from a specific family of functions to simplify the class of
losses first appeared in the literature on regret minimizing and has subsequently
been expanded upon.
In paper (Ren et al., 2020), the theoretical underpinnings, methods, and applica-
tions of adversarial attack strategies are introduced. The machine learning and secu-
rity industries have both been paying growing attention to adversarial attack and
defense approaches, which have emerged as a popular study topic in recent years.
The research community has, however, identified a serious risk to the security of the
current DL formulas: With the manipulation of benign samples, adversaries can
quickly trick DL models without being noticed by humans. Imperceptibility to
human hearing and vision changes is sufficient to cause the model to forecast incor-
rectly with a high degree of confidence. The adversarial sample phenomenon is seen
to be a substantial barrier to the widespread use of DL models in industrial settings.
Current adversarial attacks can be divided examining gray-box, black-box, and
white-box attacks in accordance with the threat model. The adversaries are pre-
sumptively fully aware of the parameters and architecture of their target model in
the white-box attack threat model. The target model’s architecture is the only thing
the adversaries are aware of in the gray-box threat model. The only method the
adversaries have in the black-box threat model to produce adversarial samples is
query access.
Heuristic and certificated defenses, among others, have recently been proposed
as defensive methods for adversarial sample detection/classification. Heuristic
defense is a term used to describe a defense mechanism that successfully counters
specific attackers despite lacking theoretical precision assurances. The most
4.3 Applications 55
effective heuristic defense at the moment is adversarial training, which aims to

increase the resilience of the DL model by including adversarial samples in the
training phase. Since most heuristic defenses cannot stop adaptive white-box
attacks, the community is beginning to concentrate on certificated defenses. A cer-
tificated defense is meant to ensure defensive effectiveness in specific circumstances
independent of the attack strategy employed by adversaries. The author of this work
examines and reviews the adversarial attacks and countermeasures that reflect the
most recent developments in this field.
The following are particular adversarial attacks on the other DL models, such as
the L-BFGS algorithm, fast gradient sign method, BIM and PGD, momentum itera-
tive attack, distributionally adversarial attack, Carlini and Wagner attack, GAN-
based attacks, and many more. The author of this work provides a summary of
various typical defenses that have been introduced recently, primarily randomization-
based schemes, adversarial training, and denoising techniques, proven defenses,
and a few additional fresh defenses.
In paper (Woo, 2019), the overall idea is to provide nuclear security with the help
of a nonzero-sum game and deep learning. There have been many attacks by attack-
ers on nuclear power plants, and to ensure security, game theory plays an important
role. To quantify the safety of nuclear power facilities, a complicated nonlinear
algorithm is employed using the nonzero technique of game theory. Moreover, deep
learning is employed in the same areas of data computation as neural networking.
There are various ways to manipulate data, including information retrieval and the
analysis of event structures, which allow for a deeper investigation of the data.
Game theory’s nonlinear complex algorithm is used to analyze terrorism-related
information because it provides a more accurate description of complex phenomena
such as terrorist assaults on nuclear infrastructure. For modeling purposes, it uses a
payoff matrix that consists of five different factors of nuclear security and two dif-
ferent factors of nuclear terrorism. Then, with the help of the payoff matrix, a game
tree is built, and with the help of the game tree, a payoff graph is made. One of the
key attributes is the ability to manage nonlinear algorithms, which require no exact
solutions to identify the outcomes of investigations, particularly for vital events in
society and business. It has been a high focus to examine human error in addition to
safety and security.
In paper (Arora et al., 2017), the utilization of deep reinforcement learning for
strategy games played in real time is discussed. The deep RTS gaming atmosphere
is described in this study as a platform to investigate innovative AI methods in
actual-time strategy games. An efficient game called Deep RTS was created
exclusively for AI research. Reinforcement learning has already been used in eas-
ier game contexts such as that present in the Arcade Learning Conditions, but it
has not been effectively utilized in more complex games. Researchers propose
Deep RTS, a brand-new gaming platform geared toward deep reinforcement
learning research, in this publication. The well-known Blizzard Entertainment
video game StarCraft II served as the inspiration for Deep RTS, an RTS simulator.
Research on strategy, deductive thinking, and handling at various stages of chal-
lenge are possible in the Deep RTS game scenario. This work draws influence
from microRTS and StarCraft II, with the objective of creating a setting that
includes competitions among each of them. The goal of the Deep RTS attempt is
to construct an outpost with an area hall and then work to extend the outpost with
assets collected to obtain the military edge. Threats are carried out by military
troops with the main objective of destroying the enemy’s outpost. Initially, each
player has a labor group. Expanding the outpost’s harmful, protective, and
resource-gathering capabilities is the main goal of the laboring groups in the game
universe. Added groups that improve the player’s attacking potential can arise
from buildings. A player needs to eliminate each competitor group to achieve the
finishing condition. Three levels can be used to describe a typical RTS game. The
collecting and outpost development phase is the first level. The second level
emphasizes financial and military power, whereas the third level is typically a
death match among each team until the game is over. There are numerous game
situations, including resource collection activities, military duties, and protecting
tasks, which reduce the difficulty of a full real-time strategy game since deep RTS
targets a diverse variety of reinforcement learning jobs. Deep RTS aims to realisti-
cally replicate RTS situations with super outstanding efficiency. The speed at
which the game system changes the status of the game and the speed at which the
game content may be produced as an image serve as performance indicators. A
powerful RTS simulator, the Deep RTS game platform, offers quick investigation
and testing of cutting-edge reinforcement learning methodologies.
In paper (Yu et al., 2018), DeDOL is one of the initial experiments for challeng-
ing long-form security games using deep Q learning. DeDOL (Deep-Q Network
based Double Oracle enhanced with Local modes), a deep reinforcement learning-
based procedure, is used to construct a monitoring approach for zero-sum Green
Security Games (GSG) that responds to the actual time input. To simulate the opera-
tional relationship between criminal protection authorities, the Green Security
Games (GSG) was developed (known as defenders) and its enemies (known as
attackers) in areas of green security.
The defender and the attacker are the only two participants in the game. The
attacker selects a single entrance point x from a set of entrances at the start of the
interaction, while the defender always begins from the patrol station. The attacker’s
initial attack power is limited tools and uses them to launch attacks at the targets he
has in mind. This game model is suitable for several green security areas. Until an
ultimate time step T is achieved, the interaction concludes or when the attacker and all
of the attack tools are discovered by the defender. The overall prize for the defender is
the outcome of the game. In this study, they focus on zero-sum games, in which the
attacker also receives rewards proportional to these actions. The game presumes that
both participants have access to local observations. They just look at the current posi-
tion of their competitor’s footprints. Instead of using the entire grid, they only use one
cell to represent the reality that they frequently have a restricted view of their sur-
roundings because of the thick foliage, challenging terrain, or severe weather.
Deep Q-learning (Hasselt et al., 2015) is utilized to estimate the optimal output
with the convolution neural network. The learning of the vanilla version of the deep
Q-network (DQN) was challenging due to the GSG-I environment’s high level of
4.3 Applications 57
dynamicity, particularly when the other player employed a randomized approach.

Therefore, to increase learning stability and loss, it uses double DQN approaches.
Finally, DeDOL is presented for zero-sum GSG-I, but it can be modified for general-
sum GSG-I, particularly when the game is near zero sum.
Learning with opponent-learning awareness (LOLA) is an approach where every
player in the surroundings influences the predicted training of the other competitors.
The LOLA developing rules have an additional term that takes into consideration
how one agent’s policy may affect another agent’s predicted attribute updates. The
iterated prisoners’ dilemma (IPD) leads to the establishment of tit-for-tat and, as a
result, cooperation when two LOLA agents come into contact, but independent
learning does not take place. In this domain, LOLA also benefits more than a naive
learner and is resistant to being taken advantage of by higher-order gradient-based
approaches. Recent years have seen a boom in multiagent reinforcement learning
(RL) because of the development of RL techniques that enable the research of
numerous agents in rich surroundings (Fukushima, 1980).
The learning outcomes in games with cooperative and competitive components
have long been studied in game theory. Specifically, IPD is frequently used to
study the conflict of cooperation and defection. In this game, pursuing selfish
goals may result in poorer overall for all agents, but working together increases
social welfare, a particular indicator of which is the total benefits received by
all agents.
Another clause in the LOLA training rule takes into consideration how one
agent’s policy affects the other agents’ next steps in learning. The strategy is not
restricted to zero-sum games and could be used in general-sum situations as well.
LOLA was applied to the deep RL setting utilizing likelihood ratio policy gradients,
allowing it to be scaled to settings with high-dimensional input and parameter
spaces. LOLA leads to high social welfare cooperation, whereas independent policy
gradients, a conventional multiagent RL technique, do not follow. LOLA is appli-
cable in situations when the opponent’s policy is unknown and must be deduced
from observations of the opponent’s actions. In IMP, LOLA results in stable learn-
ing of the Nash equilibrium. In a round-robin contest against other multiagent learn-
ing algorithms, precise LOLA agents obtain the highest average returns on the IPD
and reasonable performance on IMP.
In paper (Lanctot et al., 2017), we see a renewed interest in multiagent reinforce-
ment learning (MARL) as a result of the achievement of deep reinforcement learn-
ing. In MARL, multiple agents communicate and gain insight in the same world at
the same time, either competitively or cooperatively. Independent RL (InRL) is the
most basic form of MARL, in which each agent is unaware of other agents and basi-
cally considers every interaction as part of its (“localized”) surroundings. It presents
a novel metric for measuring the correlation impact of strategies learned by inde-
pendent learners, as well as demonstrating the degree of severity of the overfitting
issue. It proposes a novel algorithm that employs Deep RL to identify the optimal
outcomes from a policy distribution and a scientific game-theoretic study to calcu-
late novel meta-strategy distributions. It assumes centralized learning for decentral-
ized execution, as is typical in the MARL setting.
Policy-space response oracles (PSRO) are the primary conceptual algorithm.

The algorithm is a natural generalization of Double Oracle in which the options
in the meta-game are policies rather than actions. The procedure is documented
on paper. The meta-game is depicted as an empirical game, beginning with a
single policy (uniform random) and expanding each epoch by adding policies
(“oracles”) that approach best responses to the other players’ meta-strategy.
Once the other players in a partially observable multiagent environment remain
constant, the environment becomes Markovian, and finding the optimal response
simplifies to solving a form of MDP. In every episode, a single player is placed
in oracle (learning) form to learn, and a fixed policy is chosen from among the
opponents’ meta-strategies. Although the generalization of PSRO is obvious
and fascinating, the RL phase may require a long time to reach a satisfactory
result. In challenging circumstances, many of the basic behaviors gained in one
epoch may have to be relearned when restarting over. To address these issues,
we propose a practical parallel version of PSRO. Instead of an infinite number
of epochs, we predetermine a set number of levels. This is known as deep cogni-
tive hierarchy (DCH).
It proposes joint policy correlation (JPC) matrices to determine the impact of
overfitting in independent reinforcement learners. It demonstrates that PSRO or
DCH generates general rules that decrease JPC drastically in partially observable
cooperative games, as well as powerful counterstrategies that securely attack oppo-
nents in a usual competitive unreliable data game.
An approach for predicting driver behavior in scenarios involving highway driv-
ing is proposed, merging deep reinforcement learning and hierarchical game theory.
Plenty of distance for driving tests is thought to be necessary for self-driving vehi-
cles to achieve the same standard of assurance as cars with drivers. Driver models
for game theory (Albaba & Yildiz, 2021) are contrasted with actual human driving
patterns of behavior collected from traffic data to show the authenticity of the sug-
gested modeling framework. In the open literature, a number of methods are sug-
gested for obtaining highly accurate representations of human drivers. For the
detection and forecasting of driving actions such as braking and steering control,
Markov dynamic models (MDMs) are used. It is suggested to use a method called
SITRAS (simulation of intelligent transport systems) to mimic lane change and
merging behaviors. Dynamic Bayesian networks (DBNs) can recognize lane shift-
ing or speeding. Therefore, a number of articles have been published regarding
problems related to self-driving vehicles.
The paper offers a probabilistic modeling approach using game theory and
deep reinforcement learning that enables concurrent choice making for multia-
gent traffic scenarios. By modeling the decision-making ego driver and supposing
fixed responses for the other drivers, this approach differs from previous research
because every driver in a multimove situation executes strategic choices concur-
rently. This is accomplished by fusing deep Q-learning (DQN), a reinforcement
learning technique, with level-k reasoning, which comes from hierarchical game
theory (Albaba & Yildiz, 2021). Level-k logic is a type of game theory concept
that is utilized to simulate the strategic thinking process of human drivers. The
level-K method is a hierarchical method of making decision framework that
4.3 Applications 59
assumes that multiple individuals have varying capacities for thinking. Level-k
reasoning is insufficient on its own. Deep Q-learning (DQN) and level-k reason-
ing are combined to produce driver designs that respond well for each of the
agent’s predicted behaviors in a multimove scenario. In this study, the model is
validated using two collection of traffic data that were gathered from the US101
and I80 motorways. The US101 set is chosen from these two sets to calculate the
values of the observation and action space parameters. Two distinct reinforcement
learning (RL) techniques, deep Q-learning (DQN) and its continuous counterpart,
c-DQN, are utilized in conjunction with the level-k reasoning method to train
driver policies. The Kolmogorov-Smirnov goodness of fit test (K-S test) is used to
compare the suggested regulations, or driver algorithms, with the policies derived
by analyzing real traffic data.
The authors (Xu et al., 2022), discussed the Internet of Vehicles (IoV). The IoV
is a flexible portable networking framework that unites customers, automobiles,
detecting equipment, and providers of services to enable connections between auto-
mobiles and people on the connection. Automobiles typically have limited compu-
tational capabilities, which prevents them from processing large volumes of service
requests quickly. However, in the circumstance of an automobile running at a fast
speed, time-dependent customer requests like collision alerts need to be handled
right away. The quality of service (QoS) of automotive operations in the IoV is par-
tially improved by cloud computing by reducing operation execution latency. The
emergence of edge computing (EC) offers an additional advantageous approach to
address the drawbacks of cloud computing in information transfer. By placing
workstations in roadside units (RSU) that are adjacent to towns, EC as a fresh
approach to computing dramatically minimizes the length of information delivery.
In most cases, the edge device has constrained storage and computational capabili-
ties. The customer’s assistance execution plan can be identified through smart con-
trolling methods such as reinforcement learning where the amounts of RSUs and
requests for services are minimal.
It is difficult to maintain the QoS for the facilities in the IoV while sticking to the
restriction of RSU assets, necessitating the creation of a fair service offloading
mechanism according to the estimate of future traffic flow. To increase resource
usage, the RSU have to maximize the distribution of hardware and software assets.
At the moment, RSU load state and support selections for unloading must therefore
be optimized using a reliable quick traffic flow prediction approach. In this research,
a service offloading method based on game theory approach for cutting-edge tech-
nology in the IoV is suggested to address those issues. The Takagi-Sugeno fuzzy
neural network (T-S FNN) is used in the technique to forecast immediate traffic
flow. The T-S FNN uses a hybrid technique of back propagation and least squares
algorithm to change parameters as opposed to the conventional traffic prediction
method (such as ST-ResNet), which includes more hidden neurons and learning
parameters. The T-S FNN is built on the Sugeno model, which integrates neural
networks and fuzzy systems and uses them to modify variables in fuzzy control. A
fuzzy neural network with multiple inputs and one output is created. The goal is to
raise the system’s QoS or to improve the overall QoS of all services provided to
consumers.
The authors (Foahom Gouabou et al., 2022) introduced AI-assisted diagnosis,

which is thought of as a potential remedy for early melanoma detection.
Convolutional neural networks (CNNs) in particular have made significant break-
throughs in deep learning, although computer-aided diagnostic (CAD) technologies
are not yet widely deployed in clinical settings. Game theory, a theory that concen-
trates on finding the best judgment alternative that enhances the judgment maker’s
expected benefits, is a potential remedy. A new paradigm for automatic melanoma
detection is presented in this paper. An intriguing method for helping dermatolo-
gists with melanoma detection in real practice is the use of CAD systems. In fact,
exceptional results by CAD driven by CNNs have been seen, matching dermatolo-
gists in a clinical trial. The majority of the frequently suggested methods for enhanc-
ing skin cancer detection accuracy fall into one of three groups: ensemble learning,
data augmentation, and transfer learning.
The objective of this work is to build a reliable CAD for melanoma diagnosis and
to explain the reasoning behind it. Models were created from pretrained designs and
obtained training data enhanced with artificial photos acquired using artificial infor-
mational data creation methods because the workflow of the research is an ensemble
method mixing multiple models. Considering the resources accessible, the models
used in this work on the B5 version of EfficientNet. In place of the classification
layer, the author modified the original system by introducing a second fully con-
nected (FC) layer for binary operations that use two nodes for categorization or
three nodes for ternary categorization.
To create a method of decision-making that is intelligible by users and nonex-
perts, the study offers a novel hierarchical architecture motivated by game theory.
Theoretically, conflict between rival players can be modeled using game theory,
which also allows for the analysis of different players’ behavior. Two strategies are
outlined for the players in the game theory method used in this work. The initial
strategy is to group the guesses by degree of confidence unsure, and the subsequent
action is to create the ultimate forecast. The output probabilities of the algorithms
determine the payout value.
Moreover, a novel method for determining the level of confidence in an estimate
has also been provided. The structure has been integrated with a heatmap display to
enable a good understanding of outcomes. This strategy would enable our CAD to
increase transparency in decisions and outperform earlier approaches. The findings
of this work clearly demonstrate that it is feasible to develop automatic medical
systems that utilize deep learning that are explicable without sacrificing diagnostic
precision.
The authors (Khadarvali et al., 2022) discussed the smart grid, and the most
recent technology to produce and distribute the right quantity of power is known
as smart grids. Most modern smart grid applications require load frequency con-
trol (LFC), which is critical. It offers superior control of two-area systems.
Whenever 2 (two) power systems are linked to one another, an LFC is applied to
determine the stability of the connection. There could be variations in the power
and frequency whenever there’s a significant change, like a generator shift or load
4.3 Applications 61
shift. The frequency error must be zero to synchronize the two systems. To do this,
fuzzy logic controls (FLC), proportional integral controllers (PIC), and integral
controllers (IC) are all utilized. An artificial neural network (ANN) that was
trained using the results of an ant lion optimization (ALO)-tuned PI controller
(ALO-PI) was used by the author to replace the conventional integral controller in
the two domains. Both the input and output information were obtained from the
ALO PI controller. To improve control, a backpropagation technique was used to
train the ANN.
The grid is becoming smarter, thanks to recent technological advancements,
and wireless data communication is used. Wireless data transmission has a risk.
Game theory has a significant impact on that. The frequency is one of the most
important features to maintain, and this study focuses on a power system’s most
crucial component which is its stability. This is because a power system’s fre-
quency changes based on how much demand is being delivered to it, either
increasing or decreasing. The frequency control system has three modes of opera-
tion. When a step-load fluctuation is introduced to the system, the primary and
secondary frequency control in this situation constitute a parallel PIC that can
drive frequency variations to zero. A control system called tertiary frequency con-
trol relies on offline enhancements. In this paper, different attacks in LFCs are
mentioned, and the countermeasures of the mentioned attacks are presented. The
traditional integral controller and the suggested ANN were compared, taking into
account the behaviors of the attacker and defense, and the ANN surpassed the IC
in the setting time.
The authors (Kishorea et al., 2020) discussed image restoration techniques.
Restoration is a process in which it is necessary to boost the specific part of a picture
that has distortions that are caused by damage or blur due to faulty image capture.
Deep learning generative adversarial network (GAN) with two primary elements
was proposed by the author: a discriminator (convolutional neural network) and a
generator (deconvolutional neural network) model. In the generator model, the net-
work is given input data and determines the input data’s combined probability dis-
tribution to generate further points of data adopting the same distribution. With the
discriminator model, the network is a simple model classifier that classifies the
images produced by the generator as either different from or similar to the target
image. The aim is to obtain images from the generator whose probability distribu-
tion is closest to that of the target image. The Nash equilibrium and minmax algo-
rithm are two key ideas in game theory that relate to the battle between generators
and discriminators. Gaining more error in the discriminative model is the primary
goal of the generative network. A well-known dataset is used as training data for the
discriminator. Use training dataset samples to achieve training with a high accuracy
and low error rate. If the generator is able to trick the discriminators, then the gen-
erator will adjust its training.
Backpropagation is used in both neural networks, such that the generator is able
to create a better result, and it also helps the discriminator by helping it recognize or
flag candidates for the synthesized images. In this article, they suggested a brand-
new technique called EIRGAN. To deblur blurry images, the author introduces
EIRGAN, an improved generative adversarial network for image restoration that is
an alternative of Nash equilibrium. Additionally, they used the mathematical strat-
egy outlined, restoring images using GAN training with the negative-f divergence
function. By developing and evaluating the model on the same dataset as those other
GAN designs, they have moreover contrasted our generative adversarial network
with additional common and cutting-edge GAN designs that are accessible in the
general public domain.
To study satellite behavior or to gather critical information from space, machine
learning and game theory were used in paper (Shen et al., 2020). SSA stands for
space situational awareness used for controlling satellite mobility, and it is depen-
dent on quick and precise space object behavioral classification and finding. Due to
the absence of a labelled dataset for the purpose of validation and training, it is the
biggest barrier for using methods for machine learning (ML) and assessing their
effectiveness in applications for SSA. They introduce a data augmentation method-
ology that incorporates game theory to generate datasets for ML methods for satel-
lite behavior identification. Data such as elevation angle, azimuth angle, range, and
range rate are transmitted through SGP4/SDP4. To create evasive maneuvering
methods for space objects, they use a two-player game of pursuit evasion. RSO,
which stands for resident space objects, is a technique used by them for resolving
the issue related to SSA behavior detection, which is provided by the game theory
approach. While the SSA observer improves tracking performance, RSO manipu-
lates tracking estimates through a sensing and monitoring system to deceive the
SSA observer. The analysis of RSO can be coordinated using UDOP, which stands
for user-defined operation pictures, to correctly execute SSA. They also used GANs
to further amplify the data that were simulated to enrich the training data.
3D-CNN is used to classify satellite behavior to assess performance. The filters
of the convolutional network are selected randomly, resulting in random classifi-
cation. Given the quantity of iterations increases, the accuracy of the training
model reaches 100%. The rest of the dataset is utilized for the proficient CNN
model’s evaluation. The trained machine learning model has a 97% accuracy rate
for classifying satellite behaviors. ResNet upgraded the GAN model so that it can
correctly conduct data augmentation. The efficacy of machine learning algorithms
during training and validation is enhanced by this model-based, game theoretic,
synthetic data. The purpose of this study is to show how classification using deep
learning models with the help of game theory activated, data augmentation, and
produced training data can identify the behavior of space objects with greater
accuracy.
In paper (Pal & Vidal, 2020), a game theory framework was suggested by schol-
ars to study adversarial attacks and defenses. There is an existence of a set of attacks
and defenses in the Nash equilibrium in their model given a linear hypothesis on the
border of decision-making for the underpinning binary classification, which means
4.3 Applications 63
that there are some input modifications that the attacker and defender are permitted
for, how much information does the attacker and defender know about one another,
and then the approaches that are allowed to be specified. For deriving general con-
straints for the classifier for hypothetical test data, they provide an optimized
approach to determine the equilibrium defense for any given classifier. Accessing a
training set of n unique samples, researchers provide methods to approximate the
optimal defense and derive generalized bounds on the finite-sample estimate’s
result. Their bounds demonstrate that the estimate progresses toward the best
defense at a rapid rate of O(sqr (log n/n)) with respect to the sample count n. They
concentrate on traditional game-theory techniques to the adversarial classification
problem and other contemporary defenses for that can obtain theoretical assurances
on the performance being attack. This classification study focuses on email spam
detection, given a dataset (X, Y) and an adversary who can alter the positive (spam)
by imposing a transformation cost determined by a cost function c. The defender is
free to select a classifier h that divides any x belongs to X into spam and nonspam
categories.
In contrast to previous research, defender in this study performs an additive
perturbation rather than a classifier. They treat the base classifier as fixed and are
given to us to attack or defend in accordance with practice. They also concentrate
on the scenario when the attacker is capable of adding disturbances. This work
seeks to describe adversarial attacks and defenses in a way that may be game theo-
retically optimal for each other, in which the attacker is unable to reduce reliable
accuracy if defense is fixed and the defender cannot improve reliable accuracy if
attack is fixed.
In paper (Wong et al., 2021), the aim is to analyze the allocation of resources
issue in multicell MIMO networks and to develop an algorithm with distributed
computation that can maximize the frequency to every base station user base sta-
tions with CSI at every base station. CSI stands for channel state information.
MIMO stands for multiple-input multiple-output systems that are used for excep-
tional capacity using the quantity of antennas at the input and output ends. However,
practically, the number of antennas should not be too high. For example, in the case
of 5G, the number of antennas should not be greater than 64. The problem of inter-
ference can be solved by the channel state with the help of zero-forcing, as all users
have the same frequency channels within the same cell. The next task is to optimize
the distributed DFA for all the base stations. DFA stands for dynamic frequency
allocation. For frequency resources with one another, the base stations build up as
an artificial forward-looking game, the DFA problem in which they compete strate-
gically. The issue with game theory in this application is that an unbiased player is
self-centered and solely concerned with their personal benefit, which is determined
by both of their own tactics and the reactions of their rivals. A participant ought to
maximize their approach predicated on the end payoff rather than the immediate
reward. The player will have to think about things that are happening in the future
as well as in the present. This study aims to jointly optimize in multicell scenarios
by utilizing the cooperation between MIMO and DFA created by games with a
forward-looking aspect.
The base stations are trained to perfect their game-theoretic reconciliation strate-
gies by utilizing multiagent deep reinforcement learning (DRL) with centralized
offline training so that it optimizes the network capacity. Finally, each base station
has a trained neural network that is equipped with extensive knowledge about rec-
onciling with other base stations so that it converges to a network-efficient equilib-
rium. They also used QMIX architecture in conjunction with TD3 (a double
Q-learning variation) for optimizing network capacity.
The authors (Hu et al., 2022) discussed how image segmentation is performed
with the help of deep learning and game theory. They introduced an improved tech-
nique of the Unet-Ore neural network model to address the issue of inadequate
segmentation. This is caused by the fuzzily segmented ore in each picture. For seg-
menting the ore images, both game theory and deep learning are used. The proposed
Unet-Ore neural network differs from the conventional neural network structure.
There is improvement in feature extraction and generalization capacity by changing
some neural network structures of the conventional one.
The authors (Ardekani et al., 2022) discussed track selection for self-driving cars
in complicated situations by using a unique method that is based on memory neural
networks and Nash equilibrium. To assess the recommended strategy in the racing
game, consider the two vehicles: the ego and the opponent. The suggested approach
for this is known as GT-LSTM. First, memory neural networks are used to under-
stand and predict the way the competitor’s car behaved. Second, agents employ
matrices for payoffs in games to decide which course of action will result in the
greatest payoff, and third, PID controllers smooth out lane changes and path follow-
ing. To make the decision-making choice easier and more accurate, the merging of
these domains is performed (game theory and neural network). This is done by
training neural networks to determine how the opponent car behaves. The suggested
approach outperforms the other two methods in regard to producing simulation out-
comes, with a rate of success of 55% compared to 15% and 32% for game theory
without knowing the other player’s vehicle and payoff matrices and game theory
compared to traditional neural networks, respectively. In addition, it’s been demon-
strated that in 90.2 percent of the cases, the recommended algorithm’s output cor-
responds to Nash equilibrium.
The authors (Mamoudan et al., 2022) discuss the pricing strategy of perishable
food items. That takes into account both the product’s brand value and the prices of
rival producers. Think of yourself as a customer; if you are going to buy some prod-
uct from a supermarket, what should you prefer, product A, whose expiration date
is nearer, or product B, whose expiration date is far. The answer is obvious you will
prefer product B. There are many factors that involve demand for the products; it
can be the company name value, the product’s price, demand rate, etc. The prices of
the food products are set in the factory, and then, the salesman cannot change the
price of the product. Therefore, for predicting a reasonable price, this model will
References 65
help. Algorithms powered by deep learning and artificial intelligence can be utilized
to foresee the price strategies of competing producers. At both the micro- and
macro-levels, it is important to predict how much different products will cost. As
salesmen cannot change the price of the product, if products are not sold before they
expire, significant expenses can be added to the supply chain. This results in
increased environmental damage. The application of game theory, which can result
in positive interaction, is consequently one useful tactic for managing the green sup-
ply chain.
The pricing model is presented using a game theory method. This model
involves the supplier, the vendor, and consumers. The supplier’s brand value has
been taken into account as this model’s variable. The company name can have a
large impact on consumer behavior. Consumers’ decisions to select and pur-
chase things might be influenced by the brand. Accurate food price forecasting
paves the way for the adoption of consumer and producer protection measures.
It is preferable for the anticipated model to have a high accuracy and error
should be low.
The recommended framework, known as CNN-LSTM-GA, blends CNN,
LSTM, and a genetic algorithm (GA). This network may gather detailed charac-
teristics from several variables. It is appropriate to use the LSTM layer for model-
ing temporal data gathered from unpredictable patterns of a series of time
components, and the CNN layer is capable of extracting features between various
variables that affect food prices. The CNN-LSTM model’s hyperparameters were
then tuned utilizing GA to minimize any potential mistakes and optimize the
model. Then, they compare the proposed CNN LSTM-GA technique with other
deep learning models using validation metrics including the root mean square
error (RMSE), mean absolute error (MAE), R-square (R2), and mean absolute
error (MSE). The measures reflect the degree of prediction error, and a smaller
value indicates a more effective model. One more use of this model is related to
seasonal food items; even if they are not perishable, they will have a significant
storage cost.
References
Albaba, B. M., & Yildiz, Y. (2021). Driver modeling through deep reinforcement learning and
behavioral game theory. IEEE Transactions on Control Systems Technology, 30(2), 885–892.
Ardekani, A. A., Chahe, A., & Yazdi, M. R. H. (2022, November). Combining deep learning and
game theory for path planning in autonomous racing cars. In 2022 10th RSI international con-
ference on robotics and mechatronics (ICRoM) (pp. 564–571). IEEE.
Arora, S., Ge, R., Liang, Y., Ma, T., & Zhang, Y. (2017). Generalization and equilibrium in genera-
tive adversarial nets (gans). In: Proceedings of the 34th international conference on machine
learning, Vol. 70. JMLR.org (pp. 224–232).
Dasgupta, P., & Collins, J. (2019). A survey of game theoretic approaches for adversarial machine
learning in cybersecurity tasks. AI Magazine, 40(2), 31–43. https://doi.org/10.1609/aimag.
v40i2.2847.
Foahom Gouabou, A. C., Collenne, J., Monnier, J., Iguernaissi, R., Damoiseaux, J. L., Moudafi,
A., & Merad, D. (2022). Computer aided diagnosis of melanoma using deep neural networks
and game theory: Application on Dermoscopic images of skin lesions. International Journal of
Molecular Sciences, 23(22), 13838.
Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism
of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202.
Hasselt, H. V., Guez, A, & Silver, D. (2015). Deep reinforcement learning with double q-learning.
AAAI. https://doi.org/10.1609/aaai.v30i1.10295.
Hazra, T., & Anjaria, K. (2022). Applications of game theory in deep learning: A survey. Multimedia
Tools and Applications, 81, 8963–8994. https://doi.org/10.1007/s11042-022-12153-2
Hu, W., Liu, X., & Xie, Z. (2022). Ore image segmentation application based on deep learning and
game theory. In World science: Problems and innovations (pp. 71–76).
Khadarvali, S., Madhusudhan, V., & Kiranmayi, R. (2022). Artificial neural network controller
in two-area and five-area system with security attack and game-theory based defender action.
Energies, 15(15), 5715.
Kishorea, A., Kumarb, A., & Dangc, N. (2020). Enhanced image restoration by GANs using game
theory. International conference on smart sustainable intelligent computing and applications.
Procedia Computer Science, 173(2020), 225–233.
Lanctot, M., et al. (2017). A unified game-theoretic approach to multiagent reinforcement learning.
Advances in Neural Information Processing Systems., 30.
Li, Y. (2017). Deep reinforcement learning: An overview. ArXiv, abs/1701.07274. https://doi.
org/10.48550/arXiv.1701.07274.
Mamoudan, M. M., Mohammadnazari, Z., Ostadi, A., & Esfahbodi, A. (2022). Food products
pricing theory with application of machine learning and game theory approach. International
Journal of Production Research, 1–21.
Pal, A., & Vidal, R. (2020). A game theoretic analysis of additive adversarial attacks and
defenses. In 34th conference on neural information processing systems (NeurIPS 2020),
Vancouver, Canada.
Ren, K., Zheng, T., Qin, Z., & Liu, X. (2020). Adversarial attacks and defenses in deep learning.
Engineering, 6, 346–360.
Schuurmans, D., & Zinkevich, M. A. (2016). Deep learning games. In: Advances in neural infor-
mation processing systems (pp. 1678–168).
Shen, D., Sheaff, C., Chen, G., Guo, M., Sullivan, N., Blasch, E., & Pham, K. (2020). Game theo-
retic synthetic data generation for machine learning based satellite behavior detection. In The
advanced Maui optical and space surveillance technologies (AMOS) conference.
van den Nouweland, A. (2007). Rock-paper-scissors; a new and elegant proof. Department of
Economics, University of Oregon, and Department of Economics, the University of Melbourne,
Australia.
Wang, R. Q., Wang, W. Z., Zhao, D. Z., Chen, G. H., & Luo, D. S. (2019). Scene recognition based
on DNN and game theory with its applications in human-robot interaction. arXiv preprint
arXiv:1912.01293.
Weerasinghe, S., Alpcan, T., Erfani, S. M., Leckie, C., Pourbeik, P., & Riddle, J. (2018).
Deep learning based game-theoretical approach to evade jamming attacks. In L. Bushnell,
R. Poovendran, & T. Başar (Eds.), Decision and game theory for security. GameSec
2018 (Lecture notes in computer science) (Vol. 11199). Springer. https://doi.
org/10.1007/978-3-030-01554-1_22
Wong, K. K., Liu, G., Cun, W., Zhang, W., Zhao, M., & Zheng, Z. (2021). Truly distributed multi-
cell multi-band multiuser MIMO by synergizing game theory and deep learning. IEEE Access,
9, 30347–30358.
4 Applications of Game Theory in Deep Neural Networks 67
Woo, T. H. (2019). Game theory based complex analysis for nuclear security using nonzero sum
algorithm. Annals of Nuclear Energy, 125, 12–17.
Xiao, L., Li, Y., Han, G., Dai, H., & Poor, H. V. (2017). A secure mobile crowdsensing game
with deep reinforcement learning. IEEE Transactions on Information Forensics and Security,
13(1), 35–47.
Xu, X., Jiang, Q., Zhang, P., Cao, X., Khosravi, M. R., Alex, L. T., et al. (2022). Game the-
ory for distributed IoV task offloading with fuzzy neural network in edge computing. IEEE
Transactions on Fuzzy Systems, 30(11), 4593–4604.
Yu, L., et al. (2018). Deep reinforcement learning for green security game with online information.
Workshops at the thirty-second AAAI conference on artificial intelligence.
Chapter 5
Case Studies and Different Applications
Game theory is a branch of mathematics that studies interactions and decision-

making in situations where the outcomes of one individual’s choices depend on the
choices made by others. It has a wide range of applications across various fields,
including economics, politics, biology, psychology, and more. Some notable appli-
cations of game theory are discussed in this chapter.
5.1 Auctions
Auction theory is a branch of economics and game theory that studies the design
and analysis of auctions. It explores how auctions work as mechanisms for allocat-
ing goods, services, or resources to potential buyers and how bidders strategically
interact in the auction process. The main goals of auction theory are to understand
the properties of different auction formats, predict the outcomes of auctions, and
design auctions. All auctions are mechanisms that have allocation and award
fee rules.
The main elements of an auction consist of the following:
Players: These players are bidders or individuals which compete and participate to
obtain the item. They have their own values for items.
Strategies: These are basically plans to increase the players’ chance of winning.
They generally make decisions on their values and considering other players’
actions and strategies.
Revenue Maximization: For the auctioneer, revenue maximization is often a pri-
mary objective. By understanding bidder behavior and strategic interactions, the
auctioneer can design auction formats that optimize the revenue generated from
the auction.

70 5 Case Studies and Different Applications
There are different types of auctions which are discussed below (Cintuglu
et al., 2015):
1. English auctions: Bidders generally compete by increasing their bids until no
one is willing to increase the price. The auctioneer announces the current highest
bid, and the process iterates until only one bidder remains.
2. Dutch auction: The auctioneer starts with a high price and lowers the price until
any bidder accepts it.
3. First-price auctions: Bidders submit their private bids without knowing the bids
of other bidders, and the highest bidder wins the auction and pays the amount
they bid. The winning bidder is required to pay the amount they bid to the seller
in exchange for the item being auctioned. Generally, they have applications in
real estates, in house or property auctions.
4. Vickrey auction: This is where bidders simultaneously submit their values and
send to the seller. Highest bidder will pay the second highest bid and gets
the object.
Game theory helps analyze how bidders strategize and bid in auctions. Basically,
it helps in decision making and predicting out various strategies. It also helps in
telling outcomes with respect to other players, overall increasing payoffs. Game
theory investigates how bidders strategically determine their bids to maximize their
utility or profits. Game theory can also detect frauds in auctions by analyzing pat-
terns and behaviors of auction.
Pricing Pricing decision in game theory refers to the strategic choices made by
competing firms in setting the prices of their products or services. Game theory
provides a framework to analyze the interactions and strategies of firms in a com-
petitive market, and each firm’s pricing decision affects its own profits as well as the
profits of other firms and organizations.
Price Competition Pricing Games— Game theory is used to model and analyze
pricing games where competing firms set their prices strategically to maximize their
profits. In a situation where there are two firms in a market, the firms may engage in
price competition to gain a larger market share by observing each other’s moves.
There are various models involved in pricing games; some of them are dis-
cussed below.
Bertrand Model: The Bertrand model is a classic example of a price competition

game which is named after French economist Joseph Bertrand, where two firms
simultaneously set prices, assuming they have similar products at different prices
without knowing the price set by the other firm and the lowest price wins the
entire market (Tremblay & Tremblay, 2011). In a situation where one firm sets
higher price and the other sets lowest price, so obviously the firm offering lower
price would win. One significant issue is that the assumption of simultaneous
decision-making might not hold in all markets.
5.1 Auctions 71
Nash Equilibrium: Game theory helps to identify the Nash equilibrium in price
competition games, where both firms set prices at their marginal cost, resulting
in zero economic profits. This equilibrium demonstrates that firms have limited
pricing power in perfectly competitive markets.
Cooperative Pricing Games: Game theory helps to analyze cooperative pricing
games, where firms in a cartel collude to set prices collectively to maximize joint
profits.
Cartel Stability: Game theory is used to study the stability and sustainability of
cartels as well as the factors that can lead to cartel breakdowns due to noncoop-
erative behavior.
Let us consider an example of two organizations that have two possible strate-
gies. They can choose a price of $5 or $10. Two hundred amounts of quantity are
demanded for $5 and 400 for $10.
Case 1: At $5, the product demanded 200 quantities, so the profit would be
5*100 = 500 for each of them.
Case 2: If both have $10, then they are dividing 400/2 = 200, so, 200*10, which is
equal to 2000 for both organizations.
Case 3: (5,10) only 5*200 = 1000 for A will be the profit and 0 profit for B.
Case 4: (10,5) B will warn 200*5 = 1000 and there will be no profit for A.
In Table 5.1, the Nash equilibrium will be ($5, $5), where both organizations will
obtain 500 as profit. However, if the organizations cooperate, they obtain ($10, $10)
as a solution with a profit equal to 2000 for both organizations.
Machine learning techniques offer a data-driven approach to studying strate-
gic interactions and decision-making in game theory. Machine learning can be
used to model and predict player behavior in strategic games. By analyzing
historical data of players’ actions and outcomes, machine learning models can
learn patterns and tendencies, aiding in understanding strategic decision-mak-
ing. Machine learning can be used to predict the best strategies. This prediction
can help to analyze potential outcomes. Machine learning can simulate and ana-
lyze interactions among multiple agents in complex scenarios. Also, certain
algorithms for auctions, market pricing, and resource allocation are developed.
Machine learning algorithms can help to identify potential coalitions and esti-
mate the prices and payoff benefits the firms would get if they collaborate. In
recent times, machine learning is extensively being used for multiagent systems
(Hazra & Anjaria, 2022).
Table 5.1 Payoff matrix

Organization A/B $ 5 for B $ 10 for B
$ 5 for A 500, 500 1000,0
$ 10 for A 0,1000 2000, 2000
5.2 Game Theory in GAN
Game theory has various applications in GAN models which are used for generation
of synthetic data. It consists of generator and discriminator which are trained in
aggressive way. Goodfellow et al. discussed about minimizing worst-case error
which is possible using adversarial training of GAN (Goodfellow et al., 2014).
GANs can be formulated as a minimax game between the generator and discrimina-
tor. The generator tries to produce data that can deceive the discriminator, while the
discriminator aims to distinguish between real and generated data accurately. This
formulation is analogous to a two-player zero-sum game in game theory. GANs can
be formulated as a two-player minimax game (Goodfellow et al., 2014), where the
generator and discriminator are the players and their objective is to maximize their
own utility (minimizing their own loss) while considering the other player’s actions.
Game theory-inspired regularization techniques are employed to stabilize GAN
training and improve the performance of both the generator and discriminator.
Berthelot et al. designed a new loss function used in the training algorithm (Berthelot
et al., 2017). In Oliehoek et al. (2018) GAN is used in finite games with mixed
strategies along with local Nash equilibrium being achieved.
Nash Equilibrium and GAN: The Nash equilibrium in a GAN is determined
when both the generator and discriminator have found their optimal strategies, and
neither player has an incentive to change their strategy given the other player’s strat-
egy. At this equilibrium point, the generator produces data that are indistinguishable
from real data, and the discriminator cannot differentiate between real and gener-
ated data. Mathematically, in a GAN, the generator seeks to minimize its loss func-
tion, which (Berthelot et al., 2017) measures how well it can fool the discriminator.
Additionally, the discriminator aims to minimize its loss function, which measures
how well it can distinguish between real and generated data. The training process
involves iteratively updating the generator and discriminator in a competitive man-
ner until they reach a Nash equilibrium. In the Nash equilibrium of a GAN, the
generator produces samples that are similar to real data, and the discriminator is
unable to differentiate between real and generated data. Liu and Chawla (2010)
discussed a Stackelberg zero-sum game with two players, which is used to make a
loss function (Foerster et al., 2018). They trained a convolutional neural network
(CNN) as the learner. The adversary is the leader L, and the learner is the follower
F. A “geometry-aware GAN” could potentially refer to a type of generative adver-
sarial network (GAN) that incorporates geometric information or considerations
into its generative process, such as information about shapes, sizes, positions, and
orientations (Huang et al., 2019). A geometry-aware GAN might incorporate an
encoder-decoder architecture. It has various applications, such as converting 2D
floor plans into 3D models or generating 3D models from 2D images. To generate
scenes or environments, a geometry-aware GAN could be considered. A geometry-
aware GAN could generate visualizations that accurately represent the spatial posi-
tioning of players, resources, and interactions, aiding in the analysis of game
5.3 Game Theory in CNN 73
dynamics. In multiagent settings, geometry-aware GANs might capture and gener-

ate player behaviors that respect geometric constraints. Since developments in both
machine learning and game theory can emerge rapidly, there might be more recent
research or applications related to geometry-aware GANs and game theory. It could
be easily incorporated into and improve any existing GAN architecture (Kossaifi
et al., 2017). GAN also uses supervised learning to estimate cost function and are
also used for evaluation purposes (Hazra & Anjaria, 2022). Also, GANs could be
used in continuous, non-convex games (Goodfellow 2016).
5.3 Game Theory in CNN
Game Theory and CNN: The application of convolutional neural networks (CNNs)
in game theory is an emerging area of research that explores how CNNs can be used
to analyze strategic interactions, predict player behavior, and model decision-
making processes in various games. CNNs can be employed to analyze strategic
interactions in various games, including board games, card games, and video games.
By inputting game states or actions as images, CNNs can learn patterns and features
that represent optimal strategies, winning moves, or potential threats. CNNs can be
used to predict player behavior in games by analyzing their actions, game states,
and historical data. This prediction can help game designers to understand players’
preferences, adapt game content dynamically, and enhance the gaming experience.
CNNs can provide strategy recommendations to players based on the current game
state. By analyzing past game playing data, CNNs can suggest optimal moves or
strategies to players, assisting them in decision-making during the game. CNNs can
be employed to generate game content, such as level designs, game maps, and vir-
tual environments. CNNs need modifying hyperparameters such as dropout rate,
learning rate, and regularization parameters, which could be seen as a game theory
scenario. Hyper-parameter arrangement is equivalent to a player’s strategy, and the
goal is to attain equilibrium. By learning from existing game content, CNNs can
generate new content that adheres to the rules and constraints of the game (Hazra &
Anjaria, 2022). CNNs can be used to represent game states in extensive-form games.
By encoding the game tree structure and player actions into an image-like format,
CNNs can learn to recognize patterns and features that represent the state of the
game. CNNs can be trained to learn strategies from large datasets of game plays.
This can provide insights into how players adapt their strategies over time, which is
particularly useful in dynamic and repeated games. CNNs can predict the likely
outcomes of extensive-form games based on players’ moves and strategies. This
prediction can help to assess the potential success of different approaches in a game
setting. In wave net model, CNN is used to generate realistic musical waveforms.
Also, GANs use CNN for image generation task, and the interaction between gen-
erator and discriminator resembles min-max game (Hazra & Anjaria, 2022).
5.4 Game Theory and Reinforcement Learning
Game Theory in Reinforcement Learning: In multiagent reinforcement learning,

multiple agents interact in an environment and learn from their experiences to
improve their strategies (Foerster et al., 2018). Game theory concepts, such as Nash
equilibria and best responses, can be used to analyze the strategic behavior of agents
in MARL settings. The agents’ learning process can converge to Nash equilibria,
which represent stable points of interaction in the game (Foerster et al., 2018). Some
reinforcement learning scenarios involve adversarial interactions, where agents are
competing against each other. Game theory can help to model these adversarial
interactions and study the strategies and counterstrategies employed by agents. In
reinforcement learning, agents learn from repeated interactions with their environ-
ment. This process is analogous to repeated games in game theory, where players
learn from their past experiences to improve their strategies. Game theory has been
used to analyze the interaction between attackers and defenders in the context of
adversarial attacks on deep learning models. By viewing this scenario as a two-
player game, researchers can develop robust defense strategies against adversarial
attacks and improve the resilience of deep learning models.
5.5 Other Applications
An oligopoly could actually be considered as a market structure that consists of

small firms that compete with each other (Tremblay & Tremblay, 2011). Any strate-
gic move, such as changing prices or quantities, can affect rival firms, which makes
decision-making complex. Oligopoly involves non-price competition, such as
advertising, marketing, customer service, and innovation (Tremblay & Tremblay,
2011). Game theory helps to analyze the strategic decisions of firms in terms of
choosing their pricing or production levels to maximize their profits while consider-
ing the reactions of their competitors. Oligopolies are often analyzed using game
theory to model the strategic interactions among firms. Concepts such as Nash equi-
librium, the prisoner’s dilemma, and tit-for-tat strategies are commonly applied to
understand the behavior of firms in oligopolistic markets. Oligopoly is character-
ized by a small number of interdependent competitors, which gives rise to complex
decision-making processes and strategic interactions. In an oligopoly, there are few
completed firms, and there is interdependence between the firms. These firms often
engage in non-non-price competition. This includes advertising, product quality
improvements, and customer service enhancements, which result in both mutual
interests and conflicts among the firms. The study of oligopolies in game theory
often involves using various models, such as the Cournot model, Bertrand model,
and Stackelberg model, to analyze the behavior and decision-making of firms in
these market structures.
5.5 Other Applications 75
TD-Gammon is a combination of neural networks and reinforcement learning

that was developed by Gerald Tesauro in the late 1980s. The neural network was
trained to predict the outcomes of moves and evaluate positions. The network
weights were updated in each iteration. It reached a level of play that surpassed the
best human players of the time. In (Sutton & Barto, 1960), TDG also uses a three-
layered ANN architecture along with a reinforcement learning technique called
TD-Lambda by Richard S. Sutton and Barto (1960). ANN basically predicts all the
moves and selects the best move in iterations.
AlphaGo is a computer program developed by DeepMind. Go is an ancient Chinese
board game with simple rules but an extremely large number of possible positions,
making it vastly more complex than games such as chess. AlphaGo utilized deep neu-
ral networks to evaluate game positions and predict moves. These networks were
trained on a large dataset of human games to learn patterns and strategies, combined
neural networks with a Monte Carlo tree search algorithm (Silver et al., 2016), which
is a heuristic search algorithm, and as the game developed, it was played against itself,
and reinforcement learning was used and is known as “AlphaGO Zero” (Silver et al.,
2017). Sometimes, CNNs are used to analyze the current board state, and recurrent
neural networks (RNNs) help to capture the game’s temporal dynamics. TorchCraft is
a research framework that serves as a bridge between video games and machine learn-
ing platforms. It provides an interactive environment between the StarCraft game
engine and AI frameworks, enabling AI agents to interact with the game environment.
Basically, a reinforcement learning environment is provided (Synnaeve et al., 2016).
TorchCraft has been used by researchers and AI enthusiasts to develop and share their
AI-driven agents. It helps to make decisions in dynamic and strategic environments.
Handling real-time situations and having imperfect information on actions and strate-
gies is a challenge for the framework.
Generative moment matching networks are a type of model used for genera-
tive tasks, particularly for generating data that closely match the underlying data
distribution (Li et al., 2015). Basically, matching of the mean and variance of the
generated data distribution to the real data distribution is performed. They use
neural networks as their primary modeling tool. They can be applied to tasks such
as image generation, data augmentation, and feature learning. It helps in creating
realistic simulation environments for testing different strategies and outcomes.
Game theory often involves working with limited or incomplete data. GMMNs
could aid in data augmentation by generating additional samples that capture the
distribution of the available data, helping to improve the reliability of analysis and
conclusions. In multiagent settings, it could also be used to mimic the behavior of
other players by generating synthetic data.
Autoencoders are a type of neural network architecture used for unsupervised
learning in machine learning (Vincent et al., 2010). They also help to reduce noise
and help in reduction of dimensions (Hazra & Anjaria, 2022). In game theory, the
payoffs and strategies for multiple players can lead to complex data structures.
Autoencoders could help to reduce the dimensionality of the data. They can analyze
equilibrium points in strategic games. By training an autoencoder on the observed
actions or decisions of players, patterns and features of their behavior can be
captured. Generating new instances of game-related data, while maintaining the

essential characteristics of the game, could improve the reliability of analysis, which
is achieved by using autoencoders. Coevolutionary neural population models are a
type of computational framework used to simulate the interactions and evolution of
populations of neural networks (Moran & Pollack, 2018). Basically, they are used
in solving tasks where the environment is dynamic and changes from time to time.
In these models, there are multiple neural populations, each representing neural
networks. Fitness functions are used to measure the quality of their solutions. They
are used to solve optimization and adaptation problems. These models also have
applications in fields such as artificial intelligence, neuroscience, and evolutionary
computation.
Coevolutionary neural population models (Moran & Pollack, 2018) can simulate
how agents adjust their strategies dynamically in response to the strategies employed
by other agents. Coevolutionary models can represent the evolution of strategies
over generations of play. In game theory, this could involve simulating the evolution
of strategies in scenarios such as repeated games, where players learn from past
interactions. Some games involve intricate and multistep strategies. Coevolutionary
models might reveal how agents learn and develop such complex strategies through
interactions.
Game Theory and Computer Vision: Game theory can be used to model and
analyze strategic interactions among multiple agents in visual scenes. For example,
in a surveillance scenario, multiple cameras might compete to observe certain areas
effectively while avoiding detection by potential intruders. Computer vision tech-
niques can be applied to track and identify objects or entities in visual data.
Computer vision can be utilized in designing and testing new game mechanics,
features, and user experiences, helping developers.
Game theory has several potential applications in the field of farming, particu-
larly in regard to decision-making and resource allocation. Game theory can be used
to model the allocation of water resources among competing agricultural users.
Game theory can model the interactions between farmers and suppliers (e.g., seed,
fertilizer) to analyze pricing and negotiation strategies. Game-theoretic models can
explore how farmers’ decisions about land use and crop selection affect local biodi-
versity, leading to more sustainable agricultural practices. Game theory can analyze
how farmers’ decisions about pricing, production, and timing affect their competi-
tiveness in markets. It can also help farmers adapt to changing dynamic environ-
mental conditions and climatic conditions, which are changing daily.
Game theory has many applications in educational institutions, where it can help
in creating models of the interactions between students and colleges during the
admission process, helping to understand strategies such as early decision applica-
tions and merit-based selections. Game theory can analyze how teachers design
grading and incentive systems. Game theory can analyze how students collaborate
and make decisions in group projects. Game theory can be applied to design adap-
tive learning platforms that provide educational content and experiences to indi-
vidual student needs and learning styles.
References 77
References
Berthelot, D., Schumm, T., & Metz, L. (2017). BEGAN: Boundary equilibrium generative adver-
sarial networks. CoRR, abs/1703.10717.
Cintuglu, M. H., Martin, H., & Mohammed, O. A. (2015). Real-time implementation of multiagent-
based game theory reverse auction model for Microgrid market operation. IEEE.
Foerster, J., Chen, R. Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., & Mordatch, I. (2018).
Learning with opponent-learning awareness. In Proceedings of the 17th international confer-
ence on autonomous agents and multiagent systems (pp. 122–130). International Foundation
for Autonomous Agents and Multiagent Systems.
Goodfellow, I. (2016). NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint
arXiv:1701.00160.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., &
Bengio, Y. (2014). Generative adversarial nets. In Proceedings of advances in neural informa-
tion processing systems (NIPS) (pp. 2672–2680).
Hazra, T., & Anjaria, K. (2022). Applications of game theory in deep learning: A survey. Springer.
Huang, D., Tao, X., Lu, J., & Do, M. N. (2019). Geometry-aware GAN for face attribute transfer.
IEEE Access, 99, 1–1.
Kossaifi, J., Tran, L., Panagakis, Y., & Pantic, M. (2017). GAGAN: Geometry-aware GAN. In
IEEE computer society conference on computer vision and pattern recognition.
Li, Y., Swersky, K., & Zemel, R. S. (2015). Generative moment matching networks. In
Proceedings of international conference on machine learning (ICML).
Liu, W., & Chawla, S. (2010). Mining adversarial patterns via regularized loss minimization.
Machine Learning, 81(1), 69–83.
Moran, N., & Pollack, J. (2018). Coevolutionary neural population models. In IEEE symposium
on artificial life.
Oliehoek, F. A., Savani, R., Gallego, J., van der Pol, E., & Groß, R. (2018). Beyond local Nash
equilibria for adversarial networks. arXiv preprint arXiv:1806.07268.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser,
J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J.,
Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., &
Hassabis, D. (2016). Mastering the game of go with deep neural networks and tree search.
Nature, 529(7587), 484–489.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T.,
Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., Van Den Driessche,
G., Graepel, T., & Hassabis, D. (2017). Mastering the game of go without human knowledge.
Nature, 550(7676), 354–359.
Sutton, R. S., & Barto, A. G. (1960). Chapter 12: Introductions. Acta Physiologica Scandinavica,
48(Mowrer 1960), 57–63.
Synnaeve, G., Nardelli, N., Auvolat, A., Chintala, S., Lacroix, T., Lin, Z., Richoux, F., & Usunier,
N. (2016). TorchCraft: A library for machine learning research on real-time strategy games.
arXiv preprint arXiv:1611.00625. Retrieved from http://arxiv.org/abs/1611.00625.
Tremblay, C. H., & Tremblay, V. J. (2011). The Cournot–Bertrand model and the degree of product
differentiation. Economics Letters, 111, 233–235. Elsevier.
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. A. (2010). Stacked denoising
autoencoders: Learning useful representations in a deep network with a local denoising crite-
rion. Journal of Machine Learning Research, 11(Dec), 3371–3408.
Chapter 6
Conclusion and Future Research
Directions
In the concluding chapter of Game Theory in Deep Learning, we endeavor to encap-

sulate the book’s essence, weaving together a comprehensive summary of its content
with a deep dive into the pivotal insights gleaned from each section. This chapter
serves as a thorough recapitulation of the book and carves a path forward for future
research in this intriguing field. We commence by revisiting the significant findings
and theories presented, distilling the wealth of information into key takeaways that
highlight the profound intersection of game theory and deep learning. This synthesis
is designed to foster a deeper understanding of the subject and to ignite a passion for
further exploration and innovation in these dynamic and intersecting fields of study.
6.1 A Summary of Key Insights
The book Game Theory in Deep Learning explores game theory, a fascinating
branch of mathematics that delves into strategic decision-making among multiple
interacting players. The chapter lays a solid foundation by introducing the core
principles of game theory and demonstrating its widespread application across vari-
ous fields, including economics, political science, and evolutionary biology. It
emphasizes game theory’s pivotal role in understanding human behavior in strategic
situations, its valuable contributions to business strategy, and the predictive analysis
of events involving multiple players. The book then categorizes games into coopera-
tive and noncooperative types, illuminating different aspects of strategic interaction.
In cooperative games, players work together toward a common goal, whereas in
noncooperative games, they pursue individual strategies and compete. A key con-
cept introduced here is the Nash equilibrium, a critical notion in both game types
that represents an optimal solution in strategic decision-making scenarios. This con-
cept is instrumental in shaping our understanding of strategic interactions within
these games.

80 6 Conclusion and Future Research Directions
Transitioning into the realm of deep learning, the book describes it as a sophisti-
cated subset of machine learning that involves training intricate neural networks on
large datasets. It explains how deep learning emulates human brain functions by
processing information across multiple layers. It also discusses its applications in
fields like image and speech recognition and natural language processing, as well as
its challenges, particularly in dealing with nonlinear problems and data scarcity.
Lastly, the book highlights the intersection of game theory and deep learning, show-
casing how game theory can profoundly enhance deep learning. It covers the appli-
cation of game theory in modeling multiagent system interactions, optimizing deep
learning model performance, and bolstering security against adversarial attacks.
The chapter also underscores the integration of game-theoretic principles in design-
ing neural networks for tasks such as computer vision and the role of game theory
in adversarial training, illustrating its potential to improve the robustness and effi-
ciency of deep learning models.
Building upon the foundational insights of game theory and deep learning, the
book delves into practically implementing these concepts in real-world scenarios. A
particularly intriguing area of application is in the field of autonomous systems.
Here, game theory provides a framework for understanding the strategic interac-
tions of autonomous agents, such as drones or self-driving cars. For instance, in
multiagent environments, these systems must make decisions that ensure their
objectives but also account for the actions of others in the shared environment. The
present book is aligned with the current research that illustrates the use of game
theory in predicting the behaviors of autonomous agents and optimizing their
decision-making processes.
Furthermore, the book explores the use of game theory in augmenting the capa-
bilities of deep neural networks in complex decision-making tasks. For example,
deep learning models can benefit from game-theoretic approaches in medical diag-
nosis and treatment planning to make more informed and strategic decisions. This
is particularly relevant when multiple outcomes must be weighed and optimized.
The book provides a theoretical understanding of game theory and deep learning
and offers a window into their practical applications and future research potential. It
guides academics and practitioners, inspiring further exploration and innovation in
these interrelated fields. The following subsection discusses the open questions and
challenges regarding applying game theory in deep learning.
6.2 Open Questions, Challenges,

and Cross-Disciplinary Opportunities
The application of game theory to deep learning, a rapidly evolving area of artificial
intelligence, has shown promising results and potential in various research fields.
However, this integration also presents several open questions and challenges. The
present subsection discusses some open questions and challenges in applying game
theory in deep learning.
6.2 Open Questions, Challenges, and Cross-Disciplinary Opportunities 81
The first challenge is about model complexity (Rodriguez et al., 2022). The com-
plexity of combining game-theoretic models with deep learning architectures can be
challenging. This includes the difficulty of ensuring the models are accurate and
efficient while retaining the interpretability of their decisions. The performance and
relevance of deep learning models, mainly when applied in a game-theoretic frame-
work, are highly contingent on both the availability and the quality of data. This
reliance becomes problematic in scenarios characterized by limited or substandard
data, as it can significantly impair the model’s effectiveness and practical applicabil-
ity (Hu et al., 2021). The challenge lies in obtaining and curating high-quality data-
sets that faithfully inform and train these complex models. Within multiagent
systems, the task of accurately modeling strategic interactions using game theory in
conjunction with deep learning frameworks is notably intricate. It encompasses the
ability to predict behaviors and outcomes in dynamic and frequently unpredictable
environments. This complexity is heightened by the need to account for many vari-
ables and potential interactions, making it a particularly challenging aspect of
research in this field. While game theory has been instrumental in bolstering the
defense mechanisms of deep learning models against adversarial attacks, develop-
ing systems robust enough to counter sophisticated and evolving cyberthreats
remain. This necessitates continuously refining defensive strategies to ensure these
models remain impervious to malicious attacks.
In the context of game-theoretic models, especially when dealing with expansive
and intricate deep learning systems, identifying optimal strategies and achieving
equilibrium states present significant computational and algorithmic hurdles. The
complexity of these systems often entails a substantial demand for computational
resources and sophisticated algorithms to navigate the search for equilibrium solu-
tions efficiently (Lins et al., 2021). The effective amalgamation of game theory and
deep learning insights necessitates a profound comprehension of both domains.
Successfully bridging these two fields cohesively and practically poses a formidable
challenge for researchers, calling for an interdisciplinary approach that harmonizes
each discipline’s methodologies and theoretical underpinnings.
Furthermore, the ethical considerations and real-world applicability of models
that combine game theory and deep learning cannot be overlooked (Anjaria, 2021).
This is particularly critical in sensitive domains such as healthcare, finance, and the
development of autonomous systems, where the consequences of decisions made by
these models can have significant real-world impacts (Wu, 2022). Ensuring that
these models adhere to ethical guidelines and are designed with practical utility
remains a paramount concern. Addressing these challenges requires a multifaceted
approach, blending advanced technical solutions with thoughtful consideration of
ethical and practical aspects. As the field evolves, these challenges present obstacles
and opportunities for groundbreaking research and development.
As we navigate these multifaceted challenges, the future of integrating game
theory with deep learning shines with potential. Innovative approaches are needed
to simplify model complexity, such as developing new frameworks that balance
accuracy, efficiency, and interpretability (Kim et al., 2022). Embracing advance-
ments in data science could alleviate data scarcity and quality issues, leveraging
techniques like synthetic data generation and advanced data augmentation. In stra-
tegic interaction modeling, the focus should shift toward more adaptive and resilient
models that can handle the unpredictability inherent in multiagent systems (Ma
et al., 2023). Further, enhancing adversarial robustness will require advanced algo-
rithms and a continuous monitoring and updating mechanism to adapt to evolving
cyberthreats. In terms of optimization and equilibrium finding, implementing more
powerful computational techniques, perhaps drawing from emerging fields like
quantum computing, could provide the necessary processing capabilities.
Interdisciplinary collaboration is critical to bridging game theory and deep learn-
ing effectively. Encouraging cross-disciplinary research initiatives and educational
programs can foster a new generation of researchers with a holistic understanding
of both fields. Finally, the ethical and practical aspects demand a comprehensive
framework considering these technologies’ societal, ethical, and real-world impli-
cations. This includes developing guidelines and standards for ethical AI and tailor-
ing solutions to meet the specific needs of various sectors like healthcare and
finance. Addressing these challenges and exploring these avenues can lead to
groundbreaking advancements, making this integration a robust field of academic
inquiry and a catalyst for innovative applications in artificial intelligence. The fol-
lowing section discusses the future research direction of applying game theory in
deep learning.
6.3 Future Research Direction
The future of applying game theory in deep learning is ripe with exciting possibili-
ties, encompassing diverse research avenues. One key direction is the development
of hybrid models that merge the predictive prowess of deep learning with the strate-
gic analytical strength of game theory. Additionally, there is a push toward crafting
adaptive algorithms capable of navigating dynamic environments, especially criti-
cal in scenarios requiring real-time decision-making. Another pivotal area is enhanc-
ing the robustness of deep learning models against adversarial attacks through
game-theoretic methods. The exploration of quantum computing’s role in unravel-
ling complex game-theoretic challenges within deep learning systems promises
groundbreaking advances.
Moreover, the focus on ethical and fair AI development underlines the impor-
tance of creating morally sound and equitable models. Interdisciplinary studies are
encouraged to amalgamate diverse insights, broadening the scope of game theory’s
application in deep learning. Real-world applications in sectors like finance, health-
care, and autonomous systems, where strategic decisions are crucial, are also a sig-
nificant focus. Another exciting area is developing systems capable of automated
strategy learning based on game-theoretic principles. Lastly, efforts are being
directed toward formulating explainable AI models, integrating game theory and
deep learning to make AI decisions more transparent and understandable to users.
Each avenue addresses existing challenges and paves the way for innovative
breakthroughs.
References 83
Future research should also focus on fine-tuning the balance between computa-
tional efficiency and the complexity inherent in these hybrid models (Kamal &
Bablu, 2022). Particular emphasis must be placed on developing user-friendly inter-
faces that allow practitioners from various domains to leverage these models effec-
tively. Additionally, establishing comprehensive benchmarks and standards for
evaluating the performance of game theory-enhanced deep learning systems will be
crucial. As the field progresses, fostering a collaborative ecosystem that brings
together researchers, industry experts, and policymakers will be key to translating
these advanced theoretical concepts into tangible, real-world solutions that can rev-
olutionize various sectors.
In this evolving landscape, an essential aspect will be continuously exploring
new datasets and environments to test and refine these game theory and deep learn-
ing models. This exploration will not only validate their efficacy across different
scenarios but also uncover new challenges and opportunities for improvement.
Integrating advanced technologies like augmented and virtual reality could also
offer innovative platforms for simulating complex game-theoretic scenarios,
enhancing the training and capabilities of deep learning systems (Zhu et al., 2022).
Ultimately, the goal is to create a dynamic, iterative process of learning and adapta-
tion, fostering an environment where both game theory and deep learning can evolve
in tandem, driving forward the frontiers of artificial intelligence.
The future of applying game theory in deep learning holds immense promise,
with several key research directions emerging. A primary focus is on developing
new game-theoretic models that are more suited for deep learning applications. This
entails designing efficient algorithms specifically tailored for solving game-theoretic
problems within deep learning contexts. Another significant area is the integration
of game theory principles directly into deep learning frameworks and tools, enhanc-
ing their strategic decision-making capabilities. Furthermore, evaluating the perfor-
mance of these game-theoretic deep learning methods across various real-world
problems will be crucial. As these fields continue to advance, we can anticipate a
surge of innovative applications that leverage the strengths of both game theory and
deep learning, making substantial contributions across diverse domains. These are
just a few examples of promising future research directions for applying game the-
ory in deep learning. As both fields continue to develop, we can expect to see even
more creative and innovative applications of game theory to deep learning problems.
References
Anjaria, K. (2021). A framework for ethical artificial intelligence-from social theories to

cybernetics-based implementation. International Journal of Social and Humanistic Computing,
4(1), 1–28.
Hu, X., Chu, L., Pei, J., Liu, W., & Bian, J. (2021). Model complexity of deep learning: A survey.
Knowledge and Information Systems, 63, 2585–2619.
Kamal, M., & Bablu, T. A. (2022). Machine learning models for predicting click-through rates
on social media: Factors and performance analysis. International Journal of Applied Machine
Learning and Computational Intelligence, 12(4), 1–14.
Kim, D., Lee, J., Moon, J., & Moon, T. (2022). Interpretable deep learning-based hippocampal
sclerosis classification. Epilepsia Open, 7(4), 747–757.
Lins, S., Pandl, K. D., Teigeler, H., Thiebes, S., Bayer, C., & Sunyaev, A. (2021). Artificial intel-
ligence as a service: Classification and research directions. Business & Information Systems
Engineering, 63, 441–456.
Ma, Y., Li, Z., Xie, X., & Yue, D. (2023). Adaptive consensus of uncertain switched nonlinear
multi-agent systems under sensor deception attacks. Chaos, Solitons & Fractals, 175, 113936.
Rodriguez, D., Nayak, T., Chen, Y., Krishnan, R., & Huang, Y. (2022). On the role of deep learning
model complexity in adversarial robustness for medical images. BMC Medical Informatics and
Decision Making, 22(2), 1–15.
Wu, Y. (2022). Ethically responsible and trustworthy autonomous systems for 6G. IEEE Network,
36(4), 126–133.
Zhu, J., Ji, S., Yu, J., Shao, H., Wen, H., Zhang, H., et al. (2022). Machine learning-augmented
wearable triboelectric human-machine interface in motion identification and virtual reality.
Nano Energy, 103, 107766.

Applications of Game Theory in Deep Learning

Uploaded by

Copyright:

Available Formats

Applications of Game Theory in Deep Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applications of Game Theory in Deep Learning

Uploaded by

Copyright:

Available Formats

SpringerBriefs in Computer Science

Tanmoy Hazra · Kushal Anjaria · Aditi Bajpai ·

**Indexing: This series is indexed in Scopus, Ei-Compendex, and zbMATH **

ISSN 2191-5768 ISSN 2191-5776 (electronic)

Paper in this product is recyclable.

In an age of ever-increasing complexity and innovation, the intersection of game

Welcome on a fascinating journey that explores the symbiotic relationship between

Surat, Gujarat, India Tanmoy Hazra

3.7 Games with Perfect Information�������������������������������������������������������� 31

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1

1.1 Basics of Game Theory

Game theory is a branch of mathematics that studies strategic decision-making. It

1.2 Introduction to Deep Learning

1.3 Game Theory in Deep Learning

1.5 Cooperative Game Theory

Chapter 2 of “game theory in deep learning” is dedicated to exploring cooperative

1.6 Noncooperative Game Theory

1.7 Application of Game Theory in Deep Learning

2. Scene Recognition in Human-Robot Interaction: Integration of deep neural net-

17. Adversarial Attacks and Defenses: Exploring a game theory framework to

1.8 Case Studies and Different Applications

1.9 Conclusion and Future Research Directions

Cooperative game theory is an important area of game theory. In cooperative games,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 13

2.2 Cooperative Game Theory

v 1  3, v 2  4, v 3  5, v 1,2  8, v 1,3

It involves treating individuals or groups in a morally right and unbiased manner,

Marginal contribution of player   Worth of coalition – worth of individual player 

Marginal contribution of 1 group = 150, Marginal contribution of player

Nontransferable Utility is used to describe situations where individuals or agents

formation, and resource allocation. Key characteristics of nontransferable utility

Case 2: Considering B’s cases

In game theory, a noncooperative game is a game in which there is competition

3.1 Comparing Cooperative and Noncooperative Theory

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 23

A Nash equilibrium in game theory is an idea that describes a situation in which

Table 3.1 Payoff matrix of Battle of Sexes game

Table 3.2 Payoff matrix of Spotlight game

Table 3.3 Payoff matrix of stag hunt game

Table 3.4 Payoff matrix of prisoner’s dilemma game

Table 3.5 Payoff matrix of friend and foe game

business competition, environmental issues, and negotiations. In business, the pris-

Table 3.6 Payoff matrix of matching pennies game

3.6 Game with Imperfect Information

3.7 Games with Perfect Information

3.8 Extensive Form Games

Fig. 3.1 A game tree

3.10 Search Strategies for Game Trees

3.10.1 Breadth-First Search (BFS)

3.10.2 Depth-First Search (DFS)

Minimax is a decision-making strategy (Ho et al., 2022) in which the objective of

3.11.1 Steps in Game Tree

3.12 Strategic Form Games

3.13 Extensive Form Games

Table 3.7 Payoff matrix of prisoner’s dilemma game

Table 3.8 Payoff matrix of prisoner’s dilemma game

Table 3.9a Example of strict domination

Table 3.9b Example of weak dominance

Table 3.9c Example of equivalence dominance

In this case, it is observed that A strictly dominates B for Player 1.

Indexing: This series is indexed in Scopus, Ei-Compendex, and zbMATH

Surat, Gujarat, India Tanmoy Hazra

3.7 Games with Perfect Information�� 31

1.1 Basics of Game Theory

1.2 Introduction to Deep Learning

1.3 Game Theory in Deep Learning

1.5 Cooperative Game Theory

1.6 Noncooperative Game Theory

1.7 Application of Game Theory in Deep Learning

1.8 Case Studies and Different Applications

1.9 Conclusion and Future Research Directions

2.2 Cooperative Game Theory

v 1 3, v 2 4, v 3 5, v 1,2 8, v 1,3

Marginal contribution of player Worth of coalition – worth of individual player

3.1 Comparing Cooperative and Noncooperative Theory

3.6 Game with Imperfect Information

3.7 Games with Perfect Information

3.8 Extensive Form Games

3.10 Search Strategies for Game Trees

3.10.1 Breadth-First Search (BFS)

3.10.2 Depth-First Search (DFS)

3.11.1 Steps in Game Tree

3.12 Strategic Form Games

3.13 Extensive Form Games

3.15 No Dominant Strategy

3.18.1 Finitely Repeated Games

3.18.2 Infinitely Repeated Games

4.2 Relation of Neural Network to Game Theory

3xm x 3yl – y – l m x m – l – y – 3xm 3yl

q l·x·0 m·x·1 1 l m ·x· 1 l·y· 1 m·y·0 1 l m ·y·1

3xm x 3yl y l m y l x m 3xm 3yl

Game theory is a branch of mathematics that studies interactions and decision-

5.2 Game Theory in GAN

5.3 Game Theory in CNN

5.4 Game Theory and Reinforcement Learning

6.1 A Summary of Key Insights

6.2 Open Questions, Challenges,

6.3 Future Research Direction