Neat Python

The document discusses using NEAT and reinforcement learning to play the game Flappy Bird indefinitely with an artificial agent. NEAT uses genetic algorithms and evolving network topologies to train agents. Reinforcement learning trains agents using rewards and punishments to maximize rewards without labeled data. The performance of NEAT improves with larger initial agent populations. The document also reviews related literature on using games and these techniques to test algorithms with artificial agents.

Uploaded by

abcfake123efg456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

134 views5 pages

Neat Python

Uploaded by

abcfake123efg456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Playing a 2D Game Indefinitely using NEAT and

Reinforcement Learning
Jerin Paul Selvan Dr. P. S. Game
Dept. of Computer Engineering Dept. of Computer Engineering
Pune Institute of Computer Technology Pune Institute of Computer Technology
Pune, India Pune, India
[email protected] [email protected]
arXiv:2207.14140v1 [cs.LG] 28 Jul 2022

Abstract—For over a decade now, robotics and the use of artifi- Episodic vs Sequential and Known vs Unknown. An approach
cial agents have become a common thing. Testing the performance to machine learning known as NEAT, or Neuroevolution of
of new path finding or search space optimisation algorithms Augmenting Topologies, functions similarly to evolution. In
has also become a challenge as they require simulation or an
environment to test them. The creation of artificial environments its most basic form, [1] NEAT is a technique for creating
with artificial agents is one of the methods employed to test such networks that are capable of performing a certain activity,
algorithms. Games have also become an environment to test them. like balancing a pole or operating a robot. It’s significant that
The performance of the algorithms can be compared by using NEAT networks can learn using a reward function as opposed
artificial agents that will behave according to the algorithm in to back-propagation. By executing actions and observing the
the environment they are put in. The performance parameters
can be, how quickly the agent is able to differentiate between outcomes of those actions, an agent learns how to behave in
rewarding actions and hostile actions. This can be tested by a given environment via reinforcement learning, a feedback-
placing the agent in an environment with different types of based machine learning technique. The agent receives com-
hurdles and the goal of the agent is to reach the farthest by taking pliments for each positive activity and is penalised or given
decisions on actions that will lead to avoiding all the obstacles. negative feedback for each negative action. In contrast to
The environment chosen is a game called ”Flappy Bird”. The
goal of the game is to make the bird fly through a set of pipes supervised learning, reinforcement learning uses feedback to
of random heights. The bird must go in between these pipes and autonomously train the agent without the use of labelled data.
must not hit the top, the bottom, or the pipes themselves. The The agent can only learn from its experience because there
actions that the bird can take are either to flap its wings or is no labelled data. In situations like gaming, robotics, and
drop down with gravity. The algorithms that are enforced on the the like, where decisions must be made sequentially and with
artificial agents are NeuroEvolution of Augmenting Topologies
(NEAT) and Reinforcement Learning. The NEAT algorithm takes a long-term objective, RL provides a solution. The agent
an ’N’ initial population of artificial agents. They follow genetic engages with the environment and independently explores it.
algorithms by considering an objective function, crossover, mu- In reinforcement learning, an agent’s main objective is to
tation, and augmenting topologies. Reinforcement learning, on maximise positive rewards while doing better.
the other hand, remembers the state, the action taken at that
state, and the reward received for the action taken using a single II. L ITERATURE S URVEY
agent and a Deep Q-learning Network. The performance of the
NEAT algorithm improves as the initial population of the artificial Games have been used a lot to act as an environment to
agents is increased. test algorithms. There is a lot of research [3] done to create
an AI bot that can challenge a player in a multi-player or
Keywords—NeuroEvolution of Augmenting Topologies (NEAT), two-player game. Neuroevolution and Reinforcement learning
Artificial agent, Artificial environment, Game, Reinforcement
Learning (RL) algorithms are some of the algorithms that are used to create
AI bots or artificial agents. [1], [7] and [8] have implemented a
configuration of an ANN called Neuroevolution. The algorithm
I. I NTRODUCTION
does not depend on the actions taken by the agents as a whole.
An intelligent agent is anything that can detect its sur- [3], [4], [5], [6] and [7] use Reinforcement Learning algorithm
roundings, act independently to accomplish goals, and learn with Deep Q-Learning to train the agents.
from experience or use knowledge to execute tasks better. The performance of the Neuroevolution algorithm depends
The agent’s surroundings are considered an environment in on the objective function, initial population, mutation rate,
artificial intelligence. The agent uses actuators to send its weights and bias added to the network, the activation function
output to the environment after receiving information from used and overall topology of the network. Authors in [2] talk
it through sensors. [11] There are several types of environ- about how superior the Neuroevolution algorithm is over the
ments, Fully Observable vs Partially Observable, Determinis- traditional Reinforcement Learning algorithm with the Deep
tic vs Stochastic, Competitive vs Collaborative, Single-agent Q-Learning algorithm. Neuroevolution has an upper hand
vs Multi-agent, Static vs Dynamic, Discrete vs Continuous, when it comes to the time taken by the artificial agent to train
itself. There are other parameters that need to be taken into lem of multi-dimensional images and solve it using CNN.
consideration while using a Neural Network. The topology Reinforcement learning works best for continuous decision-
of the network plays a vital role in the performance. Two making problems. However, Deep Reinforcement Learning
strategies were proposed by Evgenia Papavasileiou (2021) [2], has a limitation of not converging for which Neural fitted
using fixed topologies in the neural networks and using aug- Q-learning and DQN algorithms were used to overcome the
mented topologies. The network topology is a single hidden issue. Since FNQ can work with numerical information only
layer of neurons, with each hidden neuron connected to every the author suggests use of DQN. Combining Q learning with
network input and every network output. Evolution searches CNN, the DQN can achieve self-learning. ReLu and maximum
the space of connection weights of this fully-connected topol- pooling layers are added to the CNN. Gradient descent (Adam
ogy by allowing high performing networks to reproduce. The Optimizer) was used to train the DQN parameters.
weight space is explored through the crossover of network Q-Value function based algorithms are the focus of Aidar
weight vectors and through the mutation of single networks’ Shakerimov (2021) [5]. For the DQN algorithms, improve-
weights. Thus, the goal of fixed-topology NE is to optimise ments could be achieved in their performance by using a
the connection weights that determine the functionality of a cumulative reward for training actions. To speed up training,
network. The topology, or structure, of neural networks also RNN-ReLU was used instead of LSTM or GRU. LSTM or
affects their functionality, and modifying the network structure GRU performs better than RNN-ReLU but takes 7 times
has been effective as part of supervised training. more time to train. Label smoothing was used to prevent the
There are two ways of making use of the environment. Authors vanishing gradients in RNN-ReLU. However, DQN is sensitive
in [3], [4], [6] and [7] use DNN to extract the features from to seed randomization.
the frame of the game and they form the input to the agent. SARSA is a slight variation of the traditional Q-Learning algo-
However, [1], [5] and [8] make use of the game itself and rithm. Authors in [6] use SARSA and Q-Learning algorithms
place the agent to perceive its surroundings. There are several with modifications such as -greedy policy, discretization and
combinations of Reinforcement Learning algorithms possible, backward updates. Some variants of Q-Learning were also im-
like Deep Neural Networks (DNN), Long short-term memory plemented such as a tabular approach, Q-value approximation
(LSTM), Deep Q-Network (DQN) and the like. However, using linear regression, and NN. In the implementation, [6]
depending on the type of obstacle and the type of game, its finds the SARSA algorithm to have outperformed Q-learning.
performance varies. The specifications of the rewards are a positive 5 for passing
Reinforcement Learning algorithm with DNN and LSTM have a pipe, a negative 1000 for hitting a pipe, and a positive 0.5
been used in [3]. This algorithm addresses issues like vast for surviving a frame. Feed-forward NN was used with a 3
search space, dependencies between the actions taken by the neuron input layer, 50 neuron hidden layer, 20 neuron hyphen
agent, the state and the environment, inputs and imperfect layer, and a 2 neuron output layer (ReLU activation function).
information. To reduce the complexity of the data generated The CNN is used with preprocessed input image by removing
by the perception of the agent, data skipping techniques the background, grayscale, and resizing to 80 x 80, 2 CNN
are implemented. There is, however, a drawback with this layers were used, one using sixteen 5 × 5 kernels with stride
algorithm. It takes a lot of time for the agent to train. Or, 2, and another with thirty-two 5 × 5 kernels with stride 2.
for every discrete step taken by the agent, it receives a state [7] proposes the use of specific feature selection and presents
that belongs to a set S and it sends an action from the the state by the bird velocity and the difference between the
set A actions to the environment. The environment makes bird’s position and the next lower pipe. This reduces the
a transition from state St to St+1 and a gamma value [0, 1] feature space and eliminates the need for deeper modules.
determines the preference for immediate reward over long- The agent is provided with rational human-level inputs along
term reward. A self-playing method is used by storing the with generic RL and a standard 3-layer NN with a genetic
parameters of the network to create a pool of past agents. This optimization algorithm. The reward for the agent is a positive 1
pool of past agents is used to sample opponents. This method for every cross of the pipe and a negative 100 if the agent dies.
offers RL to learn the Nash equilibrium strategy. Data skipping The Neuro evolution has the following characteristics: the NN
techniques were proposed in this paper. It refers to the process weights and the number of hidden layer units undergo changes,
of dropping certain data during the training and evaluation the mutation rate is kept at 0.3, and the initial population
process. Data skipping techniques proposed are: ”no-op” and size is 200. [8] proposes the use of two levels for the Flappy
”maintain move decision”. The network is composed of an Bird game. The fitness function is calculated by the distance
LSTM-based architecture, which has four heads with a shared traveled by the agent and the current distance to the closest
state representation layer. An actor-critic off-policy learning gap. The mutation rate is kept at 0.2, and there are 5 neurons
algorithm was proposed. in the hidden layer.
Botong Liu (2020) [4] has used Reinforcement Learning with
DQN. The game was split into frames, and each game image III. M ETHODOLOGY
was sequentially scaled, grayed, and adjusted for brightness. The NEAT algorithm implementation is dependent on the
Deep Q Network algorithm was used to convert the game objective function, crossover, mutation, and a population of
decision problem into a classification and recognition prob- agents. For a given position of the bird, say (x, y), there are
two actions that the agent can make. Either the bird flaps its
wings or it does not flap its wings. The vertical and horizontal
distances traveled by the agent are determined by the following
equations.
1
dvertical = vjump .t + .a.t2 (1)
2
dhorizontal = vf loor .t (2)
df loor = vf loor .t (3)
dpipe = vpipe .t (4)
Eq. (1) determines the vertical displacement of the agent,
where a is the acceleration that is a constant [12]. As shown in

Fig. 2. Parameters required as input to the NN

is encoded with the value 0 otherwise, it has the value 1.

With reference to Fig. 3 and Table. I, the edges between

Fig. 1. Details of the game environment [10]

the Fig. 2, the y coordinate of the agent, the distance between Fig. 3. Diagramatic view of the encoded chromosome in Table I
the top pipe and the agent (y - T’) and the distance between
the bottom pipe and the agent (T’) are the inputs to the neural
network. The gap between the top and the bottom pipe is TABLE I
fixed to 320 pixels, and the heights are randomly generated. E NCODING OF A CHROMOSOME BEFORE CROSSOVER AND MUTATION
The distance between subsequent pipes is also kept constant. Weight 0.25 2.31 1.55 0.98 5.11 1.17 0.07
With respect to the NEAT algorithm, the fitness of the agent From 1 2 3 1 3 4 2
is determined by the number of pipes that the agent is able To 2 3 2 3 4 3 4
Enabled 1 0 1 1 1 1 1
to cross without collision. As soon as the agent collides with
the pipe, hits the roof, or falls down to the ground, the agent
is removed from the environment. The performance of the the nodes are represented by the rows ’From’ and ’To’. The
algorithm depends on the initial population that is taken into Table. I shows the encoding of the network before mutation.
consideration. The activation function used is the hyperbolic After the mutation, or rather after topology augmentation, the
tangent function. The mutation rate is kept at 0.03. The encoding of the edges is shown in Table. II. The resultant
encoding of the chromosome is shown in Table. I. The weight connections are shown in Fig. 4. The edges that are in red
of the connection from a node in a layer to another node in are the edges that were dropped, and the edges that are in
the other layer and the dropped value is also considered as green are the ones that have been added as a result of the
part of the encoding. If the connection is to be dropped, it mutation. The cross-over process happens between any two
randomly selected parents. The next population is determined
by the fitness of the individual agents.

Fig. 4. Diagramatic view of the encoded chromosome in Table. II

Fig. 6. Gameplay when initial population is 100

TABLE II
E NCODING OF A CHROMOSOME AFTER CROSSOVER AND MUTATION

Weight 0.25 5.11 1.17 0.98 2.31 1.55 0.07

From 1 2 4 1 3 3 4
To 3 4 2 4 2 4 3
Enabled 1 1 1 0 1 1 0

IV. R ESULTS
The implementation of the algorithm requires no historic
data or any dataset. The algorithm makes use of the sensory
data perceived from the environment by the artificial agent as
the program runs. The inputs to the algorithm are the y position
of the agent, the vertical distance of the agent from the top
pipe, and the vertical distance of the agent from the lower pipe.
The output of the algorithm is the action that the agent is to
take i.e. jump or drop down owing to gravity. NEAT algorithm
was implemented by taking different initial populations. Fig. Fig. 7. Gameplay when initial population is 120
5, Fig. 6 and Fig. 7 shows the average score and the scores
reached in every generation, when the game is played by the Fig. 8 for generations 30 to 50. The average score of the agent
agents over 50 generations. The change in the average scores is steadily increasing from when the initial population is 20
to 100. The maximum score is observed when the population
is 160. The average fitness value of the population is higher
when the initial population size is 100. This is shown in Fig. 9.
The initial training phase is less than 5 generations. When the
initial population has fewer agents, it takes more generations to
spike the average score of the game. This can be observed from
Fig. 10. Table. III shows the average score and the maximum
score gained by the agent over 50 generations. A maximum
score of 1025 is obtained when the initial population is 160
and the gameplay run till 50 generations.
C ONCLUSION AND F UTURE S COPE
By using a 2D game, the performance of the algorithms can
be determined very efficiently. Unlike simulation, the creation
of an environment gives better control over the environment.
Through various iterations by changing the initial population
Fig. 5. Gameplay when initial population is 80 size, the average score gained by the agent has increased. The
initial population of agents also affects the training speed. The
over the change in the initial population is separately shown in more the agents, the quicker the training is done. The highest
TABLE III
S CORES OVER CHANGE IN INITIAL POPULATION

Initial Population Average Score Max Score

20 158.2 583
40 187.04 771
60 200.06 756
80 250.28 765
100 244.56 911
120 220.66 544
140 255.82 565
160 293.72 1025

average score is obtained when the initial population is set to

100 individuals. It can be concluded that the performance of
the algorithm increases as the initial population is increased.
The implemented algorithm can be extended to making use of
Fig. 8. Average scores over initial population change (Gen 30 - Gen 50)
Reinforcement Learning with multiple agents and using Aug-
mented Topologies along with the Deep Q-Learning model.
R EFERENCES
[1] M. G. Cordeiro, Paulo B. S. Serafim, Yuri B. Nogueira, “A Minimal
Training Strategy to Play Flappy Bird Indefinitely with NEAT”, 2019
18th Brazilian Symposium on Computer Games and Digital Entertain-
ment (SBGames), pp. 384-390, DOI: 10.1109/SBGames.2019.00014.
[2] Evgenia Papavasileiou, Jan Cornelis and Bart Jansen, “A Systematic
Literature Review of the Successors of “NeuroEvolution of Augmenting
Topolo- gies”, Evolutionary Computation, vol: 29, Issue: 1, March 2021,
pp. 1-73, DOI: 10.1162/evcoa00282.
[3] Inseok Oh, Seungeun Rho, Sangbin Moon and Seongho Son, “Creat-
ing Pro-Level AI for a Real-Time Fighting Game Using Deep Rein-
forcement Learning”, IEEE Transactions on Games, 2021, pp. 1-10, doi:
10.1109/TG.2021.3049539.
[4] Botong Liu, “Implementing Game Strategies Based on Reinforce-
ment Learning”, ICRAI 2020: 2020 6th International Conference on
Robotics and Artificial Intelligence, November 2020, pp 53–56, DOI:
https://doi.org/10.1145/3449301.3449311.
[5] Evalds Urtans, Agris Nikitenko, “Survey of Deep Q-Network variants
in PyGame Learning Environment”, ICDLT ’18: Proceedings of the
2018 2nd International Conference on Deep Learning Technologies, June
2018, pp 27–36, DOI: https://doi.org/10.1145/3234804.3234816.
[6] Tai Vu, Leon Tran, ”FlapAI Bird: Training an Agent to
Fig. 9. Average Fitness of the population over initial population change Play Flappy Bird Using Reinforcement Learning Techniques”,
experimental projects with community collaborators, 2020, DOI:
https://doi.org/10.48550/arXiv.2003.09579.
[7] Andre Brandao, Pedro Pires, Petia Georgieva, ”Reinforcement Learning
and Neuroevolution in Flappy Bird Game”, Pattern Recognition and Im-
age Analysis, 2019, pp.225-236, DOI: DOI:10.1007/978-3-030-31332-
620.
[8] Yash Mishra, Vijay Kumawat, Selvakumar Kamalanathan, ”Performance
Analysis of Flappy Bird Playing Agent Using Neural Network and
Genetic Algorithm”, Information, Communication and Computing Tech-
nology, 2019, pp.253-265, DOI:10.1007/978-981-15-1384-821.
[9] A. McIntyre, M. Kallada, C. G. Miguel, and C. F. da Silva, “neat-
python,” https://github.com/CodeReclaimers/neat-python.
[10] T. Ruscica and J. Keromnes, ”Flappy Bird game images”,
https://github.com/techwithtim/NEAT-Flappy-Bird/tree/master/imgs.
[11] S. Kumar, ”Understand Types of Environments in Artificial Intel-
ligence”, https://www.aitude.com/understand-types-of-environments-in-
artificial-intelligence/, 2020.
[12] D. Zhu, ”How I Built an Intelligent Agent to Play Flappy Bird”,
Analytics Vidhya, 2020.

Fig. 10. Speed of agents getting trained over initial population change

Unit 4
No ratings yet
Unit 4
8 pages
The Era of Experience Paper
No ratings yet
The Era of Experience Paper
11 pages
Designing Neural Networks Through Neuroevolution: Review Article
100% (1)
Designing Neural Networks Through Neuroevolution: Review Article
12 pages
Presentation F Nal
No ratings yet
Presentation F Nal
25 pages
Lecture 1
No ratings yet
Lecture 1
40 pages
Reinforcement Learning For IoT - Final
No ratings yet
Reinforcement Learning For IoT - Final
45 pages
Deep Reinforcement Learning in Games
No ratings yet
Deep Reinforcement Learning in Games
9 pages
Retrieve
No ratings yet
Retrieve
10 pages
Lecture1 Presentation
No ratings yet
Lecture1 Presentation
20 pages
AI General Game Player Using Neuroevolution Algorithms
100% (1)
AI General Game Player Using Neuroevolution Algorithms
64 pages
Paper Fiuri
No ratings yet
Paper Fiuri
17 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Military AI-Week 01-AI Concept-Theory-Full-2023
No ratings yet
Military AI-Week 01-AI Concept-Theory-Full-2023
57 pages
Paper Formatt
No ratings yet
Paper Formatt
12 pages
2017 - Generic Animats
No ratings yet
2017 - Generic Animats
10 pages
AI Research Paper
No ratings yet
AI Research Paper
14 pages
Teaching AI To Play Games Using Neuroevolution of Augmenting Topologies
No ratings yet
Teaching AI To Play Games Using Neuroevolution of Augmenting Topologies
3 pages
Embodied Intelligence Via Learning and Evolution
No ratings yet
Embodied Intelligence Via Learning and Evolution
18 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
What Is AI
No ratings yet
What Is AI
53 pages
Using Genetic Algorithms To Evolve Artificial Neural Networks
No ratings yet
Using Genetic Algorithms To Evolve Artificial Neural Networks
24 pages
Fifthsemister5file 1
No ratings yet
Fifthsemister5file 1
78 pages
7750 Improving Exploration in Evolution Strategies For Deep Reinforcement
No ratings yet
7750 Improving Exploration in Evolution Strategies For Deep Reinforcement
12 pages
Evolving Neural Networks Through Augmenting Topologies: Kenneth O. Stanley and Risto Miikkulainen
No ratings yet
Evolving Neural Networks Through Augmenting Topologies: Kenneth O. Stanley and Risto Miikkulainen
27 pages
2207.13583-TOWARDS THE NEUROEVOLUTION OF LOW-LEVEL ARTIFICIAL General Intelligence
No ratings yet
2207.13583-TOWARDS THE NEUROEVOLUTION OF LOW-LEVEL ARTIFICIAL General Intelligence
18 pages
Jair 2104 PDF
No ratings yet
Jair 2104 PDF
38 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
26 pages
Ai With Robotics Sol
No ratings yet
Ai With Robotics Sol
5 pages
Pgrev Stud 954409 3 1 2 L
No ratings yet
Pgrev Stud 954409 3 1 2 L
33 pages
Paper PDF
No ratings yet
Paper PDF
11 pages
Literature Review - Improving The Efficiency of Decision-Making Agents in E-Gaming
No ratings yet
Literature Review - Improving The Efficiency of Decision-Making Agents in E-Gaming
4 pages
Hybrid AI Agent On 2d Racing Game Using Neural Networks and Reinforcement Learning
No ratings yet
Hybrid AI Agent On 2d Racing Game Using Neural Networks and Reinforcement Learning
7 pages
AI Unit I
No ratings yet
AI Unit I
9 pages
2017 - The Animat Path To Artificial General Intelligence
No ratings yet
2017 - The Animat Path To Artificial General Intelligence
10 pages
AI Reinforcdement Learning
No ratings yet
AI Reinforcdement Learning
20 pages
Lec-2 History of AI
No ratings yet
Lec-2 History of AI
50 pages
Application of NEAT Algorithm in PC Games.: 1 Brief Idea About The Project
No ratings yet
Application of NEAT Algorithm in PC Games.: 1 Brief Idea About The Project
2 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
47 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
24 pages
A Brief Survey of Deep Reinforcement Learning PDF
No ratings yet
A Brief Survey of Deep Reinforcement Learning PDF
14 pages
2 Unit1
No ratings yet
2 Unit1
89 pages
Competitive Coevolution Through Evolutionary Complexification
No ratings yet
Competitive Coevolution Through Evolutionary Complexification
38 pages
Presentation Ashik
No ratings yet
Presentation Ashik
16 pages
Niall - Project 3
No ratings yet
Niall - Project 3
32 pages
AI Notes
No ratings yet
AI Notes
51 pages
Neuro - Evolutionary Model For Playing Games
No ratings yet
Neuro - Evolutionary Model For Playing Games
6 pages
SmartAgent - Creating Reinforcement Learning Tetris AI
No ratings yet
SmartAgent - Creating Reinforcement Learning Tetris AI
52 pages
Introduction To Artificial Intelligence (40-417)
No ratings yet
Introduction To Artificial Intelligence (40-417)
28 pages
Intelligent Agents Overview (1) - 1
No ratings yet
Intelligent Agents Overview (1) - 1
7 pages
1 Intro Up To RL - TD
No ratings yet
1 Intro Up To RL - TD
20 pages
Chrislu Proposal 1
No ratings yet
Chrislu Proposal 1
3 pages
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
No ratings yet
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
6 pages
A White Paper On The Future of Artificial Intelligence
No ratings yet
A White Paper On The Future of Artificial Intelligence
6 pages
A Beginners Guide To Deep Reinforcement Learning PDF
No ratings yet
A Beginners Guide To Deep Reinforcement Learning PDF
9 pages
Ai Introduction: What Is Artificial Intelligence?
No ratings yet
Ai Introduction: What Is Artificial Intelligence?
9 pages
2024 MTH058 Lecture01 IntroductionToAI
No ratings yet
2024 MTH058 Lecture01 IntroductionToAI
52 pages