Dynamic Difficulty Adjustment Via Fast User Adaptation
Dynamic Difficulty Adjustment Via Fast User Adaptation
key element in game development that provides continuous come by using deep neural networks. In [11], a method was
motivation and immersion to the player. However, conven- proposed for adapting the game challenge to the player by
tional DDA methods require tuning in-game parameters to gen- generating an enemy agent based on a player model trained
erate the levels for various players. Recent DDA approaches using the player’s actual movement and strategy. This method
based on deep learning can shorten the time-consuming tuning outperformed conventional DDA in several subjective metrics
process, but require sufficient user demo data for adaptation. but required adequate data acquisition process because the
In this paper, we present a fast user adaptation method that player model had to be newly trained for each player.
can adjust the difficulty of the game for various players using
In this paper, we propose a novel DDA approach referred to
only a small amount of demo data by applying a meta-learning
as fast user adaptation based on deep neural networks that can
algorithm. In the video game environment user test (n=9), our
quickly adapt to a player’s capabilities with a small amount
proposed DDA method outperformed a typical deep learning-
of play data. In order to use the sparse user demo data effec-
based baseline method.
tively, we employ the model-agnostic meta-learning (MAML)
algorithm [5]. Meta-learning is a method that focuses on fast
Author Keywords
adaptation to various tasks, i.e., the generalization of network
Dynamic difficulty adjustment; deep learning; meta-learning.
parameters, to make it easy to respond to new unseen tasks.
We apply this meta-learning concept to create a DDA model
CCS Concepts
that quickly adapts to new players.
•Human-centered computing → User models;
•Computing methodologies → Neural networks; •Applied FAST USER ADAPTATION
computing → Computer games; The fast user adaptation method we present is to modify the
MAML algorithm [5] to train a model that can quickly respond
INTRODUCTION
to different users (Figure 1(a)). When dividing the training
Difficulty balancing is a key element in game development data obtained from various tasks into Ddemo and Dvalid , the
because players easily become bored or frustrated when the MAML method first updates the network parameter θ in a few
games are too easy or difficult for them. Dynamic difficulty gradient steps calculated using Ddemo , and trains to minimize
adjustment (DDA) is a method for adapting the difficulty of a
game according to the player’s ability to provide continuous
motivation to the player. Various studies in the HCI field [1,
2, 4] have revealed that DDA has positive effects, such as
increasing the immersion [3] and long-term motivation of
players [11].
Several studies have been conducted on how to implement
DDA [6, 14, 8, 12]. One of the most straightforward but pow-
erful methods is to increase or decrease the game’s strength
index, e.g., the in-game parameters or the AI level, accord-
ing to the player’s in-game performance. However, the pa-
rameter adjustment method requires careful tuning, which is
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected].
UIST’20, October 20–23, 2020, Minneapolis, MN, USA Figure 1. Overview of the DDA network models we implemented. (a)
© 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. Fast user adaptation. (b) LSTM-FC Net.
ISBN 978-1-4503-6708-0/20/04. . . $15.00
DOI: https://doi.org/10.1145/3313831.XXXXXXX
Figure 2. (a) Objective evaluation results and (b) subjective evaluation results of each DDA method.
the loss of Dvalid calculated with the updated θ . This training training. When the same data were exploited for five epochs,
method can be expressed in the following equation our model training was about nine times faster than the LSTM-
FC Net (2 hours vs 18 hours).
Nine participants between 22 to 29 years of age (mean
min ∑ L(θ − α∇θ L(θ , DTdemo ), DTvalid ),
θ T
age=25.33) were recruited for the user test. All participants
were provided with sufficient practice time to avoid their skill
where L(θ , DT ) denotes the loss value when data DT , obtained increase during the user test. After the practice time, par-
from task T , is fed into the model with parameter θ . Our fast ticipants took apart in three sessions with the three different
user adaptation method applies the MAML algorithm, where types of DDA agents in random order. The initial difficulty
in place of using training data from various tasks (DT ), we adjustment of each session was performed using data acquired
use data from various players (DP ) instead. Similar to [11], during a pre-session of about one minute performed imme-
we hypothesize that a DDA that makes players encounter diately before each session. Each session lasted about four
agents whose behavior and strategy are similar to themselves minutes, and after half of each session (i.e., after two minutes),
can boost player motivation effectively. Therefore, our DDA a short break was given and an additional difficulty adjustment
method is intended to make an agent quickly learn the player’s implemented.
movements so that the player faces an agent who plays simi-
RESULTS
larly to himself/herself.
We evaluated the DDA methods using both objective and
EXPERIMENT DETAILS
subjective evaluation metrics. As objective metrics of how
successfully the DDA model adapted to the user, we measured
For the user test, we developed a virtual Air Hockey game
the participants’ win/loss rate and puck possession, i.e., the
environment where two players compete with their respective
percentage of time with puck on one’s side. An even game is
strikers and a single puck on a slippery surface. A player can
expected to result in 50 percent for each metric. Figure 2(a)
freely move the striker within his/her area and score points by
indicates that our method shows a comparable win/loss rate to
hitting the puck and putting it inside the opponent’s goal. We
the conventional method and is superior to that of the LSTM-
conducted a user test that confronts participants with DDA-
FC Net. In terms of the puck possession, our method also
applied agents in this Air Hockey game environment.
shows a comparable result to the conventional method, and a
To validate our fast user adaptation model, we implemented superior result to the LSTM-FC Net.
two baseline DDA methods: another data-driven approach
As subjective metrics, we asked the participants to complete a
utilizing neural networks, and a conventional DDA approach.
questionnaire which assessed the enjoyment, suitable difficulty,
For the data-driven baseline, referred to as LSTM-FC Net,
engrossment, and personal gratification, modified from [7].
we implemented a neural network model incorporating long
Figure 2(b) shows the subjective evaluation results of our user
short-term memory (LSTM) layers that can extract the user em-
test. Our DDA method shows superior results to the LSTM-FC
bedding information, e.g., users’ proficiency, from user demo
Net in terms of the enjoyment, engrossment, and, in particular,
data, and fully connected (FC) layers that output appropriate
the suitable difficulty score.
actions based on the current game state and the embedding
information (Figure 1(b)). For the conventional DDA baseline,
CONCLUSION
we generated agents corresponding to progressive levels of
In this paper, we proposed a novel DDA method named fast
difficulty from 1 to 9. The level of difficulty was increased or
user adaptation based on a meta-learning algorithm. Our
decreased depending on the player’s win or loss.
method surpassed a deep neural network-based baseline in
In detail, our DDA network consists of four FC layers with 80 both objective and subjective evaluations, and showed a much
hidden units, and the LSTM-FC Net consists of two LSTM faster learning speed. In addition, our method showed com-
layers with 10 hidden units and four FC layers with 80 hidden parable performance to the conventional DDA even though it
units. 60M timesteps of artificial agent data and 0.2M data has the advantage of not requiring time-consuming parameter
acquired from one human player were used for the model tuning.
ACKNOWLEDGMENTS [8] David Melhart, Ahmad Azadvar, Alessandro Canossa,
This work was supported by the Basic Science Research Antonios Liapis, and Georgios N Yannakakis. 2019.
Program through the National Research Foundation of Ko- Your gameplay says it all: modelling motivation in Tom
rea (NRF) funded by the Ministry of Education (NRF- ClancyâĂŹs the division. In 2019 IEEE Conference on
2018R1D1A1B07043580). Games (CoG). IEEE, 1–8.
REFERENCES
[9] Hee-Seung Moon and Jiwon Seo. 2019a. Observation of
[1] Alexander Baldwin, Daniel Johnson, and Peta A Wyeth. human response to a robotic guide using a variational
2014. The effect of multiplayer dynamic difficulty autoencoder. In 2019 Third IEEE International
adjustment on the player experience of video games. In Conference on Robotic Computing (IRC). IEEE,
CHI’14 Extended Abstracts on Human Factors in 258–261.
Computing Systems. 1489–1494. [10] Hee-Seung Moon and Jiwon Seo. 2019b. Prediction of
[2] Thomas Constant and Guillaume Levieux. 2019. human trajectory following a haptic robotic guide using
Dynamic difficulty adjustment impact on players’ recurrent neural networks. In 2019 IEEE World Haptics
confidence. In Proceedings of the 2019 CHI conference Conference (WHC). IEEE, 157–152.
on human factors in computing systems (CHI ’19). 1–12. [11] Johannes Pfau, Jan David Smeddinck, and Rainer
Malaka. 2020. Enemy within: Long-term motivation
[3] Alena Denisova and Paul Cairns. 2015. Adaptation in effects of deep player behavior models for dynamic
digital games: the effect of challenge adjustment on difficulty adjustment. In Proceedings of the 2020 CHI
player performance and experience. In Proceedings of Conference on Human Factors in Computing Systems
the 2015 Annual Symposium on Computer-Human (CHI’20). 1–10.
Interaction in Play (CHI PLAY’15). 97–101.
[12] I-Chen Wu, Ti-Rong Wu, An-Jen Liu, Hung Guei, and
[4] Alena Denisova and Paul Cairns. 2019. Player Tinghan Wei. 2019. On strength adjustment for
experience and deceptive expectations of difficulty MCTS-based programs. In Proceedings of the AAAI
adaptation in digital games. Entertainment Computing Conference on Artificial Intelligence (AAAI’19), Vol. 33.
29 (2019), 56–68. 1222–1229.
[5] Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. [13] Ziming Wu, Yulun Jiang, Yiding Liu, and Xiaojuan Ma.
Model-agnostic meta-learning for fast adaptation of 2020. Predicting and Diagnosing User Engagement with
deep networks. In Proceedings of the 34th International Mobile UI Animation via a Data-Driven Approach. In
Conference on Machine Learning (ICML). 1126–1135. Proceedings of the 2020 CHI Conference on Human
[6] Suoju He, Junping Wang, Xiao Liu, Wan Huang, and Factors in Computing Systems (CHI’20). 1–13.
others. 2010. Dynamic difficulty adjustment of game AI [14] Haiyan Yin, Linbo Luo, Wentong Cai, Yew-Soon Ong,
by MCTS for the game Pac-Man. In 2010 Sixth and Jinghui Zhong. 2015. A data-driven approach for
International Conference on Natural Computation, online adaptation of game difficulty. In 2015 IEEE
Vol. 8. IEEE, 3918–3922. conference on computational intelligence and games
[7] Takahiro Kusano, Yunshi Liu, Pujana Paliyawan, (CIG). IEEE, 146–153.
Tomohiro Harada, and Ruck Thawonmas. 2019. Motion [15] Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari,
Gaming AI using Time Series Forecasting and Dynamic Tianhao Zhang, Pieter Abbeel, and Sergey Levine. 2018.
Difficulty Adjustment for Improving Exercise Balance One-shot imitation from observing humans via
and Enjoyment. In 2019 IEEE Conference on Games domain-adaptive meta-learning. arXiv preprint
(CoG). arXiv:1802.01557 (2018).