Moving Beyond Test Scores: Analyzing The Effectiveness of A Digital Learning Game Through Learning Analytics
Moving Beyond Test Scores: Analyzing The Effectiveness of A Digital Learning Game Through Learning Analytics
Bruce M. McLaren
Carnegie Mellon University
[email protected]
487 Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)
sult, we derived lessons for improving learning support in
Decimal Point as well as in a more general learning game
context.
2. RELATED WORK
2.1 Learning Analytics in Games
In-game formative assessment can be a powerful comple-
mentary tool for capturing students’ learning progress [59].
Traditional formative measures typically make use of game-
based metrics, such as the number of completed levels or
the highest level beaten [2, 11], but these metrics may not
always align with actual learning. Prior studies on Deci-
mal Point, for instance, reported that students who played
more mini-game rounds did not learn more than those who
played fewer [18, 39]. An alternative approach is to employ
learning analytics methods from ITS studies. For exam-
ple, learning curve analysis, which visualizes students’ error
rates over time, has been applied in several learning games
and yielded valuable insights that range from instructional
redesign lessons to discovery of unforeseen strategy by stu-
dents [17, 29, 42].
Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020) 488
Table 1: Survey items before and after game play.
Pre-intervention surveys
Dimension (item count) Example statement Cronbach’s α
Decimal efficacy (3) [44] I can do an excellent job on decimal number math assignments. .83
Computer efficacy (3) [31] I know how to find information on a computer. .71
Identification agency (2) [50] I work on my classwork because I want to learn new things. .60
Intrinsic agency (2) [50] I work on my classwork because I enjoy doing it. .86
External agency (3) [50] I work on my classwork so the teacher won’t be upset with me. .61
Perseverance (3) [12] Setbacks don’t discourage me. I don’t give up easily. .79
Math utility (3) [13] Math is useful in everyday life. .63
Math interest (2) [14] I find working on math to be very interesting. .75
Expectancy (1) [23] I plan to take the highest level of math available in high school. -
Post-intervention surveys
Dimension (item count) Example statement
Affective engagement (3) [5] I felt frustrated or annoyed. .78
Cognitive engagement (3) [5] I tried out my ideas to see what would happen. .54
Game engagement (5) [7] I lost track of time. .74
Achievement emotion (6) [43] Reflecting on my progress in the game made me happy. .89
489 Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)
we observed that: there were 4 students who did not master
any skill, 20 students who mastered one skill, 33 students
who mastered two skills, 42 students who mastered three
skills, 34 students who mastered four skills, and 26 students
who mastered all five skills. Next, we counted how many op-
portunities each student who mastered a skill took to reach
mastery in that skill. An opportunity is defined as one com-
plete decimal exercise; each mini-game round consists of one
opportunity, except for those in Sequence, which contain
three opportunities (i.e., students have to fill in three deci-
mal sequences per round). The distributions of opportunity
count until mastery are plotted in Figure 2, which shows
that Number Line and Sorting took the longest to master,
Figure 3: Over-practice ratio in each skilll. The
at around 5 opportunities on average. For Number Line, one
number next to each skill indicates the count of stu-
student even needed 26 opportunities to reach mastery.
dents who mastered that skill and were included in
the violin plot.
Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020) 490
Table 2: Results of feature selection for predicting game enjoyment. The Overall performance row indicates
the selected model’s scores when trained and evaluated on the entire dataset.
than those of the test score models. Even when trained tions of the game should provide more instructional support
and evaluated on the entire dataset, Linear Regression could that can react to various misconceptions from students, for
only explain about 20% of the variance in game engagement example via explanatory feedback [19] or predefined error
and affective engagement. On the other hand, the achieve- messages for different types of error [36].
ment emotion model did have reasonable performance (ad-
justed R2 = .386), so we focused on analyzing the fea- Once students have mastered a skill, however, our analy-
tures in this model. The linear regression table showed sis showed that over-practice was very common, i.e., stu-
the coefficient and significance of each feature as follows: dents kept playing more mini-games in the mastered skill.
computer efficacy with β = 0.047, p = .063, identification At the same time, there were only 26 out of 159 students
agency with β = 0.099, p = .024, intrinsic agency with who mastered all five skills, suggesting that the majority
β = 0.116, p = .002, math interest with β = 0.114, p = .001, of students still had room for improvement in the unmas-
pretest score with β = −0.017, p = .011, opportunity count tered skills but chose not to practice them. One possible
with β = 0.009, p = .033. In other words, computer efficacy reason is that the game environment did not explicitly indi-
had a positive and marginally significant association, while cate when the student has reached mastery or force them to
pretest score had a negative and significant association; the switch to practicing a different skill. Consequently, young
remaining features (identification agency, intrinsic agency, students, who were likely to be weak at self-regulated learn-
math interest and opportunity count) each had a positive ing [37,53], simply played the mini-games that they thought
and significant association. were engaging, which in this case involved the skills they had
already mastered. A prior study by [29] similarly found that,
5. DISCUSSION in a game about locating fractions on number line, students
were more engaged when the game was easier, contradicting
5.1 Investigating in-game learning game design theories that optimal engagement would occur
Based on the opportunity count until mastery in each skill
at moderate difficulty level.
(Figure 2), we identified Sorting and Number Line as the
most difficult skills in the game. Our prior learning curve
analysis [40] on a different Decimal Point study reported a
consistent finding – that the learning curves of these two 5.2 Investigating factors related to posttest and
skills were mostly flat and reflected small learning rates. delayed posttest performance
Based on previous research in decimal learning, a plausible We saw that our linear regression models were able to pre-
explanation is that there are several misconceptions which dict posttest and delayed posttest performance well, cap-
can lead to students making a mistake in Sorting or Num- turing about 75% of the variance in test scores with only
ber Line problems, including (1) treating decimals as whole 3-5 features. The three features present in both models are
numbers, (2) treating decimals as fractions, and (3) ignoring pretest score, Sorting mastery and Bucket mastery. The
the zero in the tenths place [46]. Furthermore, even when inclusion of pretest score is not surprising, as it is consistent
students recognize their misconception, they may shift to with the standard practice of controlling for prior knowledge
a different misconception instead of arriving at the correct when analyzing posttest score [58]. On the other hand, both
understanding [56]. This phenomenon likely also occurred Sorting mastery and Bucket mastery suggest that the abil-
in Decimal Point, as the game provides corrective feedback ity to compare decimal numbers plays a large role in test per-
(whether an answer is right or wrong) but does not empha- formance. This is likely due to the game and test materials
size the underlying reasoning; consequently, as an example, focusing on the four most common decimal misconceptions
a student realizing it is wrong to assume longer decimals are (Megz, Segz, Pegz, Negz), three of which are related to dec-
larger may end up concluding that shorter decimals must be imal comparison [25]. Based on the distribution of practice
larger, thereby adopting a new misconception. This high- opportunities until mastery, however, students took much
lights the need for more refined tracing of the student’s more attempts to master Sorting problems than Bucket
dynamic learning states in a digital learning environment. problems, which may explain why they did not achieve high
While the standard KC modeling technique can track when scores on the posttest and delayed posttest, averaging at
students make an intended mistake (e.g., longer decimals only around 30 out of 52 points [22]. Therefore, improv-
are larger), it does not investigate their specific input to ing students’ performance on Sorting problems, potentially
see whether a new misconception (e.g., shorter decimals are by incorporating hints and error messages as we previously
larger) has emerged. To address this issue, future itera- discussed, is crucial in future studies of the game.
491 Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)
At the same time, we saw that Number Line mastery had a count. Identification and intrinsic agency indicate that, with
significant positive association with delayed posttest score, all other factors being equal, the more students identified
but was not selected in the posttest model. An interpreta- their learning as coming from intrinsic motivation (rather
tion of this result is that Number Line tasks, which we identi- than external pressures), the more achievement they felt
fied as among the most difficult in the game, could be at a de- after learning. Math interest and computer efficacy sug-
sirable difficulty level, which can promote deeper and longer- gest that students’ acquaintance with the learning domain
lasting learning than the more straightforward tasks [61]. or medium could also be positively associated with achieve-
For instance, a prior study on comparing erroneous exam- ment emotion [26]. On the other hand, pretest score had
ples and problem-solving decimal tasks found that erroneous a negative association, likely because students with lower
examples, which are more aligned with the desirable diffi- prior knowledge were able to learn more from the game
culty, led to significantly higher delayed posttest scores but and therefore felt more achievement than those with high
similar posttest scores [34]. In our case, we also saw that prior knowledge. Similarly, for opportunity count, a plau-
Number Line is an important feature for predicting delayed sible reason for students choosing to play more mini-game
posttest but not for predicting posttest performance. rounds is that they felt the mini-games were helpful, which
contributed to their achievement emotion after game play.
Similar to Number Line mastery, gender (male = 0, female Overall, the features we identified could serve as a guideline
= 1) was not a feature in the posttest model, but had a posi- for promoting achievement emotion in learning games and
tive association with delayed posttest scores. In other words, in more general instructional contexts.
with other factors being equal, females could achieve higher
delayed posttest scores than males. While this association is 6. CONCLUSIONS
only marginally significant (p = .074), similar findings about
From our analyses, we gained several insights into students’
females’ tendency to outperform males in retention and de-
learning outcomes and enjoyment in Decimal Point. First,
layed posttest have been reported in previous mathematics
we found that Sorting and Number Line are important skills
intervention studies [1, 20]. Using the same dataset as in
for posttest and delayed posttest performance, but students
this work, [22] also found that females demonstrated signif-
required more instructional support to effectively master
icantly higher pre-post and pre-delayed learning gains than
them. Second, very few students mastered all five deci-
males, with a larger effect size in pre-delayed learning gains.
mal skills from the game, while the majority engaged in
Therefore, an important next step is to conduct future stud-
over-practice, likely due to their preference for playing easy
ies of Decimal Point on a larger sample size to draw more
mini-games, i.e., those they had already mastered. Third,
conclusive findings about whether the game promotes more
expanding on prior findings about gender effect in Decimal
retention in females and what could lead to this effect.
Point [22,33], we identified a trend of females outperforming
males in the delayed posttest, which should be investigated
5.3 Investigating factors related to enjoyment on a larger sample size. Fourth, we learned that students’
Our enjoyment prediction models did not perform as well achievement emotion can be reasonably captured by their
as the learning models and could explain only about 20% of level of computer efficacy, learning motivation, prior knowl-
the variance in game engagement and affective engagement. edge and number of mini-game rounds. All of these insights
These poor model fits likely result from the lack of appropri- can be derived from log data alone and would serve as useful
ate features in our data. To track student engagement, pre- metrics to assist digital learning game researchers in evalu-
vious work has emphasized the use of fine-grained measures ating and improving their own games. For Decimal Point,
such as time spent on decision making [47], social engage- in particular, an important next step is to perform similar
ment profile [49] and interaction traces [6]; in contrast, our analyses in other studies of the game to see which of our
feature set consists mainly of quantitative scores (e.g., Likert findings can be replicated. Identifying consistent trends in
responses) and aggregate data (e.g., error count). Related to student data could allow us to construct a more generalized
this direction, a previous study of Decimal Point by [57] has model of students’ game play that combines existing theories
clustered students based on their mini-game selection orders with novel exploratory analyses [38].
and found that the cluster which demonstrated more agency
reported higher enjoyment. Adopting their method of en- In a broader context, we have seen the rapid growth of digi-
coding students’ mini-game sequences is a good first step in tal learning games in recent years, from being conceived as a
building more fine-grained features for our prediction tasks. novel learning platform [15, 21] to having their effectiveness
On the other hand, the lack of association between our in- validated by rigorous studies [10]. The game Decimal Point,
game learning measures (e.g., skill mastery, over-practice op- in particular, has been shown to significantly improve stu-
portunity count, error count) and game engagement or affec- dents’ learning through several research works [18,22,35,39].
tive engagement implies that students’ game performance, When viewing from a learning analytics perspective, how-
whether good or bad, were unlikely to yield any negative ever, one could identify room for improvement that would
emotion such as confusion or frustration. This is a positive otherwise not be reflected in pretest and posttest scores
outcome, indicating that our game environment does not alone. For instance, a game may not adequately support
impose any performance pressure on students – one of the all of its learning objectives, or students may engage in non-
primary principles of learning games [15]. optimal learning behavior due to a lack of self-regulation.
At the heart of these issues is the question of how digital
At the same time, we did find that a linear regression model learning games can optimize student learning while retain-
was able to predict achievement emotion reasonably well ing its core value as a playful environment, where players
from student’s identification agency, intrinsic agency, math are free to exercise their agency. Addressing this question is
interest, computer efficacy, pretest score and opportunity an important step for future works in the field.
Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020) 492
7. REFERENCES [14] W. Fan and C. A. Wolters. School motivation and
[1] J. Ajai and B. Imoko. Gender differences in high school dropout: The mediating role of
mathematics achievement and retention scores: A case educational expectation. British Journal of
of problem-based learning method. International Educational Psychology, 84(1):22–39, 2014.
Journal of research in Education and Science, [15] J. P. Gee. What video games have to teach us about
1(1):45–50, 2015. learning and literacy. Computers in Entertainment
[2] E. Andersen, E. O’Rourke, Y.-E. Liu, R. Snider, (CIE), 1(1):20–20, 2003.
J. Lowdermilk, D. Truong, S. Cooper, and Z. Popovic. [16] J. F. Hair, W. C. Black, B. J. Babin, R. E. Anderson,
The impact of tutorials on games of varying R. L. Tatham, et al. Multivariate data analysis (vol.
complexity. In Proceedings of the SIGCHI Conference 6), 2006.
on Human Factors in Computing Systems, pages [17] E. Harpstead and V. Aleven. Using empirical learning
59–68, 2012. curve analysis to inform design in an educational
[3] R. S. Baker. Personal correspondence, 2019. game. In Proceedings of the 2015 Annual Symposium
[4] R. S. Baker, M. J. Habgood, S. E. Ainsworth, and on Computer-Human Interaction in Play, pages
A. T. Corbett. Modeling the acquisition of fluent skill 197–207, 2015.
in educational action games. In International [18] E. Harpstead, J. E. Richey, H. Nguyen, and B. M.
Conference on User Modeling, pages 17–26. Springer, McLaren. Exploring the subtleties of agency and
2007. indirect control in digital learning games. In
[5] A. Ben-Eliyahu, D. Moore, R. Dorph, and C. D. Proceedings of the 9th International Conference on
Schunn. Investigating the multidimensionality of Learning Analytics & Knowledge, pages 121–129, 2019.
engagement: Affective, behavioral, and cognitive [19] J. Hattie and H. Timperley. The power of feedback.
engagement across science activities and contexts. Review of educational research, 77(1):81–112, 2007.
Contemporary Educational Psychology, 53:87–105, [20] L. L. Haynes and J. V. Dempsey. How and why
2018. students play computer-based mathematics games: A
[6] P. Bouvier, K. Sehaba, and É. Lavoué. A trace-based consideration of gender differences. 2001 Annual
approach to identifying users’ engagement and Proceedings-Atlanta: Volume, page 178.
qualifying their engaged-behaviours in interactive [21] M. A. Honey and M. L. Hilton. Learning science
systems: application to a social game. User Modeling through computer games. National Academies Press,
and User-Adapted Interaction, 24(5):413–451, 2014. Washington, DC, 2011.
[7] J. H. Brockmyer, C. M. Fox, K. A. Curtiss, [22] X. Hou, H. Nguyen, J. E. Richey, and B. M. McLaren.
E. McBroom, K. M. Burkhart, and J. N. Pidruzny. Exploring how gender and enjoyment impact learning
The development of the game engagement in a digital learning game. In International Conference
questionnaire: A measure of engagement in video on Artificial Intelligence in Education. Springer, 2020.
game-playing. Journal of Experimental Social [23] C. S. Hulleman, O. Godes, B. L. Hendricks, and J. M.
Psychology, 45(4):624–634, 2009. Harackiewicz. Enhancing interest and performance
[8] H. Cen, K. R. Koedinger, and B. Junker. Is over with a utility value intervention. Journal of
practice necessary?-improving learning efficiency with educational psychology, 102(4):880, 2010.
the cognitive tutor through educational data mining. [24] A. Illanas Vila, J. R. Calvo-Ferrer, F. J.
Frontiers in artificial intelligence and applications, Gallego-Durán, F. Llorens Largo, et al. Predicting
158:511, 2007. student performance in foreign languages with a
[9] C.-H. Chen, K.-C. Wang, and Y.-H. Lin. The serious game. 2013.
comparison of solitary and collaborative modes of [25] S. Isotani, D. Adams, R. E. Mayer, K. Durkin,
game-based learning on students’ science learning and B. Rittle-Johnson, and B. M. McLaren. Can erroneous
motivation. Journal of Educational Technology & examples help middle-school students learn decimals?
Society, 18(2):237–248, 2015. In European Conference on Technology Enhanced
[10] D. B. Clark, E. E. Tanner-Smith, and S. S. Learning, pages 181–195. Springer, 2011.
Killingsworth. Digital games, design, and learning: A [26] M. Jansen, O. Lüdtke, and U. Schroeders. Evidence
systematic review and meta-analysis. Review of for a positive relation between interest and
educational research, 86(1):79–122, 2016. achievement: Examining between-person and
[11] G. C. Delacruz, G. K. Chung, and E. L. Baker. within-person variation in five domains. Contemporary
Validity evidence for games as assessment Educational Psychology, 46:116–127, 2016.
environments. cresst report 773. National Center for [27] S. Karumbaiah, R. S. Baker, and V. Shute. Predicting
Research on Evaluation, Standards, and Student quitting in students playing a learning game.
Testing (CRESST), 2010. International Educational Data Mining Society, 2018.
[12] A. L. Duckworth, C. Peterson, M. D. Matthews, and [28] K. R. Koedinger, E. Brunskill, R. S. Baker, E. A.
D. R. Kelly. Grit: perseverance and passion for McLaughlin, and J. Stamper. New potentials for
long-term goals. Journal of personality and social data-driven intelligent tutoring system development
psychology, 92(6):1087, 2007. and optimization. AI Magazine, 34(3):27–41, 2013.
[13] A. M. Durik, M. Vida, and J. S. Eccles. Task values [29] D. Lomas, K. Patel, J. L. Forlizzi, and K. R.
and ability beliefs as predictors of high school literacy Koedinger. Optimizing challenge in an educational
choices: A developmental analysis. Journal of game using large-scale design experiments. In
Educational Psychology, 98(2):382, 2006. Proceedings of the SIGCHI Conference on Human
493 Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)
Factors in Computing Systems, pages 89–98, 2013. [44] P. Pintrich, D. Smith, T. Garcia, and W. McKeachie.
[30] M. Manske and C. Conati. Modelling learning in an A manual for the use of the motivated strategies for
educational game. In AIED, pages 411–418, 2005. learning questionnaire (mslq) ann arbor. MI: National
[31] G. Marakas, R. Johnson, and P. F. Clay. The evolving Center for Research to Improve Postsecondary
nature of the computer self-efficacy construct: An Teaching and Learning, pages 1–76, 1991.
empirical investigation of measurement construction, [45] S. Raschka. Mlxtend: Providing machine learning and
validity, reliability and stability over time. Journal of data science utilities and extensions to python’s
the Association for Information Systems, 8(1):2, 2007. scientific computing stack. The Journal of Open
[32] R. E. Mayer. Computer games for learning: An Source Software, 3(24), Apr. 2018.
evidence-based approach. MIT Press, 2014. [46] L. B. Resnick, P. Nesher, F. Leonard, M. Magone,
[33] B. McLaren, R. Farzan, D. Adams, R. Mayer, and S. Omanson, and I. Peled. Conceptual bases of
J. Forlizzi. Uncovering gender and problem difficulty arithmetic errors: The case of decimal fractions.
effects in learning with an educational game. In Journal for research in mathematics education, pages
International Conference on Artificial Intelligence in 8–27, 1989.
Education, pages 540–543. Springer, 2017. [47] V. Riemer and C. Schrader. Impacts of behavioral
[34] B. M. McLaren, D. M. Adams, and R. E. Mayer. engagement and self-monitoring on the development of
Delayed learning effects with erroneous examples: a mental models through serious games: Inferences from
study of learning decimals with a web-based tutor. in-game measures. Computers in Human Behavior,
International Journal of Artificial Intelligence in 64:264–273, 2016.
Education, 25(4):520–542, 2015. [48] J. P. Rowe and J. C. Lester. Modeling user knowledge
[35] B. M. McLaren, D. M. Adams, R. E. Mayer, and with dynamic bayesian networks in interactive
J. Forlizzi. A computer-based game that promotes narrative environments. In Sixth AI and Interactive
mathematics learning more than a conventional Digital Entertainment Conference, 2010.
approach. International Journal of Game-Based [49] J. A. Ruiperez-Valiente, M. Gaydos, L. Rosenheck,
Learning (IJGBL), 7(1):36–56, 2017. Y. J. Kim, and E. Klopfer. Patterns of engagement in
[36] B. M. McLaren, S.-J. Lim, D. Yaron, and K. R. an educational massive multiplayer online game: A
Koedinger. Can a polite intelligent tutoring system multidimensional view. IEEE Transactions on
lead to improved learning outside of the lab? Frontiers Learning Technologies, 2020.
in Artificial Intelligence and Applications, 158:433, [50] R. M. Ryan and J. P. Connell. Perceived locus of
2007. causality and internalization: Examining reasons for
[37] J. Metcalfe and N. Kornell. The dynamics of learning acting in two domains. Journal of personality and
and allocation of study time to a region of proximal social psychology, 57(5):749, 1989.
learning. Journal of Experimental Psychology: [51] J. L. Sabourin, L. R. Shores, B. W. Mott, and J. C.
General, 132(4):530, 2003. Lester. Understanding and predicting student
[38] R. J. Mislevy, J. T. Behrens, K. E. Dicerbo, and self-regulated learning strategies in game-based
R. Levy. Design and discovery in educational learning environments. International Journal of
assessment: Evidence-centered design, psychometrics, Artificial Intelligence in Education, 23(1-4):94–114,
and educational data mining. JEDM| Journal of 2013.
Educational Data Mining, 4(1):11–48, 2012. [52] R. Sawyer, A. Smith, J. Rowe, R. Azevedo, and
[39] H. Nguyen, E. Harpstead, Y. Wang, and B. M. J. Lester. Is more agency better? the impact of
McLaren. Student agency and game-based learning: A student agency on game-based learning. In
study comparing low and high agency. In International Conference on Artificial Intelligence in
International Conference on Artificial Intelligence in Education, pages 335–346. Springer, 2017.
Education, pages 338–351. Springer, 2018. [53] W. Schneider. The development of metacognitive
[40] H. Nguyen, Y. Wang, J. Stamper, and B. M. knowledge in children and adolescents: Major trends
McLaren. Using knowledge component modeling to and implications for education. Mind, Brain, and
increase domain understanding in a digital learning Education, 2(3):114–121, 2008.
game. In International Conference on Educational [54] V. J. Shute, L. Wang, S. Greiff, W. Zhao, and
Data Mining, pages 139–148, 2019. G. Moore. Measuring problem solving skills via stealth
[41] M. Ninaus, K. Moeller, J. McMullen, and K. Kiili. assessment in an engaging video game. Computers in
Acceptance of game-based learning and intrinsic Human Behavior, 63:106–117, 2016.
motivation as predictors for learning success and flow [55] J. Stamper, K. Koedinger, R. S. d Baker,
experience. 2017. A. Skogsholm, B. Leber, J. Rankin, and S. Demi. Pslc
[42] Z. Peddycord-Liu, R. Harred, S. Karamarkovich, datashop: A data analysis service for the learning
T. Barnes, C. Lynch, and T. Rutherford. Learning science community. In International Conference on
curve analysis in a large-scale, drill-and-practice Intelligent Tutoring Systems, pages 455–455. Springer,
serious math game: Where is learning support needed? 2010.
In International Conference on Artificial Intelligence [56] W. Van Dooren, D. De Bock, A. Hessels, D. Janssens,
in Education, pages 436–449. Springer, 2018. and L. Verschaffel. Remedying secondary school
[43] R. Pekrun. Progress and open problems in educational students’ illusion of linearity: A teaching experiment
emotion research. Learning and Instruction, aiming at conceptual change. Learning and
15(5):497–506, 2005. Instruction, 14(5):485–501, 2004.
Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020) 494
[57] Y. Wang, H. Nguyen, E. Harpstead, J. Stamper, and
B. M. McLaren. How does order of gameplay impact
learning and enjoyment in a digital learning game? In
International Conference on Artificial Intelligence in
Education, pages 518–531. Springer, 2019.
[58] B. E. Whitley and M. E. Kite. Principles of research
in behavioral science. Routledge, 2013.
[59] J. Wiemeyer, M. Kickmeier-Rust, and C. M. Steiner.
Performance assessment in serious games. In Serious
Games, pages 273–302. Springer, 2016.
[60] M. V. Yudelson, K. R. Koedinger, and G. J. Gordon.
Individualized bayesian knowledge tracing models. In
International conference on artificial intelligence in
education, pages 171–180. Springer, 2013.
[61] C. L. Yue, E. L. Bjork, and R. A. Bjork. Reducing
verbal redundancy in multimedia learning: An
undesired desirable difficulty? Journal of Educational
Psychology, 105(2):266, 2013.
495 Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)