4.1. Behavioral Correlates of Simple Video Game Training
The accuracy in the change detection task obtained differs for each condition; specifically, greater accuracy was observed in smaller set sizes, and this behavior is congruent with previous studies where accuracy was studied as a function of the set size of the change detection task [
5,
13].
Regarding the effect of the proposed training on the accuracy values, we found a significant increase in the smallest set size condition (two-square) when comparing the mid- with the post-training session. In addition, we found a significant decrease in the accuracy value in the eight-square condition. These changes were observed for both groups. Moreover, the highest accuracy values are observed after one month of training. Changes related to the training were observed for both groups, which differs from the initial hypothesis; this means that training with the video game played by the RT group also led to important changes. However, the question also arises as to whether the observed benefits are specific to training with simple video games or are due to some test–retest effect of the CDT task.
In contrast to our findings, in previous studies in which an action first-person shooter video game [
5] was used for VWM capacity training, significant benefits were reported in the accuracy of four-, five-, and six-square set sizes. Similarly, it has been shown that expert players in ARS video games reach higher accuracy values than non-experienced players in a change detection task of one, two, and four squares [
13]. Both studies make use of action video games for training and attribute the increases in accuracy to aspects of the genre, such as the participants responding to a stimulus under pressure, high cognitive demand, divided attention, and changes between periods of concentration and divided attention. The lack of changes in the accuracy value in bigger set sizes (four-, six- and eight-square) observed in this work can be attributed to the genres of video games used for training (i.e., 2D puzzle simple games) and also to the fact that in our experiment, participants were not obliged to respond under pressure, since there was no time limit in the test set phase. However, a similar lack of effect of training on the accuracy values of this task (two, four, six, and eight squares) has been previously reported [
26]. When using a simple attentional task adapted as a mobile app to train the VWM, no significant effects were found on the accuracy, but rather on the response time in the change detection task.
We found that both groups take less time to respond (hit and correct rejection) to the two-square set in the post-training session. It was for the correct rejection RT where we found an interaction between the set size and the group. It seems that this interaction is mediated by the apparent difference in this value in the two-square, pre-session condition between the two groups; however, this difference is not significant (t(16) = −1.23,
p = 0.23). The behavior of the groups is quite similar for the other conditions (
Table 3). As stated before [
42,
43], the quality of the information encoded in the VWM can be measured through the RTs. In other words, dealing with visual information in an optimal way allows for shorter retrieval times [
44]. Thus, the decrease in RT in the simplest condition of the change detection task (two-square) could be considered as a benefit of our training with simple video games in the VWM group. Since this benefit is not applicable to the RT group, it can be directly attributed to the training with the simple video game designed for memory.
The Cowan estimator changed over time in both groups. The mean Cowan estimator values for the VWM group were K = 3.06, 3.77, and 2.56 (for the pre-, mid-, and post-training sessions, respectively) and 3.16, 3.27, and 2.16 for the RT group. These results allow us to infer that the participants of both groups improved the estimated capacity of their VWM while comparing the pre- and mid-training capacity values. Congruent with the accuracy values, the participants presented an increase in the estimated capacity for the two-square array in the post-training session. Moreover, it was in the session after one month of training where a progressive increase in this value as a function of set size was noted.
Also, both groups showed a decrement of this value, even lower than the pre-training values, for the post-training session for the eight-square condition (see
Table 4). This behavior, where the VWM measures decrease considerably in the post-training session, has not been reported in previous works that use video game training to train the VWM [
5,
26]. This decrease could be attributed to some confounding variable, which cannot be measured with the methodology proposed in this work; such is the case of the motivation, expectation, or engagement that the participants had before and after their training. It is possible that the participants experienced fatigue, low motivation, or interest in this condition of the task. It is worth remembering that for the post-training evaluation, the participants came from training with the highest difficulty of the video games, which increased progressively as a function of time. However, these assertions cannot be made without the support of an exit survey that would allow us to know these variables. Previous works that have not found this behavior have made use of adaptive training where the difficulty of the training is modified according to the demands of the participant [
26].
In sum, contrary to our hypothesis, both groups presented changes and benefits in the dependent measures considered, including increased accuracy, shorter response times for the same condition, and an increase in the estimated capacity of the VWM for the two-square condition. However, these benefits are not directly attributable to training with simple video games. Except for the VWM group, the RT correct rejection was lower for the two-square condition than for all other conditions, whereas the RT group exhibited similar RTs regardless of the condition.
4.2. Neural Correlates of Simple Video Game Training
Current results show the significant sensitivity of ERP amplitudes to the number of elements in the memory set, with more negative amplitudes for larger set sizes and similar amplitude values for set sizes that exceed VWM capacity, specifically for sets of six and eight squares. These findings are consistent with what has been reported in the literature about the NSW component as a neural correlate of the change detection task [
22,
33,
35,
45]. We found that this relationship tends to be less pronounced as the training sessions progress and was not significant in the mid- and post-training session evaluations. The NSW is more robust at posterior EEG electrode sites (parietal, temporal, and occipital) [
46].
The NSW analysis did not reveal significant effects of video game training on the average voltage of this potential, with only a couple of significant differences observed at electrode O1 for the two-square condition in the VWM group. Although these results could imply that training with simple video games was not effective in enhancing VWM capacity, we chose to analyze the average voltage of the component along with four additional features: the area under the curve (to assess the energy of the waveforms) and the terms of the second-degree polynomial approximation. This approach was used because the NSW resembles a quadratic function, whose apex is related to the retention period of the change detection task, as well as the amplitude and curvature of the potential. In addition, considering the small sample size, it was decided to use DTs on EEG epoch features instead of analyzing the grand averages of the NSW. In this way, it was possible to increase our sample in order to explore the possibility that there might be changes that machine learning could cause.
4.3. Decision Trees Provide Insight on Cognitive Benefits
The obtained features were used to train ML models to determine whether the sets were distinguishable based on time instant or group, i.e., to analyze whether there were changes in the NSW epochs. The results from these models would indicate the degree of distinction, helping us understand if and how the features of the potential changed. Therefore, we characterized the EEG epochs to increase the amount of data available for training the machine learning models. As previously discussed, characterizing EEG waveforms for training ML models is a methodology that has been used to classify various brain states, such as cognitive load and learning, among others [
15,
16,
37]. However, in this study, we chose to characterize the epochs used for averaging and obtaining ERPs. This allowed us to leverage ML models that specialize in identifying characteristic patterns in large datasets, which may not be immediately apparent.
The ML algorithms allowed distinctions to be made between the different sets of features obtained from the NSW epochs.
DTs revealed that the highest F1 scores achieved for the VWM group considering the 480 ms to 900 ms time window occur when comparing the NSW epoch features of the pre- and post-session NSW epochs for the two-square condition. Similarly, DTs found the highest F1 score for this comparison when considering the 500 ms to 1200 ms window.
The decision trees (see
Figure 6 and
Figure 7) show that this group presents more negative AUC_T5 values in the session after two months of training. Such a change can be related to the training with the simple memory video game, as it is not presented for the RT group. In addition, the post-training session is also related to more negative AUC_Pz and mv_T6 values compared to the pre-training session values.
The topographical distribution of the NSW has been described in terms of the different cognitive processes it might reflect. Recently, Zickerick and colleagues examined the fronto-central slow wave as an index of WM maintenance [
47], which is thought to reflect processes of cognitive control associated with the maintenance of task-relevant information [
46,
48], but they additionally investigated the posterior slow wave which has been associated with cognitive operations following target identification [
49,
50]. According to these authors, an increase in a more centrally distributed component may indicate improved WM maintenance and rehearsal activity.
In the light of our findings, we could argue that the observed increased negativity in temporoparietal regions, after training in the memory group, may indicate improved WM maintenance and rehearsal activity. Functional magnetic resonance imaging studies have proved that cortical activation in WM tasks is not limited to the prefrontal cortex [
51,
52,
53], but that co-activated regions have been found consistently in the inferior temporal cortex and the posterior parietal cortex. The inferior temporal cortex shows greater sensitivity to the identity and features of a stimulus, such as shape or color, independent of the stimulus location [
54,
55], whereas a location-specific signal in the posterior parietal cortex has shown the strong retinotopic mapping of a remembered target location [
56]. According to [
57], the left IPS plays an important role in the phonological storage during VWM; they suggest that stimuli appear to be recoded and maintained by verbal rehearsal in a phonological short-term store in virtually similar regions, regardless of if they are auditory or visual. In fact, the left lateralized effect, observed over the T5 scalp region, could be associated with a verbal active rehearsal of visual and spatial information, meaning it is not only related to posterior visual areas activation. As for the RT group, it is possible to observe more negative AUC_T5 values for the later session; however, they are in conjunction with mostly positive mv_O1 values and for the four-square condition.
On the other hand, for both time windows, the DTs show that the F1 scores obtained when comparing the pre- and intermediate sessions of VWM group for all set sizes are higher contrasted with those of the RT group. Thus, it can be inferred that the features of the VWM group are more distinguishable between these time instants than those of the RT group. This may indicate that the simple memory video game was able to present greater changes in the EEG waveforms, unlike the video game used for the training of the RT group, after 1 month of training.
Also, the DTs show that the greatest changes for the RT group are observed when comparing the mid- and post-training sessions, specifically for the two- and four-square conditions, i.e., after 2 months of training. Among these changes, more negative AUC_T5 values are observed in the post-session, which coincides with what was observed for the VWM group.
Both groups show a change in AUC_T5 for the post session; however, for the RT group, this change is for a larger set size and occurs after 1 month-long training. In addition, the VWM group also shows more negative values in the right temporal region (T6), while the RT group does not.
With respect to the groups, the DTs show that the F1 scores of the previous session are lower than those of the mid-session. This lets us know that the features of the groups before training with the video games are practically indistinguishable, which allows us to attribute the changes in the other sessions to the training.
Finally, the DTs show that the groups differ mostly in the intermediate session for the two- and two-square conditions. Changes in the two-square condition again imply more negative values in the temporal region compared to the RT group, while the RT group also presents more positive values in the right occipital region for the same condition.
4.4. Video Game Design Factors
Contrary to what has been reported in the literature, in which commercial action [
5,
14,
58] or adventure [
24] video games are utilized for VWM training, we decided to design our own platform as a hybrid between a commercial video game and a cognitive trainer, since this allowed us to have the freedom to modify various aspects of the game, such as its visual components, its difficulty, or adding competitive factors such as a scoring table that lists the best scores obtained by the participants. In addition, it allowed us to control the game time of the participants since they could do the training from home at the time they preferred to do it. This was in order for the subject to take his training in the most enjoyable way possible, thus avoiding possible negative effects related to the participants seeing their training as a tedious task. Moreover, factors such as fun, motivation to play, constant difficulty changes, and rewards have been identified as factors that directly influence the video game training of cognitive abilities [
59,
60,
61]. The use of hybrid platforms has become popular because of its positive results in cognitive training [
62]. The changes related to the Cowan estimator show an improvement for the mid-training session, while they tend to return to baseline for the post-training session. This may be due to the fact that, in general, the participants completed fewer levels in the second month of training compared to the first one. The levels of the second training period were more difficult, in addition to the fact that the difficulty remained monotonously hard; that is, the game became more complex than it already was, which could be an important factor for some participants to play less time. Perhaps adding dynamic difficulty changes, that is, varying the difficulty non-progressively so that the game does not remain simple or complex for a long period of time, should be considered. This could help maintain the participants’ interest in continuing their training. The exploration of this relationship in the present work is limited, since there is no exhaustive information from the participants that describes how they felt when performing the training.
4.5. Limitations
While the present study provides a novel exploration of the benefits of training in working memory by combining ERP measures and machine learning, it has limitations. Research has shown that participants tend to benefit more from cognitive training that adapts its difficulty based on their individual performance. In other words, the training difficulty should change in response to how well participants are performing. In this study, however, we used progressive changes in difficulty, where the difficulty varied as a function of time rather than participant performance. To this, we can attribute the lack of cognitive benefits in the post-training session, as participants probably experienced a lack of motivation to perform their training or were faced with a very complex task to solve. The change in difficulty could be modulated through performance data recorded in the simple video game or through a cognitive estimation algorithm, the latter of which would involve parallel EEG monitoring in training, which could result in a negative effect on the participant. However, in order to determine the lack of benefit, it would be necessary to implement exit surveys that would provide us with information on the participant’s perception of the training video game, the change detection task, and video game training in general. Also, the transfer effects of the proposed training are needed. Participants were trained with a memory video game and tested on a change detection task. However, it is necessary to assess whether memory benefits in complex tasks that require other cognitive domains, such as attention, are closer to participants’ everyday tasks. Likewise, a non-immediate post-training evaluation session is needed at least three months after training in order to monitor changes and whether they are maintained as a long-term benefit. We believe that the major limitations of our work are in the sample size and in the blinding used in the experimental design. The relatively small sample can explain the lack of robust behavioral (accuracy, RT, and VWM estimated capacity) and electrophysiological (NSW) results obtained. We tried to solve this problem by performing ML analysis, multiplying the NSW data of each participant, which allowed us to find more robust results about the change in NSW in the temporoparietal regions. In turn, the experimental design was such that participants knew the experimental condition, the other control condition, and which one they belonged to from the beginning of the study. The lack of blinding may attenuate our results, as participants’ behavior could have been biased, even unconsciously. Therefore, it would be necessary to thoroughly examine the effect of this part of the experimental design in order to increase the robustness of our results.
The present study uncovered minor changes in the dependent CDT. However, the observation of these benefits in both the VWM group and the RT group raises the question of a potential test–retest effect related to the change detection task, a phenomenon previously documented in cognitive research [
5,
21,
63]. This test–retest effect could lead to improved performance due to familiarity rather than the training intervention itself. To further investigate this possibility, incorporating a test–retest reliability analysis [
5] could serve as a valuable tool to discern the extent to which the observed enhancements are influenced by this effect. The need for the careful consideration of experimental design, control tasks, and group assignment within the context of training paradigms and the test–retest effect should be emphasized. Addressing these concerns and utilizing robust methodologies is crucial in drawing meaningful conclusions from cognitive training studies. However, it was found that the VWM group became faster at responding correctly to a correct rejection of the CDT, a behavior that was not present in the RT group. In addition, the changes in brain electrical activity in the temporoparietal region in the same group could indicate that the training with the memory video game brought behavioral benefits and changes in the group that trained with it.
Finally, our study lacks additional neuropsychological measures of working memory capacity or other executive functions, such as attentional control, which would allow a more precise characterization of our sample and differential analysis of the effect of training in relation to another parameter. While the participants in the present study were young cognitively normal individuals, we believe that the training model could be applied to explore improvement capacity in patients who have some deficits in WM capacity or other cognitive deficits that correlate with WM capacity.