Computer Assisted Learning - 2022 - Hahnel - Patterns of Reading Behaviour in Digital Hypertext Environments
Computer Assisted Learning - 2022 - Hahnel - Patterns of Reading Behaviour in Digital Hypertext Environments
DOI: 10.1111/jcal.12709
ARTICLE
1
DIPF, Leibniz Institute for Research and
Information in Education, Frankfurt, Germany Abstract
2
Centre for International Student Assessment Background: Computer-based assessment allows for the monitoring of reader behav-
(ZIB), Frankfurt am Main, Germany
iour. The identification of patterns in this behaviour can provide insights that may be
3
Assessment and Psychometric Research
Division, Australian Council for Educational useful in informing educational interventions.
Research (ACER), Camberwell, Australia Objectives: Our study aims to explore what different patterns of reading activity
Correspondence exist, and investigates their interpretation and consistency across different task sets
Carolin Hahnel, DIPF, Leibniz Institute for (units), countries, and languages. Three patterns were expected: on-task, exploring
Research and Information in Education,
Rostocker Strasse 6, 60323 Frankfurt am and disengaged.
Main, Germany. Methods: Using log data from the PISA 2012 digital reading assessment (9226 stu-
Email: [email protected]
dents from seven countries), we conducted hierarchical cluster analyses with typical
process indicators of digital reading assessments. We identified different patterns and
explored whether they remained consistent across different units. To validate the
interpretation of the identified patterns, we examined their relationship to perfor-
mance and student characteristics (gender, socio-economic status, print reading skills).
Results and Conclusions: The results indicate a small number of transnational clus-
ters, with unit-specific differences. Cluster interpretation is supported by associations
with student characteristics—for example, students with low print reading skills were
more likely to show a disengaged pattern than proficient readers. Exploring behav-
iour tended to be exhibited only once across the three units: It occurred in the first
unit for proficient readers and in later units for less skilled readers.
Major Takeaways: Behavioural patterns can be identified in digital reading tasks that
may prove useful for educational monitoring and intervention. Although task situa-
tions are designed to evoke certain behaviours, the interpretation of observed beha-
vioural patterns requires validation based on task requirements, assessment context
and relationships to other available information.
KEYWORDS
digital reading, hypertext, log data analysis, PISA, reading process
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2022 The Authors. Journal of Computer Assisted Learning published by John Wiley & Sons Ltd.
(Rouet et al., 2017). Such decisions generate differences in what processes in digital reading that relate to navigating a hypertext, inte-
readers search for and select in task-oriented settings (Foltz, 1996; grating information from multiple sources, and evaluating information
Vidal-Abarca et al., 2010), which creates traces in process data that for relevance and trustworthiness.
can be recorded during a reading situation (e.g., tracking mouse clicks As adolescents and young adults grew up with information and
in log files; for an overview, see Goldhammer et al., 2021). Investigat- communication technologies (ICTs), they are often believed to be nat-
ing reading behaviour with process data can uncover how students urally proficient digital readers, but their behaviour does not necessar-
approached a task and reveal successful strategies or difficulties they ily live up to this expectation (Brun-Mercer, 2019; Lumley &
have experienced. This new knowledge about student activities and Mendelovits, 2012). In an early attempt to shed light on how students
associated patterns may provide helpful information to target remedi- acquire information when learning from hypertext, Lawless and Kuli-
ation (e.g., by individualizing instructions or providing feedback). kowich (1996) classified readers according to their behaviour in a
The present study uses process indicators derived from log data hypertext of 150 pages. They instructed 42 participants to study a
to examine behavioural patterns that may emerge from these indica- provided subject ‘as carefully as they could’ (p. 390) and identified
tors. Specifically, we consider process indicators of navigation (hyper- three behavioural patterns: Knowledge seekers generally showed a
text coverage, precision of navigation) and time allocation (processing high rate of node visits, which the authors interpreted as them trying
time, orientation time) for three digital reading units (i.e., hypertext to gather as much information as possible. Feature seekers, in con-
stimuli with three comprehension items each). Previous studies have trast, invested more time in trying to understand how the hypertext
made significant contributions by empirically investigating the predic- worked and what kinds of screens it had. The students with the third
tive value of process indicators for comprehension outcomes. However, pattern, the apathetic users, spent little time in the hypertext, did not
they often focused on examining single indicators without considering use many special features and inspected few, if any, pages. After
broader patterns emerging in specific contexts (see Rouet et al., 2017). learning, the students performed on a multiple-choice recall test, with
Therefore, based on process indicators of digital reading, our study the knowledge seekers outperforming the other two clusters, while
explores if different meaningful activity patterns can be identified. the apathetic users showed the poorest performance.
Moreover, it compares the occurrence of identified patterns in seven
countries and three languages, examines whether students stick consis-
tently to certain patterns and investigates the validity of the patterns' 1.2 | Assessment of digital reading behaviour
interpretation by examining their relationship to other variables
(i.e., gender, socio-economic status, print reading skills). Digital reading is a key competence to participate in many societal activi-
ties, and therefore of interest to comparative educational large-scale
assessments, such as the Programme for International Student Assessment
1.1 | Reading and learning with digital text (PISA; e.g., Lim & Jung, 2019; Naumann & Sälzer, 2017; OECD, 2015). In
the last decade, large-scale assessments have shifted towards technology-
Digital reading refers to reading activities in digital environments that based assessments. In this context, process data have proven insightful for
are often non-linearly structured and require the use of links to access explaining differences in competence measures (see Goldhammer
different text pages (Wiley et al., 2018). When dealing with digital et al., 2021) or improving their reliability (Ramalingam & Adams, 2017).
hypertext, readers face ‘a series of unknowns related to possible links, Process data from reading assessments are often investigated in terms of
possible texts, possible decisions and possible interactions [where navigation and time allocation, investigating general reading processes
they] must ignore distractions, anticipate and predict meaningful (e.g., skilled readers are fast readers; Perfetti, 2007) or digital reading-
moves with minimal text information’ (Afflerbach & Cho, 2008, specific processes (e.g., skilled readers are able to locate task-relevant
p. 81ff.). Contemporary frameworks addressing digital reading situa- websites; e.g., Hahnel et al., 2016; Sullivan & Puntambekar, 2015).
tions acknowledge these specific demands and highlight the multiple
actions that readers must perform. For example, the new literacies
perspective on online research and comprehension (Leu et al., 2014) 1.2.1 | Indicators of navigation
emphasizes that digital reading typically occurs as part of an inquiry
and learning process, as the Internet is often used as an information The need for navigation is a major difference between print and digital
resource. Given a specific question, students engage in locating n et al., 2018). When reading print text, the page and
reading (Salmero
information online, critically evaluating information, synthesizing chapter structure suggests a natural order, whereas in a hypertext,
information and communicating results. A similar stance is taken in readers actively select their own pathway by choosing which links to
the IPS-I-model (Brand-Gruwel et al., 2009), which describes digital click on (e.g., OECD, 2010; Reinking, 1997; Rouet & Levonen, 1996).
reading situations as information problems. To solve an information Accordingly, simple indicators, such as the number of navigation steps,
problem, readers make use of several constituent skills, including can provide insight into how readers locate information. Often, such
those for searching for information, making initial judgements about indicators also incorporate which websites contain information rele-
the usefulness of information, deeply processing selected information vant for a task at hand in order to get in-depth measures of task-
n et al. (2018) sum-
and synthesizing the information gathered. Salmero relevant navigation (or task-irrelevant steps, as less-skilled readers are
marized previous research more broadly, distinguishing comprehension more prone to process distracting information than skilled readers;
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HAHNEL ET AL. 739
Cerdán et al., 2011). While the exact definition of ‘task-relevant’ has digital reading assessment (i.e., three hypertexts, each with three com-
differed, there is a consensus among researchers as to its importance. prehension items). Instead of providing an open learning activity, the
Unsurprisingly, research on task-relevant navigation suggests a strong PISA digital reading tasks require students to actively search for,
relationship with proficiency in digital reading, even after controlling for select and process digital information to meet specific task demands
print reading skill (Hahnel et al., 2016; Naumann, 2015; Naumann & (task-oriented reading, Vidal-Abarca et al., 2010). We aim to explore
n, 2016; OECD, 2011, 2015; Sullivan & Puntambekar, 2015).
Salmero patterns that reveal how students cope with the task requirements
However, task-specific effects may occur (e.g., the magnitude of the and hypertext structures in the PISA digital reading units and, more-
association between task-relevant navigation and digital reading perfor- over, investigate whether identified patterns can be generalized
mance depends on the navigation intensity of tasks; Naumann, 2015). across different hypertexts, countries and languages. As process indi-
cators, we examined at the unit-level how intensively a hypertext
stimulus was explored (hypertext coverage), and at the item level how
1.2.2 | Indicators of time allocation closely navigation behaviour adhered to task-relevant pages and did
not deviate to irrelevant pages (precision of navigation). Concerning
Time allocation is often assessed as the time a reader has taken to time allocation, we considered processing time and orientation time
complete a reading item (i.e., from the beginning of the item until the as indicators.
next item is triggered). This processing time, or time on task, is Based on previous work, we expected to find at least three
affected by individual characteristics (e.g., comprehension ability, patterns:
knowledge of reading strategies, reading enjoyment) and the demands
of an item (e.g., difficulty; Naumann, 2019). The relationship of pro- • An on-task pattern, in which student behaviour meets the require-
cessing time with task performance (‘time on task effect’) can differ ments of respective tasks (i.e., navigation indicators reflect that
depending on the involvement of automatic cognitive processes or mostly task-relevant navigation steps were performed; processing
more demanding ones (Goldhammer et al., 2014). In the former case, times reflect an adjustment to task demands with less-complex
a negative relationship between time on task and reading perfor- tasks being processed faster than more-complex tasks).
mance can be expected, whereas the relationship should be positive • An exploring pattern, in which student processing activities show a
when higher order cognitive processes are involved. Based on their high coverage of hypertext pages as well as low navigational preci-
results, Naumann and Goldhammer (2017) argue that PISA digital sion and long processing times in the first item of a digital read-
reading tasks, particularly those with a high navigation demand, can- ing unit.
not be completed using automatic processes. A slightly different inter- • A disengaged pattern in which students show generally low activity
pretation is suggested by OECD (2015), namely that while for some in navigation, short processing times and, given the processing
items working faster might be associated with improved performance, time, relatively long orientation times, potentially indicating strug-
in others, working too fast might indicate a lack of focus. gle or disinterest.
A second, less considered indicator of time allocation is orienta-
tion time (i.e., the time from the start of an item to when the first click Based on these considerations, our study explores the following
occurs). While there is little work making use of this indicator in digital questions:
reading research, it has been used more widely in other domains.
Problem solving research suggests that able problem solvers invest 1. Can the expected reading patterns be identified for different digi-
more time in analysing a problem prior to enacting a solution tal reading units?
(e.g., Eichmann et al., 2019; Whimbey & Lochhead, 1991). Zoanetti 2. Are the identified reading patterns similar across different coun-
(2010) offers an alternative view, that, rather than orientation time hav- tries and languages?
ing a strictly positive relationship with performance, there exists an 3. To what extent do students stick to specific behavioural patterns
optimum point. He reached this conclusion after noting that for some across digital reading units?
students, a period of extended lack of activity at the beginning of the 4. How do students' patterns relate to performance in digital reading
problem signified that they had not understood what to do. Case stud- and other student characteristics (gender, socio-economic status,
ies of student navigation behaviour in the PISA digital reading items print reading skills)?
(OECD, 2015) support this view by identifying a negative relationship
between orientation time (or as known in that work ‘initial reaction
time’) and digital reading performance in a relatively simple item. 2 | METHOD
Country Language n original n clean % original Original Clean Original Clean Original Clean
AUS English 3937 3113 79.1 47.7 48.3 15.8 (0.00) 15.8 (0.01) 0.24 (0.01) 0.26 (0.01)
IRL English 855 669 78.2 52.2 53.5 15.7 (0.01) 15.7 (0.01) 0.14 (0.02) 0.16 (0.02)
USA English 843 703 83.4 50.7 51.4 15.8 (0.01) 15.8 (0.01) 0.17 (0.03) 0.19 (0.03)
CAN English 2634 1862 70.7 50.3 50.1 15.8 (0.01) 15.8 (0.01) 0.40 (0.03) 0.41 (0.04)
CAN French 901 619 68.7 50.8 52.7 15.8 (0.01) 15.8 (0.01) 0.45 (0.03) 0.45 (0.03)
FRA French 1010 744 73.7 50.4 51.1 15.9 (0.01) 15.9 (0.01) 0.05 (0.02) 0.03 (0.03)
AUT German 900 744 82.7 51.6 53.0 15.8 (0.01) 15.8 (0.01) 0.15 (0.03) 0.15 (0.03)
DEU German 979 772 78.9 49.1 49.9 15.8 (0.01) 15.8 (0.01) 0.16 (0.03) 0.19 (0.04)
Note: AUS, Australia; AUT, Austria; CAN, Canada; DEU, Germany; FRA, France; IRL, Ireland; USA, United States of America.
year cycles. In PISA 2012, the participating countries had the option accessible to students. The units and items follow a fixed order and
to additionally assess student performance in digital reading, students could not return to an item once they had moved on to the
computer-based mathematics and problem solving in a computer- next item (OECD, 2014, p. 36). Table 2 provides an overview about
based study. This study comprised a 20-min tutorial and two 20-min unit and item characteristics. A brief description of the units' hyper-
clusters of items focusing on one or two of the assessed domains. The texts follows. Not mentioned are the back and forward buttons, which
participating students were randomly assigned to one of 24 testlets, were available in every unit. These buttons allowed students to move
eight of which contained the digital reading cluster CR2 (i.e., second around within the hypertext, but did not allow them to move between
cluster of the Computer-based Reading material, see Section 2.2). The items or units in the assessment.
OECD released the items from CR2. The result and log data of partici- The unit Language Learning presented a fictitious social media
pating countries are publicly available on the OECD website.1 platform, in which students were logged in as ‘Rafael’ and had access
We selected seven countries with a comparable cultural back- to his profile information, friends list, messages and more. The first
ground that carried out the PISA study in English, French or German. two items started by displaying the platform's welcome page; the last
In these countries, a total of 12,059 students had been assigned to item started displaying Rafael's messages. Across the top of all pages
work on the PISA digital reading cluster CR2, with 86.4% of students was a navigation bar with seven active links which were the main
having scores on all items. We only regarded data of students whose means of navigating within the unit. On most pages new links were
log data were complete and fulfilled certain plausibility criteria. That available that led to other sites without further new links.
means, we excluded cases with missing or duplicated item start and The unit Sports Club opened with an email exchange between
end events, and with zero or negative processing times (1.6% of the two girls who discuss joining a gym. There were four active links
original sample). Additionally, we excluded cases that showed proces- included in the exchange that led to the homepages of four gyms (see
sing times beyond a country/language-specific threshold of three SD Figure 1). When clicked, each of these links opened a new tab. Stu-
above the mean (8.2% of the original sample). dents could navigate back to the content of the email exchange using
After cleaning, the data of 9226 students was used (76.5% of the the tabs. Of the four gym homepages, two contained no further links.
original sample). The sample included 50.3% of female students. Stu- The other two contained additional links that opened new pages
dents were on average 15.8 years old (SD = 0.29) and their average within the website.
on the PISA index of economic, social and cultural status (ESCS) was The unit Seraing included the fictitious homepage of the town
0.25 (SD = 0.84, Min = 3.42, Max = 3.12). Table 1 presents further Seraing, with a total of 36 clickable links on this page, appearing as
details of the sample. A comparison of the sample characteristics gives either hyperlinks or in drop-down menus. Only three of these links led
no indication that data cleaning systematically biased the sample. to further content, with the remaining 33 leading to a generic ‘no con-
tent’ page and a link that students could click to return to the previ-
ous page. The main active link from the town homepage took
2.2 | Material students to the homepage for the Community Cultural Centre, a new
website which opened in a new tab. This website had further content
The released digital reading material contained nine items organized available, including seven new links. Of these seven, two led to further
in three units. The units provided a hypertext stimulus shared content, while the other five again led to a generic ‘no content’ page.
between items and varied in terms of how much material was The hypertext of Seraing was the most complex one of the three units.
The first two items started with the town's homepage. The last item
1
http://www.oecd.org/pisa/pisaproducts/database-cbapisa2012.htm. introduced an additional email portal as start page.
HAHNEL ET AL.
Overall % correct
Min.
Response No. of No. of No. rel. coverage Partial Full
Unit Unit name Context Item Task format links pages pages required credit credit
CR017 Language • fictitious social media platform CR017Q01 identify what kind of service the Multiple 22 19 1 - 50.9
Learning languagelearning.com platform provides choice
• logged in as ‘Rafael’ CR017Q04 locate specific information in Rafael's Multiple 22 19 2 - 92.5
profile choice
CR017Q07 reflect on a scam message that Open text 22 19 1 15.7% (3 of 26.0 40.9
Rafael had received in his inbox 19 pages)
CR013 Sports • email exchange between two girls CR013Q01 find out on what day the girls can Multiple 11 12 1 - 70.6
Club who want to join a gym meet for the gym choice
• four links to gym homepages are CR013Q04 identify the cheapest gym Multiple 11 12 8 - 72.3
provided in the e-mail choice
• the girls express several
preferences (e.g., swimming and CR013Q07 give a reasoned recommendation for Open text 11 12 5a 66.7% (8 of 12.1 43.1
soccer, a price limit) a gym 12 pages)
CR002 Seraing • fictitious homepage of the town CR002Q01 find information about local Multiple 43 21 1 - 97.0
Seraing (5 pages; only first and festivities (town's homepage) choice
second item) CR002Q03 find the movie schedule (Cultural Multiple 43 21 4b - 85.3
• homepage for the Community Centre's homepage) choice
Cultural Centre (18 pages)
• e-mail portal (5 pages; only CR002Q05 write an email to a friend Open text 44 23 10b 46.1% (12 of 7.4 51.2
third item) recommending a concert (e-mail 26 pages)
portal)
a
The pages in this item contained useful information for a correct item solution, but it was not critical to visit them.
b
In these items, there were two different ways to find the correct answer.
741
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
742 HAHNEL ET AL.
The digital reading items were presented in either a multiple- cluster emerged. Grouping was based on the dissimilarity of the clus-
choice or open text response format. The multiple-choice items ters, which was determined using City Block (Manhattan) distance
required students to select one of four response alternatives. measures (i.e., the absolute distance between two vectors). Before the
Responses to these items were dichotomously scored (0–incorrect, analyses, we z-standardized all process indicators per country/
1–correct). Items with open formats required students to write a short language to compare variables on a common scale. The number of
text response. These responses were evaluated and partial credit clusters was determined with the R package NbClust (Charrad
coded (0–no credit, 1–partial credit, 2–full credit) by trained personnel et al., 2014) that provides several indices to find the optimal number
(OECD, 2014, pp. 257–276). Four items could be solved without any of clusters. The optimal number is determined by the majority princi-
navigation required, since all relevant information could be retrieved ple (i.e., the solution with the most indices in favour; see documents
from the item start page. The other items required students to visit on cluster indices in Data S1).
different pages to access target information for solving the item cor- For cluster interpretation, we considered the mean values on the
rectly. The percentages of correct item responses indicate how diffi- process indicators per cluster in light of the item requirements and
cult the items were to solve (i.e., CR002Q01 was the easiest item; performance measures (correct item response rates and average digi-
CR017Q07 was the most difficult item). tal reading skill; see Section 2.3.2). Furthermore, we predicted cluster
membership by means of multinomial regression analysis with the R
package nnet (Venables & Ripley, 2002). The predictors were stu-
2.3 | Analytical approach and measures dents' gender, socio-economic status (ESCS), and print reading skills.
We performed two analyses per digital reading unit, since the print
For the identification of different behavioural patterns, we conducted reading skill measure was not available for all students. Since we
a series of cluster analyses based on sets of 10 process indicators focused on relations at cluster level and not deriving estimates for the
derived from the PISA digital reading log files (see Section 2.3.1). First, population, we analysed unweighted data.
country/language-specific analyses were performed per unit. Second,
the analyses were repeated for all data per unit and the results were
compared with the country/language-specific results. Following the 2.3.1 | Measures for cluster identification
approach of Lawless and Kulikowich (1996), we conducted hierarchi-
& Kłopotek, 2018) using the
cal cluster analysis (e.g., Wierzchon For cluster analysis, we derived 10 process indicators from the log
agglomeration method of Ward (1963) that minimizes the total data using the R package LogFSM (Kroehne, 2019) as follows. Note
within-cluster variation. That is, the individual cases were initially con- that the first indicator, hypertext coverage, was created once per
sidered as single clusters; then the clusters closest to each other were unit and the other three measures three times per unit (i.e., once
successively grouped together in an iterative process until a single per item). The time indicators were additionally log transformed and
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HAHNEL ET AL. 743
group-mean centred by country/language to account for language- served as test scores [WLE reliability between 0.73 (Ireland) and 0.77
related time differences. (Australia)]. Test scores of zero indicate an average paper-based
reading skill within countries; higher values indicate higher skill levels
• Hypertext coverage represents how intensively the hypertext was (min = 5.99; max = 4.54).
used. We counted how many unique pages students had visited
within a unit and divided the count by the number of available
pages, multiplied by 100. The higher this percentage value is, the 3 | RE SU LT S
more intensively a hypertext was explored.
• Precision of navigation is an indicator adapted from Rouet (2003) In the following, we report the results of the overall analyses. Results
that reflects the degree of task-oriented navigation. The indicator of country/language-specific analyses can be found in the online sup-
results from dividing the number of opened relevant pages by the plement (see results in Data S1; Section 3.2 provides a summary of
sum of all available task-relevant pages and other opened pages the findings).
whose information does not contribute to the correct task solution,
multiplied by 100. High percentage values indicate that students
visited more task-relevant pages and did not wander off to pages 3.1 | Results of the cluster analysis
with irrelevant content.
• Processing time is the time difference in seconds from the item start We identified five clusters for the unit Language Learning, three clus-
event to the item end event. The indicator represents how much ters for Sports Club and four clusters for Seraing. Tables 3–5 present
time students have spent in total to complete an item. the descriptive statistics of the process indicators for the overall sam-
• Orientation time is the time difference in seconds from the item ple and each cluster, together with information on the average item
start event to the first mouse click event (including clicks on links, and test scores. For each cluster, we chose labels based on the extent
radio buttons, text fields, or the next button). The indicator repre- to which the process indicators indicated noticeable differences com-
sents how much time students took to first look at the stimulus pared to the overall sample results or those of the other clusters. In all
and the item instruction before starting to interact with the item. units, we found a cluster of students whose navigation behaviour and
time allocation corresponded with the demands of the respective
items (e.g., longer processing times in more navigation-intensive
2.3.2 | Measures for cluster interpretation and/or difficult items). Accordingly, we labelled these clusters on-task.
In the following, we use the on-task pattern as a reference to explain
Digital reading skill our reasoning for labelling the others clusters in a unit.
To represent students' digital reading skill, we used all response data For Language Learning, we labelled the identified clusters passive,
to the 19 digital reading items from PISA's 2012 CBA study [smallest on-task, exploring, hasty, and disengaged (Table 3). The passive cluster
sample: n = 1146 (Canada/French); largest sample: n = 5613 resembled the on-task pattern regarding time allocation, but these
(Australia)]. The item responses were modelled per country/language students stood out for hardly using the navigation options. Their ori-
with a partial credit model (Masters, 2010) using the R packages TAM entation time in the last item, the scam message reflection, was also
(Robitzsch et al., 2021). Weighted likelihood estimates (WLE; above-average, which might indicate that they were unsure how to
Warm, 1989) were used as test scores [WLE reliability between 0.65 approach this item, reinforcing the impression of passive waiting. The
(Ireland) and 0.72 (Germany)]. Test scores of zero indicate an average exploring cluster shows the anticipated pattern of the highest hyper-
digital reading skill within countries; higher values indicate higher skill text coverage, combined with the lowest navigational precision and
levels (min = 5.65; max = 3.56). longest processing time in the first item. This pattern indicates that
students explored the hypertext of the unit first before they engaged
Print reading skill in completing the items. They also seem to react quickly in the second
To represent students' reading skill, we used all response data to the item, potentially due to their exploration efforts in the first item. The
44 reading items [smallest sample: n = 3152 (France); largest sample: hasty cluster featured high precision in navigation and short orienta-
n = 10,987 (Canada/English)], administered during the PISA 2012 tion times, especially in the first two items, indicating that these stu-
main study as paper-based assessment (OECD, 2013). Note that the dents were quick to act. Finally, the disengaged cluster showed a
PISA 2012 main study took place before the CBA study. Since PISA hypertext coverage and time allocation that is too low to reasonably
used a rotational design to assess other competences in addition deal with the item content. Their precision values were comparatively
to reading, students who participated in the paper-based reading high in the first and third items; however, this is well-explained by the
assessment did not necessarily participate in the digital reading lack of navigation requirements within these items.
assessment. For our data, the overlap is 6378 students (69.1% of the Looking at the item and test scores for Language Learning, the stu-
cleaned sample). As for the digital reading items, the item responses dents of the exploring cluster were the best performers, performing
to the paper-based reading items were modelled per country/ even slightly better than students with an on-task pattern. The pas-
language with a partial credit model. Weighted likelihood estimates sive and the hasty clusters showed comparatively lower but still
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
744 HAHNEL ET AL.
TABLE 3 Unit Language Learning: Means and SD of the process indicators and performance scores
TABLE 4 Unit Sports Club: Means and SD of the process indicators and performance scores
average performance. The disengaged cluster performed the poorest cluster) and, for the first item, the lowest navigational precision and
on the digital reading items. the longest processing time of all three clusters. Compared with the
For Sports Club, we labelled the clusters on-task, exploring and on-task cluster, the students were also faster in reacting to the second
persistent (Table 4). Similar to the exploring cluster in Language Learn- item and completed it in a shorter time, hinting again at a supporting
ing, the exploring cluster of Sports Club showed the highest hypertext effect of their navigation activities in the first item. The persistent clus-
coverage (although not much different from that of the on-task ter showed navigation activities that were far too low. The cluster's
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HAHNEL ET AL. 745
TABLE 5 Unit Seraing: Means and SD of the process indicators and performance scores
processing times were also short, but compared with the processing The item and test scores in Seraing showed that students in the
times of the disengaged cluster in Language Learning, not so short that on-task cluster were the highest performers, closely followed by the
one could assume that tasks were skipped or that students gave explorers. Students showing a disengaged pattern performed very
up. This is also consistent with the persistent cluster showing compar- poorly on digital reading. Finally, the students of the lost interest clus-
atively long orientation times. ter showed item success rates that were high in the first item, moder-
For Sports Club, students of the exploring cluster were less suc- ate in the second item, and nearly zero in the last item.
cessful in correctly solving the items than students of the on-task
cluster. There was also a noticeable difference in the digital reading
test scores compared to the exploration cluster in Language Learning, 3.2 | Comparison across countries
suggesting a different cluster composition (i.e., the Sports Club
explorers are not the same students as the Language Learning The identified patterns were found to a large degree in the country/
explorers). The item and test scores of the persistent cluster were the language-specific analyses (see results in Data S1). For Language
lowest of the Sports Club clusters, suggesting that these students Learning, the patterns passive, on-task and hasty were observed in all
experienced difficulties completing the items. countries; the exploring cluster was only absent once (for Canada/
For Seraing, we labelled the clusters on-task, exploring, disengaged English); and the disengaged cluster was not observed for the Irish,
and lost interest (Table 5). It was noticeable that the average hypertext United States and Canadian/French data. Only the Canadian/English
coverage in all clusters was lower than the required coverage, poten- data revealed a new cluster that was similar to the passive cluster, but
tially indicating that with a comparatively large number of available links, showed slightly more activity in navigation (though, still less compared
students became lost. The exploring cluster exhibited similar behavioural with the on-task cluster).
patterns as in the units before (i.e., lowest navigation precision and lon- The largest diversity of results was observed for the unit Sports
gest processing time in the first item). The disengaged cluster in Seraing Club. Although the on-task, exploring, and persistent clusters
resembled the disengaged pattern in Language Learning (e.g., coverage appeared in all countries/languages, there were some country-specific
that was too low, processing and orientation times that were too short). additions. For the US data, we observed an additional disengaged pat-
The last cluster, whose pattern we interpreted as lost interest, looked at tern characterized by a low navigation activity and short processing
first similar to the on-task cluster, but the cluster's activity was notice- and orientation times. In the last item, though, the time indicators
ably reduced in the last item (e.g., low navigation precision, almost showed values comparable with those of the persistent cluster. The
halved processing time). This pattern suggests that students engaged data from Austria, France and Germany also suggested a disengaged
with the items in a task-oriented way at the start of the unit, but lost pattern but the similarities to the persistent cluster, which were
interest or patience in continuing from the second item onwards. observed for the US data, were less pronounced and rather indicated
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
746 HAHNEL ET AL.
F I G U R E 2 Flow diagram of
students' cluster membership across
units. Box sizes are proportional to
sample sizes within clusters; line
thickness represents the sample size
of students in boxes connected by
the line; colours indicate students'
cluster membership in the first unit
fast reaction and task processing. The Austrian data even revealed a Most students who showed disengaged behaviour in the first unit
fifth cluster, which indicated rushed behaviour in the last item: The continued to show behaviours with a profile of reduced activity (per-
mean values of the process indicators of this cluster resembled those sistent, disengaged, lost interest). The finding demonstrates a similar
of the on-task cluster; however, the navigation precision and the pro- level of consistency in adhering to similar behavioural patterns as for
cessing time for the last item were far lower. the on-task behaviours. However, looking at the persistent and lost
For Seraing, the on-task, exploring and lost interest clusters were interest clusters, most members actually demonstrated a passive pat-
observed for all countries. The country/language-specific analyses tern in the first unit. Finally, other combinations of cluster transitions
revealed the disengaged cluster only for the data from Australia, than those presented were possible, but, as the thin flow lines in
Austria and Germany. New clusters were not observed. Figure 2 suggest, they occurred rarely.
3.3 | Cluster membership across units 3.4 | Relations to the external measures
We explored students' cluster membership across the digital reading To further investigate the interpretation of the identified clusters, we
units with a flow diagram. Figure 2 shows that, after the first unit, most predicted cluster membership per unit by gender, socio-economic sta-
students who had demonstrated either on-task, hasty, passive or tus and print reading skills, with the reference category being ‘on-
exploring behaviour in the first unit went on to exhibit on-task behav- task’. The results of these multinomial regressions are reported in
iour in later units. In particular, the large proportion of the passive clus- Table 6. The coefficients are log odds, which means that, when a pre-
ter switching to on-task behaviours in later units suggests a warm-up dictor changes by a SD, they reflect the change of probability in
effect for these students. Interestingly, a considerable number of stu- belonging to a considered cluster compared with the on-task cluster.
dents classified as disengaged in the first unit also demonstrated on- For example, for Language Learning, it was more likely that a student
task behavioural patterns later on. This flow may indicate that these belonged to the passive cluster than to the on-task cluster: on average
students were surprised that backward navigation to previous units (b = 1.13), when the student was female (b = 0.21), and was a stu-
was not possible, although this was covered in the CBA instructions. dent with lower socio-economic status (b = 0.21). Including print
Concerning the exploring behaviours, students were more likely to reading skills did not change this effect pattern (Intercept: b = 1.24,
demonstrate an exploring pattern once rather than multiple times. This Gender: b = 0.31, ESCS: b = 0.10). Students with poor print read-
finding speaks against the idea that exploring might be an intra-individ- ing skills also had a higher chance of belonging to the passive cluster
ual, consistent task processing behaviour. Instead, it suggests that than the on-task cluster (b = 0.21).
exploring in different units might result from various functions or moti- Overall, the results show that, rather than being in the on-task
vations (e.g., exploring for orientation or out of curiosity or boredom). cluster, boys were more likely to show a hasty behaviour or, in later
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HAHNEL ET AL. 747
TABLE 6 Results of multinomial regressions to predict cluster membership (reference category is ‘on-task’)
Passive Hasty Exploring Disengaged Exploring Persistent Exploring Disengaged Lost interest
Sample 1 (n = 9226)
Intercept 1.13 (0.05) 1.27 (0.10) 1.09 (0.09) 1.65 (0.11) 1.75 (0.06) 1.71 (0.06) 2.32 (0.08) 4.01 (0.18) 1.33 (0.06)
Gender (male) 0.21 (0.05) 0.55 (0.10) 0.22 (0.08) 0.61 (0.11) 0.37 (0.06) 0.57 (0.06) 0.35 (0.07) 1.20 (0.18) 0.51 (0.06)
ESCS 0.21 (0.03) 0.25 (0.06) 0.02 (0.05) 0.47 (0.06) 0.02 (0.03) 0.55 (0.04) 0.06 (0.04) 0.49 (0.09) 0.49 (0.03)
Sample 2 (n = 6378)
Intercept 1.24 (0.07) 1.09 (0.12) 1.00 (0.11) 1.74 (0.14) 1.72 (0.07) 2.07 (0.09) 2.22 (0.10) 4.41 (0.23) 1.42 (0.07)
Gender (male) 0.31 (0.06) 0.36 (0.12) 0.16 (0.10) 0.19 (0.13) 0.33 (0.07) 0.34 (0.08) 0.33 (0.09) 0.93 (0.22) 0.31 (0.07)
ESCS 0.10 (0.04) 0.09 (0.07) 0.02 (0.06) 0.12 (0.08) 0.01 (0.04) 0.15 (0.05) 0.09 (0.06) 0.00 (0.12) 0.14 (0.04)
Print reading 0.21 (0.03) 0.24 (0.05) 0.00 (0.04) 0.82 (0.05) 0.11 (0.03) 1.01 (0.04) 0.20 (0.04) 1.20 (0.08) 0.84 (0.03)
skills
Note: Results in bold are significant with p < 0.05 or below. Country/language was included as categorical predictor, but effects are not reported in the
table (reference category is Australia/English).
units, exploring or different shades of reduced activity (disengaged, 4.1 | Key results and implications
persistent, lost interest). Students with lower socio-economic status
were less likely to be in the on-task or exploring clusters than any The study results show that clusters indicating certain activity pat-
other clusters. Some of these effects of gender and socio-economic terns are identifiable for students who deal with hypertext. While
status were explainable by considering print reading skills, though. these clusters indicate context-dependency, they also seem to be
As one might assume, students with lower print reading skills roughly comparable to those derived from dealing with an open learn-
tended to be part of any other cluster than the on-task cluster in ing task (Lawless & Kulikowich, 1996). On-task behaviours were
each unit. It is noteworthy, though, that print reading skills were not observed in all units for a large part of students, and the flow of clus-
predictive for an on-task versus exploring cluster membership in the ter membership across units indicated that most students, indepen-
first unit (b = 0.00). However, they became increasingly predictive dent from their behaviour in the first unit, went on to exhibit on-task
for this distinction in later units (b = 0.11 and b = 0.20), indicat- behaviour in later units. This is good news for educational monitoring,
ing different cluster compositions of the three exploring clusters, as as it supports the assumption that students will try to show what they
did the comparison of the clusters' average digital reading test know or can do. Many large-scale programs, such as PISA, are so-
scores. called low-stakes assessments. That is, their results are not used to
make far-reaching educational decisions for individuals (however, they
are high stakes in terms of policy conclusions; Singer et al., 2018).
4 | DISCUSSION Accordingly, effects of disengagement during test-taking are critically
discussed, since associations between low skill and test-taking disen-
The present study explored behavioural patterns that students exhibit gagement raise questions about whether test scores may underesti-
when they need to actively search for, select and process digital infor- mate true skill levels (e.g., Goldhammer et al., 2017). The high
mation in a task-oriented way. Based on log data, we created process comparability of the country/language-specific results is also note-
indicators (hypertext coverage, precision of navigation, processing worthy. It stresses that there are sets of responses processes in the
time, orientation time) that refer to students' processing of multiple digital reading assessment that are invariant between countries/lan-
items within three different digital reading units. Their analysis guages, providing indirect evidence for the comparability of test
revealed multiple clusters per unit. Some of these clusters could be scores across countries (see Vandenberg & Lance, 2000).
summarized as on-task, exploring and disengaged across units, while Especially, the differentiation between the passive and hasty clus-
others were unit-specific (passive, hasty, persistent, lost interest) or ters may prove helpful for intervention. These clusters highlight a lack
indicated country/language-specific cluster collapses and differentia- in specific features (passive: navigation activity; hasty: time allocation)
tions. The investigation of cluster composition by gender, socio- where students deviate from the supposed ideal of on-task behaviour.
economic status and print reading skills, as well as cluster membership Although we did not explicitly observe passive or hasty clusters in
across units provided further insight for the interpretation of the clus- other units than Language Learning, their absence does not mean that
ters. In the following, we discuss the results in terms of lessons for the students did not show comparable behaviours at all. A general prob-
cognitive assessment of digital reading and in light of educational lem with explorative approaches, such as cluster analysis, is that only
monitoring. clusters are identified that are represented in the data prominently
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
748 HAHNEL ET AL.
enough. Therefore, country/language-specific clusters may be short orientation times in the disengaged cluster even speak against an
observed that did not emerge in the transnational analyses. From a engagement with task instructions. Other explanations are also conceiv-
theoretical point of view, it might be possible that passive or hasty able. For example, since Seraing was the last digital reading unit in a
behaviours result from overconfidence in one's assessment of task 20-min session, one might argue that students in the lost interest cluster
requirements or comprehension of a text (e.g., Cerdán et al., 2011; may have spent too much time on previous items and ran out of time.
Rouet et al., 2017). This interpretation is supported by their digital However, their low performance in digital and print reading suggests
reading performance, which was lower than those of students in the otherwise. In an additional post-hoc analysis (see Data S1 ‘Post-hoc-
on-task cluster. For educational purposes, it could be fruitful to prede- analysis.pdf’), we also could not find evidence for this interpretation:
fine specific activity patterns that, when detected, raise adaptive feed- Quite the opposite, students in the lost interest group spent significantly
back or prompts to support learners. For example, it would be useful less time on all digital reading units than the adaptive and exploring
to develop a mechanism based on disengaged patterns to detect dis- groups. Alternatively, poor self-regulatory or ICT skills, lack of interest in
engaged test taking and intervene to motivate test-takers or deter- the unit, or other motivational effects may contribute to students show-
mine reasons for their disengagement (see Wise et al., 2019). It is ing a pattern of reduced activity (e.g., Naumann, 2015, 2019). Such
important to remember that to make maximal use of such activity pat- effects might even explain the differences that we observed between
terns requires careful preparation during item and test development clusters in terms of socioeconomic status (Lim & Jung, 2019).
and piloting. All in all, our study emphasizes that both task requirements and
The exploring clusters illustrate most clearly the value of validat- the assessment context must be considered when interpreting log
ing the interpretation of the clusters. Based on their description, the data (e.g., Goldhammer et al., 2021; Maddox et al., 2018; Rouet
exploring clusters are highly similar across units. Interestingly, explora- et al., 2017). While simple indicators provide insight into how students
tion in the sense of feature seeking after Lawless and Kulikowich interact with digital reading tasks, depending on the task/context,
(1996) does not seem to be pronounced in any of the clusters, which they may not allow for unambiguous inferences at the individual level.
seems plausible as the use of hypertexts is not novel anymore For example, the processing time of the exploring students in the first
(e.g., due to higher levels of familiarity with internet-based activities). item of the Language Learning unit would not directly allow conclu-
Nevertheless, there are striking differences in the compositions of sions about their reading efficiency. Proficient readers can be
these clusters across units. While there were no statistically signifi- expected to process simple reading tasks faster than less proficient
cant differences between the exploring and on-task clusters in the readers (e.g., Goldhammer et al., 2014). However, the processing time
first unit in terms of digital reading performance, gender, socio- of exploring students, who are good readers, is substantially higher
economic status and print reading skills, students in the exploring than that of students who were poor readers. Only with the additional
clusters in later units showed lower skill in both digital and print read- information that they spent much more time in accessing other pages
ing and were more likely to be boys. It was also rare for students to than other students, it becomes clear that this time indicator does not
exhibit exploring behaviours in more than one unit. While the synthe- only capture the time needed for the cognitive processing of the item.
sis of the findings suggests that the exploring cluster in the first unit
might have used the hypertext for orientation, or because of curiosity
or task novelty, it suggests for later units that students had other rea- 4.2 | Limitations
sons or motives for their actions (e.g., losing focus, not understanding
the task, getting tired, being bored). Consequently, patterns of Apart from the discussed ambiguities in interpreting the clusters,
similar behaviours provide different interpretations, making a unit- other limitations deserve attention. First, the primary focus of PISA
aggregated analysis of this behaviour inconclusive and requiring a 2012 was on the reliable assessment of student competencies, with
renaming of these clusters. Future research needs to validate our con- the computer-based assessment being an available add-on option.
clusion, for example, by experimentally rotating units within clusters The recording and analysis of log data were considered a by-product
and observing if the compositions for a certain unit would change of potential interest rather than a data source allowing for substantial
across rotations. conclusions. There are traces that this affected the raw data quality.
The clusters disengaged, persistent and lost interest raise questions, For example, navigation between tab structures did not seem to be
but are not without insight. They showed significantly reduced activity recorded consistently. Therefore, we used count indicators of page
in task processing compared with the other clusters, the question is occurrences instead of sequence indicators. However, with this con-
why. Previous research (e.g., Hahnel et al., 2016; Lim & Jung, 2019; cern for data completeness in mind, one might wonder whether cer-
n, 2016; Sullivan & Puntambekar, 2015) identifies
Naumann & Salmero tain cluster structures would have emerged with more reliable log
low reading skills as one explanation, which is also supported by our data records (e.g., passive pattern, which is characterized by a lack of
analyses. However, it remains unclear whether understanding the navigation activity).
hypertext is challenging for students or whether they already struggle Second, we decided to use a cluster-analytic approach to distin-
with understanding the task. Especially the longer orientation times in guish between different activity patterns. Although this approach is
the persistent and the lost interest clusters could be interpreted as an suitable for uncovering structures in complex data, it requires numer-
indication of the latter (see OECD, 2015; Zoanetti, 2010), whereas the ous decisions. For example, the number of clusters must be
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HAHNEL ET AL. 749
determined before the analysis and can rarely be determined unam- Cerdán, R., Gilabert, R., & Vidal-Abarca, E. (2011). Selecting information to
biguously. In addition, our results showed high variation in the process answer questions: Strategic individual differences when searching
texts. Learning and Individual Differences, 21(2), 201–205. https://doi.
indicators within individual clusters. This indicates that the behaviour
org/10.1016/j.lindif.2010.11.007
of individuals within a cluster can significantly differ from one to Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R
another. Cluster analysis does not provide any information about how package for determining the relevant number of clusters in a data set.
likely individuals belong to different clusters, which might facilitate Journal of Statistical Software, 61(6), 1–36.
Eichmann, B., Goldhammer, F., Greiff, S., Pucite, L., & Naumann, J. (2019).
conclusions about representative behaviours within clusters. Latent
The role of planning in complex problem solving. Computers & Educa-
class analysis is a viable alternative, but comes with the drawback that tion, 128, 1–12. https://doi.org/10.1016/j.compedu.2018.08.004
students are not assigned to fixed clusters but receiving probabilities Foltz, P. (1996). Comprehension, coherence, and strategies in hypertext
of belonging to a cluster, which may complicate the interpretation of and linear text. In J.-F. Rouet, J. J. Levonen, A. Dillon, & R. J. Spiro
(Eds.), Hypertext and cognition (pp. 109–136). Lawrence Erlbaum
further analyses, such as from multinomial regression. However,
Associates.
despite these limitations, our study still provides value in demonstrat- Goldhammer, F., Hahnel, C., Kroehne, U., & Zehner, F. (2021). From bypro-
ing that the analysis and interpretation of process data benefits from duct to design factor: On validating the interpretation of process indi-
thorough considerations of task demands and the assessment context cators based on log data. Large-Scale Assessments in Education, 9(1), 1–
25. https://doi.org/10.1186/s40536-021-00113-5
(see Goldhammer et al., 2021). In-depth knowledge about resulting
Goldhammer, F., Martens, T., & Lüdtke, O. (2017). Conditioning factors of
activity patterns may help guide students in their learning process and
test-taking engagement in PIAAC: An exploratory IRT modelling
ensure that data that is maximally useful for educational interventions approach considering person and item characteristics. Large-Scale
is collected in assessment situations. Assessments in Education, 5, 1–25. https://doi.org/10.1186/s40536-
017-0051-9
Goldhammer, F., Naumann, J., Stelter, A., To th, K., Rölke, H., & Klieme, E.
ACKNOWLEDGMENT
(2014). The time on task effect in reading and problem solving is moder-
This research was funded by the Centre for International Student ated by task difficulty and skill: Insights from a computer-based large-
Assessment (ZIB), Germany and the Australian Council for Educational scale assessment. Journal of Educational Psychology, 106(3), 608–626.
Research (ACER), Australia. Hahnel, C., Goldhammer, F., Naumann, J., & Kroehne, U. (2016). Effects of
linear reading, basic computer skills, evaluating online information, and
navigation on reading digital text. Computers in Human Behavior, 55,
CONF LICT OF IN TE RE ST 486–500. https://doi.org/10.1016/j.chb.2015.09.042
The authors declare no conflicts of interest. Kroehne, U. (2019). LogFSM: Analyzing log data from educational assess-
ments using finite state machines (LogFSM). http://www.logfsm.com
Lawless, K. A., & Kulikowich, J. M. (1996). Understanding hypertext navi-
P EE R R EV I E W
gation through cluster analysis. Journal of Educational Computing
The peer review history for this article is available at https://publons. Research, 14(4), 385–399. https://doi.org/10.2190/DVAP-DE23-
com/publon/10.1111/jcal.12709. 3XMV-9MXH
Leu, D. J., Forzani, E., Rhoads, C., Maykel, C., Kennedy, C., & Timbrell, N.
(2014). The new literacies of online research and comprehension:
DATA AVAI LAB ILITY S TATEMENT
Rethinking the Reading achievement gap. Reading Research Quarterly,
The PISA 2012 database is publicly available on the homepage of 50(1), 37–59. https://doi.org/10.1002/rrq.85
the OECD (https://www.oecd.org/pisa/data/pisa2012database- Lim, H. J., & Jung, H. (2019). Factors related to digital reading achieve-
downloadabledata.htm). Other study materials are available under ment: A multi-level analysis using international large scale data. Com-
puters & Education, 133, 82–93. https://doi.org/10.1016/j.compedu.
https://osf.io/zrg8t/.
2019.01.007
Lumley, T., & Mendelovits, J. (2012). How well do young people deal with
ORCID contradictory and unreliable information on line? What the PISA digital
reading assessment tells us. Annual Conference of the American Educa-
Carolin Hahnel https://orcid.org/0000-0003-2394-3944
tional Research Association (AERA), Vancouver.
Dara Ramalingam https://orcid.org/0000-0002-0612-0092 Maddox, B., Bayliss, A. P., Fleming, P., Engelhardt, P. E., Edwards, S. G., &
Ulf Kroehne https://orcid.org/0000-0002-0412-169X Borgonovi, F. (2018). Observing response processes with eye tracking
Frank Goldhammer https://orcid.org/0000-0003-0289-9534 in international large-scale assessments: Evidence from the OECD
PIAAC assessment. European Journal of Psychology of Education, 33(3),
543–558. https://doi.org/10.1007/s10212-018-0380-2
RE FE R ENC E S Masters, G. N. (2010). The partial credit model. In M. L. Nering & R. Ostini
Afflerbach, P., & Cho, B.-Y. (2008). Identifying and describing construc- (Eds.), Handbook of polytomous item response theory models (pp. 109–
tively responsive comprehension strategies in new and traditional 122). Routledge.
forms of reading. In S. E. Israel & G. G. Duffy (Eds.), Handbook of Naumann, J. (2015). A model of online reading engagement: Linking
research on reading comprehension (pp. 69–90). Routledge. engagement, navigation, and performance in digital reading. Computers
Brand-Gruwel, S., Wopereis, I., & Walraven, A. (2009). A descriptive model in Human Behavior, 53, 263–277. https://doi.org/10.1016/j.chb.2015.
of information problem solving while using internet. Computers & Edu- 06.051
cation, 53(4), 1207–1217. https://doi.org/10.1016/j.compedu.2009. Naumann, J. (2019). The skilled, the knowledgeable, and the motivated:
06.004 Investigating the strategic allocation of time on task in a computer-
Brun-Mercer, N. (2019). Online reading strategies for the classroom. based assessment. Frontiers in Psychology, 10, 1429. https://doi.org/
English Teaching Forum, 57(4), 2–10. 10.3389/fpsyg.2019.01429
13652729, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jcal.12709 by National Kaohsiung Normal, Wiley Online Library on [04/09/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
750 HAHNEL ET AL.
Naumann, J., & Goldhammer, F. (2017). Time-on-task effects in digital Singer, J., Braun, H., & Chudowsky, N. (2018). International education
reading are non-linear and moderated by persons' skills and tasks' assessments: Cautions, conundrums, and common sense. National Acad-
demands. Learning and Individual Differences, 53, 1–16. https://doi. emy of Education. https://doi.org/10.31094/2018/1
org/10.1016/j.lindif.2016.10.002 Sullivan, S. A., & Puntambekar, S. (2015). Learning with digital texts: Explor-
Naumann, J., & Salmero n, L. (2016). Does navigation always predict per- ing the impact of prior domain knowledge and reading comprehension
formance? Effects of navigation on digital Reading are moderated by ability on navigation and learning outcomes. Computers in Human Behav-
comprehension skills. The International Review of Research in Open and ior, 50, 299–313. https://doi.org/10.1016/j.chb.2015.04.016
Distributed Learning, 17(1), 42–59. https://doi.org/10.19173/irrodl. Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the
v17i1.2113 measurement invariance literature: Suggestions, practices, and recom-
Naumann, J., & Sälzer, C. (2017). Digital reading proficiency in German mendations for organizational research. Organizational Research
15-year olds: Evidence from PISA 2012. Zeitschrift für Erziehungswis- Methods, 3(1), 4–70. https://doi.org/10.1177/109442810031002
senschaft, 20(4), 585–603. Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S
OECD. (2010). PISA 2009 assessment framework: Key competencies in read- (Fourth ed.). Springer. https://www.stats.ox.ac.uk/pub/MASS4/
ing, mathematics and science. OECD Publishing. http://www.oecd.org/ Vidal-Abarca, E., Mañá, A., & Gil, L. (2010). Individual differences for self-
pisa/pisaproducts/44455820.pdf regulating task-oriented reading activities. Journal of Educational Psy-
OECD. (2011). PISA 2009 Results: Students On Line. OECD Publishing. chology, 102(4), 817–826. https://doi.org/10.1037/a0020062
OECD (Ed.). (2013). PISA 2012 assessment and analytical framework: Math- Ward, J. H. (1963). Hierarchical grouping to optimize an objective function.
ematics, reading, science, problem solving and financial literacy. OECD. Journal of the American Statistical Association, 58(301), 236–244.
OECD. (2014). PISA 2012 technical report. OECD Publishing. http://www. https://doi.org/10.1080/01621459.1963.10500845
oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf Warm, T. A. (1989). Weighted likelihood estimation of ability in item
OECD (Ed.). (2015). Students, computers and learning: Making the connec- response theory. Psychometrika, 54(3), 427–450.
tion. OECD Publishing. Whimbey, A., & Lochhead, J. (1991). Problem solving and comprehension.
Perfetti, C. (2007). Reading ability: Lexical quality to comprehension. Scien- Lawrence Erlbaum Associates.
tific Studies of Reading, 11(4), 357–383. Wierzchon , S. T., & Kłopotek, M. A. (2018). Cluster analysis. In Modern
Pirolli, P. (2005). Rational analyses of information foraging on the web. algorithms of cluster analysis (Vol. 34, pp. 9–66). Springer International
Cognitive Science, 29(3), 343–373. https://doi.org/10.1207/ Publishing. https://doi.org/10.1007/978-3-319-69308-8_2
s15516709cog0000_20 Wiley, J., Thomson, J., Leppänen, P. H. T., Ackerman, R., Kanniainen, L., &
Ramalingam, D., & Adams, R. J. (2017). How can the use of data from Prieler, T. (2018). Cognitive processes and digital reading. In M. Barzillai,
computer-delivered assessments improve the measurement of J. Thomson, S. Schroeder, & P. van den Broek (Eds.), Learning to read in
twenty-first century skills? In E. Care, P. Griffin, & M. Wilson (Eds.), a digital world (pp. 57–90). John Benjamins Publishing Company.
Assessment and teaching of 21st century skills: Research and applications Wise, S. L., Kuhfeld, M. R., & Soland, J. (2019). The effects of effort moni-
(pp. 225–238). Springer International Publishing. https://doi.org/10. toring with proctor notification on test-taking engagement, test per-
1007/978-3-319-65368-6_13 formance, and validity. Applied Measurement in Education, 32(2), 183–
Reinking, D. (1997). Me and my hypertext: A multiple digression analysis 192. https://doi.org/10.1080/08957347.2019.1577248
of technology and literacy (sic). The Reading Teacher, 50(8), 626–643. Zoanetti, N. (2010). Interactive computer based assessment tasks: How
Robitzsch, A., Kiefer, T., & Wu, M. (2021). TAM: Test analysis modules. problem-solving process data can inform instruction. Australasian Jour-
https://CRAN.R-project.org/package=TAM nal of Educational Technology, 26(5), 585–606.
Rouet, J.-F. (2003). What was I looking for? The influence of task specific-
ity and prior knowledge on students' search strategies in hypertext.
Interacting with Computers, 15(3), 409–428. SUPPORTING INF ORMATION
Rouet, J.-F., Britt, M. A., & Durik, A. M. (2017). RESOLV: Readers' repre- Additional supporting information can be found online in the Support-
sentation of Reading contexts and tasks. Educational Psychologist,
ing Information section at the end of this article.
52(3), 200–215. https://doi.org/10.1080/00461520.2017.1329015
Rouet, J.-F., & Levonen, J. J. (1996). Studying and learning with hypertext:
Empirical studies and their implications. In J.-F. Rouet, J. J. Levonen, A.
Dillon, & R. J. Spiro (Eds.), Hypertext and cognition (pp. 9–24). Lawrence How to cite this article: Hahnel, C., Ramalingam, D., Kroehne,
Erlbaum Associates. U., & Goldhammer, F. (2023). Patterns of reading behaviour in
Salmero n, L., Strømsø, H. I., Kammerer, Y., Stadtler, M., & van den digital hypertext environments. Journal of Computer Assisted
Broek, P. (2018). Comprehension processes in digital reading. In M.
Learning, 39(3), 737–750. https://doi.org/10.1111/jcal.12709
Barzillai, J. Thomson, S. Schroeder, & P. van den Broek (Eds.), Learning
to read in a digital world (pp. 91–120). John Benjamins.