(2014) Developing Coding Schemes For Program Comprehension Using Eye Movements
(2014) Developing Coding Schemes For Program Comprehension Using Eye Movements
Keywords: POP-II.B. program comprehension, POP-III.B. java, POP-V.B. eye tracking, POP-VI.F. exploratory
Abstract
This paper introduces an approach to use eye movement data in the context of program comprehen-
sion studies. The central aspect is the development of coding schemes, which reflect cognitive pro-
cesses behind the observable visual behavior of programmers. For this purpose, we discuss to first use
a quantitative approach to find those episodes in the eye movements that yield the most potential for
analysis. Subsequently, qualitative methods can be used on this subset.
1. Introduction
Tracking a programmers gaze is probably one of the closest and most direct measurements we have
to infer cognitive processes from behavioral and observable data. However, using eye movement data
poses several challenges. Even though eye movements and cognitive processes are connected, the re-
lation is complex and there is no easy matching. Moreover, eye tracking produces huge data sets. In
this paper we discuss a structured approach to develop an analytical research instrument to study code
reading and comprehension. Central for this approach is the development of coding schemes to label
and aggregate eye movement data of programmers understanding source code.
As a start we will focus on eliciting program comprehension (PC) strategies from eye movement data.
However, the presented approach is suitable for a multitude of questions, e.g. on the difference be-
tween novices and experts, the interaction of task type and comprehension process, influence factors
like programming language and paradigm, program length, additional visualization, tools/interface is-
sues, plan-like and un-planlike programs, and debugging.
The question we address here is What strategies do expert programmers use during program compre-
hension?. We decided to look into experts first, since experts are supposed to have developed suc-
cessful strategies which can be taught to less experienced programmers. Furthermore, we expect to
find some established strategies that are shared by more than one individual.
In November 2013 the first international workshop on Eye Movements in Programming Education:
Analyzing the experts gaze was conducted as an attempt to broaden the knowledge about PC strate-
gies. The focus was on cognitive processes behind observable eye movements during source code
reading. The workshop was organized in association with the 13th KOLI CALLING Conference in
Computing Education. Before the workshop, two sets of eye movement records of expert program-
mers reading Java were given to the participants. 1
Participants were asked to analyze and code these records with a provided scheme containing code ar-
eas in different levels of detail, eye movement patterns and presumed comprehension strategies.
Based on this analysis, the participants wrote position papers describing the eye movement data, com-
menting on the coding scheme, and possible applications of eye movement research in computer sci-
ence education. The scheme was revised following suggestions given in the position papers and dur-
ing the workshop.2
In the following we will introduce a systematic approach to develop coding schemes for eye move-
ment records of programmers in order to study PC processes - without getting lost in the data.
2 The position papers and the full version of the coding scheme can be found in the technical report
(Bednarik, Busjahn, & Schulte 2014) and at http://www.mi.fu-berlin.de/en/inf/groups/ag-ddi/Gaze_
Workshop/.
3 See (Mayrhauser & Vans 1994) for a detailed description
The codes are systematically split into a number of segments, that encode particular aspects of a cog-
nitive process. Starting point is the mental model (program model, situation model, or domain
model). It reflects the level of abstraction at which a programmer is working. Programmers can start
building a mental model at any level that appears opportune and switch between any of the three
model components during comprehension. The second segment, element classifies what the program-
mer does at that level with the general notions of cognition goals, hypotheses, and actions that
support the hypothesis driven understanding process. These actions can be analyzed further and
coded in segments with greater detail. Only the first two parts of the coding scheme (mental model
and element) are mandatory.
The analysis proceeds from identifying actions of various types to determining action sequences and
extracting cognition processes and strategies. The results can be used for statistical analyses, e.g. dis-
covering patterns of cognitive behavior or analyzing frequencies of certain actions.
The coding scheme can be expanded or reduced according to the level of detail desired. Due to this
flexibility, the scheme can be adjusted to answer a variety of research questions for various aspects of
PC. Hence, researchers can tailor AFECS to their own needs instead of developing a coding scheme
from scratch. Often results from different studies are difficult to compare. By using the same scheme,
results maintain a degree of standardization and enable comparisons across studies.
This scheme is especially interesting in the context of our approach, as it directly provides a broad
range of codes for cognitive processes during program comprehension.
The object classification comprises how objects present in the program are described. There are seven
object categories, e.g.
program only: refers to items which occur only in the program domain, and which would not
have a meaning in another context, like a counter
program-domain: object descriptions which contain a mixture of program and problem do-
main references, e.g. a list of marks
domain: an object which is described in domain terms, rather than by its representation within
the program, e.g. a distance.
Possible analyses with this scheme are the proportion of information types used, and the level of ab-
straction featured in the summary. Good and Brna assume, that the scheme can also be applied on ver-
bal protocols gathered during comprehension tasks.
Besides primitive categories that denote fixations on a certain point in the program, there are two cate-
gories of codes for a series of eye movements called pattern and strategy. Patterns are observable se-
quences of fixations, while strategies require the interpretation of a pattern concerning the intention
behind this visual behavior. Several researchers involved in computer science education and eye
movements defined an initial set of codes, observables as well as potential strategies. Only very few
codes, like the scan pattern were adopted from previous research on eye movements in programming.
This scheme was given to the workshop participants with the task to code the provided eye movement
records using the video annotating software ELAN4 and to modify the coding scheme as they seem fit.
The workshop participants' suggestions for the scheme were compiled into a revised version which
was discussed during the workshop. Further revisions were included accordingly. Finally the observ-
able codes of the scheme which only relate to single fixations were abstracted. Table 1 presents an ex-
cerpt from the final workshop coding scheme, table 2 a subblock example.
4 See http://tla.mpi.nl/tools/tla-tools/elan/.
Using this scheme on eye movement records provided an excellent basis for the rich discussion during
the workshop. The codes were developed partly top-down and partly bottom-up on only two eye
movement records on a single program. It is not possible to draw conclusions about the reliability or
the completeness of the scheme. Additionally, it is hard to describe the codes unambiguously, some
codes are still rather fuzzy. These shortcomings lie somewhat in the nature of the data, the analysis in-
strument and the kind of research problem. Nevertheless, applying the scheme illustrated the useful-
ness of this kind of analysis and we can draw from the lessons learned.
On a more general level, it is worth trying to find the most common global patterns, how program-
mers go about understanding a source code. Finally, wed like to choose a few control structures that
are of special interest, e.g. loops and conditions. For these longer sets of fixations, instruments to
compare fixation sequences as proposed e.g. by Cristino, Matht, Theeuwes & Gilchrist (2010) and
West, Haake, Rozanski & Karn (2006) can be applied in addition to counting frequencies.
Besides computing possible data points to analyze, it is still a good idea to have a human looking for
interesting data sequences. There might be patterns which are not frequent but nevertheless yield rich
information, like extreme or very unexpected behavior or ideal cases that can be predicted from cur-
rent theory.
5 Mayring (2000) refers to conventional content analysis as inductive and directed content analysis as
deductive category development.
analyzing the data. It aims at extending or refining an existing theory. The summative approach
counts single words or content and interprets the underlying context. Conventional content analysis is
generally used to gain a richer understanding of a phenomenon, when prior theory or research is lim-
ited. The coding scheme is derived from data during data analysis. Codes are sorted into categories
and relationships among categories are identified (Hsieh & Shannon 2005; Mayring 2000).
The overall intention of qualitative content analysis to interpret meaning from content matches our
proposition. Furthermore the level of our intended outcome corresponds to what seems feasible with
qualitative content analysis, producing a coding scheme with categories and codes describing PC pro-
cesses in order to contribute to theory building. Nevertheless, eye movements are not exactly the kind
of data, this approach aims to analyze.
5.2.2. Phenomenography
Phenomenography is an empirical, qualitative research approach that describes the variation in the
way people understand or experience a certain phenomenon. The analysis consists of an iterative
process, in which the researcher goes back to the data again and again. The outcome space of this
analysis is a set of categories specifying different levels of understanding. These categories often have
a hierarchical structure, going from categories with few features of the phenomenon to depicting
richer or deeper understanding. Phenomenography is usually used in educational settings (Eckerdal
2009). The data for phenomenographic research has the form of peoples accounts of their own expe-
rience and is usually gathered via interviews. Richardson (1999) points out, that other data sources are
possible. Though those are in general just other forms of discourse that have the same evidential sta-
tus as oral accounts.
While the goal to find ways in which programmers understand source code in general agrees with the
phenomenographic paradigm, the intended outcome is still different. At the current point, it is of inter-
est to develop a coding scheme to capture different cognitive processes during PC. The outcome space
obtained by phenomenography is already a step further than what seems reasonable right now for the
coding scheme.
6. Conclusion
We presented an approach to use eye movement data for PC studies. For that purpose, we combine
quantitative and qualitative methods to develop coding schemes. As an initial example, we worked on
a scheme about comprehension strategies by expert programmers. Taking into account previous cod-
ing schemes in this context and the procedures of their development, allowed us to reflect potential
pitfalls such as the missing comparability of results in advance.
Coding schemes for eye movement data should contain observable behavior as well as interpreted
cognitive processes. For the most part, the observable codes can be assigned automatically, which is
an advantage over previous coding schemes. Following the proposed procedure facilitates the compar-
ison of data-driven results with other studies, without having to adopt their theoretical premises. Hav-
ing a consistent, but yet flexible naming scheme as suggested by von Mayrhauser & Lang (1999) and
Salinger & Prechelt (2013) will help that.
In order to use the above discussed qualitative methods, the gaze data could be translated into textual
form using observable codes as label. The resulting records would have this form: Signature -
MethodBody - MethodBody or Scan - JumpControl - LinearHorizontal, enriched with information on
the line and an unique name for the element. This might be seen as a representation of the raw data,
similar to the transcript of an interview. However, unlike a transcript, any chosen label is already im-
plying a certain interpretation. Hence, this translation process has to be done carefully. It is interesting
to now explore the possibility to produce such a representation of the eye movements in a rigorous,
and probably automated or semi-automated way.
7. Acknowledgements
We would like to thank Shahram Eivazi, Tersia //Gowases, Andrew Begel and Bonita Sharif as well
as the other participants of the Koli workshop for discussing the ideas presented in this paper.
8. References
Bednarik, R., Busjahn, T., & Schulte, C. (2014). Eye Movements in Programming Education:
Analyzing the experts gaze. Joensuu, Finland: University of Eastern Finland.
Cristino, F., Matht, S., Theeuwes, J., & Gilchrist, I. D. (2010). ScanMatch: A novel method for
comparing fixation sequences. Behavior Research Methods, 42(3), 692700.
Eckerdal, A. (2009). Novice Programming Students Learning of Concepts and Practise. Uppsala
University, Uppsala.
Good, J., & Brna, P. (2004). Program comprehension and authentic measurement: a scheme for
analysing descriptions of programs. Empirical Studies of Software Engineering, 61(2), 169185.
Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative
health research, 15(9), 12771288.
Mayring, P. (2000). Qualitative Content Analysis. Forum Qualitative Sozialforschung / Forum:
Qualitative Social Research, 1(2).
Mayring, P. (2001). Combination and integration of qualitative and quantitative analysis. In Forum
Qualitative Sozialforschung/Forum: Qualitative Social Research (2).
OBrien, M. P., Shaft, T. M., & Buckley, J. (2001). An Open-Source Analysis Schema for Identifying
Software Comprehension Processes. In Proceedings of 13th Workshop of the Psychology of
Programming Interest Group (p. 129146). Bournemouth, UK.
Pennington, N. (1987). Stimulus structures and mental representations in expert comprehension of
computer programs. Cognitive Psychology, 19(3), 295341.
Rayner, K. (1998). Eye Movements in Reading and Information Processing: 20 Years of Research.
Psychological Bulletin, 124(3), 372422.