Would ChatGPT-facilitated Programming Mode
Would ChatGPT-facilitated Programming Mode
Int J Educ Technol High Educ (2024) 21:14 International Journal of Educational
https://doi.org/10.1186/s41239-024-00446-5
Technology in Higher Education
*Correspondence:
[email protected] Abstract
1
Chinese Education ChatGPT, an AI-based chatbot with automatic code generation abilities, has shown
Modernization Research Institute, its promise in improving the quality of programming education by providing learners
Hangzhou Normal University,
Yu Hang Tang Rd 2318,
with opportunities to better understand the principles of programming. However, lim-
311121 Hangzhou, China ited empirical studies have explored the impact of ChatGPT on learners’ programming
2
Zhejiang University, College processes. This study employed a quasi-experimental design to explore the possible
of Education, Yu Hang Tang Rd
866, 310058 Hangzhou, Zhejiang,
impact of ChatGPT-facilitated programming mode on college students’ programming
China behaviors, performances, and perceptions. 82 college students were randomly divided
3
Xiaoshan High School, Gongxiu into two classes. One class employed ChatGPT-facilitated programming (CFP) practice
Rd 538, 311201 Hangzhou, China
and the other class utilized self-directed programming (SDP) mode. Mixed methods
were utilized to collect multidimensional data. Data analysis uncovered some intrigu-
ing results. Firstly, students in the CFP mode had more frequent behaviors of debug-
ging and receiving error messages, as well as pasting console messages on the website
and reading feedback. At the same time, students in the CFP mode had more fre-
quent behaviors of copying and pasting codes from ChatGPT and debugging, as well
as pasting codes to ChatGPT and reading feedback from ChatGPT. Secondly, CFP
practice would improve college students’ programming performance, while the results
indicated that there was no statistically significant difference between the students
in CFP mode and the SDP mode. Thirdly, student interviews revealed three highly
concerned themes from students’ user experience about ChatGPT: the services offered
by ChatGPT, the stages of ChatGPT usage, and experience with ChatGPT. Finally, col-
lege students’ perceptions toward ChatGPT significantly changed after CFP practice,
including its perceived usefulness, perceived ease of use, and intention to use. Based
on these findings, the study proposes implications for future instructional design
and the development of AI-powered tools like ChatGPT.
Keywords: ChatGPT, Programming learning, Behavioral analysis, Perception, College
student
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits
use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original
author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third
party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the mate-
rial. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or
exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://
creativecommons.org/licenses/by/4.0/.
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 2 of 22
Introduction
Programming education has become increasingly important in the current higher edu-
cation system because it could promote college students’ computational thinking skills,
which are vital in various working situations (Jancheski, 2017; Stehle & Peters-Burton,
2019). However, the endeavor to harness the advantages of programming is not devoid
of obstacles. Extensive research has determined that students actively engaged in pro-
gramming education may encounter a range of challenges (Looi et al., 2018; Sun et al.,
2021a). The challenges that were faced included a deficiency in pertinent technical skills
and obstacles hindering the ability to access crucial resources (Bau et al., 2017; Tom,
2015). The challenges that college students face in their programming learning can have
a negative impact on their overall educational experience. In this regard, it seems neces-
sary to offer comprehensive support to college students in their programming learning
process (Lu et al., 2017; Sun et al., 2021d).
The use of certain technological enablers, for instance, interactive coding platforms,
integrated development environments (IDEs), and online coding communities, has a
substantial impact on the level of programming performance and behavior. They may
lead to promoting the acquisition of programming skills, facilitating their practical appli-
cation, and catering to the personalized preferences of programming learners (Cheva-
lier et al., 2020; Ghatrifi et al., 2023; Nurbekova et al., 2020). An exemplary illustration
of programming education can be found in the utilization of ChatGPT. The impact of
ChatGPT on learning has been remarkable, as it has brought about a revolutionary wave
of technological advancements that have greatly facilitated the teaching and learning
processes (Firaina & Sulisworo, 2023; Lo, 2023). ChatGPT may be a resource for deter-
mining performance, learning, and discussing numerous programming-related topics.
ChatGPT can formulate, clarify, and illustrate code samples in response to students’
inquiries. It has the potential to serve as an exhibition platform for various programming
solutions, explanations of various methodologies, and illustrative approaches (Chen
et al., 2023; Yilmaz & Yilmaz, 2023a). ChatGPT can generate a customized learning tra-
jectory based on the learner’s existing proficiency and desired objectives in program-
ming learning. ChatGPT can provide recommendations for online courses, tutorials,
books, and many other resources to enable ongoing personal and professional develop-
ment (Jalil et al., 2023; Surameery & Shakor, 2023; Tian et al., 2023). ChatGPT exhibits
exceptional characteristics, including the capability to produce text that faithfully rep-
licates authentic human dialogue when given inputs. This feature is very different from
traditional teacher-led instruction, and it makes people think about how it could be
used and incorporated into programming education (Firaina & Sulisworo, 2023; Javaid
et al., 2023). Further significant factors exist that necessitate scrutiny. Notably, conven-
tional teaching methods do not significantly impact the accessibility of vital information;
rather, it is ChatGPT’s sophisticated algorithmic architecture and computational prow-
ess that ensure its availability. In addition, it’s important to acknowledge that there may
be some inaccuracies in the error detection system of ChatGPT, which could potentially
impact students’ motivation and capacity to utilize the feedback provided.
What learning mode is more effective and preferred by college students for pro-
gramming learning remains uncertain. This uncertainty is partly attributed to pro-
gramming learning studies to date have been based on the comparison between the
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 3 of 22
most commonly used technological sources and devices and traditional learning
forms. As an illustrative instance, researchers have juxtaposed and contrasted various
programming methodologies. Sun et al. (2021a), for instance, attempted to harmo-
nize text-based programming techniques with isomorphic block-based techniques.
With instructor-led delivery, Sun et al. (2021b) sought to reconcile learner-oriented,
unplugged programming. Sun et al. (2021d) facilitated the lectures on novice pro-
gramming instruments and the transition from Logo to Scratch. To address these
obstacles, teachers have implemented innovative pedagogical approaches, including
game-based learning, offline instruction, and project-centric programming, which
are commonly used in informal learning environments. The primary aim of these
methodologies is to convert traditional instructor-led programming instruction into
interactive, learner-focused programming activities (Brackmann et al., 2017; Hosseini
et al., 2019; Nurbekova et al., 2020).
For this study, we utilized Davis’ (1989) technology acceptance model (TAM) as a
theoretical framework to investigate students’ acceptance of ChatGPT in program-
ming education. Researchers have used Davis’ (1989) technology acceptance model as
a framework to look into how students interact with AI-based learning technologies
(Xie et al., 2023) by looking at the currently published academic literature. Research-
ers extensively employ the TAM framework in studies focused on the domain of pro-
gramming learning. Studies conducted by Cheng (2019) and Thongkoo et al. (2020)
have provided evidence that supports the claim that the TAM is a valid predictive
model for technological adoption in programming learning. In today’s modern era,
the widespread use of computing technology has made it an essential tool for stu-
dents. It has become so prevalent that it is now considered indispensable, enabling
students to take charge of their learning and strive toward long-term objectives such
as skill enhancement (Peng et al., 2023). Developed by Davis (1989), TAM aims to
shed light on the factors that influence user behavior in adopting new technologies
(Xie et al., 2023). According to the TAM, users’ perceptions of a technology’s use-
fulness and ease of use play a vital role in determining their adoption and regular
usage of it (Davis, 1989). In addition, the attitudes and behavioral intentions of users
play a crucial role in determining their willingness to adopt technology in the learn-
ing process (Yang & Tsai, 2008; Yi et al., 2016). Perceived ease of use is a concept
that revolves around how users assess the simplicity or complexity of a technological
device or system based on their impressions and expectations (Davis, 1989). When
considering the value of a particular technology, it is important to take into account
an individual’s assessment of how it will enhance their work efficiency (Xie et al.,
2023). When a new piece of technology is perceived as user-friendly, the chances of
people embracing it are significantly higher. Nevertheless, individuals tend to become
less inclined to utilize technology when they encounter challenges in acquiring the
necessary skills to operate it (Teo et al., 2008).
It will be worth exploring the effectiveness of ChatGPT in programming learning. In
this regard, this study aimed to explore the impact of ChatGPT-facilitated programming
on college students’ programming behaviors, performance, and perception by compar-
ing two kinds of learning modes: ChatGPT-facilitated programming (CFP) mode and
traditional self-directed programming (SDP) mode. There were four research questions:
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 4 of 22
RQ1: What are the differences in the programming behaviors of students engaged in
CFP mode compared with those in SDP mode?
RQ2: What are the differences in the programming performances of students
engaged in CFP compared with those in SDP mode?
RQ3: How do college students describe their user experiences with ChatGPT in CFP
mode?
RQ4: How do college students’ perceptions of ChatGPT change following their expe-
rience with CFP mode?
The results of this study offer significant insights that can inform policy-making
regarding the most effective approaches to expand and support programming education
at the college and university levels. Despite being conducted within the specific context
of China, our research findings and consequences hold significance for scholars, poli-
cymakers, and practitioners worldwide. Countries, regardless of their level of develop-
ment, can recognize and rectify possible deficiencies in programming education through
careful consideration of the data presented in our research. The advent of AI tools such
as ChatGPT introduces novel complexities to programming instruction. Our research
makes a significant and innovative contribution to this urgent matter at an international
level. The ramifications transcend academic settings and universities, encompassing the
approaches and policies of governments across the globe. In conclusion, the findings of
this inquiry substantially advance our comprehension of the perspectives held by aspir-
ing teachers worldwide regarding the application of ChatGPT in computer program-
ming instruction.
Literature review
Advancing programming education with AI‑technologies
Technology-enhanced programming learning has been gaining momentum in recent
years with the advance of AI technologies. This literature review examines the new
trends in the programming and learning fields that have emerged in the age of AI. One of
the significant trends in programming education is the integration of AI-powered tools
such as chatbots, intelligent tutoring systems, and automated programming assessment
software. These tools offer students personalized instruction and immediate feedback to
help them progress at their own pace and improve their programming skills. For exam-
ple, Skalka et al. (2021) proposed a conceptual framework that combines micro-learn-
ing and automatic evaluation of source code to give students immediate feedback and
involve them in software development in a virtual learning environment. This framework
was shown to significantly improve the results of students in advanced programming
courses. In the Programming 1 course, Malik et al. (2022) introduced a chatbot that was
specifically engineered to highlight problem-solving strategies, common programming
errors, syntax, and semantics, with the ultimate goal of assisting inexperienced learn-
ers in simultaneously mastering a variety of competencies. The students perceived the
chatbot’s methodology as advantageous in the above-mentioned points. Klasnja-Milievi
et al. (2016) also designed an Intelligent Tutoring System called Programming-Tutor that
uses AI to provide an immersive learning experience for online programming courses
in the Pacific. This ITS is expected to help students learn programming more easily and
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 5 of 22
efficiently in an online mode and provide valuable formative assessment while enhanc-
ing student learning.
Furthermore, the application of data analytics, machine learning, and natural language
processing technologies in programming education has gained popularity. These inno-
vations can help analyze programming-related data, extract valuable insights into learn-
ers’ programming skills, and pinpoint areas where improvement is needed to enable
effective feedback and instruction. For instance, Khan et al. (2019) developed a model
to predict the performance of introductory programming students based on their early
semester grades. The researchers used WEKA to compare eleven different machine-
learning approaches and found that the Decision Tree algorithm did the best in terms of
recognizing instances, being accurate on the F-measure, and finding true positives. This
model is expected to enable students to forecast their probable final grades, empower-
ing them to modify their study approaches for better academic results. In a similar vein,
Sivasakthi et al. (2017) developed a predictive data mining model that utilized classifi-
cation algorithms to predict the performance of first-year Computer Application bach-
elor’s degree students in introductory programming. The research ascertained the most
efficient classification algorithm and demonstrated the accuracy of each algorithm in
use. In addition, Shen et al. (2023) proposed a student profile model that includes code
information and other student characteristics as input to a deep neural network for per-
formance prediction and found that a four-layer deep neural network using all available
dimensions of student profiles achieved the best performance.
Despite these promising trends, challenges remain in effectively integrating AI tech-
nologies into the learning process. These challenges include ethical considerations such
as privacy, security, and bias, and tensions between the use of AI-powered tools and the
human aspect of programming involving creativity and problem-solving skills (Gervasi
et al., 2021; Mousavinasab et al., 2021; Wang et al., 2023). Recent advancements in large-
scale language modeling and AI, such as interactive text generators capable of respond-
ing to user prompts (Yilmaz & Yilmaz, 2023c), have not been adequately explored or
recognized in programming education. This leaves a gap in implementing state-of-the-
art technologies in programming education pedagogy.
education outside the formal programming curriculum. The lack of a controlled com-
parison or quasi-experimental design in previous studies complicates researchers’ ability
to make definitive claims regarding the effectiveness of the method in assisting individu-
als with coding learning. Additionally, there has been minimal involvement by students
in both the practice and reflection of ChatGPT in programming learning. As a result,
there is a gap in knowledge and practical application of the impact of ChatGPT on pro-
gramming education.
Research methodology
Research context, participants, and instructional procedures
As an integral component of a mandatory “Python Programming” course, we con-
ducted the research specifically tailored for Educational Technology majors enrolled at
a Chinese university during the spring term of 2020. We employed a quasi-experimen-
tal design to compare the programming behaviors, performances, and perspectives of
learners between the control condition (self-directed programming, SDP) and the exper-
imental condition (ChatGPT-assisted programming, CFP) mode. The CFP experimen-
tal class consisted of 43 individuals (19 females and 13 males), as opposed to the SDP
control class of 39 individuals (16 females and 15 males). With prior knowledge of C
programming, the first author of this study, who also instructed both groups, shared this
expertise. After obtaining written consent from the review committee, the researchers
collected and interacted with data, ensuring no ethical violations occurred. The same
instructor oversaw both classes and employed identical instructional strategies, course
materials, and guidelines. The sole distinction lay in the integration of ChatGPT into
the experimental setup. The instructor, with the assistance and direction of the research
team, organized the course into three phases and five eighty-minute learning sessions.
There were 39 learners (female = 16; male = 15) in the control SDP class and 43
learners (female = 19; male = 13) in the experimental CFP class. The students with C
programming experience were instructed by the same teacher (the first author of this
paper), and both classes were not informed of the different treatments. The review com-
mittee’s written agreement was obtained to interact with and collect data for research
purposes without ethical issues. Classes were taught by the same instructor (the fourth
author), who maintained the same teaching style under two conditions, offered the same
instructional materials to learners, and used the same teaching guidance for each class,
except for the use of ChatGPT in the experiment condition. With guidance and sup-
port from the research team, the instructor divided the course into three phases and five
instructional sessions (each session lasted 80 min). In Phase I and Phase II, the instruc-
tor taught the first four sessions’ contents, including the basic concepts of Python pro-
gramming, including the introduction of Python (e.g., IDLE, input(), eval(), print()), data
structure (e.g., int, float, set, list, dictionary), control structure (e.g., if, for, while), func-
tions, and methods (e.g., Recursion, Lambda). In Phase III, the instructor taught the last
session’s content, which was about a comprehensive programming project (radar chart).
The design of the instructional sessions referred to the book titled Python Program-
ming (ICOURSE, 2023). During Phase I and Phase II, both classes received instructor’
lecturing with oral presentations and Python demonstrations in first place, followed by
self-directed practices with the programming tasks demonstrated before. In Phase III,
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 8 of 22
students were required to complete a radar chart task within 100 min. Students in the
SDP mode solved problems according to their knowledge (e.g., refer to additional mate-
rials). While in the CFP mode, the ChatGPT Next platform, which was deployed by the
first author, was adopted as a major tool to facilitate students’ problem-solving process
(see Fig. 1). On this platform, gpt-3.5-turbo was chosen as the model, and students can
initiate various thematic conversations with ChatGPT by typing their questions in the
main window and getting feedback from ChatGPT.
Understanding task (UT) A student transferred to the task window to understand the program-
ming projects
Coding in Python (CP) A student wrote codes with the Python language in the system
Debugging in Python (DP) A student debugged in PyCharm
Understanding Python codes (UPC) A student attempted to understand the code with the mouse moving
back and forth on the code
Checking Radar Chart (CRC) A student checked the radar chart output in PyCharm
Reading console message (RCM) A student read error messages in output console in PyCharm
Asking new questions (ANQ) A student asked new questions in ChatGPT/browser
Pasting console message (PCM) A student pasted error messages from output console window in
ChatGPT/browser
Pasting Python codes (PPC) A student pasted Python codes to ChatGPT/browser
Reading feedback (RF) A student read feedback in ChatGPT/browser
Copy and paste codes (CPC) A student copy and paste codes from ChatGPT/browser
Referring to additional materials (RAM) A student referred to additional materials from instructors
Failure in ChatGPT (FC) ChatGPT failed to give feedback due to technical problems (e.g., cannot
connect to the server, get stuck)
Idle operation (IO) A student had no operation
Based on this coding framework, each coder autonomously recorded the data in
chronological order, annotating learner behaviors every 10 s and validating the results
with one another. Additionally, we employed lag-sequential analysis (LsA) to assess
the behavioral patterns of the learners, using the video coding outcomes as a foun-
dation (Faraone & Dorfman, 1987). Evaluating the frequency of transitions between
two behaviors and network representations displayed in two instructional modes
was entailed. We chose Yule’s Q to represent the extent of transitional associations
because it has descriptive value and can account for base numbers of contributions
(ranging from − 1 to + 1, with zero indicating no association). This study used a net-
work visualization technique previously developed (Chen et al., 2017) to illustrate the
outcomes of LsA in networks. Nodes represent behavior codes along with their cor-
responding frequencies, while edges denote transitional Yule’s Q values. An arrow
pointing in the opposite direction of the node denotes the direction of the transition.
Secondly, we evaluated the students’ programming performance by assessing their
programming assignments. The programming task involved developing a radar chart,
with an advanced requirement being the creation of a Holland radar chart and a sim-
plified radar chart serving as the primary requirements. The first author assessed the
scores on a scale of one hundred using a rubric that included aspects such as code
correctness, programming project aesthetics, and functional integrity. A T-test was
employed to compare the academic performance of students enrolled in the two
courses.
Thirdly, we conducted semi-structured post-class interviews with students, spe-
cifically focusing on their experiences using ChatGPT to practice programming (the
interview instrument can be found in Appendix A). We followed a rigorous procedure
of thematic analysis to examine the interview data (Cohen et al., 2013). The method-
ology consisted of the subsequent steps: (1) getting interview transcripts ready for
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 10 of 22
analysis; (2) having two separate coders assign codes to different parts of the data; (3)
recording the coded segments so that they can be analyzed later; (4) talking about and
comparing parts of the data that have the same codes; (5) aligning and improving the
codes to make themes that make sense; and (6) checking the validity and reliability
of the themes that were made. Transcripts containing statements such as “I appre-
ciated the human-like explanations ChatGPT provided whenever I made a syntax
error; it facilitated my learning pace” and coded segments including “ChatGPT pro-
vided human-like explanations; it facilitated my learning pace; identify and correct
code errors” serve as an example of this procedure in our research. The codes were
subsequently integrated and refined into the theme “experience with ChatGPT”. To
offer additional elucidation, the themes refer to the discernible patterns or concepts
that permeate the data and are substantial in delineating a phenomenon while also
being relevant to the research inquiry. A theme related to "experience with ChatGPT"
may manifest in our research, as student responses inquire about the pros and cons
of ChatGPT. Moreover, codes serve as the identifiers assigned to specific data points
to concisely represent or summarize the data. For example, situations in which stu-
dents refer to "extensive programming knowledge" and "various and contextualized
responses" could be interpreted as "ChatGPT experience." Moreover, we refer to the
data segments that have been assigned unique identifiers as "coded segments." In our
research, a coded segment may consist of a participant reminiscing about personal
experiences while extracting a sentence or paragraph from the interview transcript.
Fourthly, at the beginning and end of the sessions, we distributed questionnaires to
determine the students’ perceptions of ChatGPT in the CFP mode. We employed a
modified version of the survey instrument created by Venkatesh and Davis (2000) as well
as Sánchez and Hueros (2010) for this investigation. Two distinct sections comprised
the survey. The primary objective of the initial segment was to gather data from partici-
pants concerning their demographic attributes, prior encounters, and anticipations con-
cerning ChatGPT. The following segment comprised questions rated on a seven-point
Likert scale, where a rating of one indicated strong disagreement and a rating of seven
indicated strong agreement. The investigations revolved around four core domains: per-
ceived usefulness, perceived ease of use, intention to use, and attitude. For additional
information, please consult Appendix B. An independent T-test and descriptive analysis
were used to ascertain the outcomes based on the gathered data.
Results
Impact of CFP on college students’ programming behaviors
Regarding RQ1 (What are the differences in the programming behavior of students
engaged in CFP mode compared with those in SDP mode?), we gathered data on stu-
dents’ behaviors during the programming process, and we employed lag-sequential
analysis to identify any differences in the behavioral patterns between the two groups.
Learners’ behavioral patterns exhibited both similarities and discrepancies between
the two learning modes. Firstly, in terms of frequency analysis, coding in Python (CP)
was the most frequent behavior observed in both classes, followed by either under-
standing Python code (UPC) or reading feedback (RF). In the SDP mode, the most fre-
quent behaviors were coding in Python (CP; frequency = 5290), reading feedback (RF;
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 11 of 22
Fig. 2 College students’ behavioral sequence diagram in SDP mode, Yule’s Q was marked on the line to
represent the strength of transitional association. UT understanding task, CP Coding in Python, DP Debugging
in Python, UPC Understanding Python codes, CRC Checking Radar Chart, RCM Reading console message, ANQ
Asking new questions, PCM Pasting console message, PPC Pasting Python codes, RF Reading feedback, CPC
Copy and paste codes, RAM Referring to additional materials, FC Failure in ChatGPT, IO Idle operation
frequency = 2443), and understanding Python code (UPC; frequency = 1760). Compara-
tively, learners in the SDP mode engaged in reading feedback behaviors more frequently
than those in the CFP mode (RF; frequency = 1439) (see Fig. 2). In the CFP mode, the
most frequent behaviors were coding in Python (CP; frequency = 5604), understanding
Python code (UPC; frequency = 1779), and reading feedback (RF; frequency = 1439).
Conversely, learners in the CFP mode had a higher frequency of code debugging behav-
iors (DP; frequency = 857) compared to the SDP mode (DP; frequency = 423) (see Fig. 3).
Secondly, regarding sequence analysis, there were 37 and 44 significant programming
learning sequences in the SDP and CFP modes, respectively, with numerous links among
the different codes. Generally, students in the SDP mode: (1) were more likely to debug
Python codes (DP) and receive error messages (RCM) (DP → RCM, Yule’s Q = 0.97),
or check radar charts (CRC) (DP → CRC, Yule’s Q = 0.87); (2) often pasted error mes-
sages (PCM) and asked new questions (ANQ) from the output console in the browser to
look for solutions (RF) (PCM → RF, Yule’s Q = 0.88; ANQ → RF, Yule’s Q = 0.87); and
(3) preferred to copy and paste codes directly from the browser into their current codes
(CPC) and debug to test the correctness of the borrowed codes (DP) (CPC → DP, Yule’s
Q = 0.85).
As for students in the CFP mode (see Table 2), they (1) frequently copied and pasted
codes from ChatGPT (CPC), debugged to test their correctness (DP) (CPC → DP, Yule’s
Q = 0.95), and then read error messages in the output console window (DP → RCM,
Yule’s Q = 0.93); (2) preferred to directly copy and paste Python codes (PPC) and error
messages from the output console window to ChatGPT (PCM) (PPC → PCM, Yule’s
Q = 0.86); (3) spent significant time reading feedback in ChatGPT (RF) and copying
codes from ChatGPT (CPC) (RF → CPC, Yule’s Q = 0.85); and (4) encountered technical
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 12 of 22
Table 2 LsA transition frequency of programming behaviors of learners in two learning modes
SDP CFP
Transition Yule’s Q Transition Yule’s Q
Table 3 Independent T-test of college students’ programming performance in SDP and CFP modes
Modes n M SD t p
problems (FC) while pasting error messages to ChatGPT (PCM) (FC → PCM, Yule’s
Q = 0.79). These differences suggest that students in each class employ distinct learning
strategies depending on the context and available resources.
we presented the results of a T-test that compared the post-test scores between the two
groups of learners (see Table 3). After the intervention, learners in the CFP mode had a
higher average score (M = 84.11, SD = 19.45) than learners in the SDP mode (M = 78.36,
SD = 17.59). A T-test indicated that no statistically significant difference was found
between the two classes (t (77) = 1.28, p = 0.204).
Table 5 Statistic results of CFP students’ scores on the four dimensions in the pre- and post-test
Perceived usefulness Perceived ease of use Intention to use Attitude
intention to use, and attitude”—are presented in Table 5. The T-test results indicate that
the post-test perceptions of CFP students regarding "perceived usefulness" (t = − 2.34,
p = 0.027 < 0.05), "perceived ease of use" (t = − 2.84, p = 0.009 < 0.01), and "intention
to use" (t = − 3.07, p = 0.005 < 0.01) were significantly higher than the pre-test percep-
tions of these variables. However, the statistical analysis did not find a significant dif-
ference between the pretest and posttest scores of CFP students in "Attitude" (t = 2.79,
p = 0.100 > 0.05).
After the T-test analysis of grade differences, we calculated the effect sizes to assess
the significance of the disparities between the pre-test and post-test scores. Researchers
often quantify the effect size using Cohen’s d value (Cohen, 1988) in the T-test. A d value
of 0.2 indicates a small effect size, d = 0.5 indicates a medium effect size and d = 0.8 indi-
cates a large effect size. The findings from Table 4 indicate that the students’ evaluations
of ChatGPT had a moderate impact on "perceived usefulness" (0.46 > 0.40), "perceived
ease of use" (0.55 > 0.50), and "intention to use" (0.65 > 0.50).
According to our analysis, the “perceived usefulness, perceived ease of use, and inten-
tion to use” of ChatGPT among college students improved significantly after some time
of usage. This finding suggests that students perceived ChatGPT as more practical, user-
friendly, and inclined to utilize it again. Conversely, these enhancements failed to yield
a favorable influence on their “attitude towards utilization.” This implies that while stu-
dents were receptive to utilizing ChatGPT for practical purposes, the perceived benefits
did not result in favorable emotional responses or supersede preexisting inclinations.
This highlights a discrepancy between utilitarian and affective considerations. Further
investigation is necessary to determine the other factors that influence students’ atti-
tudes, in addition to the perception of utility.
Discussions
The primary objective of this research was to investigate the impact of ChatGPT on
programming learning among college students. In this study, a comparison was made
between two learning modes in terms of their impact on college students’ program-
ming learning. The two modes under investigation were instruction with traditional
self-directed programming and instruction with ChatGPT-facilitated programming. The
objective of this research was to examine and evaluate the disparities in programming
performance and programming habits among college students in two distinct learning
modalities. Additionally, the study sought to explore the perceptions of these individuals
toward ChatGPT programming learning.
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 16 of 22
Firstly, the results showed that students in each class employed distinct learning strat-
egies depending on the context and available resources. In terms of frequency analysis,
coding in Python was the most frequent behavior observed in both classes, followed by
either understanding Python code or reading feedback. However, ChatGPT enabled stu-
dents to paste their original codes or complete error messages, allowing them to receive
personalized feedback more easily. This personalized feedback was found to facilitate
their programming learning (Sun et al., 2021c). As highlighted by Chen et al. (2023),
the improvement in programming learning performance through the use of ChatGPT
mostly stems from the provision of code explanations and debugging assistance. In addi-
tion to this, Yilmaz and Yilmaz (2023a) noted the features of ChatGPT that facilitate
personalized learning, provide clear explanations and programming examples, enable
inquiries and searches, and offer resources on advanced topics. Due to Python’s popular-
ity, it is clear that hands-on experience and participation in coding activities are crucial
to learning the language. By analyzing how college students in a ChatGPT-enabled cod-
ing course approach learning, we may better understand how AI can supplement and
improve education rather than replace established methods (Wang et al., 2023). Teachers
and college students can apply the findings of our research to modify their pedagogical
approaches to maximize the utility of tools such as ChatGPT. Still, our study started an
important conversation about how to use these AI-powered tools ethically and respon-
sibly. This gives us a chance to look into the best ways to help people learn while still
maintaining academic integrity. A concern that emerged during our investigation was
the possibility that students might employ ChatGPT or comparable resources inappro-
priately and cite information from their work. Scholars such as Iqbal et al. (2022) have
emphasized the need for transparent communication and explicit protocols regarding
the implementation of AI assistance to enhance the learning experience while upholding
rigorous academic criteria.
Furthermore, the results of this research indicated that there was no statistically signif-
icant distinction between students engaged in the CFP mode and those engaged in the
SDP mode. This suggests that the rudimentary implementation of ChatGPT as a facili-
tator in the programming course does not appear to yield a considerable improvement
in student programming performance in comparison to the traditional instructional
approach. Contrary to the findings of Yilmaz & Yilmaz et al. (2023a), which suggest that
programming instruction aided by ChatGPT and similar tools can enhance students’
programming abilities through code explanations and diagnostic assistance, this study’s
results contradict their findings. In contrast, this study, like Qureshi’s (2023), discovered
that students who utilized ChatGPT achieved higher scores in programming. However,
ChatGPT’s code contained errors and contradictions, hindering students from achiev-
ing perfect scores in both investigations. The unique contextual elements intrinsic to
the study and its results could potentially explain the divergence between the present
research findings and those of previous investigations. Other potential influences that
could affect the results include the magnitude of the sample, the expertise level of the
participants, the quality of instruction, or the specific evaluation methods employed. The
results of this study indicate that the utilization of ChatGPT in isolation does not yield
a substantial benefit in comparison to conventional self-directed approaches to learn-
ing programming. However, proponents argue that the use of AI in schools does not
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 17 of 22
of the results obtained, it is possible to deduce that the intervention produced positive
outcomes with respect to the students’ attitudes and intentions towards the technology.
However, the impact of the intervention on their attitudes was negligible. The propen-
sity of the students to adopt technology can be attributed to the benefits they perceive
as well as the technology’s user-friendly characteristics. However, the lack of a signifi-
cant change in perspective indicates that there may still be reservations or contradic-
tory views among individuals concerning the execution of this endeavor. In general, the
results indicate that college students perceive ChatGPT as advantageous and intuitive
and are inclined to integrate it into their forthcoming academic and vocational pursuits.
This study highlights the importance of comprehending the potential advantages and
disadvantages of integrating AI language models into programming education, as well as
the potential impact of these technologies on the perspectives and attitudes of individu-
als, including teachers and learners, towards technology.
Implications
Considering the research findings, this study proposes pedagogical and developmental
suggestions for the future integration and implementation of ChatGPT. To begin with,
teachers ought to maintain an awareness of the potential merits and demerits associ-
ated with the incorporation of ChatGPT or other AI language models into their com-
puter programming courses from an instructional standpoint. While ChatGPT can
provide benefits like personalized feedback and contextualized responses, it can also
lead to inaccuracies and inconsistencies in code (Chen et al., 2023). Therefore, instruc-
tors should plan and implement AI-based educational resources and systems carefully
and thoughtfully. For instance, instructors could enhance the material’s understanding
among students by providing clear explanations and programming examples of utilizing
ChatGPT. A "prompt" establishes the context for a task or corpus of text that the lan-
guage model completes following AI technology. Constructing an appropriate prompt
is crucial as the model strives to generate text that aligns with the context established by
the initial prompt (Liu et al., 2023; Reynolds & McDonell, 2021).
In addition, during the ChatGPT-facilitated programming process, instructors need
to integrate effective strategies for utilizing ChatGPT, and they should also monitor
students closely for instances of academic dishonesty, such as plagiarism when using
ChatGPT or other web resources. On the development level, the accuracy and consist-
ency of the model’s generated code are the top priorities. As highlighted by this research
study and previous studies (Jalil et al., 2023; Yilmaz & Yilmaz, 2023b), inaccuracies in
the code generated by ChatGPT limit its effectiveness and may lead to instances of aca-
demic dishonesty such as plagiarism. By improving the accuracy and consistency of the
code generated, ChatGPT can become a more reliable and valuable tool for program-
ming learning. Additionally, developers can also consider integrating a framework for
heuristic guidance in ChatGPT that is modeled after Socrates’ method of question-
ing. Rather than providing direct answers to programming problems, this framework
should guide students toward solving problems by asking sequential and probing ques-
tions. By encouraging students to think creatively and critically, the heuristic guidance
framework can help enhance their problem-solving skills. Additionally, this framework
has the potential to provide students with a more personalized learning experience by
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 19 of 22
Appendix A
Interview protocol
Do you like using ChatGPT and why?
What services does ChatGPT offer you?
What do you think about using ChatGPT in this course so far?
At what stages of program design do you usually use ChatGPT?
How do you use ChatGPT during the problem-solving process in programming?
What do you think is good about ChatGPT?
What are the biggest problems you’ve encountered with ChatGPT?
Appendix B
Survey for the perception of ChatGPT
Name.
Sex.
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 20 of 22
Author contributions
Dan Sun conceptualized the study, designed the methodology to conduct formal analysis, and wrote the original draft.
Azzeddine Boudouaia wrote the original draft and conducted data analysis. Chengcong Zhu conducted data collection
and analysis. Yan Li proofread the manuscript. All authors revised the draft and contributed to the final manuscript.
Funding
This work was supported by funding from the National Natural Science Foundation of China [Grant No. 62307011] and
the Zhejiang University [Grant No. S20230157] and Graduate Education Research Center of Zhejiang University [Grant
No. YJSJY20240101].
Declarations
Competing interests
There is no competing interest to declare.
References
Bau, D., Gray, J., Kelleher, C., Sheldon, J., & Turbak, F. (2017). Learnable programming: Blocks and beyond. Communications
of the ACM, 60(6), 72–80. https://doi.org/10.48550/arXiv.1705.09413
Chen, B., Resendes, M., Chai, C. S., & Hong, H. Y. (2017). Two tales of time: Uncovering the significance of sequential pat-
terns among contribution types in knowledge-building discourse. Interactive Learning Environments, 25(2), 162–175.
https://doi.org/10.1080/10494820.2016.1276081
Chen, E., Huang, R., Chen, H. S., Tseng, Y. H., & Li, L. Y. (2023). GPTutor: A ChatGPTpowered programming tool for code explana-
tion. arXiv preprint arXiv:2305.01863.
Cheng, G. (2019). Exploring factors influencing the acceptance of visual programming environment among boys and
girls in primary schools. Computers in Human Behavior, 92, 361–372. https://doi.org/10.1016/j.chb.2018.11.043
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 21 of 22
Chevalier, M., Giang, C., Piatti, A., & Mondada, F. (2020). Fostering computational thinking through educational robotics: A
model for creative computational problem solving. International Journal of STEM Education, 7(1), 1–18. https://doi.org/10.
1186/s40594-020-00238-z
Cohen, J. (1988). The effect size (pp. 77–83). Routledge.
Cohen, L., Manion, L., & Morrison, K. (2013). Research methods in education. Routledge
Davis, F. D. (1989). Perceived usefulness, perceived ease of use and user acceptance of information technology. MIS Quartely,
13(3), 319–340.
Elkhodr, M., Gide, E., Wu, R., & Darwish, O. (2023). ICT students’ perceptions towards chatgpt: An experimental reflective lab
analysis. STEM Education, 3(2), 70–88. https://doi.org/10.3934/steme.2023006
Faraone, S. V., & Dorfman, D. D. (1987). Lag sequential analysis: Robust statistical methods. Psychological Bulletin, 101(2),
312–323. https://doi.org/10.1037/0033-2909.101.2.312
Firaina, R., & Sulisworo, D. (2023). Exploring the usage of ChatGPT in higher education: Frequency and impact on productivity.
Buletin Edukasi Indonesia, 2(01), 39–46. https://doi.org/10.56741/bei.v2i01.310
Gervasi, O., Murgante, B., Misra, S., Garau, C., Blečić, I., Taniar, D., Apduhan, B. O., Rocha, A. C., Tarantino, A. C., & E., & Torre, C. M.
(2021). Inference engines performance in reasoning tasks for intelligent tutoring systems. Computational science and its
applications (pp. 471–482). Springer International Publishing AG. https://doi.org/10.1007/978-3-030-86960-1_33
Ghatrifi, M. O., Amairi, J. S., & Thottoli, M. M. (2023). Surfing the technology wave: An international perspective on enhancing
teaching and learning in accounting. Computers and Education. Artificial Intelligence, 4, 100144. https://doi.org/10.1016/j.
caeai.2023.100144
ICOURSE. retrieved from 28 July, 2023 from https://www.icourse163.org/course/BIT-268001?from=searchPage&outVendor=
zw_mooc_pcssjg_
Iqbal, N., Ahmed, H., & Azhar, K. A. (2022). Exploring teachers’ attitudes towards using Chat GPT. Global Journal for Management
and Administrative Sciences, 3(4), 97–111. https://doi.org/10.46568/gjmas.v3i4.163
Jalil, S., Rafi, S., LaToza, T. D., Moran, K., & Lam, W. (2023). Chatgpt and software testing education: Promises & perils. In 2023 IEEE
international conference on software testing, verification and validation workshops (ICSTW) (pp. 4130–4137). IEEE.
Jancheski, M. (2017). Improving teaching and learning computer programming in schools through educational software.
Olympiads in Informatics, 11(1), 55–75. https://doi.org/10.15388/ioi.2017.05
Javaid, M., Haleem, A., Singh, R. P., Khan, S., & Khan, I. H. (2023). Unlocking the opportunities through ChatGPT tool towards
ameliorating the education system. BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 3(2), 100115.
https://doi.org/10.1016/j.tbench.2023.100115
Khan, I., Al Sadiri, A., Ahmad, A. R., & Jabeur, N. (2019). Tracking student performance in introductory programming by means
of machine learning. In Proceedings of MEC international conference on big data and smart city (ICBDSC) (pp.1–6).
Klasnja-Milićević, A., Vesin, B., Ivanović, M., Budimac, Z., & Jain, L. C. (2016). Case study: Design and implementation of program-
ming tutoring system. Springer International Publishing AG.
Limna, P., Kraiwanit, T., Jangjarat, K., Klayklung, P., & Chocksathaporn, P. (2023). The use of chatgpt in the digital era: Perspectives
on Chatbot Implementation. Journal of Applied Learning & Teaching, 6(1), 64–74. https://doi.org/10.37074/jalt.2023.6.1.32
Lo, C. K. (2023). What is the impact of chatgpt on education? A Rapid Review of the literature. Education Sciences, 13(4), 410.
https://doi.org/10.3390/educsci13040410
Looi, C. K., How, M. L., Wu, L. K., Seow, P., & Liu, L. (2018). Analysis of linkages between an unplugged activity and the develop-
ment of computational thinking. Computer Science Education, 28(3), 255–279. https://doi.org/10.1080/08993408.2018.
1533297
Lu, O. H. T., Huang, J. C. H., Huang, A. Y. Q., & Yang, S. J. H. (2017). Applying learning analytics for improving students engage-
ment and learning outcomes in an MOOCs enabled collaborative programming course. Interactive Learning Environ-
ments, 25(2), 220–234. https://doi.org/10.1080/10494820.2016.1278391
Malik, S. I., Ashfque, M. W., Tawafak, R. M., Al-Farsi, G., Ahmad Usmani, N., & Hamza Khudayer, B. (2022). A chatbot to facilitate
student learning in a programming 1 course: A gendered analysis. International Journal of Virtual and Personal Learning
Environments, 12(1), 1–20. https://doi.org/10.4018/IJVPLE.310007
Mousavinasab, E., Zarifsanaiey, N., NiakanKalhori, R. S., Rakhshan, M., Keikha, L., & Ghazi Saeedi, M. (2021). Intelligent tutoring
systems: A systematic review of characteristics, applications, and evaluation methods. Interactive Learning Environments,
29(1), 142–163. https://doi.org/10.1080/10494820.2018.1558257
Nurbekova, Z., Tolganbaiuly, T., Nurbekov, B., Sagimbayeva, A., & Kazhiakparova, Z. (2020). Project-based learning technol-
ogy: An example in programming microcontrollers. International Journal of Emerging Technologies in Learning, 15(11),
218–227. https://doi.org/10.3991/ijet.v15i11.13267
OpenAI. (2023). Introducing ChatGPT. Retrieved July 30, from https://openai.com/blog/chatgpt
Peng, R., Hu, Q., & Kouider, B. (2023). Teachers’ acceptance of online teaching and emotional labor in the EFL context. Sustain-
ability, 15(18), 13893–13893. https://doi.org/10.3390/su151813893
Qureshi, B. (2023). Exploring the use of chatgpt as a tool for learning and assessment in undergraduate computer science curricu-
lum: Opportunities and challenges. arXiv preprint arXiv:2304.11214.
Rahman, M. M., & Watanobe, Y. (2023). ChatGPT for education and research: Opportunities, threats, and strategies. Applied
Sciences, 13(9), 5783.
Ray, B., Posnett, D., Filkov, V., & Devanbu, P. (2014). A large scale study of programming languages and code quality in github.
In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering (pp. 155–165).
Sánchez, R. A., & Hueros, A. D. (2010). Motivational factors that influence the acceptance of Moodle using TAM. Computers in
Human Behavior, 26(6), 1632–1640. https://doi.org/10.1016/j.chb.2010.06.011
Shen, G., Yang, S., Huang, Z., Yu, Y., & Li, X. (2023). The prediction of programming performance using student profiles. Educa-
tion and Information Technologies, 28(1), 725–740. https://doi.org/10.1007/s10639-022-11146-w
Sivasakthi, M. (2017). Classification and prediction based data mining algorithms to predict students’ introductory program-
ming performance. In Proceedings of international conference on inventive computing and informatics (ICICI) (pp. 346–350).
Skalka, J., Drlik, M., Benko, L., Kapusta, J., Pino, D., Rodríguez, J. C., Smyrnova-Trybulska, E., Stolinska, A., Svec, P., & Turcinek, P.
(2021). Conceptual framework for programming skills development based on microlearning and automated source
code evaluation in virtual learning environment. Sustainability, 13(6), 3293. https://doi.org/10.3390/su13063293
Sun et al. Int J Educ Technol High Educ (2024) 21:14 Page 22 of 22
Stehle, S. M., & Peters-Burton, E. E. (2019). Developing student 21st century skills in selected exemplary inclusive STEM high
schools. International Journal of STEM Education, 6(1), 1–15. https://doi.org/10.1186/s40594-019-0192-1
Sun, D., Ouyang, F., Li, Y., & Zhu, C. (2021b). Comparing learners’ knowledge, behaviors, and attitudes between two instruc-
tional modes of computer programming in secondary education. International Journal of STEM Education, 8(1), 54–54.
https://doi.org/10.1186/s40594-021-00311-1
Sun, D., Ouyang, F., Li, Y., & Chen, H. (2021c). Three contrasting pairs’ collaborative programming processes in China’s second-
ary education. Journal of Educational Computing Research, 59(4), 740–762. https://doi.org/10.1177/0735633120973430
Sun, L., Hu, L., & Zhou, D. (2021d). Which way of design programming activities is more effective to promote K-12 students’
computational thinking skills?A meta-analysis. Journal of Computer Assisted Learning. https://doi.org/10.1111/jcal.12545
Sun, D., Xu, F., & Li, Y. (2021a). Using learning analytics in understanding students’ behavior in block-based and text-based
programming modality. In 2021 tenth international conference of educational innovation through technology (EITT). https://
doi.org/10.1109/eitt53287.2021.00060
Surameery, N. M. S., & Shakor, M. Y. (2023). Use chatgpt to solve programming bugs. International Journal of Information Tech-
nology & Computer Engineering (IJITC), 3(1), 17–22.
Teo, T., Luan, W. S., & Sing, C. C. (2008). A cross-cultural examination of the intention to use technology between Singaporean
and Malaysian pre-service teachers: An application of the technology acceptance model (TAM). Educational Technology
and Society, 11(4), 265–280.
Thongkoo, K., Daungcharone, K., & Thanyaphongphat, J. (2020). Students’ acceptance of digital learning tools in program-
ming education course using technology acceptance model. IEEE Xplore. https://doi.org/10.1109/ECTIDAMTNCON482
61.2020.9090771
Tian, H., Lu, W., Li, T. O., Tang, X., Cheung, S. C., Klein, J., & Bissyand é , T. F. (2023). Is ChatGPT the ultimate programming an assis-
tant–How far is it?. arXiv preprint arXiv: 2304.11938.
Tlili, A., Shehata, B., Adarkwah, M. A., Bozkurt, A., Hickey, D. T., Huang, R., & Agyemang, B. (2023). What if the devil is my guardian
angel: ChatGPT as a case study of using chatbots in education. Smart Learning Environments, 10(1), 15.
Tom, M. (2015). Five cs framework: A student-centered approach for teaching programming courses to students with diverse
disciplinary background. Journal of Learning Design, 8(1), 21–27. https://doi.org/10.5204/jld.v8i1.193
Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field stud-
ies. Management Science, 46(2), 186–204. https://doi.org/10.1287/mnsc.46.2.186.11926
Wang, H., Tlili, A., Huang, R., Cai, Z., Li, M., Cheng, Z., Yang, D., Li, M., Zhu, X., & Fei, C. (2023). Examining the applications of intel-
ligent tutoring systems in real educational contexts: A systematic literature review from the social experiment perspec-
tive. Education and Information Technologies, 28(7), 9113–9148. https://doi.org/10.1007/s10639-022-11555-x
Xie, Y., Boudouaia, A., Xu, J., AL-Qadri, A. H., Khattala, A., Li, Y., & Aung, Y. M. (2023). A study on teachers’ continuance intention to
use technology in English instruction in western China junior secondary schools. Sustainability, 15(5), 4307. https://doi.
org/10.3390/su15054307
Yang, F.-Y., & Tsai, C.-C. (2008). Investigating university student preferences and beliefs about learning in the web-based
context. Computers and Education, 50, 1284–1303.
Yilmaz, R., & Yilmaz, F. G. K. (2023a). Augmented intelligence in programming learning: Examining student views on the use
of ChatGPT for programming learning. Computers in Human Behavior: Artificial Humans, 1(2), 100005. https://doi.org/10.
1016/j.chbah.2023.100005
Yilmaz, R., & Yilmaz, F. G. K. (2023b). The effect of generative artificial intelligence (AI)-based tool use on students’ computa-
tional thinking skills, programming self-efficacy and motivation. Computers and Education. Artificial Intelligence, 4, 100147.
https://doi.org/10.1016/j.caeai.2023.100147
Yilmaz, R., & Yilmaz, F. G. K. (2023c). The effect of generative artificial intelligence (AI) based tool use on students’ computa-
tional thinking skills, programming self-efficacy and motivation. Computers and Education: Artificial Intelligence, 4, 100147.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Dan Sun (Ph.D.) is an assistant professor in Jing Hengyi School of Education at Hangzhou Normal Uni-
versity. Her research interests include artificial intelligence in education, programming education, learning
analytics and computational thinking, etc.
Azzeddine Boudouaia (Ph.D.) is a postdoctoral fellow in educational technology at the College of Edu-
cation, Zhejiang University, PR China. His research interests include curriculum studies, technology and arti-
ficial intelligence in EFL education, EFL teaching approaches, and teacher professional development.
Chengcong Zhu is a high school teacher in Xiaoshan High School. His research interests include infor-
mation technology, programming education, etc.
Yan Li (Ph.D.) is a professor in College of Education at Zhejiang University. Her research interests include
e-learning, distance education, ICT education, etc.