Unleashing ChatGPTs Power A Case Study On Optimizing Information Retrieval in Flipped Classrooms Via Prompt Engineering
Unleashing ChatGPTs Power A Case Study On Optimizing Information Retrieval in Flipped Classrooms Via Prompt Engineering
Abstract—This research project investigates the impact of commented in the journal Nature, “Conversational AI is a
prompt engineering, a key aspect of chat generative pretrained game-changer for science” [2]. Conversational AI tools like
transformer (ChatGPT), on college students’ information retrieval chat generative pretrained transformer (ChatGPT) [3] and Chat-
in flipped classrooms. In recent years, an increasing number of
students have been using AI-based tools, such as ChatGPT rather GLM [4] employ large language models (LLMs) to mimic
than traditional research engines to learn and to complete course human conversation, comprehending and providing responses
assignments. Despite this growing trend, previous research has to user queries in a manner reminiscent of human interaction.
largely overlooked the influence of prompt engineering on students’ These AI-based chatbots have proven to be invaluable assets in
use of ChatGPT and effective strategies for improving the quality of various sectors, such as customer service [5] and healthcare [6],
information retrieval in learning settings. To address this research
gap, this study examines the information quality obtained from as they can optimize workflow efficiencies, minimize expenses,
ChatGPT in a flipped classroom by evaluating its effectiveness in and elevate the overall user experience.
task completion among 26 novice undergraduates from the same ChatGPT is an advanced language model with the remarkable
major and cohort. The experimental results provide evidence that capability of generating text that closely resembles human lan-
proficient mastery of prompt engineering improves the quality of guage in response to given prompts [7]. By providing a concise
information obtained by students using ChatGPT. Consequently,
by acquiring proficiency in prompt engineering, students can maxi- description of a code snippet, along with pertinent details and
mize the positive impact of ChatGPT, obtain high-quality informa- constraints, ChatGPT can be initiated. To illustrate, suppose we
tion, and enhance their learning efficiency in flipped classrooms. require a function to compute the average of a set of numbers. In
Index Terms—Chat generative pretrained transformer
this case, the following prompt can be used to initiate ChatGPT:
(ChatGPT), flipped classrooms, information retrieval, prompt “Write a Python function that takes a list of numbers as an
engineering. argument and returns their average.”
The potential for significant improvement lies in augmenting
flipped learning through the use of AI-based chatbots [8]. The
I. INTRODUCTION flipped classroom pedagogy, also known as the inverted class-
HATGPT, an artificial intelligence (AI) conversational room, is an innovative and widely embraced teaching approach.
C chatbot developed by OpenAI, since its release on
November 30, 2022, has garnered worldwide interest [1]. As
It involves the transformation of traditional in-class activities,
such as instructor presentations, into homework assignments,
while projects or tasks typically assigned as homework are
completed during class time [9], [10], [11], [12].
Manuscript received 31 July 2023; revised 15 September 2023; accepted 27 Recent meta-analyses have indicated that flipped learning has
September 2023. Date of publication 16 October 2023; date of current version the potential to enhance student achievement across various
24 January 2024. This work was supported in part by the National Natural
Science Foundation of China under Grant 61976050 and Grant 61972384, in subject disciplines [13], [14]. However, it is essential to ac-
part by the Ministry of Education, China under Grant 2021BCF01002, and in knowledge that implementing this approach comes with its fair
part by the Jilin Provincial Department of Education Social Science Research share of challenges. Akçayır et al. [9] discovered two prominent
Planning Project under Grant JJKH20221137SK and Grant JJKH20231252SK.
(Mo Wang and Minjuan Wang are co-first authors.) (Corresponding authors: issues associated with flipped learning. First, students often
Xin Xu; Minghao Yin.) lack proper guidelines or instructions when studying at home.
Research with minimum or no impact on participants are exempted by the Second, they encounter difficulty in seeking assistance during
University where the study was conducted. In addition, participating in this
study was completely voluntary and with full consent of the participants. the preclass learning phase, which subsequently hinders their
Mo Wang, Xin Xu, Lanqing Yang, and Minghao Yin are with the North- active participation in in-class activities. Many studies have
east Normal University, Changchun 130024, China (e-mail: wangm875@nenu. shown that AI-based chatbots can play a variety of roles, such as
edu.cn; [email protected]; [email protected]; [email protected]).
Minjuan Wang is with Learning Design and Technology, San Diego State learning chatbots [15], [16], [17], assistant chatbots [18], [19],
University, San Diego, CA 92182 USA (e-mail: [email protected]). and mentor chatbots [20], [21]. AI-based chatbots can address
Dunbo Cai is with the Center for Technology Research and Innovation, China these challenges by providing students with 24/7 assistance and
Mobile (Suzhou) Software Technology Company Ltd., Suzhou 215000, China
(e-mail: [email protected]). personalized support, thereby enhancing their engagement in
Digital Object Identifier 10.1109/TLT.2023.3324714 preclass learning activities [8].
1939-1382 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
630 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
Although there are many studies on flipped classrooms and By incorporating generative AI into teaching methods, edu-
AI-based chatbots [14], [22], [23], few have focused on how to cators can assist students in developing future-proof skills that
improve students’ skills in using chatbots to obtain high-quality enable them to excel in the job market and adeptly adapt to
responses, thus highlighting the need for further research in this emerging challenges. For students who have grown up in an
area. ChatGPT has been criticized for relying on biased data education system centered on skill acquisition, there can be a
and potentially providing inaccurate or false information [24]. significant challenge in reconciling the skills they have acquired
There can be potential issues and considerations associated during their studies with those demanded by the future job
with students using ChatGPT without guidance. Therefore, it market. For example, the advent of the first industrial revolution
is imperative to enhance the capability of using ChatGPT to brought about disruption in the light industry, leading to the re-
prevent students from accessing harmful information while in- placement of manual labors with machines. Similarly, generative
corporating it into a flipped classroom setting. AI technologies such as ChatGPT have the potential to replace a
Prompt engineering, a technique involving strategically de- substantial number of jobs that only require fundamental skills.
signed input prompts, plays a crucial role in obtaining more During the era of mechanization, the appropriate response was to
precise responses from AI-based chatbots generated by LLMs, acquire knowledge of mechanics and become adept at operating
which are trained on a substantial amount of content and gen- machinery. This the same logic seems to apply today with the
erally exhibit the ability to generate accurate results based on wide use of generative AI. Instead of shying away from this
task descriptions. In this context, the task description is referred technology, individuals should strive to learn and become profi-
to as a prompt. Unlike human communication, which often cient in using generative AI systems. Hence, conducting research
relies on ambiguous cues, LLMs require clearer guidance or on the impact of generative AI on current teaching methods
specific phrasing to achieve optimal understanding and response and exploring effective ways to use generative AI to enhance
generation. Prompt engineering encompasses various elements, teaching is timely and can make a significant contribution to
such as questions, keywords, and contextual information, all this field of study.
aimed at enhancing the model’s comprehension of user needs
and improving response accuracy. II. RELATED WORK
Effective prompts empower users to leverage the powerful
AI generated content (AIGC) encompasses content generated
capabilities of LLMs, obtaining accurate and relevant responses
through deep learning models like generative pretrained trans-
that enhance work efficiency and problem-solving capabilities.
former (GPT). These advanced technologies exhibit remarkable
Prompt engineering, through gaining a better understanding
proficiency in processing diverse data formats, including natural
of user needs, has the potential to significantly improve user
language, images, audio, and video. By harnessing a variety of
experience, their satisfaction with the tool, and thus, to realize
multimodal data sources, including tutorial videos, academic
the optimal use of LLMs.
papers, and other reliable information, AIGC holds the promise
Having a strong grasp of prompt engineering techniques is
of achieving substantial progress in the field of education [1].
essential for unlocking the full benefits discussed in the previous
The incorporation of these diverse data sources can greatly
paragraph. Mastery of these techniques allows users to effec-
enhance the personalized educational experience. Google Re-
tively harness the potential of LLMs. By leveraging prompts in
search introduced Minerva [30], an advanced model that builds
the right way, users gain access to the powerful capabilities of
upon the PaLM [31] while incorporating a science-and-math-
LLMs, resulting in responses that are more accurate and relevant.
focused dataset. The proposed approach attains state-of-the-art
This, in turn, leads to improved work efficiency and enhanced
performance in reasoning tasks by employing a combination
problem-solving capabilities. Understanding how to effectively
of innovative techniques, such as few-shot prompting, majority
use prompts guides users in maximizing the potential of LLMs
voting, and chain of thought (scratchpad prompting) [1].
so as to reap the benefits of prompt engineering.
Despite the significance of prompt engineering highlighted
in numerous studies and the exploration of methods to elicit A. ChatGPT
high-quality responses from LLMs [25], [26], [27], [28], [29], ChatGPT, developed by OpenAI, is an advanced conversa-
there remains a research gap regarding the impact of prompt en- tional chatbot powered by the GPT-3 language model [32]. It
gineering on students’ participation in ChatGPT-enabled flipped produces text that closely resembles human language in re-
classrooms. Therefore, the primary objective of our study is to sponse to prompts and engages in open-ended conversations [7].
investigate the influence of prompt engineering on students’ Its training uses a “prompt–response” dialogue structure, in-
retrieval of information and knowledge in flipped learning. It corporating reinforcement learning with a human-in-the-loop
aims to address two specific questions as follows: approach. This approach involves gathering feedback from hu-
mans, who rank the model’s responses, allowing for fine-tuning
Q1: Does mastering prompt engineering methods help improve through proximal policy optimization. ChatGPT demonstrates
the quality of information students obtain from ChatGPT? exceptional capabilities, such as answering follow-up questions,
Q2: How can the content of prompt engineering be effectively acknowledging errors, challenging incorrect assumptions, and
arranged in teaching to enhance the quality and efficiency of refusing inappropriate queries. Despite receiving subpar grades,
flipped classroom instruction? ChatGPT demonstrates the potential to acquire a university
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 631
degree [33]. ChatGPT has attracted an impressive 100 million through a combination of large-scale pretraining and fine-tuning.
active users in a mere two months since its inception [34]. By using self-attention mechanisms, ChatGPT effectively
The ChatGPT model employs the Transformer architecture, captures contextual information and generates meaningful
leveraging vast corpora during training to facilitate a thorough responses based on user input. Through iterative refinement
understanding and generation of human language. The technical and interactive learning, ChatGPT continuously enhances its
specifications of the ChatGPT model include the following. dialogue generation capabilities.
1) Transformer Architecture: ChatGPT is built upon the pow- Despite being a powerful natural language processing model,
erful foundation of the Transformer architecture, serving ChatGPT does possess certain notable limitations.
as the backbone for numerous state-of-the-art models [1], 1) Lack of Common Sense and Deep Understanding [40]:
such as GPT-3 [25], DALL-E-2 [35], Codex [36], and Despite extensive pretraining and possessing a broad range
Gopher [37]. The Transformer is a deep neural network of linguistic knowledge, ChatGPT still lacks an accurate
architecture that revolutionized “sequence-to-sequence” grasp of real-world common sense and deep understand-
tasks by introducing self-attention mechanisms. It com- ing. In some situations, it may generate responses that
prises an encoder–decoder structure [38], where the en- seem reasonable but are actually incorrect or absurd.
coder transforms the input sequence into intermediate Bian et al. [41] highlighted that while GPT exhibits the
hidden representation vectors. These vectors are then pro- capacity to effectively generate common knowledge by
cessed by the decoder to generate the target sequence. By using “Prompt” and displays competence in responding to
leveraging this architecture, ChatGPT gains a deep under- general knowledge inquiries, it faces challenges when it
standing of the contextual meaning in natural language, comes to addressing specific types of knowledge and lacks
allowing it to produce coherent and contextually relevant the ability to precisely identify the common knowledge
responses. needed to answer specific questions.
2) Self-Attention Mechanism: The self-attention mechanism 2) Limitations of Controlling Generated Content: The con-
plays a crucial role in the Transformer model [1]. It enables trol over the generated responses of ChatGPT is relatively
ChatGPT to focus on specific elements of the input during limited due to its foundation in large-scale unsupervised
processing and assign weights to them according to their pretraining and supervised fine-tuning [42]. ChatGPT may
significance. By employing this mechanism, ChatGPT can generate inappropriate, offensive, or inaccurate replies,
effectively capture long-range dependencies in language particularly when faced with sensitive topics or cultur-
and incorporate contextual information into its generated ally diverse scenarios. The challenge persists in ensuring
responses. that ChatGPT generates responses that align with user
3) Large-scale Pretraining: Through pretraining on a vast expectations. Dave et al. [43] highlighted the inherent lack
corpus of text, ChatGPT acquires language understand- of controllability in ChatGPT’s output, which raises con-
ing capabilities. During the pretraining phase, ChatGPT cerns regarding its applicability in the healthcare domain.
is trained to anticipate the succeeding word or phrase Specifically, they identified potential issues, such as copy-
in a given context [39], which helps it grasp statistical right violations, inadequate handling of complex medical
patterns and language structures. This contributes to its legalities, and a failure to meet the growing demand for
comprehension of language concepts, grammar, and se- transparency in AI-generated content.
mantics. Through this pretraining process, ChatGPT gains 3) Contextual Oversensitivity: ChatGPT’s context handling
extensive linguistic knowledge, empowering it to generate can be excessively sensitive, leading to conservative
responses using flexible language reasoning. and repetitive conversation generation. It often produces
4) Fine-Tuning and Response Generation: After pretraining, replies that resemble previous dialogue, lacking inno-
ChatGPT enhances its performance through fine-tuning vation and diversity. Consequently, this diminishes the
on specific tasks. For dialogue generation tasks, ChatGPT user experience by resulting in less fluid and varied
uses a supervised learning approach, where human experts conversations.
provide example dialogue data to fine-tune the model. This 4) Easily Misleading: The training of ChatGPT involves a
process enables ChatGPT to generate responses that are vast corpus of Internet text, encompassing information
contextually relevant and meaningful. from diverse sources that may contain errors, biases, and
5) Interactive Response and Model Iteration: Through user inaccuracies. Consequently, the model may demonstrate
interactions, ChatGPT undergoes continuous improve- biases, misunderstandings, or disseminate incorrect infor-
ment. When users provide input, ChatGPT considers it as mation in its responses. To prevent potential misinforma-
context and generates an appropriate response. The inter- tion or the misleading of users, it is crucial to approach the
active feedback can be used to refine the model, improving model’s generated responses with caution and diligently
the quality and accuracy of the generated replies. Through verify them. The use of ChatGPT warrants caution due
this iterative process, ChatGPT can steadily optimize its to its potential to mislead both authors and readers. The
conversational generation capabilities. model has been observed to produce wrong facts, generate
ChatGPT is a powerful automated conversational system references that do not exist, and exhibit a tendency to
based on the Transformer architecture, specifically designed for staunchly and persuasively support assertions that may
natural language processing. It achieves dialogue generation be untrue. Consequently, it is crucial to employ this tool
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
632 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
ethically and prioritize its application for the betterment According to Zhai [59], ChatGPT provides valuable suggestions
of humanity [44]. in the context of special education, which he described as being
5) Vulnerabilities in Adversarial Attacks: ChatGPT and other particularly advantageous for students with unique learning
deep learning models exhibit vulnerability to adversarial requirements.
attacks, wherein malicious users manipulate the model For instructional assessment, instructors can use ChatGPT
by providing specific inputs that lead to the generation to craft scenarios for interactive learning, assessments, and
of deceptive, harmful, or inappropriate responses. Liu student evaluations [60]. As an illustration, Han et al. [61]
et al. [45] highlighted that users have the ability to launch directed ChatGPT to generate multiple-choice questions for
adversarial attacks on ChatGPT, leading the model to medical topics, including illustrations and experimental values.
be influenced by specific conversational contexts. This Nonetheless, Al-Worafi et al. [62] advised only using ChatGPT
manipulation involves introducing irrelevant premises into as a supplementary tool for assessment preparation, cautioning
the generated content, potentially resulting in inappropri- that it may not cover all intended learning objectives and should
ate responses. Although certain defense measures have not replace instructors or human tutors.
been implemented to mitigate these attacks, the chal- For instructional support, Topsakal et al. [63] employed Chat-
lenge of addressing adversarial attacks remains an ongo- GPT to facilitate English language learning for students through
ing research area that requires further investigation and interactive dialogues. Once the accuracy of the materials has
resolution. been verified, instructors can use ChatGPT to adapt them for
In addition to the aforementioned limitations, according to tools like Google Dialogflow.1 This enables the provision of
the research conducted by Lo et al. [24], ChatGPT exhibits interactive and personalized learning environments for students.
significant variability in its performance across various domains. As to instructional enhancement, ChatGPT has the potential
This performance spectrum encompasses levels ranging from to improve active learning strategies. For instance, Rudolph
highly satisfactory to acceptable, mediocre, and even poor. et al. [64] proposed the use of a flipped classroom approach,
Notably, in domains, such as economics [46] and program- wherein students are expected to study preclass materials in
ming [47], ChatGPT’s performance has been consistently satis- preparation for the class. Nevertheless, students in traditional
factory. Moreover, it demonstrates commendable proficiency in flipped classrooms often face challenges in preclass learn-
English comprehension. However, when applied in fields, such ing [65], and there is a need to improve classroom engage-
as law [33], [48] and medical education [49], [50], its perfor- ment [14]. The COVID-19 pandemic has exacerbated this is-
mance is about average. In contrast, ChatGPT’s performance in sue, as fully online flipped learning has resulted in diminished
mathematics [51], sports science [52], and software testing [53] classroom engagement and decreased interaction among stu-
is notably poor. dents [66], [67]. Assuming the role of a virtual tutor, ChatGPT
assists students in online independent learning by address-
ing their queries [68], while also bolstering group dynamics
B. Teaching and Learning With ChatGPT through the provision of discussion structures and real-time
ChatGPT is steadily gaining prominence in the education feedback [69].
field, mirroring the extensive research and application of other While ChatGPT has achieved considerable success, its im-
AI tools in the educational domain [54], [55], [56], [57]. Lever- plementation in education has brought forth a range of new
aging its robust natural language processing capabilities and challenges and potential threats. A particular concern arises
adept conversational generation skills, ChatGPT holds signif- from ChatGPT’s potential to enable AI-assisted cheating, as it
icant potential in education and learning support [24]. ChatGPT allows students to “substitute” themselves during exams and
engages with students and educators, offering customized edu- written assignments. According to Susnjak [70], the analytical
cational support, answering inquiries, and facilitating innovative thinking prowess of ChatGPT, coupled with its ability to pro-
learning experiences. duce compelling text with minimal guidance, raises credibility
In teaching and learning, ChatGPT is mainly used in four concerns. This is particularly significant given their prevalence
areas: 1) instructional preparation (e.g., generating course ma- in higher education. Besides, the text generated by ChatGPT
terials, providing suggestions, and conducting language trans- presents challenges for conventional plagiarism detection tools.
lation); 2) instructional assessment (e.g., generating assessment Ventayen [71] conducted a study wherein ChatGPT authored
tasks and evaluating student performance); 3) instructional sup- an article by drawing from existing publications. In addition,
port (e.g., assisting students with practice); and 4) instruc- Khalil and Er [72] showcased the exceptional content genera-
tional enhancement (e.g., improving the effectiveness of existing tion capability of ChatGPT by creating 50 articles that yielded
teaching methods). average similarity scores of 13.72% and 8.76% when assessed,
In terms of instructional preparation, ChatGPT can provide respectively, by Turnitin and iThenticate (two popular plagia-
assistance and suggestions for instructors. A study has shown rism detection applications [24]). The recent study revealed that
that ChatGPT is a valuable tool for educators, helping them iden- peer reviewers could only identify 63% of fraudulent abstracts
tify essential curriculum content, and providing an outline [58]. produced by ChatGPT, raising significant concerns about AI-
Megahed et al. [47] requested ChatGPT to prepare a course driven text in scientific literature [73].
outline tailored to an undergraduate statistics course, noting that
these instructional suggestions require minimal modifications. 1 [Online]. Available: https://cloud.google.com/dialogflow.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 633
Furthermore, in a comprehensive review of 60 articles en- institutions is essential to tackle these challenges. Moreover,
compassing various academic disciplines, Sallam [74] identified instructor training and student education play a vital role in
the challenges linked to the use of ChatGPT in education, effectively responding to the challenges posed by ChatGPT in
specifically in relation to accuracy and reliability. The research the educational landscape [24].
conducted by Mbakwe et al. [75] highlighted that ChatGPT’s The attitude toward ChatGPT should be to retain its essence
training on vast data corpora makes it vulnerable to biases while eliminating its shortcomings. Efforts should be made to
and the potential incorporation of inaccurate information. Such enhance its ability to accelerate the learning process and improve
biases may originate from research predominantly conducted in teaching effectiveness, while also minimizing the harm caused
high-income countries or from textbooks that lack relevance to by errors and irrelevant information [82]. The effectiveness
diverse regions. In addition, it should be noted that ChatGPT’s of integrating ChatGPT into educational activities hinges on
knowledge is based on data available only until 2021 [50], assisting students in obtaining accurate information through
[60], [76], rendering its responses on professional topics and its use. In the application of ChatGPT, prompt engineering is
current events susceptible to inaccuracy and unreliability. This extensively employed to guide ChatGPT in avoiding the delivery
limitation becomes particularly worrisome when considering of irrelevant or erroneous information.
ChatGPT’s potential to generate incorrect or false informa-
tion [47], [76], [77], posing a significant risk for students who
heavily depend on ChatGPT as a source of information. Due C. Prompt Engineering
to such concerns, some schools have already banned the use Prompt engineering, the art of skillful prompt construction
of ChatGPT on campus [24]. However, as the saying goes, we for LLMs like ChatGPT, is essential in guiding the generation of
should not throw out the baby with the bathwater. As Mhlanga’s desired responses [83]. Prompts primarily facilitate communica-
research [78] indicates, the impact of using ChatGPT for educa- tion between users and ChatGPT. Prompts provide guidance to
tional purposes needs immediate attention to ensure maximum ensure that ChatGPT generates responses aligned with the user’s
use of its advantages while minimizing its disadvantages. intent. As a result, well-engineered prompts greatly improve the
Researchers have conducted thorough investigations into the efficacy and appropriateness of ChatGPT’s responses.
potential issues associated with integrating ChatGPT into educa- Effective prompt engineering is crucial for optimizing models
tion. They have also put forth a range of strategies to tackle these as it involves designing, optimizing, and refining prompts to
concerns, which can be categorized into three primary aspects: 1) accurately convey the user’s intent to ChatGPT [84]. Prompt
task design; 2) AI writing detection; and 3) institutional policies. engineering plays a vital role in bridging the gap between
In terms of task design, researchers have explored various user intent and the models’ understanding, thereby significantly
approaches. Zhai [79] proposed the exploration of innovative impacting the quality of generated replies. Thus, it becomes
formats to foster students’ creative and critical thinking. Choi essential for users to master prompt engineering in order to fully
et al. [33] highlighted the significance of demanding students to leverage ChatGPT’s potential and achieve optimal results in
analyze cases instead of merely recalling knowledge. Geerling various applications, considering the direct influence of prompt
et al. [46] proposed that students should be tasked with applying quality on generated replies.
the concepts learned in the course, including the creation of With the continuous improvement of language models, mas-
nonreplicable materials for artificial intelligence. According to tering prompt engineering has become crucial for users to fully
Stutz et al. [80], future assessments should prioritize higher unleash the potential of ChatGPT and achieve optimal results in
levels of Bloom’s taxonomy, including application, analysis, and various applications [83]. Numerous studies have explored the
creation. impact of prompt engineering on AI generative models across
In terms of AI writing detection and institutional policies, domains, including image generation tasks [85] and classifica-
Szabo [52] reported that while conventional plagiarism detection tion tasks [86], highlighting the critical role of excellent prompt
tools may not detect text produced by ChatGPT, AI detectors design for LLMs like ChatGPT [87], [88]. To guide models
can still identify such content. In addition, generating accurate in generating effective natural language prompts, Reynolds
reference lists could be a challenge for ChatGPT, and this could et al. [89] introduced the concept of meta-prompts. In addition,
serve as a vital indicator for instructors to identify student usage Lo et al. [90] proposed the CLEAR framework, consisting of
of ChatGPT [32], [77], [81]. In addition to identifying students’ five fundamental principles that enhance interaction with AI
plagiarism behavior, researchers highlight the significance of language models and facilitate more effective evaluation and
implementing antiplagiarism guidelines and providing educa- content creation. Furthermore, White et al. [91] presented a
tion on academic integrity [32], [64], [72]. prompt engineering catalog, akin to software patterns, offering
While ChatGPT holds immense potential for assisting instruc- reusable solutions to challenges faced when interacting with
tors in tasks like generating course materials, providing sugges- LLMs. As suggested by Giray. [92], academic writers can adapt
tions, and serving as a virtual tutor for students through an- to the ever-changing landscape by acquiring expertise in prompt
swering questions and facilitating collaboration, it also presents engineering and harnessing the power of large-scale language
challenges in the form of generating inaccurate or fabricated models. To enhance the effectiveness of using ChatGPT, it is
information and evading plagiarism detectors. Promptly adapt- important to incorporate prompt engineering into educational
ing teaching approaches and institutional policies in educational activities, empowering both instructors and students.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
634 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 635
To control ChatGPT’s influence on students’ learning out- Our research team manually annotated the answers obtained
comes, in the experimental process, we proposed an isolation from students’ questioning. The quality of the answers provided
questioning method, that is, requiring students to ask Chat- by ChatGPT, as measured by CQS scores, was evaluated by
GPT questions based on the task. The instructor recorded the experts in AIGC. The evaluation of the answers focused on
questions and entered them into ChatGPT to obtain answers. four dimensions of general intelligence standards, as outlined
In practice, students were asked to pose questions twice: once by Paul [94].
during a pretest when they had no prior knowledge of the 1) Relevance: Is the response provided by ChatGPT relevant
topic, and again after gaining an understanding of TCP socket to the task? Does it assess the complexity and comprehen-
programming. We used basic prompts to improve students’ siveness of addressing the task?
questions, simulating the types of questions that students fa- 2) Clarity: Is the response from ChatGPT clear, appropriately
miliar with basic prompt engineering would ask ChatGPT. In organized, and logically coherent? Does it employ suitable
addition, we used a popular prompt engineering framework terminology and diction for the users?
(CRISPE) [29] to enhance students’ questions, simulating the 3) Accuracy: Is ChatGPT’s response accurate? Does it
questions that students proficient in prompt engineering would contain any errors in knowledge?
pose to ChatGPT. 4) Precision: Is the answer provided by ChatGPT specific
The CRISPE framework proposed by Shieh [29] is new but and detailed enough? Is it precise and unambiguous?
considered an excellent template for writing prompts. CRISPE The CQS score of a question varied based on the relevance
stands for the following. of the answer to TCP socket programming, increasing with
1) CR: Capacity and Role. What role do you want ChatGPT relevance and decreasing with lack of relevance.
to play? In order to mitigate the potential influence of new technology
2) I: Insight, what background information and context do on students, this experiment adopted an isolated questioning
you want ChatGPT to provide? method. The implementation involved the instructor assigning
3) S: Statement. What do you want ChatGPT to do? the task and introducing ChatGPT to the students, providing
4) P: Personality. In what style or manner do you want instructions on its usage. The students were informed that they
ChatGPT to answer you? could acquire the necessary knowledge to complete the TCP
5) E: Experiment. Ask ChatGPT to provide multiple answers socket programming assignment by asking questions to Chat-
for you. GPT. Each student was required to design five high-quality,
Prompts generated by following this framework can elicit task-related questions to obtain relevant information. To evaluate
more complete and in-depth answers comparing to the free-style the relevance of the obtained answers, experts were invited to
unstructured prompts. manually annotate them and assign scores to each question based
This framework-guided approach offers the advantage of on its alignment with TCP socket programming.
addressing potential negative impacts of new technology on stu- The instructor employed prompt engineering methods to en-
dents’ learning. By requiring students to formulate task-specific hance the quality of students’ questions, while AIGC experts
questions for ChatGPT, it ensures that they actively engage with from our research team assessed the answers generated by
the technology. In addition, while incorporating this technolog- ChatGPT. In an ideal scenario, students acquire the techniques of
ical aspect, the teaching approach in this experiment followed prompt engineering, enabling them to refine their own questions.
the flipped classroom model, so as to align with established Consequently, the quality of the answers they obtain is expected
pedagogical practices. to be comparable to those derived from the instructor’s improved
questions. The isolated questioning method offers two distinct
advantages. First, it eliminates concerns about the potential
D. Implementation Phase impact of new technology on the teaching process for students.
Before delving into network layer concepts, the students Second, it facilitates the investigation of prompt engineering’s
were assigned the task of learning TCP socket programming influence on students’ use of ChatGPT in flipped classroom
with Python. They were required to write code for both the tasks.
server and client sides, adhering to the client/server model. The Following is a detailed breakdown of the aforementioned
task involved the client connecting to the server and sending a experimental steps as shown in Fig. 1.
string. The server was expected to convert the client’s string to 1) Students were randomly assigned to Group 1 or Group
uppercase and return it to the client. 2. The instructor explained the task objective: to create a
The instructor delivered a comprehensive introduction to TCP Python program for communication using TCP sockets.
socket programming and client/server mode communication to Group 1 students, lacking prior knowledge in TCP socket
all students in this study. In a 45-min session, the students were programming, generated a set of 5 questions for Chat-
provided with a detailed explanation of the assignment they GPT, designated as Question Set A1. Group 2 students,
would work on. The assignment required the students to research equipped with knowledge of TCP socket programming,
and apply various programming concepts related to TCP sockets, posed a separate set of 5 questions to ChatGPT, named
including the TCP protocol, socket programming, IP, ports, and Question Set C2.
more. Their objective was to use these concepts to ask questions 2) To refine the questions in A1, the researcher added a simple
to ChatGPT. prompt that restricted ChatGPT’s responses to the specific
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
636 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 637
TABLE IV
P-VALUE OF COMPARING THE CQS SCORES OF GROUPS WITH SIMPLE TABLE VII
PROMPTING METHOD OR NOT P-VALUE OF COMPARING THE CQS SCORES OF GROUPS WITH AND WITHOUT
THE COMBINATION OF CRISPE AND TASK-RELATED KNOWLEDGE
TABLE V
P-VALUE OF COMPARING THE CQS SCORES OF GROUPS WITH THE SIMPLE
PROMPTING METHOD OR CRISPE
As this study reveals, the implementation of a sound frame-
work, such as the CRISPE framework, in comparison to un-
structured prompts, enabled students to acquire higher quality
information from ChatGPT in flipped classroom tasks. Conse-
quently, the CRISPE method leads to a statistically significant
improvement in the quality of answers obtained by students
the CQS scores between the C1 and E2 groups, as well as from ChatGPT. The significant impact of the prompting method
the C2 and E1 groups (please refer to Table III), we observe based on the CRISPE framework suggests that mastering prompt
no statistically significant difference in the mean CQS scores engineering can enhance the efficiency and effectiveness of us-
between the C1 and C2 groups. However, the E2 group shows a ing ChatGPT to acquire knowledge in certain learning settings,
remarkable improvement of 0.7409 in the CQS scores compared ultimately improving students’ learning outcomes.
to the C1 group, with a P-value of 0.0033, much lower than alpha
at 0.05. Similarly, the E1 group demonstrates an impressive
improvement of 1.2515 compared to the C2 group, with a B. Quantitative Results for RQ2
P-value of 7.555e-7. This section presents quantitative results aimed at answering
Furthermore, when comparing the C1 group with the E1 group the second research question, RQ2: How can the content of
and the C2 group with the E2 group (please see Table III), we prompt engineering be arranged in teaching to improve the
found that the E1 (E2) group exhibits a significant improvement quality and efficiency of flipped classroom teaching?
of 1.083 (0.9091) in the average CQS score compared to the C1 1) Quasi-Experiment 4: To investigate whether task-related
(C2) group, with P-values of 0.0007 (2.108e-7). prior knowledge can enhance the quality of answers obtained
These findings provide compelling evidence supporting the from ChatGPT, a comparison was made between the answers
effectiveness of the CRISPE framework in enhancing the quality obtained in A1 and C1, as well as A1 and C2 (please see
of answers obtained from ChatGPT. The evidence is derived Table VI). The comparison of gain CQS scores revealed that
from both the pretest and posttest results of the same students the information quality from ChatGPT in C1/C2 was higher
over time, as well as controlled experiment conducted between compared to A1 (0.2334/0.0702). Nonetheless, inferential statis-
different student groups. tical analysis showed that the difference between the two groups
2) Quasi-Experiment 2: By comparing the questions from was not significant (P-value: 0.1158/0.7282). The absence of a
groups A1 and B1, we assessed the effectiveness of the simple significant difference suggests that comprehending task-related
prompting method in improving the quality of answers from information does not significantly impact the quality of answers
ChatGPT (please see Table IV). Upon analyzing the gain scores, obtained from ChatGPT for students. The findings suggest
the average CQS score obtained from ChatGPT by the B1 group that task-related knowledge does not significantly enhance the
was significantly higher than that of the A1 group, with an quality of information obtained by students from ChatGPT.
improvement of 0.5167 and a P-value of 0.0003. Therefore, in order to receive high-quality task-related answers
3) Quasi-Experiment 3: A comparison between the quality from ChatGPT, it is necessary to explore alternative methods
of answers obtained in the B1 and D1, highlights the effective- that can improve the efficiency of using ChatGPT.
ness of CRISPE over the simple prompting method (please see 2) Quasi-Experiment 5: To investigate how to arrange
Table V). Analyzing the gain scores reveals that the information prompt engineering methods during the learning process, we
obtained from ChatGPT by the D1 was significantly superior conducted a comparative analysis of the improvement in CQS
to that obtained by the B1 (with improvements of 0.3333). scores between Groups D and E, as opposed to Group A (please
Furthermore, a statistically significant difference was observed see Table VII). The comparison of gain CQS scores revealed
when comparing the simple prompting method to CRISPE, with that the information quality from ChatGPT in E1 was higher
the P-value of 0.0153. compared to D1 (0.4667), with respective P-value of 0.0079.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
638 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
Our recommendation is that instructors should provide stu- use of generative AI tools by students, we suggest that schools
dents with essential task-related knowledge before ensuring that and educational institutions consider offering courses to instruct
they have mastered the skills of prompt engineering. This will students on effective and standardized AI tool usage, thereby
facilitate the effective use of ChatGPT to accomplish flipped enhancing their learning achievements.
classroom tasks with a focus on high-quality outcomes.
V. CONCLUSION
C. Discussion
The primary objective of this study was to answer two re-
1) Interpret the Results: Through a series of experiments, search questions. First, the research team investigated the im-
the findings indicate that mastering prompt engineering methods pact of prompt engineering on students’ ability to obtain high-
indeed contributes to improving the quality of information that quality answers from ChatGPT in flipped classroom settings.
students received from ChatGPT. In addition, the findings sug- The study employed multiple comparative quasi-experiments
gest that, within the context of flipped classroom instructional and the quantitative results indicate a statistically significant
activities, it is essential for educators to ensure that students pos- and positive influence of mastering prompt engineering on the
sess solid foundational knowledge before incorporating prompt effectiveness of information retrieval from ChatGPT. Second,
engineering elements. This approach leads to enhanced effi- we explored the optimal arrangement of prompt engineering
ciency in using ChatGPT, consequently elevating the overall content to enhance the quality and efficiency of flipped class-
teaching quality in flipped classrooms. room teaching. The findings indicate that in order to maximize
In line with other research findings [83], [84], [87], [88], the positive impact of ChatGPT in similar learning settings,
which prompt engineering plays a pivotal role in efficiently students should not only master prompt engineering techniques
conveying users’ intentions to ChatGPT, our study presents but also possess the prerequisite knowledge relevant to the
an exploration of this assertion within the realm of education, assigned task. In conclusion, this study highlights the positive
serving to enlighten educators on the use of prompt engineering impact of prompt engineering on students’ completion of tasks
in aiding students to harness ChatGPT’s capabilities. when using ChatGPT. In addition, this study underscores the
2) Limitations: This study exhibits certain limitations. First, changing role of instructors in this age of AI, from “sage on
it primarily focuses on a specific cohort of students, which stage” to mentors and coaches. We recommended that generative
may constrain the generalizability of the research findings to AI be used under the guidance of instructors to effectively
other academic disciplines or students with different educational harness its positive impact. This study carries some limitations,
backgrounds. Our choice of first-year students as the target group such as using isolated questioning methods and simulated ideal
for this research was deliberate, as they have not undergone conditions. It also assumes that students have already mastered
extensive specialized training, and their diverse academic back- prompt engineering. Nevertheless, the results and findings still
grounds are still diverse and more representative of undergrad offer valuable insights for furthering our understanding of the
students in other universities. Such a choice of subjects helps role of generative AI tools in assisting students with information
mitigate constraints when extending the research findings to stu- acquisition.
dent populations in other disciplines and educational contexts. It In addition, this study may serve as a launch pad for IEEE
is advisable for future studies to expand the scope of the research TRANSACTIONS ON LEARNING TECHNOLOGIES’s upcoming spe-
objectives to enhance research effectiveness. cial issue on Education in the World of ChatGPT and other
Second, the study employed a method termed “isolated ques- Generative AI, which is scheduled to be published by June 2024.
tioning methods” to simulate students’ mastery of prompt en- We hope researchers around the world will join this ongoing
gineering. The assumption made in the study was that all par- dialogue and provide their insights on the use of Generative AI
ticipants could ideally master prompt engineering methods and in education and training.
without being distracted by the novelty of technology. However,
it is important to acknowledge that this approach did not account
for individual differences among students, which is another ACKNOWLEDGMENT
factor for future studies to explore. One of the co-authors serves on this journal’s editorial board.
3) Recommendations: The research findings indicate that in However, she was not involved in the review or the decision-
learning activities, prompt engineering has a positive impact on making of this paper.
students’ information retrieval quality when using ChatGPT. In
this era of AI, the integration of AI tools into teaching activities
is becoming increasingly inevitable. When it comes to the use REFERENCES
of AI tools, such as ChatGPT, we recommend that educators [1] Y. Cao et al., “A comprehensive survey of AI-generated content (AIGC): A
prioritize the mastery and use of prompt engineering techniques. history of generative AI from GAN to ChatGPT,” 2023, arXiv:2303.04226.
[2] E. A. V. Dis, J. Bollen, W. Zuidema, R. v. Rooij, and C. L. Bockting, “Chat-
We also encourage more educational research institutions and GPT: Five priorities for research,” Nature, vol. 614, no. 7947, pp. 224–226,
scholars to delve into the application of prompt engineering 2023.
in education across various disciplines and age groups. This [3] L. Ouyang et al., “Training language models to follow instructions with
human feedback,” in Proc. Annu. Conf. Neural Inf. Process. Syst., 2022,
will contribute to a deeper understanding of the practical effects pp. 27730–27744.
of prompt engineering on students’ use of AI tools, such as [4] Z. Du et al., “General language model pretraining with autoregressive
ChatGPT and the identification of best practices. Regarding the blank infilling,” 2021, arXiv:2103.10360.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 639
[5] L. Nicolescu and M. T. Tudorache, “Human-computer interaction in cus- [29] J. Shieh, “Best practices for prompt engineering with OpenAI API,”
tomer service: The experience with AI chatbots—A systematic literature OpenAI, Feb. 2023. [Online]. Available: https://help.openai.com/en/
review,” Electronics, vol. 11, no. 10, 2022, Art. no. 1579. articles/6654000-best-practices-for-prompt-engineering-with-openai-
[6] L. Xu, L. Sanders, K. Li, and J. C. Chow, “Chatbot for health api
care and oncology applications using artificial intelligence and ma- [30] A. Lewkowycz et al., “Solving quantitative reasoning problems with
chine learning: Systematic review,” JMIR Cancer, vol. 7, no. 4, 2021, language models,” in Proc. Annu. Conf. Neural Inf. Process. Syst., 2022,
Art. no. e27850. pp. 3843–3857.
[7] Y. Lu, H. Wang, and W. Wei, “Machine learning for synthetic data [31] A. Chowdhery et al., “PALM: Scaling language modeling with pathways,”
generation: A review,” 2023, arXiv:2302.04062. 2022, arXiv:2204.02311.
[8] P. Diwanji, K. Hinkelmann, and H. F. Witschel, “Enhance classroom [32] M. Perkins, “Academic integrity considerations of AI large language
preparation for flipped classroom using AI and analytics,” in Proc. Int. models in the post-pandemic era: ChatGPT and beyond,” J. Univ. Teach.
Conf. Enterprise Inf. Syst., 2018, pp. 477–483. Learn. Pract., vol. 20, no. 2, 2023, Art. no. 07.
[9] G. Akçayır and M. Akçayır, “The flipped classroom: A review of [33] J. H. Choi, K. E. Hickman, A. Monahan, and D. Schwarcz, “Chat-
its advantages and challenges,” Comput. Educ., vol. 126, pp. 334–345, GPT goes to law school,” J. Legal Educ., vol. 71, no. 3, pp. 387–400,
2018. 2023.
[10] J. Bergmann and A. Sams, Flip Your Classroom: Reach Every Student in [34] T. Teubner, C. M. Flath, C. Weinhardt, W. v. d. Aalst, and O.
Every Class Every Day. Washington, DC, USA: Int. Soc. Technol. Educ., Hinz, “Welcome to the era of ChatGPT et al. the prospects of large
2012. language models,” Bus. Inf. Syst. Eng., vol. 65, no. 2, pp. 95–101,
[11] B. Sohrabi and H. Iraj, “Implementing flipped classroom using digital me- 2023.
dia: A comparison of two demographically different groups perceptions,” [35] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hi-
Comput. Hum. Behav., vol. 60, pp. 514–524, 2016. erarchical text-conditional image generation with clip latents,” 2022,
[12] J.-W. Lin, H.-C. K. Lin, and H.-R. Chen, “Developing an E-learning arXiv:2204.06125.
platform capable of being aware of self-regulated learning behaviors of [36] M. Chen et al., “Evaluating large language models trained on code,” 2021,
role models,” IEEE Trans. Learn. Technol., vol. 15, no. 6, pp. 697–708, arXiv:2107.03374.
Dec. 2022. [37] J. W. Rae et al., “Scaling language models: Methods, analysis & insights
[13] L. Cheng, A. D. Ritzhaupt, and P. Antonenko, “Effects of the flipped from training Gopher,” 2021, arXiv:2112.11446.
classroom instructional strategy on students’ learning outcomes: A [38] C. Raffel et al., “Exploring the limits of transfer learning with a unified text-
meta-analysis,” Educ. Technol. Res. Devlop., vol. 67, pp. 793–824, to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 1, pp. 5485–5551,
2019. 2020.
[14] K. F. Hew, S. Bai, P. Dawson, and C. K. Lo, “Meta-analyses of flipped [39] M. U. Hadi et al., “A survey on large language models: Applications,
classroom studies: A review of methodology,” Educ. Res. Rev., vol. 33, challenges, limitations, and practical usage,” 2023, Techrxiv.23589741.v1.
2021, Art. no. 100393. [40] M. Farrokhnia, S. K. Banihashem, O. Noroozi, and A. Wals, “A SWOT
[15] D. E. Gonda and B. Chu, “Chatbot as a learning resource? Creating analysis of ChatGPT: Implications for educational practice and research,”
conversational bots as a supplement for teaching assistant training course,” Innovations Educ. Teach. Int., 2023.
in Proc. IEEE Int. Conf. Eng., Technol. Educ., 2019, pp. 1–5. [41] N. Bian, X. Han, L. Sun, H. Lin, Y. Lu, and B. He, “ChatGPT is a
[16] W. Huang, K. F. Hew, and D. E. Gonda, “Designing and evaluating three knowledgeable but inexperienced solver: An investigation of common-
chatbot-enhanced activities for a flipped graduate course,” Int. J. Mech. sense problem in large language models,” 2023, arXiv:2303.16421.
Eng. Robot. Res., vol. 8, pp. 813–818, 2019. [42] Y. K. Dwivedi et al., “So what if ChatGPT wrote it?’ Multidisciplinary
[17] T. Ito, M. S. Tanaka, M. Shin, and K. Miyazaki, “The online PBL (project- perspectives on opportunities, challenges and implications of generative
based learning) education system using AI (artificial intelligence),” in conversational ai for research, practice and policy,” Int. J. Inf. Manage.,
Proc. 23rd Int. Conf. Eng. Product Des. Educ., 2021, pp. 1–6. vol. 71, 2023, Art. no. 102642.
[18] J. Li, L. Ling, and C. W. Tan, “Blending peer instruction with just- [43] T. Dave, S. A. Athaluri, and S. Singh, “ChatGPT in medicine:
in-time teaching: Jointly optimal task scheduling with feedback for An overview of its applications, advantages, limitations, future
classroom flipping,” in Proc. 8th ACM Conf. Learn., Scale, 2021, prospects, and ethical considerations,” Front. Artif. Intell., vol. 6, 2023,
pp. 117–126. Art. no. 1169595.
[19] A. Varnavsky, “Chatbot to increase the effectiveness of the << flipped [44] I. Švab, Z. Klemenc-Ketiš, and S. Zupanič, “New challenges in scientific
classroom >> technology,” in Proc. 2nd Int. Conf. Technol. Enhanced publications: Referencing, artificial intelligence and ChatGPT,” Slovenian
Learn. Higher Educ., 2022, pp. 289–293. J. Public Health, vol. 62, no. 3, pp. 109–112, 2023.
[20] K. F. Hew, W. Huang, J. Du, and C. Jia, “Using chatbots in flipped learning [45] B. Liu, B. Xiao, X. Jiang, S. Cen, X. He, and W. Dou, “Adversarial
online sessions: Perceived usefulness and ease of use,” in Proc. Int. Conf. attacks on large language model-based system and mitigating strategies:
Blended Learn., 2021, pp. 164–175. A case study on ChatGPT,” Secur. Commun. Networks, vol. 2023, 2023,
[21] K. F. Hew, W. Huang, J. Du, and C. Jia, “Using chatbots to support Art. no. 8691095.
student goal setting and social presence in fully online activities: Learner [46] W. Geerling, G. D. Mateer, J. Wooten, and N. Damodaran, “Is ChatGPT
engagement and perceptions,” J. Comput. Higher Educ., vol. 35, no. 1, smarter than a student in principles of economics?,” Social Sci. Res. Netw.,
pp. 40–68, 2023. 2023.
[22] S. Wollny, J. Schneider, D. D. Mitri, J. Weidlich, M. Rittberger, and H. [47] F. M. Megahed, Y.-J. Chen, J. A. Ferris, S. Knoth, and L. A. Jones-Farmer,
Drachsler, “Are we there yet?—A systematic literature review on chatbots “How generative AI models such as ChatGPT can be (mis) used in SPC
in education,” Front. Artif. Intell., vol. 4, 2021, Art. no. 654924. practice, education, and research? An exploratory study,” Qual. Eng.,
[23] C. K. Lo and K. F. Hew, “A review of integrating AI-based chatbots into 2023.
flipped learning: New possibilities and challenges,” Front. Educ., vol. 8, [48] S. Hargreaves, “‘Words are flowing out like endless rain into a paper
2023, Art. no. 1175715. cup’: ChatGPT & law school assessments,” The Chinese University of
[24] C. K. Lo, “What is the impact of ChatGPT on education? A rapid review Hong Kong Faculty of Law Research Paper, 2023. Accessed: Mar. 2023.
of the literature,” Educ. Sci., vol. 13, no. 4, 2023, Art. no. 410. [Online]. Available: https://ssrn.com/abstract=4359407
[25] T. Brown et al., “Language models are few-shot learners,” in Proc. Annu. [49] T. H. Kung et al., “Performance of ChatGPT on USMLE: Potential for
Conf. Neural Inf. Process. Syst., 2020, pp. 1877–1901. AI-assisted medical education using large language models,” PLoS Digit.
[26] X. Wang et al., “Self-consistency improves chain of thought reasoning in Health, vol. 2, no. 2, 2023, Art. no. e0000198.
language models,” 2022, arXiv:2203.11171. [50] A. Gilson et al., “How does ChatGPT perform on the United States medical
[27] L. Gao et al., “PAL: Program-aided language models,” in Proc. Int. Conf. licensing examination? The implications of large language models for
Mach. Learn., 2023, pp. 10764–10799. medical education and knowledge assessment,” JMIR Med. Educ., vol. 9,
[28] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language no. 1, 2023, Art. no. e45312.
models are zero-shot reasoners,” in Proc. Annu. Conf. Neural Inf. Process. [51] S. Frieder et al., “Mathematical capabilities of ChatGPT,” 2023,
Syst., 2022, pp. 22199–22213. arXiv:2301.13867.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
640 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 17, 2024
[52] A. Szabo, “ChatGPT a breakthrough in science and education: Can it fail [76] D. Baidoo-Anu and L. O. Ansah, “Education in the era of genera-
a test?,” Feb. 2023. [Online]. Available: osf.io/ks365 tive artificial intelligence (AI): Understanding the potential benefits of
[53] S. Jalil, S. Rafi, T. D. LaToza, K. Moran, and W. Lam, “ChatGPT and ChatGPT in promoting teaching and learning,” Social Sci. Res. Netw.,
software testing education: Promises & perils,” in Proc. IEEE Int. Conf. 2023.
Softw.Testing, Verification Validation Workshops, 2023, pp. 4130–4137. [77] J. Qadir, “Engineering education in the era of ChatGPT: Promise and
[54] F. Liu, L. Zhao, J. Zhao, Q. Dai, C. Fan, and J. Shen, “Educational process pitfalls of generative AI for education,” in Proc. IEEE Glob. Eng. Educ.
mining for discovering students’ problem-solving ability in computer Conf., 2023, pp. 1–9.
programming education,” IEEE Trans. Learn. Technol., vol. 15, no. 6, [78] D. Mhlanga, “Open AI in education, the responsible and ethical use of
pp. 709–719, Dec. 2022. ChatGPT towards lifelong learning,” Social Sci. Res. Netw., 2023.
[55] V. Echeverria, K. Yang, L. Lawrence, N. Rummel, and V. Aleven, “Design- [79] X. Zhai, “ChatGPT user experience: Implications for education,” Social
ing hybrid human–AI orchestration tools for individual and collaborative Sci. Res. Netw., 2022.
activities: A technology probe study,” IEEE Trans. Learn. Technol., vol. 16, [80] P. Stutz et al., “Ch(e)atGPT? An anecdotal approach on the impact of
no. 2, pp. 191–205, Apr. 2023. ChatGPT on teaching and learning GIScience,” 2023. [Online]. Available:
[56] B. B. Tomić, A. D. Kijevčanin, Z. V. Ševarac, and J. M. Jovanović, “An AI- https://doi.org/10.35542/osf.io/j3m9b
based approach for grading students’ collaboration,” IEEE Trans. Learn. [81] D. R. Cotton, P. A. Cotton, and J. R. Shipway, “Chatting and cheating:
Technol., vol. 16, no. 3, pp. 292–305, Jun. 2023. Ensuring academic integrity in the era of ChatGPT,” Innovations Educ.
[57] K. Ahmad et al., “Data-driven artificial intelligence in education: A Teach. Int., 2023.
comprehensive review,” IEEE Trans. Learn. Technol., to be published, [82] Y. A. Ahmed and A. Sharo, “On the education effect of CHATGPT: Is AI
doi: 10.1109/TLT.2023.3314610. CHATGPT to dominate education career profession?,” in Proc. Int. Conf.
[58] A. Tlili et al., “What if the devil is my guardian angel: ChatGPT as a case Intell. Comput., Commun., Netw. Serv., 2023, pp. 79–84.
study of using chatbots in education,” Smart Learn. Environments, vol. 10, [83] S. Ekin, “Prompt engineering for ChatGPT: A quick guide to techniques,
no. 1, 2023, Art. no. 15. tips, and best practices,” 2023, techrxiv.22683919.v2.
[59] X. Zhai, “ChatGPT for next generation science learning,” XRDS: Cross- [84] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train,
roads, ACM Mag. Students, vol. 29, no. 3, pp. 42–46, 2023. prompt, and predict: A systematic survey of prompting methods in natural
[60] R. A. Khan, M. Jawaid, A. R. Khan, and M. Sajjad, “ChatGPT-reshaping language processing,” ACM Comput. Surv., vol. 55, no. 9, pp. 1–35,
medical education and clinical management,” Pakistan J. Med. Sci., vol. 39, 2023.
no. 2, 2023, Art. no. 605. [85] V. Liu and L. B. Chilton, “Design guidelines for prompt engineering text-
[61] Z. Han, F. Battaglia, A. Udaiyar, A. Fooks, and S. R. Terlecky, “An to-image generative models,” in Proc. CHI Conf. Hum. Factors Comput.
explorative assessment of ChatGPT as an aid in medical education: Use it Syst., 2022, pp. 1–23.
with caution,” MedRxiv, 2023. Accessed: Feb. 2023. [86] X. Han, W. Zhao, N. Ding, Z. Liu, and M. Sun, “PTR: Prompt tuning with
[62] Y. M. Al-Worafi, A. Hermansyah, K. W. Goh, and L. C. Ming, “Artificial rules for text classification,” AI Open, vol. 3, pp. 182–192, 2022.
intelligence use in university: Should we ban ChatGPT?,” 2023. [Online]. [87] J. Wei et al., “Chain-of-thought prompting elicits reasoning in large
Available: https://doi.org/10.20944/preprints202302.0400.v1 language models,” in Proc. Annu. Conf. Neural Inf. Process. Syst., 2022,
[63] O. Topsakal and E. Topsakal, “Framework for a foreign language teaching pp. 24824–24837.
software for children utilizing AR, voicebots and ChatGPT (large language [88] Y. Zhou et al., “Large language models are human-level prompt engineers,”
models),” J. Cogn. Syst., vol. 7, no. 2, pp. 33–38, 2022. 2022, arXiv:2211.01910.
[64] J. Rudolph, S. Tan, and S. Tan, “ChatGPT: Bullshit spewer or the end [89] L. Reynolds and K. McDonell, “Prompt programming for large language
of traditional assessments in higher education?,” J. Appl. Learn. Teach., models: Beyond the few-shot paradigm,” in Proc. Extended Abstr. CHI
vol. 6, no. 1, pp. 342–362, 2023. Conf. Hum. Factors Comput. Syst., 2021, pp. 1–7.
[65] C. K. Lo and K. F. Hew, “A critical review of flipped classroom challenges [90] L. S. Lo, “The clear path: A framework for enhancing information literacy
in K-12 education: Possible solutions and recommendations for future through prompt engineering,” J. Academic Librarianship, vol. 49, no. 4,
research,” Res. Pract. Technol. Enhanced Learn., vol. 12, no. 1, 2017, 2023, Art. no. 102720.
Art. no. 4. [91] J. White et al., “A prompt pattern catalog to enhance prompt engineering
[66] C. K. Lo and K. F. Hew, “Design principles for fully online flipped with ChatGPT,” 2023, arXiv:2302.11382.
learning in health professions education: A systematic review of research [92] L. Giray, “Prompt engineering with ChatGPT: A guide for academic
during the COVID-19 pandemic,” BMC Med. Educ., vol. 22, no. 1, 2022, writers,” Ann. Biomed. Eng., vol. 51, pp. 2629–2633, 2023.
Art. no. 720. [93] J. R. Fraenkel and N. E. Wallen, How to Design and Evaluate Research in
[67] C. K. Lo, “Strategies for enhancing online flipped learning: A systematic Education. New York, NY, USA: McGraw-Hill, 2012, vol. 7.
review of empirical studies during the COVID-19 pandemic,” Interactive [94] R. Paul, “The state of critical thinking today,” New Directions Community
Learn. Environments, 2023. Colleges, vol. 2005, no. 130, pp. 27–38, 2005.
[68] S. Nisar and M. S. Aslam, “Is ChatGPT a good tool for T&CM students
in studying pharmacology?,” Social Sci. Res. Netw., 2023.
[69] E. Kasneci et al., “ChatGPT for good? On opportunities and challenges
of large language models for education,” Learn. Individual Differences,
vol. 103, 2023, Art. no. 102274.
[70] F. Ali, “Let the devil speak for itself: Should ChatGPT be allowed or
banned in hospitality and tourism schools?,” J. Glob. Hospitality Tourism,
vol. 2, no. 1, pp. 1–6, 2023.
[71] R. J. M. Ventayen, “OpenAI ChatGPT generated results: Similarity in-
dex of artificial intelligence-based contents,” Adv. Intell. Syst. Comput., Mo Wang received the bachelor’s degree in art and
2023. design, the master’s degree in design, and the Ph.D.
[72] M. Khalil and E. Er, “Will ChatGPT get you caught? Rethinking of degree in education technology from Northeast Nor-
plagiarism detection,” 2023, arXiv:2302.04335. mal University, Changchun, China, in 2004, 2007,
[73] H. H. Thorp, “ChatGPT is fun, but not an author,” Science, vol. 379, and 2020, respectively.
no. 6630, pp. 313–313, 2023. She has authored or coauthored articles in a series
[74] M. Sallam, “The utility of ChatGPT as an example of large language of both domestic and international indexed journals,
models in healthcare education, research and practice: Systematic review including prestigious publications such as Education
on the future perspectives and potential limitations,” MedRxiv, 2023. Research. Her research interests include digital media
Accessed: Feb. 2023. arts, specifically in the curation and preservation of
[75] A. B. Mbakwe, I. Lourentzou, L. A. Celi, O. J. Mechanic, and A. Dagan, regional art and culture through digital media. She
“ChatGPT passing USMLE shines a spotlight on the flaws of medical has also undertaken pioneering research in the professional development of art
education,” PLOS Digit. Health, vol. 2, no. 2, 2023, Art. no. e0000205. teachers.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: UNLEASHING CHATGPT’S POWER: A CASE STUDY ON OPTIMIZING INFORMATION RETRIEVAL 641
Minjuan Wang (Member, IEEE) received the B.A. Lanqing Yang received the B.S. degree in soft-
degree in chinese literature from Beijing/Peking Uni- ware engineering from Northeast Normal University,
versity, Beijing, China, the M.A. degree in compara- Changchun, China, in 2018, where she is currently
tive literature from Penn State University, State Col- working toward the degree in computer science and
lege, PA, USA, and the Ph.D. degree in information technology from the School of Information Science
science and learning technologies from the University and Technology, Northeast Normal University, in
of Missouri-Columbia, Columbia, MO, USA. 2018.
She is currently a Professor and Program Chair Her primary research focuses include combinato-
of Learning Design and Technology with San Diego rial optimization, machine learning, emotional com-
State University, San Diego, CA, USA. Her research puting, among others.
interests include STEM education, new and emerging
technologies in various educational settings, Metaverse and immersive learning,
and the design and implementation of artificial intelligence including AIGC for
education and training.
Dr. Wang serves as the Editor-in-Chief for IEEE TRANSACTIONS ON LEARN- Dunbo Cai received the B.S. and M.S. degrees in
ING TECHNOLOGIES. She is an internationally recognized scholar and has computer science from Northeast Normal University,
keynoted about 45 international conferences. She is recognized internationally Changchun, China, in 2003 and 2006, respectively,
for her research, publishing and dedicated service to IEEE and other scholarly and the Ph.D. degree in computer science from Jilin
communities. She is a member of the IEEE Education Society. She also co-chairs University, Changchun, in 2009.
the Education Society’s technical committee for immersive learning. His research interests include swarm intelligence,
automated reasoning, and automated planning.
Xin Xu received the B.S., M.S., and Ph.D. degrees in Minghao Yin (Member, IEEE) received the B.S.
computer science from Northeast Normal University, and M.S. degrees from Northeast Normal Univer-
Changchun, China, in 2012, 2015, and 2023, respec- sity, Changchun, China, in 2001 and 2004, respec-
tively. tively, and the Ph.D. degree from Jilin University,
He is currently a Postdoctoral Researcher with Changchun, China, in 2008, respectively, all in com-
the School of Media Science (Journalism), Northeast puter science.
Normal University. His research interests include ma- He has authored two books and more than 100
chine learning and data mining, with a specific focus articles. His research interests include swarm intelli-
on knowledge discovery and representation learning gence, automated reasoning, automated planning, and
in education. algorithms.
Authorized licensed use limited to: MVJ College of E ngineering - Bengaluru. Downloaded on February 16,2024 at 10:19:38 UTC from IEEE Xplore. Restrictions apply.