Accelerating Software Development Using Generative AI ChatGPT Case Study
Accelerating Software Development Using Generative AI ChatGPT Case Study
net/publication/378394341
CITATIONS READS
0 586
4 authors, including:
All content following this page was uploaded by Asha Rajbhoj on 29 February 2024.
ABSTRACT 1 INTRODUCTION
The Software Development Life Cycle (SDLC) comprises multiple The Software Development Life Cycle (SDLC) consists of multiple
phases, each requiring Subject Matter Experts (SMEs) with phase- phases. Each phase of the SDLC produces distinct engineering arti-
specific skills. The efficacy and quality of deliverables of each phase facts and requires Subject Matter Experts (SME) with skills relevant
are skill dependent. In recent times, Generative AI techniques, to each phase. The quality, as well as the efficiency of deliverables,
including Large-scale Language Models (LLMs) like GPT, have be- is skill dependent. The recent progress in Generative AI techniques
come significant players in software engineering. These models, has significantly influenced software engineering and we believe it
trained on extensive text data, can offer valuable contributions to can lower this skills barrier by enabling domain SMEs to operate at
software development. Interacting with LLMs involves feeding the natural language level. Large-scale language models (LLMs),
prompts with the context information and guiding the generation like OpenAI’s Codex [1], and Generative Pre-trained Transformer
of textual responses. The quality of the response is dependent on (GPT) [2, 3] are increasingly adopted in AI-driven software engi-
the quality of the prompt given. This paper proposes a system- neering. These are trained on a large corpus of text data and have
atic prompting approach based on meta-model concepts for SDLC capabilities that make it a valuable tool for software development
phases. The approach is validated using ChatGPT for small but to enhance the efficiency and quality of the development process.
complex business application development. We share the approach This can save time and effort for the skilled development teams,
and our experience, learnings, benefits obtained, and the challenges allowing them to focus on higher-level tasks.
encountered while applying the approach using ChatGPT. Our Interacting with LLMs in general involves feeding suitable
experience indicates that Generative AI techniques, such as Chat- prompts (natural language instructions) to provide a context and
GPT, have the potential to reduce the skills barrier and accelerate guide its generation of textual responses [4]. Many researchers
software development substantially. have discussed the future of ChatGPT and other large language
models having a significant effect on how we interact with technol-
CCS CONCEPTS ogy [5, 6]. One may guide LLMs to generate desired responses in
• Software and its engineering → Software creation and man- multiple ways. For instance, one may directly ask LLM to provide
agement; Software development process management; Software details. Another way is to dictate LLM to preempt how to follow the
development methods. response generation. For instance, technology stack, design pattern,
architecture, and so on are preempted prior to setting up the context
for subsequent interaction for the code generation. We propose a
KEYWORDS
systematic prompting approach to leverage LLMs for application
AI in SDLC, Large Language Models, Generative AI, ChatGPT, SDLC development. The approach defines prompt templates for SDLC
automation, Automated Software Development phases based on meta-model concepts. The prompting approach is
ACM Reference Format: validated for small yet complex business application development
Asha Rajbhoj, Akanksha Somase, Piyush Kulkarni, and Vinay Kulkarni. using ChatGPT. We share the approach and our experience, learn-
2024. Accelerating Software Development Using Generative AI: ChatGPT ings, benefits obtained, and challenges encountered while applying
Case Study. In 17th Innovations in Software Engineering Conference (ISEC the approach using ChatGPT. In summary, this paper makes the
2024), February 22–24, 2024, Bangalore, India. ACM, New York, NY, USA, following contributions:
11 pages. https://doi.org/10.1145/3641399.3641403
• Approach for accelerating software development by leverag-
ing Generative AI.
Permission to make digital or hard copies of all or part of this work for personal or • Generic prompt template for SDLC phases based on high-
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation level meta-model concepts.
on the first page. Copyrights for components of this work owned by others than the • Evaluation of approach using ChatGPT.
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or • Validation of the approach on a small yet complex enough
republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected]. business application.
ISEC 2024, February 22–24, 2024, Bangalore, India The organization of the paper is as follows: Section 2 provides
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 979-8-4007-1767-3/24/02 a brief overview of related work. Section 3 describes the meta-
https://doi.org/10.1145/3641399.3641403 model used for defining prompts. Section 4 presents a prompting
ISEC 2024, February 22–24, 2024, Bangalore, India Asha Rajbhoj et al.
approach. Section 5 presents an evaluation of the approach using efficient code snippets [15, 16]. Talasbek discussed how advanced
the case study. Section 6 presents threats to validity. Section 7 AI technologies like ChatGPT can improve software testing process
discusses overall learning and future work. efficiency, and productivity [17]. They explored the possibilities
of automation through the automated generation of test plans and
test scripts using Python and Selenium WebDriver. Ameya et al.
2 RELATED WORK
studied the impact of Generative AI on software development [23].
There has been a growing interest in using Generative AI techniques They interviewed 30 professionals from diverse groups of software
for software engineering tasks, including requirements engineering, engineers, UX designers, and project managers. This research found
design, and testing [1, 3, 19-22]. Several researchers have explored Generative AI is effective in the SDLC, irrespective of the size, scale,
the application of Generative AI. This section reviews related work and nature of the enterprises adopting it.
in all SDLC phases. To the best of our knowledge, the utilization
of Generative AI techniques across various stages of the Software
Development Life Cycle (SDLC) is a relatively unexplored area.
Jianzhang et al. conducted an empirical evaluation of ChatGPT 3 META-MODEL
on retrieving requirements information, specifically NFR, features, Fig. 1 depicts the high-level meta-model used for defining the
and domain terms [7]. Their qualitative and quantitative results prompting approach for the SDLC process. The meta-model has
indicated impressive performance results. White et al. proposed three parts:
prompt design techniques for software engineering in the form Requirement specification meta-model: This covers various
of patterns to enhance the use of LLMs, such as ChatGPT, for requirements specification concepts – Context, Process, Activity,
improving requirements elicitation, rapid prototyping, code quality, Parameter, Rule, RuleType, and various types of associations among
refactoring, and system design [8, 9]. Ruan et al. presented an these concepts. Functionality can be decomposed into multiple Pro-
automated framework for generating requirements models from cess. A Process may have subprocesses. A Process can be described in
requirements written in natural language [24]. They used ChatGPT terms of one or many activities. An Activity can be further decom-
to extract requirement description elements from the requirements posed into sub-activities. A Process may have multiple Rule. Each
text and to present them in a structured format. rule is categorized into one of the predefined RuleType. An Activ-
On the architecture and design front, Galanos et al. presented ity may have multiple input-output (IO) Parameter. Applications
Architext, a semantic generation tool that generates architectural typically need to take care of geography-specific regulations, and
designs using natural language prompts as input to LLMs [10]. varying currencies, address specific details, and so on depending
Ahmad et al. conducted a case analysis of a services-driven software on the operating market and geography. Context concept specifies
application, demonstrating ChatGPT’s potential to support it [11]. these details.
Several researchers have explored the use of LLMs for a variety Design specification meta-model: This covers various design
of code generation. Researchers investigated the Text-to-SQL ca- specification concepts related to presentation, service and database
pabilities of the GPT3 model [12, 13]. H Tian et.al conducted an layer. The presentation layer concepts includes- Screen, user inter-
empirical analysis of ChatGPT performance on code generation, face class (UIClass), attribute to display data attributes (UIAttribute),
program repair, and code summarization and compared it with and buttons. The service layer concepts include – Service and Op-
state-of-the-art approaches [14]. They concluded that ChatGPT eration. The data layer concepts include – Entity and Relationship.
can handle typical programming challenges and discussed limited The presentation layer Button invokes Operation of Service.
attention span. The paper highlights the significance of prompt en- Code generation meta-model: This includes all the concepts
gineering for practical applications in software engineering. Many of design specification meta-model and additional concepts - Class,
researchers have successfully used Codex to generate accurate and Attribute, and Operation.
Accelerating Software Development Using Generative AI: ChatGPT Case Study ISEC 2024, February 22–24, 2024, Bangalore, India
prompts are structured to first generate high-level descriptions, Input: <Business Domain>, <Technology Stack>, <ER Spec: De-
followed by detailed specifications. For instance, the 3rd prompt sign Spec Prompt 1 Response>, <Services Spec: Design Spec Prompt
of this template generates high-level descriptions of the services. 2 Response>, <Screens Spec: Design Spec Prompt 5 Response>
After reviewing these, service details such as service signature with Prompts:
input and output data type are generated in the 4th prompt. This is 1. My application domain is <Business Domain>. Application
executed for each service generated from the 3rd prompt. Similarly, Architecture specification is as follows: < Technology Stack>.
the 5th prompt generates the high-level descriptions of the screens. Consider the following design specification: Entities: <ER
After reviewing these, screen details such as buttons, input /output Spec>. Services Specification: <Services Spec >
data fields, and screen flow details are generated in the 6th prompt. 2. For each <service> IN Services Spec
This prompt is executed for each screen generated by 5th prompt. i. Generate service class code for <service>.
Prompt Template: ii. Generate data access object class code for <service>.
Input: <Business Domain>, <Process Spec: Requirement Spec 3. Generate input output class code for <service>.
Prompt 2 Response>, <Rules Spec: Requirement Spec Prompt 3 4. For each <screen layout> IN Screens Spec
Response> i. Consider following screen layout and generate code: <
Prompts: screen layout>
1. My application domain is <Business Domain>. Consider the ii. Generate screen validation and form submission code.
following requirement specification: Processes: <Processes 5. For each <entity> IN ER Spec - Generate database schema
Spec> Rules: <Rules Spec> for <entity> table.
2. Generate Entities and Relationships descriptions for the
above requirements specification. 4.4 Test Cases Generation
3. Generate Services description for the above requirements The test case generation prompt template takes as input the busi-
specification. ness domain, services, and screen details generated in the design
4. For each <service> IN <Prompt 2 Response> - Generate phase. The prompt template covers prompts to generate various
further details for <service> such as input and output pa- types of test cases along with test data. The 1st prompt sets the
rameters. initial context using the inputs. In the 2nd prompt, test cases are
5. Generate Screens for the above requirements specification. generated for each service, covering various test case types such
6. For each <screen> IN <Prompt 4 Response> - Generate as functional test cases, data validation test cases, and exception
further details for <screen> such as UI Classes, Attributes, handling. Similarly, in the 3rd prompt, test cases are generated for
Buttons, and Screen flows. each screen, covering test case types such as screen field interac-
tion, screen flow navigation, and data validation. After the test case
4.3 Code Generation generation for all services and screens, in the 4th prompt, system
test cases are generated.
The code generation prompt template takes as input the business
Prompt Template:
domain, technology stack, entities, services, and screen details.
Input: <Business Domain>, <Services Spec: Design Spec Prompt 2
The technology stack specifies the technologies of interest to gen-
Response>, <Screens Spec: Design Spec Prompt 4 Response>
erate the application. The input related to entities, services, and
Prompts:
screen control details are taken from the generated design specifi-
cations. The prompt template covers instructions to generate code 1. My application domain is <Business Domain>. Consider
for each service class, followed by its data access object class and the following design specification: Services Specification:
input-output class. After generating the service layer code, code for <Services Spec> Screens Specification: <Screens Spec>
screens is generated. The screen layout descriptions from the design 2. For each <service> IN Services Spec - Generate test cases
specification are provided within the prompt, and code is gener- for <service> covering these test case types – Functional test
ated for screen validation and event handling. Similarly, database case, Data validation test case, and Exception handling. Also,
schemas for tables based on the entity-relationship specification generate the test data for each test case.
are generated. 3. For each <screen> IN Screens Spec - Generate test cases
After reviewing LLM prompt responses, generated code can be for <screen> covering these test case types – Screen field
copied into the integrated development environment (IDE). The interaction, Screen flow navigation, Data validation. Also,
prompts for the code generation are designed such that the manual generate the test data for each test case.
effort required to copy and paste the generated code is minimal. 4. Generate system test cases for the application. Also, generate
Code corresponding to a single file is generated in the same re- the test data for each test case.
sponse to the extent possible. For instance, code for each service,
DAO, and entity-specific class is generated in a separate response. 5 CASE STUDY
Code is to be checked for compilation errors if any and further We used ChatGPT, a Generative AI technique, and Employee Pen-
augmented manually. For instance, business logic for the services, sion Plan System case study to validate the prompting approach.
modification of screen style, etc. can be added/modified if not as Requirements specification, design specification, and test case gen-
desired. eration prompts are executed by an experienced person in one Chat-
Prompt Template: GPT session. Code generation prompts are executed by a novice
Accelerating Software Development Using Generative AI: ChatGPT Case Study ISEC 2024, February 22–24, 2024, Bangalore, India
developer. In this section, we share our experiences in applying The representative prompts execution and observations for re-
the approach using ChatGPT for application development. Due quirements specification generation are shown in Table 1. Some
to the large size of ChatGPT response text, in this paper, we have review edit prompts were required for missed information. Chat-
not shown prompt responses. Instead, this section presents our GPT generated requirement specifications corresponding to On-
positive and negative observations, of the prompt responses. All boarding of New Employees, Employee Fund Allocation, Balance
responses are evaluated to check whether they address the prompt Tracking and Reporting, Recording of Month-end Returns, Dash-
context, include all the necessary details, consistent, and do not boards and Reports. Generated high-level data elements required
conflict with other responses. review edit prompt. 1st Edit (1E) prompt shown in Table 1 was
required for the addition of missing information such as grade,
5.1 Requirements Specification Generation department, employee contact information, employment start date,
and end date. ChatGPT generated processes, corresponding to the
Requirements specification generation prompt template is used for five requirements - i) Onboarding of New Employees, ii) Employee
the requirements specification generation of the case study. The Fund Allocation, iii) Balance Tracking and Reporting, iv) Recording
scope of the application is limited by specifying business require- of Month-end Returns, v) Dashboards and Reports. It provided
ments. The input parameters used for the template are - a description of the processes. Each requirements specification
<Business Domain>: Employee Pension Domain covered the necessary activities. The generated processes were
<Context>: INDIA Geography accurate. ChatGPT generated multiple rules corresponding to each
<Business requirements>: i) On-board new employees to the pen- of the processes along with rules classification. A few examples
sion plan and capture employee information. ii) Enable employees of classification output are as follows – i) Validation Rule: Ensure
to allocate their funds to available asset types iii) Allow adminis- that all mandatory employee information fields are completed before
trator and employee to view their current and past fund balances, enrolling an employee in the pension plan. ii) Display Rule: Display
and their contribution allocation on a monthly basis. iv) Record comprehensive information to employees upon their enrollment in the
month-end returns for the various asset types and incorporate these pension plan, including the selected contribution rate, available asset
returns into the employee balances. v) Provide employee-specific types, and investment strategy.
and overall dashboards of fund-wise balances.
In total ChatGPT generated 5 processes, 12 activities, 28 parame- iii)DashBoardService iv)AuthService iv)PensionPlanService. The
ters, and 11 rules. All activities, rules, and parameters of processes prompt response was satisfactory. Each service covered the func-
were verified for correctness. The generated requirements specifi- tionality details in terms of operations. For instance, the gener-
cations were satisfactory and met the input business requirements. ated EmployeeService description covered functionality related to
managing employee information and onboarding new employees
5.2 Design Specification Generation to the pension plan. ChatGPT could generate the service details
On finalizing requirements specification, design specifications are such as names of operations along with input and output data
generated using ChatGPT. Design specifications covered the cre- type details. A few operations specification generated for Em-
ation of detailed entities, entity relationships, services, and screen ployeeService are as follows: addEmployee(employee: Employee):
specifications. The representative prompts execution and obser- Employee ; updateEmployee(employeeId: string, updatedEmploy-
vations for design specification generation are shown in Table eeInfo: Partial<Employee>): Employee ; getEmployeeById(employ-
2. ChatGPT could generate entities and relationships. The re- eeId: string): Employee ; validateEmployeeInfo(employee: Em-
sponse covered five entities: Employee, Fund Allocation, Asset ployee): Boolean ; deleteEmployee(employeeId: string): boolean
Type, Balance, and Returns; four relationships: i) Employee to ChatGPT generated seven screen descriptions. Here, context
Fund Allocation, ii) Employee to Balance, iii) Balance to Employee forgetting was observed. The screens were not as desired. So, we
and Asset, iv) Returns to Asset Type. Each entity was described provided the screen names through a subsequent edit prompt (Ta-
with the attributes along with the data types. The primary key ble 2, 5E). The screen names given are as - i) Onboarding Screen
and foreign key attributes were also marked in the generated text. ii) Contribution Allocation Screen iii) Balance and Contribution
The generated entity attributes were consistent with requirements Summary Screen iv) Fund Performance Record Screen v) Dashboard
specification parameters. We observed that all the generated speci- Screen. With this edit prompt, ChatGPT generated accurate screen
fications details were in sync. ChatGPT generated textual descrip- specification details. Further for each screen, ChatGPT generated
tion of five Services – i) EmployeeService, ii) ContributionService detailed screen layout information. This included screen structure
4 Generate further details for EmployeeService such as (+) It generated service details including service description,
input and output parameters. method signature with input and output parameters.
5 Generate Screens for the above requirements (-) It generated high level description of screens – i) Login screen
specification ii) Employee dashboard iii) Admin Dashboard iv) Update employee
screen v) View employee screen
(-) These generated screens do not cover all the application’s
requirements.
5E Consider following screens: i) Onboarding Screen ii) (+) It generated high level functional description for all the screens.
Contribution Allocation Screen iii) Balance and
Contribution Summary Screen iv) Fund Performance
Record Screen v) Dashboard Screen Generate
specification details.
6 Generate further details for Onboarding Screen such (+) It generated textual information of this screen that contains i)
as UI Classes, Attributes, Buttons, and Screen flows. Form Field (input / output controls) ii) Buttons iii) validations
Accelerating Software Development Using Generative AI: ChatGPT Case Study ISEC 2024, February 22–24, 2024, Bangalore, India
such as screen title, screen fields, and screen behavior. For instance, 5.3 Code Generation
for the On-Boarding Screen, it generated a screen structure descrip- Code generation prompts are executed by a novice developer in
tion that covered - Name (text input), Phone Number (text input), a separate ChatGPT session. Generated Entity specifications and
Department (dropdown list), Salary (number input), and Start Date high-level services are used as input to the prompt template. The
(date picker), buttons for data submission. The generated text also technology stack is provided as follows: PHP, HTML, CSS, and
had screen behavior related to data validation, mandatory/ optional MySQL. The representative prompts execution and observations
data fields, data format, minimum/ maximum field values, and so for code generation are shown in Table 3. As shown in 1st prompt
on. Table 3, context is set through the design specification and tech-
Overall, ChatGPT generated 4 entities, 4 entity relationships, nology stack. ChatGPT summarized the input given, which was
5 services, 13 operations, and 5 screens. The generated specifica- as per design specification. The subsequent prompts generated
tions are verified with respect to the input requirement specifica- service class code, data access object (DAO) class code, and entity
tions. Additionally, we checked if the generated services included specific class code for each service. Considering entity relation-
appropriate methods, input, output details, screen specifications ships, ChatGPT automatically imported dependent services PHPs
screen attributes, buttons, and screen flows. The generated design in service class code. The DAO class code generated CRUD (Create,
specifications were satisfactory and met the input requirements Read, Update, Delete) operations. However, for other function-
specifications. alities, stubbed methods were generated, and business logic was
written manually. The generated code for services had some incon-
sistencies. Multiple instances of compilation errors occurred due
to mismatches in parameters, classes, and operation naming, as Overall, each test case was well-defined by ChatGPT and in-
well as constructors. Code corrections related to missing imports cluded different scenarios such as invalid inputs, data accuracy,
were manually resolved. The code for handling messages and user screen navigation, and so on. For instance 1) for On boarding em-
alerts was implemented manually. For instance, for Contribution- ployee operation of EmployeeService, ChatGPT generated test case
Service constructor did not have DAO class as a parameter but for the addition of a new employee with valid data, a test case for
was created in the constructor, and for EmployeeService construc- the addition of an employee with invalid data, etc. 2) for onboard-
tor it was taken as a parameter. Business logic was plugged in ing screen ChatGPT generated test case to validate that the Start
wherever required. Manual code was written to establish database Date is not later than the End Date, test case to verify that the
connectivity. phone number follows a valid format, test case to validate that the
For screen generation, the generated screen layout description email address is in a valid format. 3) it generated test cases for
of design specification generated from 5th prompt is given as input. transition from the Onboarding Screen to the Contribution Allo-
Generated code is compiled and reviewed for syntax and semantics cation Screen after successfully adding an employee 4) for system
errors. Errors were manually resolved. Generated HTML code was test cases ChatGPT generated an employee’s ability to complete
largely as per screen specifications. There was a lack of consistency the onboarding process, navigate to the Contribution Allocation
in the CSS. Manual integration of CSS into HTML was required. Screen, adjust allocation, and save changes.
Manual resizing and positioning of UI components was needed. The It generated 35 test cases for screens, 39 test cases for services
code generated by ChatGPT lacked validation checks. For instance, and 11 test cases for the system testing. Five system test cases
no code is generated to prevent the insertion of duplicate records covered non-functional testing related to performance, security,
to the database, such as checking for existing entries with the same usability, accessibility, and browser compatibility. There were no
email address or other unique identifiers. review edits for the test cases. Generated test cases were used for
testing. Test cases related to services are tested through driver stub
Table 4: Code generation summary code. Test cases related to screens are executed from a web browser
Code Count by deploying code using XAMPP platform. Fig. 3 shows sample
representative screens of the application developed.
Number of Files Generated 42
LoC Generated 4914
LoC Written 310
LoC Relevant 4604
LoC Not Relevant 237
Table 6 shows the prompts and effort breakup for Employee 7 DISCUSSION AND FUTURE WORK
Pension Plan application development using ChatGPT. The count
of template prompts (T) is calculated considering the number of The SDLC process demands Subject Matter Experts (SMEs) with
services and screens. The edit prompts (E) are used for correcting phase-specific skills. The efficacy and quality of the software is skill
the inaccuracies in the generated artifacts. Overall SDLC process dependent. To reduce the skills barrier and to accelerate software
using ChatGPT took us ∼22 person days. Productivity gain is development, in this paper, we propose the use of Generative AI
computed as follows: Productivity Gain = (Estimated Effort – techniques. The use of Generative AI demands prompts engineer-
Actual Effort) / Actual Effort. ing as the quality of the response is dependent on the quality of
the prompt. To address this, we proposed a systematic prompting
approach. The prompting strategy encompasses several key aspects
Table 6: Prompts and Effort Summary i) Prompts are designed using meta-model concepts. ii) Prompt
templates are designed specifically for each SDLC phase and can
SDLC Artifacts PromptsCountT– EffortPer- be executed in separate LLM sessions. The first prompt of each
Generation Template son Day phase-specific template sets the appropriate context for subsequent
promptE–Edit prompt (PD) generation. iii) SDLC phase-specific prompts also covered the spe-
cific needs of a phase. For instance, for rules generation in the
Requirement 3T + 1 E ∼1 PD
requirements specification phase, predefined rule types are given
Specifications
in the prompt. Design phase, the strategy was to first generate
Design 14T + 1 E ∼4 PD
high-level description for services and screens followed by detailed
Specifications
generation. For code generation, code corresponding to a single
Source Code 31T + 14 E ∼15 PD
file is generated to reduce copy-paste efforts.
Test Cases 12T ∼2 PD
The prompting approach is evaluated on a small, yet complex
Total 76 Prompts ∼22 PD
business application using ChatGPT. There were many lessons
ISEC 2024, February 22–24, 2024, Bangalore, India Asha Rajbhoj et al.
learned from this evaluation. The prompting approach of breaking a significant challenge, as there are differences in the natural lan-
down the application functionality into smaller parts worked to a guage text variations, label variations, and structure variations
large extent. Using ChatGPT, we could generate multiple artifacts when correlating previous and new artifacts. To address this issue,
of SDLC phases. However, it was essential to validate the gener- we are exploring prompt engineering to reduce non-determinism.
ated artifacts at each step to prevent the propagation of errors into This paper discussed a waterfall approach for the SDLC. We are
subsequent phases. The involvement of a subject-matter expert was exploring other approaches like Agile software development. We
crucial for effective validation. Requirements specification genera- have also planned to use the approach across various domain appli-
tion and test cases generation gave satisfactory outputs. However cations and LLMs.
multiple challenges were observed in design specification and code
generation. While the response text from ChatGPT was semanti- REFERENCES
cally correct, there were a few instances where the labels used in the [1] Wojciech Zaremba ,Greg Brockman, OpenAi, 2021. OpenAI Codex, (Aug, 2021).
response text did not match those specified in the prompts. For ex- Available at: https://openai.com/blog/openai-codex
[2] OpenAi, 2023. Introducing ChatGPT, (Nov, 2023). Available at: https://openai.
ample, in one iteration, the recommended entities were ”Employee com/blog/chatgpt .
Pension Contribution”, ”Contribution Percentage”, ”Asset Returns”, [3] Ashley Pilipiszyn, OpenAI, 2021. GPT-3 Powers the Next Generation of Apps,
”Fund Balance”, ”Users”, ”Roles”, and ”Role Privileges”. In other (Mar, 2021). Available at: https://openai.com/blog/gpt-3-apps
[4] Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and
iterations, the recommended entities were ”Employee”, ”Contribu- Graham Neubig. 2023. Pre-train, Prompt, and Predict: A Systematic Survey of
tion”, ”Pension Plan”, ”Fund”, ”User”, and ”Performance Record”. Prompting Methods in Natural Language Processing. ACM Comput. Surv. 55, 9,
Similarly, the generated code, content, and output forms varied, Article 195 (September 2023), 35 pages. https://doi.org/10.1145/3560815.
[5] Aljanabi Mohammad, 2023. ChatGPT: Future directions and open possibilities.
with differences in the number of services generated, the names Mesopotamian Journal of Cybersecurity (2023), 16-17. DOI: https://doi.org/10.
of functions and parameters, and the format of the output. This 58496/MJCS/2023/003
[6] Aljanabi Mohammad, Ghazi Mohanad, Ahmed H. Ali, and Saad A. Abed, 2023.
non-determinism resulted in increased effort required for review ChatGPT: Open Possibilities. Iraqi Journal for Computer Science and Mathemat-
and modification. ics(Jan,2023),62-64. DOI: https://doi.org/10.52866/20ijcsm.2023.01.01.0018
Occasional difficulty in recalling past conversations was noted [7] Jianzhang Zhang, Yiyang Chen, Nan Niu, Yinglin Wang, Chuang Liu, 2023. A
Preliminary Evaluation of ChatGPT in Requirements Information Retrieval. DOI:
when referencing previously generated output. To overcome this https://doi.org/10.48550/arXiv.2304.12562.
issue, the output had to be used as input multiple times, and sub- [8] Jules White, Sam Hays, Quchen Fu, Jesse Spencer-Smith, Douglas C. Schmidt,
sequent improvements were to be requested through prompts. At 2023. ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Re-
quirements Elicitation, and Software Design. DOI: https://doi.org/10.48550/arXiv.
times past conversation references were given with annotations 2303.07839.
such as short names, keywords, etc. For inconsistency in code gen- [9] Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry
Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C. Schmidt, 2023. A
eration, previous output code was given as input, and subsequent Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. DOI:
improvements were asked through prompts. Multiple prompts https://doi.org/10.48550/arXiv.2302.11382
were required to direct corrections. The generated code was func- [10] Theodoros Galanos, Antonios Liapis, Georgios N. Yannakakis, 2023. Architext:
Language-Driven Generative Architecture Design. DOI: https://doi.org/10.48550/
tionally correct. However, it was not the best. We observed code arXiv.2303.07519
improvements were possible for error handling and performance. [11] Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fahmideh, Mst
Generative AI for software engineering, particularly for large Shamima Aktar, and Tommi Mikkonen, 2023. Towards Human-Bot Collaborative
Software Architecting with ChatGPT. In Proceedings of the 27th International
and complex applications, is a promising area of research, and fur- Conference on Evaluation and Assessment in Software Engineering (EASE ’23),(
ther investigation is required to fully evaluate its potential and June, 2023), 279-285. DOI: https://doi.org/10.1145/3593434.3593468
[12] Rajkumar Nitarshan, Raymond Li, and Dzmitry Bahdanau, 2022. Evaluating
limitations. Generative AI-based code generation may not be as the Text-to-SQL Capabilities of Large Language Models. DOI: https://doi.org/10.
effective as the Model Driven Engineering (MDE) based approach, 48550/arXiv.2204.00498.
especially for large application development. Model Driven Engi- [13] Aiwei Liu, Xuming Hu, Lijie Wen, Philip S. Yu, 2023. A Comprehensive Evaluation
of ChatGPT’s Zero-Shot Text-to-SQL Capability. DOI: https://doi.org/10.48550/
neering (MDE) presents a solution that shifts the focus to creating arXiv.2303.13547.
problem-specific models and using them for automated validation, [14] Haoye Tian, Weiqi Lu, Tsz On Li, Xunzhu Tang, Shing-Chi Cheung, Jacques
analysis, and code generation. The benefits of enhanced developer Klein, Tegawendé F. Bissyandé, 2023. Is ChatGPT the Ultimate Programming
Assistant—How Far Is It?. DOI: https://doi.org/10.48550/arXiv.2304.11938.
productivity through automated code generation have been proven [15] Burak Yetistiren, Isik Ozsoy, and Eray Tuzun, 2022. Assessing the Quality of
with MDE. However, MDE poses a significant entry barrier for GitHub Copilot’s Code Generation. In Proceedings of the 18th International
Conference on Predictive Models and Data Analytics in Software Engineering
Subject Matter Experts (SMEs) who are typically not well-versed (PROMISE ’22), (Nov, 2022),62-71. DOI: https://doi.org/10.1145/3558489.3559072.
with the technology [18]. In the future, we plan to combine MDE [16] Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation
with Generative AI, to leverage the benefits of both. SMEs can inter- vs. Experience: Evaluating the Usability of Code Generation Tools Powered by
Large Language Models. In CHI Conference on Human Factors in Computing
act with Generative AI tools like ChatGPT using purpose-specific Systems Extended Abstracts (CHI EA ’22), (April, 2022), 1-7. DOI: https://doi.org/
contextual prompts, and the design model can be automatically 10.1145/3491101.3519665.
populated. We are currently exploring the use of a meta-model to [17] Arailym L. Talasbek. 2023. The Automation Capabilities in the Field of Software
Testing. Suleyman Demirel University Bulletin: Natural and Technical Sciences
guide the automatic model population using LLMs. 62, (Mar. 2023), 5-14.
The maintenance of large applications is a crucial area that re- [18] Jon Whittle, John Hutchinson, and Mark Rouncefield. 2014. The State of Practice
in Model-Driven Engineering. IEEE Software 31, 3 (May 2014), 79-85. DOI: https:
quires attention. The correlation between the contextual informa- //doi.org/10.1109/ms.2013.65
tion generated by Generative AI and the custom application-specific [19] Nat Friedman. 2021. Introducing GitHub Copilot: Your AI Pair Pro-
contextual information needs to be explored to facilitate evolution- grammer. URL:https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-
programmer
ary maintenance. Non-determinism in the generation process is [20] Rohith Pudari and Neil A. Ernst, 2023. From Copilot to Pilot: Towards AI Sup-
ported Software Development. DOI: https://doi.org/10.48550/arXiv.2303.04142.
Accelerating Software Development Using Generative AI: ChatGPT Case Study ISEC 2024, February 22–24, 2024, Bangalore, India
[21] Arghavan M. Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel https://doi.org/10.1109/REW57809.2023.00035
C. Desmarais, Zhen Ming, and Jiang. 2022. GitHub Copilot AI Pair Programmer: [25] Donald J. Reifer. 2000. Web Development: Estimating Quick-to-Market Software.
Asset or Liability? .DOI: https://doi.org/10.48550/arXiv.2206.15331 IEEE Software 17, (June, 2000), 57-64. DOI: https://doi.org/10.1109/52.895169.
[22] Yihong Dong, Xue Jiang, Zhi Jin, and Ge Li, 2023. Self-collaboration Code Gener- [26] Maurice H. Halstead. 1977.Elements of Software Science (Operating and Pro-
ation via ChatGPT. DOI: https://doi.org/10.48550/arXiv.2304.07590. gramming Systems Series). Elsevier Science Inc.
[23] Ameya S. Pothukuchi, Lakshmi V. Kota, and Vinay Mallikarjunaradhya, 2023. [27] Asha Rajbhoj, Padmalata Nistala, Ajim Pathan, Piyush Kulkarni, and Vinay
Impact of Generative AI on the Software Development Lifecycle (SDLC). Interna- Kulkarni, 2023. RClassify: Combining NLP and ML to Classify Rules from Re-
tional Journal of Creative Research Thoughts, vol 11, (Aug, 2023). quirements Specifications Documents. In Proceedings of the 31st IEEE Interna-
[24] Kun Ruan, Xiaohong Chen, Zhi Jin, Requirements Modeling Aided by ChatGPT: tional Requirements Engineering Conference (RE ’23), pp. 180-189. DOI:https:
An Experience in Embedded Systems. In Proceedings of the 31st IEEE Interna- //doi.org/10.1109/RE57278.2023.00026
tional Requirements Engineering Conference (RE’23 ), (Sep, 2023), 170-177. DOI: