Test 2 29

Uploaded by

lee parker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views14 pages

Test 2 29

Uploaded by

lee parker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

The Performance of the LSTM-based Code Generated by Large

Language Models (LLMs) in Forecasting Time Series Data

Saroj Gopalia,∗ , Sima Siami-Naminib , Faranak Abric and Akbar Siami Namina
a Department of Computer Science, Texas Tech University, USA
b AdvancedAcademic Programs, Johns Hopkins University, USA
c Department of Computer Science, San Jose State University, USA

ARTICLE INFO ABSTRACT

Keywords: Generative AI, and in particular Large Language Models (LLMs), have gained substantial momentum
Large Language Models (LLMs) due to their wide applications in various disciplines. While the use of these game changing technolo-
arXiv:2411.18731v1 [cs.AI] 27 Nov 2024

Code Generation gies in generating textual information has already been demonstrated in several application domains,
Forecasting Time Series Data their abilities in generating complex models and executable codes need to be explored. As an intriguing
Deep Learning Models case is the goodness of the machine and deep learning models generated by these LLMs in conducting
Long Short-Term Memory (LSTM) automated scientific data analysis, where a data analyst may not have enough expertise in manually
Prompt Engineering coding and optimizing complex deep learning models and codes and thus may opt to leverage LLMs to
Falcon generate the required models. This paper investigates and compares the performance of the mainstream
LLama-2 LLMs, such as ChatGPT, PaLM, LLama, and Falcon, in generating deep learning models for analyzing
GPT-3 time series data, an important and popular data type with its prevalent applications in many application
PaLM. domains including financial and stock market. This research conducts a set of controlled experiments
where the prompts for generating deep learning-based models are controlled with respect to sensitivity
levels of four criteria including 1) Clarify and Specificity, 2) Objective and Intent, 3) Contextual
Information, and 4) Format and Style. While the results are relatively mix, we observe some distinct
patterns. We notice that using LLMs, we are able to generate deep learning-based models with
executable codes for each dataset seperatly whose performance are comparable with the manually
crafted and optimized LSTM models for predicting the whole time series dataset. We also noticed
that ChatGPT outperforms the other LLMs in generating more accurate models. Furthermore, we
observed that the goodness of the generated models vary with respect to the “temperature” parameter
used in configuring LLMS. The results can be beneficial for data analysts and practitioners who would
like to leverage generative AIs to produce good prediction models with acceptable goodness.

1. Introduction trillion to $4.4 trillion across various use cases, surpassing

the 2021 GDP of the United Kingdom at $3.1 trillion [5].
Large language models (LLMs) such as ChatGPT [15], Such integration could enhance the overall influence of
LLaMa [22], Falcon [16], and PaLM [4] are gaining popu-
artificial intelligence by up to 40% with the possibility of
larity on a regular basis and for a variety of reasons. These
further doubling this estimate by incorporating generative
generative models are already playing an integral role in AI into existing software applications used for tasks beyond
assisting people with their day-to-day duties, such as gener- the initially analyzed use cases.
ating code, writing emails, assisting with projects, and many As per Salesforce’s findings [20], 61% of employees
more. As a result, a wider range of users is involved in deal- already employ or intend to utilize generative AI for ac-
ing with LLMs. According to a report published by Markets
complishing their tasks. Additionally, 68% of employees
and Markets Research Pvt. Ltd.[13], the global market for
believe that generative AI can enhance their ability to serve
generative AI is expected to achieve at a Compound Annual customers better. Moreover, 67% of employees feel that it
Growth Rate (CAGR) of 35.6% between 2023 and 2028, can amplify the benefits derived from other technological
indicating significant potential opportunities. The CAGR investments. These insights highlight the growing adoption
was valued at $11.3 billion in 2023 and it is expected to rise of LLMs in various professional settings.
to almost $51.8 billion by 2028. Looking ahead, industry
In this research work, we have studied the performance
forecasts estimate that the value of generative AI market
of the following large language models:
might reach $191.8 billion by 2032.
Recent research suggests that generative AI-based appli- • GPT−3.5−Turbo 1 is an efficient model that are de-
cations could contribute an annual value ranging from $2.6 signed for natural languages and code comprehension.
∗ Corresponding author: Department of Computer Science, Texas Tech
• Falcon [16], is one of the most effective and optimized
University
∗ Principal corresponding author language models based on high quality, large scale
[email protected] (S. Gopali); [email protected] (S. training data.
Siami-Namini); [email protected] (F. Abri); [email protected] 1 https://platform.openai.com/docs/models/gpt-3-5
(A.S. Namin)
ORCID (s): 0000-0003-3565-9756 (S. Gopali); 0000-0002-3758-4172 (S.
Siami-Namini); 0000-0003-3028-094X (F. Abri); 0000-0002-1646-7495 (A.S.
Namin)