Federated Learning Enhanced MLP–LSTM Modeling in an Integrated Deep Learning Pipeline for Stock Market Prediction

Kumarappan, Jayaraman; Rajasekar, Elakkiya; Vairavasundaram, Subramaniyaswamy; Kotecha, Ketan; Kulkarni, Ambarish

doi:10.1007/s44196-024-00680-9

Federated Learning Enhanced MLP–LSTM Modeling in an Integrated Deep Learning Pipeline for Stock Market Prediction

Research Article
Open access
Published: 29 October 2024

Volume 17, article number 267, (2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

Federated Learning Enhanced MLP–LSTM Modeling in an Integrated Deep Learning Pipeline for Stock Market Prediction

Download PDF

Jayaraman Kumarappan¹,
Elakkiya Rajasekar¹,
Subramaniyaswamy Vairavasundaram²,
Ketan Kotecha³ &
…
Ambarish Kulkarni⁴

2790 Accesses
10 Citations
Explore all metrics

Abstract

In this study, the research presents the Federated Learning Enhanced Multi-Layer Perceptron (Fed-MLP) Long Short-Term Memory that is suggested by the research. The research intends to use the LSTM networks extensively that are proficient in spatial dependence capturing and integrate them with the collaborative learning framework of Federated Learning in an endeavor to augment the predictive competency. In the first step, we gather stock market indices from various financial organizations, using CAC40 stocks as the index for the French stock market. To guarantee data consistency and quality, pre-processing methods including linear interpolation and Z-score normalization are used. There are two types of models for each of the three basic elements within the Fed-MLP–LSTM, namely, MLP for feature extraction and LSTM for sequence modeling. Institutionally, each refining institution trains a local MLP–LSTM on the corpus specific to their institution, with only the model parameters being transferred to a central server through Federated Learning. A global model is created and updated through repeated training and totaling of parameters while preserving privacy of the data going to each node. In the performance evaluation, quantitative measures like Root-Mean-Square Error (RMSE), and accuracy are seven used. Hypothesis testing shows that we have good evidence to support that the proposed Fed-MLP–LSTM outperforms the other methods with the lowest RMSE of 0. 0108 and 98.3% of accuracy with reference to their respective cocaine molecule target. The proposed method is implemented in python. This suggests that using Federated Learning along with MLP and LSTM as the components of this vector enhanced the function increasing its capacity and reliability in predicting the trends of stocks. In conclusion, the present study suggests a sound solution for effective and secure stock market forecasting in collaboration environments that can find its use in the financial domains and securities businesses.

Analyzing the Performance of a Deep Learning-Based Model for Stock Market Trend Prediction

Stock Market Forecasting: From Traditional Predictive Models to Large Language Models

Article Open access 26 June 2025

Learning-Based Stock Trending Prediction by Incorporating Technical Indicators and Social Media Sentiment

Article 09 March 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Many non-FL frameworks and machine learning models have been used for the prediction of stock market movements [1, 2]. The goal of the current study is to anticipate Indian stock markets over 10 years using the Artificial Neural Network, Support Vector Machine, Random Forest, as well as Naive-Bayes algorithms. A Hidden Markov Model is used in an existing study to forecast the stock values of four foreign airlines Shen and Shafq [3]. To develop an LSTM deep learning algorithm for stock market trend prediction, 2 years’ worth of Chinese stock market records was gathered, pre-processed, and analyzed. The suggested technique takes advantage of LSTM to lessen the inaccuracies that often arise in one-way forecasting LSTM models [4]. To do this, it adds macro variables, such as interest rates, economic growth rates, and economic indicators, and analyses trade balances, exchange rates, and currency quantities. All of this information is used to train a two-way prediction model. Based on patterns of data dissemination, FL frameworks are divided into three groups: federated transfer learning, vertical transfer learning, and horizontal transfer learning. Vertical FL is used for the same data samples with distinct feature spaces, while horizontal FL is used for datasets that share the same feature space with various samples. For machine learning models, federated transfer learning is employed when there is little-to-no overlap between the samples and features [5]. Two technologies are required for FL approaches: encryption technology and distributed machine learning. The goal of DML (such as MapReduce) is to train machine learning models concurrently on multiple computational nodes [6]. Nevertheless, FL and DML vary for the following reasons: Control: Unlike typical DML methods like Mapreduce, FL does not permit the server to directly or indirectly modify the data of worker nodes. Instead, the server can manage the worker nodes. Data distribution & load balancing: To improve model training efficiency and facilitate load balancing, DML typically offers autonomous and identical data distribution (IID). FL, on the other hand, does not allow IID and could provide the worker nodes distinct data parts. Communication cost: Since worker and server nodes are typically situated close to one another, DML applications offer reduced communication costs [7]. However, because cloud-based communication channels are used to connect the nodes, FL applications have a significant communication overhead. Communication quality: Due to their strategic locations, DML's nodes typically have access to high-speed broadband. Consequently, the operating environments and the DML network have stayed stable. On the other hand, employees at FL have experienced higher workloads because of the increased demands from the enhanced system [8].

The application of deep learning to stock market prediction has garnered significant interest due to its ability to model complex, non-linear relationships and seize temporal dependencies inside monetary information [9, 10]. Deep gaining knowledge of models, consisting of Multi-Layer Perceptron’s and Long Short-Term Memory networks, offer numerous advantages for inventory marketplace prediction [10]. MLPs are adept at extracting problematic features from uncooked facts, figuring out styles that are probably unnoticed using conventional statistical methods [11]. LSTMs, then again, are especially designed to deal with sequential statistics, making them exceedingly powerful at taking pictures developments and dependencies through the years, which can be crucial for accurate forecasting in the pretty dynamic and unstable stock markets. To leverage these advantages, we suggest a comprehensive workflow that integrates MLP and LSTM inside a federated gaining knowledge of framework for inventory market prediction the usage of the CAC40 dataset [12]. The workflow begins with the gathering and pre-processing of facts from multiple economic establishments, ensuring that the statistics is standardized and any missing values are imputed using strategies like Z-score normalization and linear interpolation. Each group independently trains a local MLP to extract meaningful capabilities from the uncooked data, which can be then exceeded to an LSTM community to model the temporal sequences.

Instead of sharing raw statistics, establishments share their model parameters with a critical server. Using Federated Averaging, the server aggregates those parameters to shape an international model that embodies the collective intelligence of all taking part nodes. This international model is redistributed back to the institutions, iterating through several rounds of nearby training and parameter aggregation. This federated learning method guarantees facts privateness and safety while enhancing the model’s generalizability and robustness by means of learning from various facts sources. By combining the feature extraction skills of MLPs with the series modeling strengths of LSTMs, and ensuring privateness through federated learning, this integrated pipeline objectives to offer an effective and stable solution for correct stock market prediction. The use of Federated Learning (FL) as applied in this research paper for the purpose of improving the MLP–LSTM modeling within an integrated deep learning process for stock market prediction will help improve on the decentralized data of numerous clients while at the same time preserving data security. LSTM networks remain one of the best methods to analyze temporal dependencies and trends in the sequential data which includes the time series stock prices. However, LSTMs might have a problem dealing with high-dimensional raw input data. Pre-processing the data fed to LSTM by an MLP may enhance the input’s feature by proactively filtering features based on their importance and removing background noise. This cooperation improves the efficiency of the consolidated deep learning process and related components. Conventional machine learning architectures involve the storage of large volumes of data in central database servers, an approach that is extremely problematic in addressing issues of privacy as well as data storage and management especially with financial data. To tackle with these concerns, Federated Learning permits individual clients to train models separately on private data and then transmit only the learned model’s parameters to a central server. This approach means that raw data are always stored on clients’ devices, which is appropriate from the data protection legislation perspective. It also shows that by combining the locally trained model, the central server can create the global model that has access to various sources of data without sharing the clients’ data. To the MLP–LSTM model, this accelerates as a number of benefits, to include the model’s resilience and flexibility for use on a wide variety of applications due to the fact that it can consider a set of patterns of data and trends of the stock market, which, in the end, leads to increased accuracy and reliability for predictive measures. The key contributions of the entire work can be summarized as follows:

The novel federated learning framework is implemented for stock market prediction, leveraging the collaborative intelligence of multiple institutions while preserving data privacy and security.
FED-MLP–LSTM is utilized for feature extraction and sequence modeling with enhanced prediction accuracy.
The federated framework ensures data privacy and security while leveraging diverse data sources to enhance model accuracy and robustness.
The performance evaluation metrics, including Root-Mean-Square Error, accuracy, ROC curves, and area under the curve, are calculated to quantitatively assess the models' predictive capabilities and discriminative abilities.

The structure of the paper is as follows: The difficulties of predicting the stock market are discussed in Sect. 1 along with the suggested integrated machine learning and deep learning pipelines. The field's related studies are reviewed in Sect. 2, which also highlights the drawbacks of current approaches. The problem statement and the study's goals are defined in Sect. 3. The technique is expounded upon in Sect. 4, wherein the stages of data preparation, feature design, model integration, and assessment are outlined. Experiments are shown and a thorough explanation is given in Sect. 5. Ultimately, Section 6 brings the work to a close by reviewing future research prospects and summarizing important results.

2 Related Works

Kumbure et al. [11] explain that a substantial amount of research has been conducted on nowcasting US GDP growth rates from 2000Q2 to 2018Q4, using a variety of approaches such sophisticated tree-based ensemble algorithms and linear dynamic factor systems. The use of dynamic factor designs, which are based on several financial and macroeconomic variables, has helped to solve the ragged edge problem. GDP nowcasting results show that tree-based ensemble models, including unpredictable gradient tree boosting, random forests, and bagged decision trees, perform better than linear dynamic factor methods. The impact of financial and price variables on GDP projections is more pronounced in post-2008–9 periods, indicating the influence of loose monetary policies and the financial crisis on real variables, which are thought to be more influential in models using machine learning. On the other hand, the dearth of research about the years following the COVID-19 crisis reveals a serious knowledge vacuum regarding the effects of exceptionally loose monetary policy on GDP estimate. Further studies should take this period into account to offer a more thorough comprehension of the dynamic link between monetary policy and GDP projections.

In the framework of financial market projections, Carta et al. [12] emphasize how crucial is the ability to provide excellent predicting findings with a minimum quantity of data entered. It emphasizes the applicability of computational approaches for identifying complicated non-linear dynamics and delivering relevant predictions, particularly in the lack of prior understanding concerning the statistically calculated distribution of the data being used, in considering the inherent susceptibility in stock market modeling. The study includes a thorough analysis of more than 100 publications that have been issued, including those which focus on using artificial neural and neuro-fuzzy methods for predicting stock markets. These works are categorized by the categorization system used here according to a number of factors, such as the features of the data being inputted, prediction approaches, evaluating performance methods, and the particular metrics for success used. The overall results from this thorough investigation highlight the general acceptability and usefulness of computerized methods for analyzing and rating stock market behavior, making a substantial contribution to the amount of literature in the area. There is inadequate analysis of certain computational approaches' drawbacks or difficulties in financial predictions.

Kumar, Jain, and Singh [13] examine the little use of analytical approaches to forecasting in businesses, highlighting a predominance of management discretion close to the item on the demand. It emphasizes how increasingly accurate estimation is required due to changing commercial and functional characteristics. The study outlines criteria essential for a thorough market estimation review by drawing on ideas throughout the research on “the diffusion of technology” and “obstacles from successful deployment.” The article analyses the firm’s present market prediction procedures and highlights identified weaknesses by its executives using an extensive investigation inside a multi-divisional organization. The study makes arguments for the underuse of statistical forecasting techniques. Finally, it addresses the underlying factors that contribute to its ultimate inadequate predicting practices and makes suggestions for solutions. This research only examines a single organization, which may restrict the application of its conclusions to various situations. It highlights problems and causes, but it does not provide enough solid real-world evidence to back up its assertions regarding the drawbacks of statistical forecasting methodologies.

Mukherjee et al. [14] explain why it is important to anticipate the characteristics of the stock market as it is which is among the more active research disciplines at the moment. The market for stocks can be unpredictable and requires careful examination of the information's trend. To get beyond this barrier and offer a practical solution, sophisticated mathematical frameworks and AI software would be required. Many ML and g algorithms can generate a reliable forecast with minimal error chance. The two network models the fact that are currently commonly used for predicting stock market values are actually the CNN along with the ANN, additionally referred to as the Deep Feed-back Neural Network. The algorithms have been utilized for predicting the information quantities for the upcoming days based on the information levels from the previous few days. The cyclic replication of this approach will persist for as lengthy as the dataset remains valid. This forecast has been optimized by the application of DL, with noteworthy outcomes. The ANN model achieved 97.66% accuracy, compared to the CNN model’s 98.92% accuracy. Analyzing the 2D graphs generated from the quantization data over a certain time frame, the CNN model forecasts. This type of dataset analysis was never done before using this approach. Utilizing the recent COVID-19 pandemic as an illustration, that prompted the stock market to drop sharply, the structure has been assessed. The findings of this research were rather good, with a 91% accuracy rate.

Stockholders have long found it interesting and challenging to anticipate the values of stock categories due to their complexity, volatility, and non-linearity. The focus of [15] is the future outlook of stock market institutions. For experimental evaluations, the four divisions of the Tehran Stock Exchange—diversified finances, oil, minerals that are not metallic, and fundamental metals—were chosen. Data were collected for each category using a decade’s worth of past data. Many ML techniques were used to forecast future stock market values for groups. ANN, RNN, decision trees, bagging, random forests, gradient boosting, XGBoost, and LSTM were all employed. Ten technical characteristics have been selected as inputs for each forecasting model. Finally, the anticipated results for all methods according to four factors were displayed. Among all the methods used in this work, LSTM yields result with the highest accuracy and greatest model-fitting capabilities. Moreover, the competition among Adaboost, gradient boost, and XGBoost for trees as models is sometimes intense.

Table 1 shows the advantages and disadvantages of the literature review. Numerous research projects have looked at the use of different computer techniques to forecast GDP growth rates and stock market behavior. When it comes to now casting US GDP growth rates, tree-based ensemble models perform better than linear dynamic factor techniques, especially in the post-2008–9 years [14]. This suggests that the financial crisis and loose monetary policies have influenced actual variables. Machine learning approaches like LSTM surpass traditional statistical techniques in stock market prediction; among the algorithms examined, LSTM demonstrates the best model-fitting capability and highest accuracy [13]. CNNs, which are deep learning algorithms, have also been used and have demonstrated good accuracy rates, especially during times of notable volatility in markets like the COVID-19 pandemic. These studies demonstrate the accuracy with which sophisticated computer tools may anticipate the dynamics of the financial markets and the significance of factoring in current events and dynamic aspects when making forecasts. The usefulness of machine learning approaches is highlighted by the research that is now available, which also offers insights into sophisticated computer methods for predicting and nowcasting in the financial and economic spheres. While some studies are limited by focusing on datasets or circumstances, which limits generalizability and necessitates broader validation, others are hampered by failing to thoroughly analyze the downsides or limitations in their methodology, thereby exaggerating their apparent efficacy.

Table 1 Methods, advantages, and disadvantages of the literature review

Full size table

3 Problem Statement

The increase in the level of uncertainty and diversification of financial and the increasing tendencies toward more complex stock exchange markets require elaboration of more accurate and efficient predictive models that could reflect temporal dependencies and capture more subtle patterns. The existing machine learning or deep learning methods, however, suffer from some demerits including over fitting, poor generalization, data fusibility and spoilers, and requirement of exhaustive data sharing which has proven to be a major issue on data privacy. Moreover, least square methods are often derived for specific data set hence restricting their use in practice. To manage these challenges, this study aims to develop a Federated Learning Enhanced MLP–LSTM Modeling, where MLP is employed in the feature extraction, LSTM in the sequence modeling part, and federated learning in the general framework. Besides, this approach improves the accuracy of the forecast by taking the advantage of MLP and LSTM algorithms and also keeps data confidentiality and integrity and allows several financial institutions to jointly train a strong model without exchanging the customers’ information [13].

4 Proposed Fed-MLP–LSTM Model for Stock Market Prediction

The general process of developing the Federated Learning Enhanced MLP–LSTM Model for stock market prediction with applied CAC40 dataset reflects a strict and well thought of systematic process aimed at the combination of positive features of MLP–LSTM and Federated Learning Models, as well as the proper handling of data privacy and security issues is illustrated in Fig. 1: It starts with data gathering where every bank participating in the study gathers and preprocesses data of the local stock exchange involving the CAC40 index. This feature is pre-processed to achieve better standards as the Z-score normalization and other normalization techniques are used in this case. In the first phase of the local model training, the institute applies a Multi-layer Perceptron to feature engineering where the institute is able to extract substantial features that will later be used in the training phase. This is because of the multiple layer structure of the MLP neural network, which enables it to effectively capture non-linear interacting patterns between the features of the training dataset transforming inputs into an enriched set of features. These data are then provided as an input to the long short-term memory deep network which is capable of handling sequential data, and thus, the temporal nature of the success of given stock market is captured.

The model parameters (weights and biases) of each local model are then uploaded onto a central server instead of raw data being exchanged among the institutions. In fact, the parameters shared by a given institution are summed up by the server using the Federated Averaging technique to produce a global model containing knowledge from all learning centers. This aggregated model is then brought back to be used by every institution, and hence, the model can benefit from a large base of data sources while not infringing on the identity of users who are contributing data. Such training involves the first pass of training, wherein the locally calculated parameters are shared, averaged, and then redistributed through another pass of the process, all of which are repeatedly performed to enhance the model’s accuracy and stability. When MLP needs feature extraction and LSTM specializes in sequence modeling brought into federated learning workflow, the best outcome consists of better prediction performance, less over fitting, and scalability. Based on the review of the federated approach, it establishes data privacy and security, which is essential for multi-site teamwork where data sensitivity is a concern. This integrated pipeline exemplifies such a powerful approach of having reliable stock market prediction models.

4.1 Data Collection

The benchmark index for the French stock market is called CAC40, formerly known as Bourse de Paris. Out of Euronext Paris's 100 highest market values, the index measures the 40 most significant stocks based on capitalization. Cotation Assistée en Continu, which translates to “continuous assisted trading,” is the abbreviation for this benchmark index that is utilized by funds that trade in the French stock market. Perfect for: Applying EDA to time series. Learn more about the out-of-sample prediction error estimator, and the Akaike information criteria, and study the tuning of models. Establish a stock forecasting technique and division, Determine the peak stress times for a certain industry [16]. The split percentages for the testing and training are set to be 30% and 70%. The three markets' initial time series are not stationary, according to the suggested approach. The time series does not have to be stationary to be used with deep learning techniques. The seven attributes listed in the dataset sections above are selected as the input variables for all of the techniques in this study, which utilize the index’s close price the following day as the label to generate predictions. We create four hidden layers in the MLP model, one fully connected layer serving as the output layer, and the matching neurons 70, 28, 14, and 7 in each hidden layer. The activate function for hidden layers is relu function. The hidden layer unit in the LSTM model is intended to be 140. The output layer consists of a single completely linked layer. In the LSTM cell, the sigmoid and tanh functions are used. Three convolution layers with matching channels 7, 5, and 3 are selected for the CNN model. The kernel size of three is the same for every convolution layer. After every convolution layer, we apply a drop layer. Two RNNs' hidden units in the UA model have been set to 70. V has an embedding size of 7. The final predictions are produced by a single completely linked layer. Studies reveal each method's hyper-parameters. The time step is 20 for all approaches, and the learning rate is 0.01. Our loss function of choice is the Mean Square Error (MSE) loss, which is optimized using the Adam method. There are 64 batch sizes configured. Before training, all of the input data will be scrambled. PyTorch was used for all method implementation experiments.

4.2 Data Pre-processing

The purpose of data early processing is to improve the value and accuracy of financial information for future analysis. Using effective imputation approaches, researchers handle the problem of value deficiency while preserving the accuracy and validity of the info. It also uses the Z-Score approach to standardize data by putting components on a similar scale, which makes it easier to spot significant developments and patterns in the financial statistics. Activities like stock market nowcasting and forecasting, evaluation of risk, and decision-making regarding investments gain from these kinds of data pre-processing procedures, because they provide a crucial basis for improved accuracy and reliability of analysis. A two stage approach is used to replace the observed irregularities by adjustments that are comparable to the original data. The panda’s collection technique is used to apply a time-based imputation once any outliers have been eliminated. By utilizing a linear interpolation approach, an algorithm determines the value gap in both data points. The lacking value, represented asy, may be calculated using (1) that follows:

$$ = y^{\prime}_{1} + (x - x^{\prime}_{x} )\frac{{y^{\prime}_{2} - y^{\prime}_{1} }}{{x^{\prime}_{2} - x^{\prime}_{1} }}, $$

(1)

where ${{\varvec{x}}}_{1}^{\boldsymbol{^{\prime}}}$ and ${{\varvec{x}}}_{2}^{\boldsymbol{^{\prime}}}$ are the two boundary points' time indices and associated values the two boundary endpoints' time assessments and their equivalents ${{\varvec{y}}}_{1}^{\boldsymbol{^{\prime}}}$ and ${{\varvec{y}}}_{2}^{\boldsymbol{^{\prime}}}$, and x represents the distance between ${{\varvec{x}}}_{1}^{\boldsymbol{^{\prime}}}$ and ${{\varvec{x}}}_{2}^{\boldsymbol{^{\prime}}}$. When imputed data are produced employing that approach, they deviate less toward nearby normal data's average variation.

Use the Z-score normalizing method to perform pre-processing. This process is a method for getting initial data ready for the model of machine learning to analyze. Data pre-processing is the term for the process involved. Initially, Z-score normalization is used as a pre-processing step for the collected data. Considering the initially collected data’s mean as well as its standard deviation for analysis, the Z-score normalization creates a normalized result. Equation (2) demonstrates that utilizing the z-score parameter to normalize the raw information is possible

$$ k^{\prime\prime}_{n} = \frac{{k^{\prime}_{n} - \overline{F}}}{sd}, $$

(2)

where $k^{\prime\prime}_{n}$ is the standardized Z-score, Z-scores have been normalized, and it is given by $k^{\prime}_{n}$ and ${k}_{n}$ indicates row F of the ${1}^{th}$ column where the value occurs.

Since the outcomes in every row were the same, the Z-score method may be applied to each row, producing typical data with an average deviation of 0. The Z-score method of creating a range from 0 to 1 is like the Min-Max normalizing technique.

4.3 FED-MLP–LSTM Architecture for Stock Market Prediction

The federated learning framework for stock market prediction using the CAC40 dataset involves a collaborative approach where multiple nodes (e.g., financial institutions or trading platforms) participate in training a shared model without directly sharing their proprietary data is represented in Fig. 2. This framework ensures data privacy and security while leveraging diverse data sources to enhance model accuracy and robustness. The central server aggregates the parameters using a technique called Federated Averaging. This process involves computing a weighted average of the parameters received from all entities. The aggregated model reflects the collective knowledge from all local datasets.

Federated learning helps mitigate the risk of overfitting to any single dataset. The aggregated model is exposed to various data patterns, making it more resilient to overfitting and improving its ability to generalize to new, unseen data. The federated learning framework is inherently scalable. As more entities join the collaborative effort, the model's predictive power continues to improve, making it adaptable to changing market dynamics. In the Federated Learning system, the various decentralized nodes work together in the process of developing a common machine learning model without the need to share the raw data. This makes it safe for everyone involved, since data like personal and financial details do not move around the network but will stay within the nodes. Both node and each node send local data to train a model and infrequently only the new model updates (i.e., learned parametric changes) are fed to a central server. The updates are collected by the server to form a global model that could be circulated back to the nodes for additional updates. This process continues in an iterative manner until the model converges while preserving data privacy and, in the meantime, leveraging all the knowledge from all nodes concerned.

The FED-MLP–LSTM architecture extends this by integrating Multi-Layer Perceptron–Long Short-Term Memory into the Federated Learning framework; this is for stock market prediction. Throughout this work, MLP has proved to be a useful feed-forward neural network for identifying non-linear dependency of input attributes. However, it is short of memory; it is therefore less capable in capturing temporal dependencies that are important in the stock market data. However, this is where LSTM networks come into play. LSTM is a type of RNN that is best suited for the data which has sequences, particularly the memory cells in LSTM will help in recalling the stocks’ prices through time series.

In this architecture, MLP component is responsible for processing the static features (for instance technical indicators or company fundamentals) and LSTM network is fed with sequential nature of the stock data (for instance historical prices). Two combined model is composed of static pattern and temporal trend compared with the original model, which can provide better prediction for the change of stock market. The federated learning part makes it possible to include all nodes (which may be different market stakeholders) into the model, while not sharing collectors’ sensitive financial information, which increases prediction accuracy but reduces data protection. In this approach, it has the most value especially when the information domain like the stock market is highly sensitive to security and models that rely on the variety of data.

4.3.1 MLP Architecture for Feature Extraction

One type of main ANN with a limited number of perceptual layers is the MLP. The input layer, hidden layer, and output layer are the minimum number of layers in an MLP. This makes it possible for MLPs to learn combinations of the incoming data because of their several layers and non-linear activation functions. Most importantly, in stock market prediction, where the interdependency of the variables, including the price series, trading volume, and economic factors, might be non-linear, an MLP is capable of modeling these non-linear patterns and dependencies and extract features that might be asymptomatic to the analysts. MLPs are highly flexible and can be tailored to specific datasets and tasks by adjusting the number of layers, neurons per layer, and activation functions. This adaptability allows the MLP to be fine-tuned for optimal feature extraction based on the unique characteristics of stock market data. The MLP architecture frequently has additional hidden layers that can handle approximations on numerous difficult issues, including fitting approximations. A single MLP is a network with direction made up of several node layers interconnected to the subsequent layer, converting an array of input variables to a collection of output variables. Typically, these links are referred to as synaptic or linkages. Every node represents a neuron having a function of non-linear activation in furtherance of the input vertices. MLP is frequently trained using a supervised learning technique called the algorithm for back propagation. The limitation of the perceptron, that is, its inability to identify indivisible linear data is addressed by the extension of the perceptron, also known as or MLP. The MLP levels are completely linked, meaning that each layer's neurons are coupled to every other layer's cells, representing an extra and total of the weights.

In particular, the input layer is represented by X, the initial hidden layer is represented by K1, and the Eq. (4) is

$${K}_{1}=f\left({V}_{1}X+{C}_{1}\right).$$

(4)

When f is often a non-linear activating operation, ${V}_{1}$ denotes weight variables, and ${C}_{1}$ denotes biased variables. To obtain the output of the hidden layers, it performs the comparable procedures throughout the hidden layers. Each layer possesses unique weighted and biased settings. Y is utilized for representing the output from the hidden layer to the output layer is depicted in Eq. (5), and therefore

$$Y=G\left({V}_{L}{K}_{L}+{C}_{L}\right),$$

(5)

wherein L denotes the hidden layer integer and K_L denotes the final hidden layer. The process of activating the function, sometimes referred to as Softmax, is functional G. As a result, the connections among input and output are going to be developed. One neuron’s response will vary if its weight (v) or bias (c) is slightly altered, and these variations will ultimately show up in the output of a single or numerous synapses. To obtain the anticipated outcomes, the weight V and bias C are finally updated using the BP method and optimizing method. The MLP architecture for feature extraction utilizes multiple hidden layers to capture complex, non-linear relationships within the input stock market data. This pre-processing step transforms raw data into a more informative representation, enhancing the subsequent sequence learning by the LSTM model.

4.3.2 LSTM Architecture for Sequence Modeling

A specific type of RNN called LSTM outperforms RNNs in numerous applications where long-term reliance is required. The vanishing gradient issue makes it challenging for RNNs to learn dependency relationships that last. LSTM is a useful technique that uses memory cells to counteract disappearing gradients. A repeating chain resides in one neural cell in all RNN. Unlike conventional RNNs, which have a single layer in the brain cell, LSTMs have four layers, all of them communicate with one another in unique ways.

The cell state Ct is essential to LSTM. Cell state may be compared to a belt of data that moves along a chain, interacting only linearly to maintain the identical data transfer. The gate function controls the LSTM's ability to add, remove, or alter cell state data. The gate system, which consists of the point-by-point multiplying process and the sigmoid neural layer, allows data to flow preferentially. The LSTM cell has three different kinds of gates to safeguard and regulate the cell state.

There is an equation for each gate: (Vi X+ ci) The sigmoid layer outputs range is (0,1), indicating the appropriate amount for every Dt-1 element to pass. An output of zero suggests that no pass is permitted, whereas an output of one indicates that all single passes are permitted. The LSTM model's architecture is depicted in Fig. 3.

Selecting which data to remove from the cell state is the initial stage in the LSTM process. The “forget gate layer” is a non-linear layer that makes this choice. Based on k_(t-1) and x_t, it will generate a vector of values for cell state D_(t-1). The vector will vary from 0 to 1. A value of 1 indicates that all of the data from D_(t-1) should be kept, whereas a value of 0 indicates that all of the data should be deleted. Four gates in LSTM neural network are represented by in Eqs. (6) and (7)

$${f}_{t}=\sigma ({P}_{f}{x}_{t}+{L}_{f}{h}_{t-1}+{c}_{f})$$

(6)

$${g}_{t}=tanh({P}_{g}{x}_{t}+{L}_{g}{h}_{t-1}+{c}_{g})$$

(7)

$${i}_{t}=\sigma ({P}_{i}{x}_{t}+{L}_{i}{h}_{t-1}+{c}_{i})$$

(8)

$${o}_{t}=\sigma \left({P}_{o}{x}_{t}+{L}_{o}{h}_{t-1}+{c}_{o}\right),$$

(9)

where ${L}_{f}$,${L}_{g}$, ${L}_{i}$, ${L}_{o}$ represent the weight matrices of the preceding short-term state ht-1. Pf, Pg, Pi, Po represent the weight matrices of the present input state ${x}_{t}$, and cf, cg, ci, and co are the bias terms. Where, p_t-1 represents the previous long-term state. The present long-term state of the network p_t can be calculated using Eq. 10

$${Q}_{t}={f}_{t}*{Q}_{t-1}+{i}_{t}{*g}_{t}$$

(10)

$${y}_{t}={h}_{t}={o}_{t}*\text{tanh}\left({p}_{t}\right).$$

(11)

The integration of MLP and LSTM in a unified pipeline leverages the strengths of both architectures to enhance stock market prediction. Initially, the MLP issue is employed for feature extraction. It approaches the uncooked enter statistics via several hidden layers, shooting problematic non-linear relationships, and reworking the enter right into a wealthy, informative function set. This pre-processed information is then fed into the LSTM issue that is adept at coping with sequential statistics. The LSTM captures temporal dependencies and tendencies within the time-collection data, essential for accurate stock market predictions. By combining MLP's capability to distill relevant features from raw statistics with LSTM’s power in modeling temporal sequences, the pipeline effectively complements prediction performance. This unified method no longer best improves the accuracy of the predictions but additionally ensures that the version can generalize properly across one-of-a-kind marketplace situations. Additionally, this integration enables stop-to-quit gaining knowledge of, where the feature extraction and series modeling are optimized collectively, main to more sturdy and reliable predictions. This synergy between MLP and LSTM within an unmarried pipeline exemplifies a powerful approach for tackling complicated prediction tasks in inventory market evaluation.

Algorithm 1

Fed-MLP–LSTM mechanism.

5 Results and Discussion

The integrated machine learning process for stock market forecasts in this study is built using a two-step modeling method and it is implemented in python software. Z-score normalization is used to standardize the data after the pre-processing stage effectively handles missing values. The model's capacity to identify relevant data that is essential for predictions is improved when the MLP approach is utilized for feature extraction. The next step is to pick characteristics using LSTM, which reduces complexity while keeping important data. As a testament to their ability to identify intricate patterns in financial data, the last stage makes use of LSTM neural networks for estimation and prediction. To enhance the forecasting accuracy in the highly volatile field of stock market forecasting, the process integrates the advantages of both LSTM and MLP modeling techniques. With careful consideration of the complex interactions between MLP, LSTM networks, and optimization strategies in crossing the difficulties of financial prediction, this all-encompassing methodology seeks to yield reliable and accurate stock market forecasts.

5.1 Experimental Outcome

The standard practice in finance studies is to assess shifts and likely exchange symptoms through assessing a Simple Moving Average (SMA-20) to actual price data. By averaging the expenses from the previous 20 days—usually open, closing, high, or low values—the SMA-20 is computed. Researchers may additionally visually see trends and likely reversals by charting the SMA-20 together with the real fee facts for a random pattern of an asset, along with an asset or currency trading pair. A constructive trend is probably implied using a go of the SMA-20 line beyond the modern-day fee, while an upward trend can be provided up with the aid of a crossover below. Due to the contrast's expertise of the stock's preceding performance and likely future movements, investors, as well as buyers can also make greater educated choices. Figure 3 depicts Simple Moving Average vs. Actual Prices for Random Sample.

A useful tool for trends identification and trading methods is the comparison of the exponential moving average (EMA) to the actual closing price of a random sample of economic data from 2010 to 2020. The EMA is more sensitive to recently price movements than the simple moving average, because it gives greater importance to more recent information points. The EMA-20, for instance, more immediately exposes developments and probable reversing points while placed above the actual closing prices. A positive emotion is suggested if the EMA-20 line is above the actual closing price, whereas an unfavorable is suggested by the crossover below.

This visual evaluation helps investors as well as traders find probable entry as well as exit opportunities by assisting them in assessing the market’s outlook while deciding decisions on the basis of the stock’s previous price fluctuations over the period of time chosen. Exponential Moving Average vs. Actual Closing Price for Random Sample is represented in Fig. 4. For stock price estimation, nowcasting, and forecasting, it is essential to analyze volatility trends over the previous year by looking at significant dates like May 2019, July 2019, November 2019, and January 2020, as well as March 2020. The stock market saw enormous changes, sometimes followed by swift price decreases, at times of high volatility, including in March 2020 when the COVID-19 pandemic struck.

On the other hand, months with comparatively lower volatility, like May 2019, can point toward greater consistent market circumstances. Economists can forecast upcoming price fluctuations with greater accuracy by monitoring previous volatility and its relationship with marketplace and economic conditions. Furthermore, with the aid of integrating these statistics into diverse qualitative and technical symptoms, buyers and buyers may also check volatility and improve their selection-making when determining the way to allocate their capital or implement buying and selling techniques. Figure 5 represents the historical volatility over the last 12 months. Integrity should be taken into consideration as evaluating these measures for training and take a look at datasets.

A crucial aspect of inventory market value forecasting, nowcasting, and prediction is inspecting the connection related to the real final price and the 10-day Simple Moving Average (SMA). A dynamic indicator that reflects latest charge styles, the 10-day SMA presents insightful data about marketplace mood and possible trade hints. The ultimate fee offers clues regarding the short-term bullish or bearish bias while it frequently plays above or below the SMA. Crucial signs for fashion reversals are provided by means of crossovers across the closing price and the SMA. Analysts may assess short-term growth, pinpoint probable levels of acceptance and opposition, and integrate these results into their prediction systems in order to make better judgments for both immediate trading and longer-term investing decisions by keeping an eye on this association. Figure 6 depicts Actual Closing Price vs. 10-day SMA.

5.2 Performance Evaluation

For assessment, the subsequent evaluation criteria were used: recollect, F1-score, precision, and accuracy. In this study, prediction accuracy can be measured using a number of metrics, which include Accuracy and Root Mean Square. In records and information analysis, the RMSE is a regularly employed metrics to assess the precision of a version for prediction, which is regularly used with regression models. It measures the manner a version’s cost predictions coordinate with precisely the values seen. The square root of the mean of the squared discrepancies between the anticipated and real values is calculated the use of the RMSE. It is a way for evaluating a version's quality of fit, with decrease RMSE values indicating advanced model effectiveness. The ease in which an analytical the model's predictions correspond to the real prices of stocks over a certain time period is known as accuracy. It is a measurement of ways intently the version's anticipated prices healthy the actual marketplace statistics, and its miles frequently expressed using metrics designed especially for regression evaluation, together with RMSE.

In evaluating the education and trying out losses for Network 1, Network 2, and the Federated Learning framework, good sized insights into the fashions’ overall performance and generalization capabilities are revealed in Fig. 7. Network 1, skilled on a centralized dataset, demonstrates its ability to limit the schooling loss correctly, indicating its scalability in understandings from the supplied information. However, while evaluated on unseen trying out statistics, Network 1 can also showcase a better testing loss, suggesting potential over fitting or limited generalization to new facts. Conversely, Network 2, also skilled on centralized facts however using a distinctive structure or hyper-parameters, may present various training and testing losses, relying on its capability to capture underlying styles inside the dataset. On the opposite hand, the Federated Learning method, regarding collaboration among multiple institutions and model parameter sharing, ambitions to mitigate over fitting and beautify generalization by way of aggregating understanding from various information sources. At the same time as the training loss may also to start with be better due to the decentralized training system, the Federated Learning framework strives to acquire a similar or maybe decrease checking out loss by leveraging collective intelligence and capturing a broader variety of facts patterns. By evaluating and evaluating the training and testing losses across Network 1, Network 2, and the Federated Learning framework, valuable insights into their respective strengths and obstacles in stock marketplace prediction may be gleaned, guiding in addition optimization efforts and version selection for actual-world packages.

The parameters which are mostly used to measure the efficiency of Federated Learning Enhanced MLP–LSTM Model are training and validation loss and it is depicted in Fig. 8. In the course of the training phase, the model learns and adapts by adjusting the parameters so as to reduce the training loss which is the sensitivity of the model to the true stock prices. It allows the model to capture the details and temporal features of the dataset, mimicking the Stock Market closely. At the same time, to control the generalization ability of the model and not over fitting during training observations, the validation loss is calculated by comparing the model’s performance on a validation set not used in the training process. This is helpful in evaluating the performance of the model when exposed to fresh data that have not been used previously in the training process, as well as to curb overfitting. Training and Validation Loss helps to adjust the model’s hyper-parameters when there is a need to fine-tune and improve the performance of the model. Validation loss that remains low over the epochs and the shrinking in the training loss gives hope that the model learns from the data provided while retaining its ability to deliver accurate predictions on out-of-sample data needed for robust and reliable performance in stock market prediction

In Fig. 9, interpretation of the three curves, namely Network 1, Network 2, and the Federated learning framework, respectively, helps in understanding discriminative performance and predictive capability of the supplied models to predict stock market index. The ROC curve for Network 1 has shown the capability of Network 1 to classify between correct and incorrect decisions at different decision levels. The greater value of the AUC implies better recognition, evidencing Network 1’s ability to classify stock market prognosis effectively. Likewise, there is Network 2 ROC curve, which provides a comparative analysis of how well it discriminates and predicts the accuracy. It is explicitly possible to compare the shapes of these curves and quantify the proficiency of the models in avoiding false positives, while making correct predictions. Thus, it can be concluded that the model, whether Network 1, Network 2, or in the Federated Learning approach, has a high potential for being a reliable and robust method of stock market prediction based on the results achieved with higher AUC values forecasting better capabilities for the models. Additionally, comparing the AUC values across the models provides valuable insights into their relative performance, guiding model selection and optimization efforts for real-world applications in financial markets.

The prediction model is probably over fitting if it works well with training data but badly on test data. On the other hand, the prediction might be underestimating if both the training and test measures are subpar. Finding an appropriate ratio that demonstrates the model's capacity to successfully generalize between training to untrained data and provide trustworthy recommendations for stock market price estimation and projection is the objective. To further improve the review process, it is advised to take other indicators and domain-dependent expertise into account. Figure 10 shows the comparison of evaluation metrics for train and tests.

Table 2 describes the performance metrics of forecasting assessment of the stock market. With a relatively low RMSE of 0.0108, this suggested Fed-MLP–LSTM system exhibits excellent accuracy metrics, demonstrating the system's ability to make precise stock market forecasts. The system also has a high accuracy rate of 98.3%, which supports the claim that it is successful in predicting stock market movements. Such reliable evaluations highlight the system's capacity for exact forecasting, resulting in a viable tool for market analysts and shareholders.

Table 2 Machine learning model training time (seconds)

Full size table

Table 2 describes three alternative approaches that were compared based on outcomes in an investigation of several predictive algorithms for stock price predictions in Fig. 10. Table 2 shows the performance of the proposed model. The combined MLP and LSTM modeling process is shown in Fig. 11. Figure 11 illustrates the integrated M.L.P. and LSTM modelling process, providing a comparison between the Fed-MLP-LSTM model and existing methods.

Three ML models' training delays are displayed in Table 3. Based on the results of current study, LR is the quickest machine learning model whereas SVM is the slowest owing to model convergence delay. Furthermore, compared to a centralized learning framework, FL and decentralized learning shorten the ML training latency. This is due to the fact that both FL and decentralized frameworks reduce the time in ML training using parallel data processing. Comparing FL with SVM to centralized and distributed learning, however, results in a longer ML training period. It results from the SVM's parameter distribution, iteration frequency, and model convergence delay. Our proposed MLP–LSTM outperforms above two methods with extremely less time delay of federated learning 8.54s.

Table 3 Performance metrics of proposed method is evaluated with existing methods

Full size table

Table 3 describes the performance of various methods in terms of RMSE and accuracy for stock market prediction tasks. ARFIMA-LSTM achieves an RMSE of 0.0539 with an accuracy of 85%. GA-BPNN demonstrates superior performance with an RMSE of 0.016 and an accuracy of 97.24%. KTFCM and Fuzzy k-means show respective accuracies of 72% and 67%, but their RMSE values are not provided

Figure 11 represents the comparison metrics. The Proposed Fed-MLP–LSTM outperforms all other methods with the lowest RMSE of 0.0108 and the highest accuracy of 98.3%. This indicates that the proposed approach effectively combines the strengths of LSTM for more accurate stock market prediction.

5.3 Discussion

Using cutting-edge methods including LSTM modeling, MLP, this work offers an integrated machine learning pipeline for stock market prediction. A limitation of the suggested approach is the possible intricacy involved in adjusting the Fed-MLP–LSTM model's hyper-parameters, which could require substantial computational resources and proficiency. Furthermore, the inherent complexity of LSTM networks and Federated mastering may additionally restrict the comprehensibility of the version's output. All things considered, this looks at adds to the expanding body of studies on system studying use for stock marketplace prediction and emphasizes the importance of taking both quantitative and qualitative elements into consideration when modeling financial markets. Because of the complex and dynamic nature of inventory market actions, it could no longer be possible for the cautioned approach to accurately depict them using simplest mathematical modeling. These elements encompass investor mood, geopolitical events, economic records, and market psychology.

5.4 Potential Limitations of the Proposed Study

FL addresses data privacy by keeping data decentralized at the nodes, which formulates the network that participates in the learning process; however, this comes at the cost of model quality. In FL, the update process transfers some information to some nodes while excluding others, which leads to unbalanced or even the lack of information at some nodes. In terms of computation, MLPs consume a moderate amount of computational resources while LSTMs, because of their sequential structure and internal storage systems, need considerably more computational power. This may result into some degree of slowness especially in the training phase as the model seeks to make analysis on large volumes of data especially on historical stock prices.

In this architecture when utilized in the federated learning setting, each node must have the required capability to train both MLP and LSTM models which may be a limitation to some participants. Some of the nodes may be less computation-capable which might mean that their updates will be infinitesimal and will hamper the overall training exercise with resultant lower standard global models. Forecasting of stock market usually involves real-time or high-frequency forecasts. However, the proposed model which involves MLP for static dataset and LSTM for sequential dataset may face a real-time issue on computational complexity. The real-time predictions with high accuracy are difficult to obtain especially where the model is complex. While using LSTM, there are some disadvantages such as compromising speed of the real-time prediction in favor of thorough training that the model has to undergo to identify intricate patterns.

6 Conclusions and Future Work

In this paper, we discussed an efficient solution to improve the effectiveness of the stock market prediction model in combination with MLP and LSTM modalities operating in the Federated Learning system with consideration of the CAC40 dataset. Therefore, in putting up this framework, we incorporate the efficiencies of Multi-Layer Perceptron’s for feature extraction and Long Short-Term Memory networks for modeling temporal data. Through the application of federated learning, we then motivate multiple financial institutions to train a reliable model contributing to their large developments without having to compromise the privacy and security of their data. The federated learning approach developed here encourages a collective model whereby institutions work together but transfer parameters rather than data. The steps involved in this process include local data preparation and model training where the data are split and the models are trained locally, and then, the collected model parameters are transmitted to a central server using Federated Averaging technique. Through such accumulation across the data sources, the global model embeds the distilled knowledge from various inputs, thereby improving the precision and applicability of the concepts involved. Due to alternating between different rounds of training and sharing parameters, this method provides improvement to a model after every round. The results revealed that our proposed MLP–LSTM hybridized model has the potential of capturing even the non-linear futuristic trends inherent in the stock market in addition to successfully capturing temporal dependencies. Data normalization and imputation conditions are the required input data, thus reducing the interferences and aberrations resulting in precise predictions. Federated Learning can reduce the amount of data that is shared and leaked. Each worker node individually creates a slave model using FL by utilizing a province's stock market data. To develop the master modeling without exchanging stock data, the worker nodes only exchange the model parameters, or gradient loss. Applications utilizing predictive analysis would benefit from it, since market data owners tend to be in favor of data sharing. True parallelism in the evaluation and analysis of Federated Learning's performance on multicomputer systems is still required. This study builds the FL framework using a hyper-threading method. However, because of the limitations of the computer platforms, it cannot execute all the threads at once. Instead of using threads to train the FL model, an environment for distributed computing can be established using Service Orientated Architecture (SOA).

Compared to the centralized learning model, the federated learning technique reduces the chances of overfitting the learning model and increases the generality of the learning model in various markets, which is the closest to the actual environment. In addition, the federated learning framework makes it possible for the model to increase the accuracy of its predictions over time as more institutions participate in the collaboration. In this paper, we are proposing an effective and privacy preserving technique for stock market prediction. Through the proposed federated learning mechanism that combines MLP and LSTM, we present a flexible and accurate solution that improves the predictive performance of the model while respecting data privacy constraints and varying financial markets. In addition to moving the field of financial forecasting forward, this approach also defines a promising path for integrating machine learning into finance that is cooperative, secure, and scalable. The study could also consider different optimizations in the federated learning and expand the framework incorporating different sets of data and financial indicators.

Data Availability

No datasets were generated or analyzed during the current study.

References

Yan, Y., Yang, G., Gao, Y., Zang, C., Chen, J. and Wang, Q.: Multi-participant vertical federated learning based time series prediction. In: Proceedings of the 8th International Conference on Computing and Artificial Intelligence, in ICCAI ’22. New York, NY, USA: Association for Computing Machinery, pp. 165–171 (2022). https://doi.org/10.1145/3532213.3532238.
Pang, X., Zhou, Y., Wang, P., Lin, W., Chang, V.: An innovative neural network approach for stock market prediction. J. Supercomput. 76(3), 2098–2118 (2020). https://doi.org/10.1007/s11227-017-2228-y
Article Google Scholar
Shen, J., Shafiq, M.O.: Short-term stock market price trend prediction using a comprehensive deep learning system. J. Big Data 7, 1–33 (2020)
Article Google Scholar
Pourroostaei Ardakani, S., Du, N., Lin, C., Yang, J.-C., Bi, Z., Chen, L.: A federated learning-enabled predictive analysis to forecast stock market trends. J. Ambient. Intell. Human Comput. 14(4), 4529–4535 (2023). https://doi.org/10.1007/s12652-023-04570-4
Article Google Scholar
Shaheen, M., Farooq, M.S., Umer, T.: Reduction in data imbalance for client-side training in federated learning for the prediction of stock market prices. J. Sens. Actuat. Netw. (2024). https://doi.org/10.3390/jsan13010001
Article Google Scholar
Ahmed, U., Srivastava, G., Lin, J.C.-W.: Reliable customer analysis using federated learning and exploring deep-attention edge intelligence. Futur. Gener. Comput. Syst. 127, 70–79 (2022)
Article Google Scholar
Patel, N.: F-LSTM: federated learning-based LSTM framework for cryptocurrency price prediction. Electron. Res. Arch. 31(10), 6525–6551 (2023)
Article Google Scholar
Thakkar, A., Chaudhari, K.: Fusion in stock market prediction: A decade survey on the necessity, recent developments, and potential future directions. Inf. Fusion 65, 95–107 (2021). https://doi.org/10.1016/j.inffus.2020.08.019
Article Google Scholar
Sakhare, N.N., Shaik, I.S.: Spatial federated learning approach for the sentiment analysis of stock news stored on blockchain. Spat. Inf. Res. 32(1), 13–27 (2024). https://doi.org/10.1007/s41324-023-00529-x
Article Google Scholar
M. R. Kumar, S. Ramkumar, S. Saravanan, R. Balakrishnan, and M. Swathi, “Stock Market Prediction via Twitter Sentiment Analysis using BERT: A Federated Learning Approach,” in Handbook on Federated Learning, CRC Press, pp. 333–353.
Kumbure, M.M., Lohrmann, C., Luukka, P., Porras, J.: Machine learning techniques and data for stock market forecasting: a literature review. Expert Syst. Appl. 197, 116659 (2022). https://doi.org/10.1016/j.eswa.2022.116659
Article Google Scholar
Carta, S., Ferreira, A., Podda, A.S., Reforgiato Recupero, D., Sanna, A.: Multi-DQN: an ensemble of deep Q-learning agents for stock market forecasting. Expert Syst. Appl. 164, 113820 (2021). https://doi.org/10.1016/j.eswa.2020.113820
Article Google Scholar
Kumar, G., Jain, S., Singh, U.P.: Stock market forecasting using computational intelligence: a survey. Arch. Computat. Methods Eng. 28(3), 1069–1101 (2021). https://doi.org/10.1007/s11831-020-09413-5
Article MathSciNet Google Scholar
Mukherjee, S., Sadhukhan, B., Sarkar, N., Roy, D., De, S.: Stock market prediction using deep learning algorithms. CAAI Trans. Intell. Technol. 8(1), 82–94 (2023). https://doi.org/10.1049/cit2.12059
Article Google Scholar
Nabipour, M., Nayyeri, P., Jabani, H., Mosavi, A., Salwana, E., Shahab, S.: Deep learning for stock market prediction. Entropy (2020). https://doi.org/10.3390/e22080840
Article Google Scholar
“CAC40 Stocks Dataset.” Accessed: Jun. 03, 2024. Available: https://www.kaggle.com/datasets/bryanb/cac40-stocks-dataset
“s12652-023-04570-4.pdf.” Accessed: Aug. 01, 2024. Available: https://doi.org/10.1007/s12652-023-04570-4.pdf
Nti, K. O., Adekoya, A., and Weyori, B.: Random Forest Based Feature Selection of Macroeconomic Variables for Stock Market Prediction. Rochester, NY: 3446053 (2019). https://doi.org/10.2139/ssrn.3446053.
Zhang, X., Hu, Y., Xie, K., Wang, S., Ngai, E.W.T., Liu, M.: A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142, 48–59 (2014). https://doi.org/10.1016/j.neucom.2014.01.057
Article Google Scholar
Dong, S., Wang, J., Luo, H., Wang, H., Wu, F.-X.: A dynamic predictor selection algorithm for predicting stock market movement. Expert Syst. Appl. 186, 115836 (2021)
Article Google Scholar
Miranda-Belmonte, H.U., Muñiz-Sánchez, V., Corona, F.: Word embeddings for topic modeling: an application to the estimation of the economic policy uncertainty index. Expert Syst. Appl. 211, 118499 (2023)
Article Google Scholar
Al Ali, A. I., Khedr, A. M.: Enhancing financial distress prediction through integrated Chinese whisper clustering and federated learning. J. Open Innov.: Technol. Market Complex 10(3), 100344 (2024)
Article Google Scholar

Download references

Funding

This research was supported by Swinburne University of Technology.

Author information

Authors and Affiliations

Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai, 345055, UAE
Jayaraman Kumarappan & Elakkiya Rajasekar
School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
Subramaniyaswamy Vairavasundaram
Symbiosis Centre for Applied Artificial Intelligence, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
Ketan Kotecha
School of Engineering, Swinburne University of Technology, Hawthorn, Australia
Ambarish Kulkarni

Authors

Jayaraman Kumarappan
View author publications
Search author on:PubMed Google Scholar
Elakkiya Rajasekar
View author publications
Search author on:PubMed Google Scholar
Subramaniyaswamy Vairavasundaram
View author publications
Search author on:PubMed Google Scholar
Ketan Kotecha
View author publications
Search author on:PubMed Google Scholar
Ambarish Kulkarni
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors have equal contributions in this work. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Subramaniyaswamy Vairavasundaram or Ketan Kotecha.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to Participate

All the authors involved have agreed to participate in this submitted article.

Consent to Publish

All the authors involved in this manuscript give full consent for publication of this submitted article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kumarappan, J., Rajasekar, E., Vairavasundaram, S. et al. Federated Learning Enhanced MLP–LSTM Modeling in an Integrated Deep Learning Pipeline for Stock Market Prediction. Int J Comput Intell Syst 17, 267 (2024). https://doi.org/10.1007/s44196-024-00680-9

Download citation

Received: 04 July 2024
Accepted: 13 October 2024
Published: 29 October 2024
DOI: https://doi.org/10.1007/s44196-024-00680-9

Keywords

Profiles

Jayaraman Kumarappan View author profile
Elakkiya Rajasekar View author profile

Federated Learning Enhanced MLP–LSTM Modeling in an Integrated Deep Learning Pipeline for Stock Market Prediction

Abstract

Similar content being viewed by others

Analyzing the Performance of a Deep Learning-Based Model for Stock Market Trend Prediction

Stock Market Forecasting: From Traditional Predictive Models to Large Language Models

Learning-Based Stock Trending Prediction by Incorporating Technical Indicators and Social Media Sentiment

Explore related subjects

1 Introduction

2 Related Works

3 Problem Statement

4 Proposed Fed-MLP–LSTM Model for Stock Market Prediction

4.1 Data Collection

4.2 Data Pre-processing

4.3 FED-MLP–LSTM Architecture for Stock Market Prediction

4.3.1 MLP Architecture for Feature Extraction

4.3.2 LSTM Architecture for Sequence Modeling

Algorithm 1

5 Results and Discussion

5.1 Experimental Outcome

5.2 Performance Evaluation

5.3 Discussion

5.4 Potential Limitations of the Proposed Study

6 Conclusions and Future Work

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethical Approval

Consent to Participate

Consent to Publish

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles