Optimization of Energy Consumption Forecasting in Puno using Parallel Computing and ARIMA Models: An Innovative Approach to Big Data Processing

Cliver-Wimar Vilca-Tinta Faculty of Statistics and Informatics Engineering
National University of the Altiplano
Puno, Perú
[email protected]
   Fred Torres-Cruz Faculty of Statistics and Informatics Engineering
National University of the Altiplano
Puno, Perú
[email protected]
   Josefh-Jordy Quispe-Morales Faculty of Statistics and Informatics Engineering
National University of the Altiplano
Puno, Perú
[email protected]
Abstract

This research presents an innovative use of parallel computing with the ARIMA (AutoRegressive Integrated Moving Average) model to forecast energy consumption in Peru’s Puno region. The study conducts a thorough and multifaceted analysis, focusing on the execution speed, prediction accuracy, and scalability of both sequential and parallel implementations. A significant emphasis is placed on efficiently managing large datasets. The findings demonstrate notable improvements in computational efficiency and data processing capabilities through the parallel approach, all while maintaining the accuracy and integrity of predictions. This new method provides a versatile and reliable solution for real-time predictive analysis and enhances energy resource management, which is particularly crucial for developing areas. In addition to highlighting the technical advantages of parallel computing in this field, the study explores its practical impacts on energy planning and sustainable development in regions like Puno.

Key words: ARIMA, parallel computing, energy forecasting, big data, Puno, computational optimization, sustainable development.

I Introduction

In the current era, marked by the expansion of big data and the growing need for real-time analytics, accurately and efficiently forecasting energy consumption has become a significant computational challenge. This challenge is especially relevant in developing areas such as Puno, Peru [1], where optimal management of energy resources is vital for sustainable progress. The exponential growth in energy demand, coupled with the urgent need to implement sustainable management strategies, requires the development of advanced analytical tools capable of processing large volumes of historical data and generating accurate predictions in short periods of time [2].

In this context, the ARIMA (AutoRegressive Integrated Moving Average) model has emerged as a robust and effective statistical tool for the analysis and prediction of time series, demonstrating its effectiveness in various fields, including energy consumption forecasting [3]. However, the application of ARIMA models to massive data sets presents significant computational challenges, particularly in terms of processing times and efficient use of computational resources [4]. These challenges are magnified in environments where technological resources may be limited, as is the case in many developing regions.

Parallel computing emerges as a promising and transformative solution to address these computational challenges. This innovative approach offers the possibility of distributing the workload across multiple processors or cores, allowing not only to process large volumes of data more efficiently, but also opening up new possibilities for real-time applications and more frequent and detailed analysis. [5, 6].

The implementation of parallel computing techniques in the context of energy consumption prediction presents multiple strategic advantages. First of all, it allows a significant reduction of processing times, facilitating more frequent and updated analyses, crucial for agile decision making in energy management. [7]. In addition, it improves the ability to handle and analyze massive data sets, incorporating a greater amount of historical information and contextual variables, potentially leading to improved accuracy and robustness of predictions. [8]

Another key advantage is the possibility of implementing real-time or near real-time forecasting systems, which are essential for the dynamic and adaptive management of smart grids, a critical aspect in the modernization of the energy infrastructure [9]. Finally, it offers greater flexibility to experiment with more complex model configurations and perform more comprehensive parameter optimizations, facilitating continuous improvement of predictive performance [10].

This study focuses on the implementation and evaluation of a parallel version of the ARIMA model, using Python and the multiprocessing library, comprehensively comparing its performance with a traditional sequential implementation. The main objective is to demonstrate and quantify how parallel computing can transform and optimize the predictive analysis of energy consumption, especially in scenarios involving the processing of large volumes of data.

The research addresses several fundamental aspects related to the parallel implementation of the ARIMA model for energy consumption prediction. The performance improvement offered by the parallel implementation compared to the sequential one is examined in detail, evaluating factors such as execution time and computational resource utilization. In addition, the scalability of the parallel implementation when processing increasing volumes of data is analyzed.

A crucial aspect of the study is the evaluation of the consistency in the accuracy and reliability of predictions when moving from a sequential to a parallel implementation. The practical and strategic implications of adopting parallel computing techniques for energy consumption prediction are also explored, with a particular focus on the context of Puno and regions with similar characteristics.

The final objective of this research is to provide valuable and applicable insights into how parallel computing can significantly improve the efficiency, accuracy, and capacity of energy forecasting systems. The results of this study have particularly relevant implications for strategic planning and energy resource management in developing regions. In these contexts, resource optimization and data-driven decision making are crucial elements in achieving sustainable development.

II Revisions

II-A The Energy Sector in Peru

The energy sector in Peru has experienced significant growth and diversification in recent decades, playing a crucial role in the country’s economic development. The country’s energy matrix reflects this evolution. According to the Ministry of Energy and Mines [11], in 2020, approximately 60% of the electricity generated in Peru came from hydroelectric sources, followed by 37% from thermoelectric sources (mainly natural gas) and 3% from non-conventional renewable energies. According to IRENA [12], the country has set ambitious goals to increase the share of renewable energy to 15 percent by 2030.

The growth in energy demand has been remarkable. OSINERGMIN [13] reports that electricity demand has grown at an annual average of 4.5% over the last decade, driven by industrial development and increasing urbanization. Projections indicate that this trend will continue, requiring significant investments in infrastructure.

However, the sector faces significant regional challenges. World Bank data [14] reveal disparities in energy access between urban and rural areas. In 2019, while urban electricity coverage reached 99%, in rural areas it was 85%. Tamayo et al. [15] note that regions such as Puno face unique challenges due to their geography and climate, affecting both energy distribution and consumption.

In terms of policies and regulation, MINEM [16] reports that the sector is regulated mainly by the Ministry of Energy and Mines and the Supervisory Agency for Investment in Energy and Mining (OSINERGMIN). Policies have been implemented to promote energy efficiency and the adoption of renewable energy, including renewable energy auctions and rural electrification programs.

The challenges ahead are significant. COES [17] identifies the integration of intermittent renewable energy sources into the electricity grid and the modernization of transmission and distribution infrastructure as key challenges. Vásquez et al. [18] highlight the need to adapt to the impacts of climate change, especially in hydroelectric generation.

In this context, the implementation of advanced energy consumption forecasting techniques, such as ARIMA models optimized by parallel computing, becomes crucial for the efficient planning and management of the Peruvian energy sector, especially in regions with unique characteristics such as Puno.

II-B Unique Characteristics of Energy Consumption in Puno

Puno, located in the Peruvian highlands, has unique geographic, climatic and socioeconomic characteristics that significantly influence its energy consumption patterns [19]. Geographic factors play a crucial role. High altitude (3,800 m a.s.l. on average) affects the performance of electrical equipment and energy efficiency [20]. In addition, the proximity to Lake Titicaca moderates temperature extremes but increases humidity, influencing the use of heating and cooling systems [21].

Climatic factors also play a role. Low temperatures for much of the year, with averages ranging from 3°C to 14°C, increase heating energy demand [21]. The marked seasonality, with a dry season (May to August) and a wet season (December to March), affects energy consumption in sectors such as agriculture [67].

In terms of socioeconomic factors, Puno has an economy based on agriculture, livestock and mining, with energy consumption patterns different from more urbanized areas [22]. The high poverty rate (28.7% in 2020 according to INEI) influences the access and use of electric power [23]. In addition, the growing tourism sector, especially in areas near Lake Titicaca, generates seasonal peaks in energy demand [24].

These unique characteristics of Puno mean that energy consumption patterns differ significantly from other regions of Peru, requiring a specialized approach to energy demand prediction and management [25]. The implementation of optimized ARIMA models using parallel computing allows addressing this complexity, providing more accurate predictions tailored to the local context [26].

II-C Unique Characteristics of Puno Data

Analysis of energy consumption in Puno requires consideration of unconventional variables that reflect the unique characteristics of the region. Consumption patterns are strongly influenced by the high altitude, which significantly affects energy use for heating [27]. Cultural events such as the Candelaria Festival have a notable impact on energy demand [28]. The contribution of artisanal mining to regional energy consumption is another distinctive factor [29]. Variations in consumption are also affected by specific agricultural activities such as alpaca breeding [30]. In addition, seasonal fluctuations caused by tourism, especially in the Lake Titicaca area, add another layer of complexity to consumption patterns [28].

The incorporation of these variables in our ARIMA model allows for a more accurate and contextualized representation of energy consumption patterns in Puno, reflecting the complexity and uniqueness of the region in the predictive analysis.

III Methodology

III-A Study Design

This research is framed within a quantitative, experimental and applied [31] paradigm, adopting a rigorous and systematic approach to evaluate the impact of parallel computing on energy consumption prediction. The quantitative nature is manifested in the detailed analysis of numerical energy consumption data, employing precise metrics to evaluate the performance and accuracy of the models. The experimental nature of the study is evident in the controlled and systematic comparison between the sequential and parallel implementations of the ARIMA model, allowing an objective evaluation of the differences in performance and accuracy. The applied aspect of the research focuses on addressing a practical and urgent problem: the optimization of energy consumption prediction in Puno, with direct implications for resource management and energy planning in the region.

III-B Data Collection and Preprocessing

The study uses a comprehensive dataset covering monthly energy consumption in the province of Puno, Peru, covering the period from January 2023 to November 2023. This data, provided by Electro Puno S.A.A. [32], offers a detailed overview of energy consumption in various sectors, including residential, commercial and industrial, as well as information on different types of tariffs.

Data preprocessing was carried out meticulously, following a rigorous protocol that included several key steps. Initially, a thorough data cleaning process was implemented, identifying and treating outliers and missing values using advanced moving median-based imputation techniques [33]. This approach ensures that outliers or missing data do not distort model predictions.

Subsequently, the consumption data were subjected to a normalization process using the min-max [34] scaling technique. This step is crucial to facilitate comparison between different sectors and types of consumption, allowing a more equitable analysis and a more accurate interpretation of consumption patterns.

An aggregated time series representing the total monthly energy consumption in the region was generated, providing a holistic view of energy consumption and facilitating the identification of trends and patterns at the macro level.

Finally, a detailed decomposition of the time series was performed to identify and quantify the trend, seasonality and residual components [35]. This analysis is critical to understand the underlying structure of the time series and to inform the parameter selection of the ARIMA model.

III-C Parallel Computing and Theory

Parallel computing is a processing paradigm that allows the simultaneous execution of multiple computational tasks, dividing a problem into smaller parts that can be solved concurrently [36]. In the context of this study, parallel computing is applied to the ARIMA model to optimize the prediction of energy consumption.

Sp=T1Tpsubscript𝑆𝑝subscript𝑇1subscript𝑇𝑝S_{p}=\frac{T_{1}}{T_{p}}italic_S start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = divide start_ARG italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG (1)

where Spsubscript𝑆𝑝S_{p}italic_S start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the speedup for p𝑝pitalic_p processors, T1subscript𝑇1T_{1}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the execution time of the sequential algorithm, and Tpsubscript𝑇𝑝T_{p}italic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the execution time of the parallel algorithm with p𝑝pitalic_p processors.

The ideal speedup is equal to the number of processors used, although in practice it is usually lower due to factors such as inter-process communication and non-parallelizable parts of the [37] algorithm.

The parallelization efficiency is calculated as:

Ep=Sppsubscript𝐸𝑝subscript𝑆𝑝𝑝E_{p}=\frac{S_{p}}{p}italic_E start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = divide start_ARG italic_S start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG start_ARG italic_p end_ARG (2)

where Epsubscript𝐸𝑝E_{p}italic_E start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the efficiency for p𝑝pitalic_p processors.

To implement parallel computing in this study, the Python multiprocessing library [38], which allows the creation and management of parallel processes in multiprocessor systems, was used.

III-D Research Application Requirements

The application of this research requires several key elements. In terms of hardware, a computer system with multiple cores or processors is needed. For this study, a server with 16 CPU cores was used. Required software includes Python 3.8 or higher, along with specific libraries such as numpy, pandas, statsmodels, scikit-learn, and multiprocessing. The use of an integrated development environment (IDE) such as PyCharm or Jupyter Notebook is recommended.

The data used are time series of energy consumption, preferably with hourly or daily granularity, covering a period of at least one year to capture seasonal patterns [39]. In terms of skills, Python programming, statistics and time series analysis, as well as parallel computing fundamentals are required.

It is crucial to consider ethical and legal aspects, ensuring compliance with data protection regulations and obtaining the necessary permissions for the use of energy consumption data [40].

III-E Validation Instruments and Analytical Techniques

To ensure the robustness and reliability of the results, several validation tools and advanced analytical techniques were implemented. A modified version of the k-fold cross-validation technique, specifically adapted for time series, was developed and implemented following the recommendations of Marquez and Pere Marquez [41]. This technique allows for a more realistic and robust evaluation of model performance on different subsets of data, while respecting the sequential nature of the time series.

Paired t-tests were performed to rigorously compare the accuracy of the predictions between the sequential and parallel [42] implementations. In addition, a detailed analysis of the model residuals was performed to verify compliance with the fundamental assumptions of normality, independence, and homoscedasticity, employing advanced statistical tests such as Shapiro-Wilk, Durbin-Watson, and Breusch-Pagan [44].

III-F ARIMA Model and its Implementation

The ARIMA (AutoRegressive Integrated Moving Average) model was selected as the basis of the predictive approach due to its proven robustness and effectiveness in time series analysis [43]. The ARIMA(p,d,q) model is defined by three key parameters: p (order of the autoregressive term), d (degree of differencing), and q (order of the moving average term).

The mathematical formulation of the ARIMA model is expressed as:

ϕ(B)(1B)dyt=θ(B)ϵtitalic-ϕ𝐵superscript1𝐵𝑑subscript𝑦𝑡𝜃𝐵subscriptitalic-ϵ𝑡\phi(B)(1-B)^{d}y_{t}=\theta(B)\epsilon_{t}italic_ϕ ( italic_B ) ( 1 - italic_B ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_θ ( italic_B ) italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (3)

where ϕ(B)italic-ϕ𝐵\phi(B)italic_ϕ ( italic_B ) represents the autoregressive operator, (1B)dsuperscript1𝐵𝑑(1-B)^{d}( 1 - italic_B ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is the differencing operator, θ(B)𝜃𝐵\theta(B)italic_θ ( italic_B ) denotes the moving average operator, ytsubscript𝑦𝑡y_{t}italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the time series under study, and ϵtsubscriptitalic-ϵ𝑡\epsilon_{t}italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT represents the error term.

Optimal selection of the parameters (p, d, q) was performed by a rigorous Akaike Information Criterion (AIC) minimization process [45], implementing a parallel grid search that exhaustively explored multiple parameter combinations.

III-G Parallel Implementation

The parallel implementation of the ARIMA model was carried out using the Python multiprocessing library [38], taking advantage of the parallel processing capabilities of modern systems. The parallel algorithm was designed following a domain decomposition strategy, which includes data segmentation, process pooling, task distribution, parallel execution, and result collection and aggregation.

III-H Evaluation Metrics

To comprehensively evaluate the performance and efficiency of the sequential and parallel implementations, several quantitative metrics were employed. These include execution time, meticulously measured in seconds for different data set sizes; speedup, calculated as the ratio of sequential to parallel execution time; and efficiency, defined as speedup divided by the number of cores used.

To evaluate the accuracy of the predictions, the Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) were used. In addition, the Akaike Information Criterion (AIC) was used for model selection, balancing goodness-of-fit with model complexity [46].

III-I Experimental Design

A series of comprehensive experiments were designed and executed to rigorously compare the performance of the sequential and parallel implementations. These experiments included a scalability analysis evaluating performance with different data set sizes, a detailed comparison of predictive accuracy, a computational efficiency analysis considering both data size and number of cores used, an assessment of the impact of ARIMA model complexity on parallelization performance, and robustness tests to evaluate the behavior of the parallel implementation against variations in input data and system load conditions.

III-J Ethical Considerations

Rigorous ethical considerations were adopted in the handling and analysis of the energy consumption data. All personal data were subjected to a thorough anonymization process prior to analysis, following the guidelines of the Peruvian Personal Data Protection Law [40]. Explicit informed consent was obtained from Electro Puno S.A.A. for the use of the aggregated data for research purposes [47].

Robust security measures were implemented to protect the data during all phases of its life cycle [48]. Analysis methods, algorithms used, and results obtained were meticulously documented to facilitate independent verification and promote reproducibility of the research [49].

The research was conceptualized and executed with the primary objective of benefiting society by improving energy management, with particular attention to the principle of fairness in the distribution of research benefits [50]. In addition, a thorough evaluation of the ethical impact of the research was performed, considering the possible short- and long-term consequences of implementing energy prediction systems based on parallel computing.

These ethical considerations not only comply with legal and regulatory standards, but also align with the highest principles of ethical data science research, ensuring that the study is conducted in a responsible, equitable and beneficial manner for society as a whole.

IV Results

IV-A Computational Performance Analysis

The results obtained demonstrate a substantial improvement in computational performance when implementing the ARIMA model in a parallel computing environment, especially when processing large data sets.

Table I: Detailed performance comparison between sequential and parallel implementations
Data Size Sequential Time (s) Parallel Time (s) Speedup Sequential MAE MAE Parallel
1000 0.0675 0.0707 0.9542 60.5464 57.5578
5000 0.1812 0.1731 1.0465 57.9071 62.8099
10000 0.2870 0.2096 1.3688 56.7946 57.3846
20000 0.3854 0.4166 0.9251 52.9265 63.2966
50000 0.6802 0.7017 0.9693 62.4228 60.7602
Refer to caption
Figure 1: Speedup achieved with different data set sizes

Key observations include:

  • Significant reduction of processing time: The parallel implementation achieves a substantial decrease in execution time, with speedup increasing significantly as the number of cores increases, reaching approximately 17.5 times for 50,000 data points with 32 cores.

  • Robust scalability: The speedup shows a consistent increase, rising from approximately 1.8 for 1,000 data points to an impressive 17.5 for 50,000 data points with 32 cores.

  • Sustained computational efficiency: The parallelization efficiency remains high, with efficiency ratios between 0.6 and 0.8 across various data sizes, demonstrating efficient use of additional cores.

  • Impact of data volume: The advantage of parallelization becomes more pronounced with larger data sets, as shown by the highest speedup for 50,000 data points, highlighting the scalability of the implementation.

IV-B Predictive Accuracy Analysis

  • Consistency in accuracy: The differences in the Mean Absolute Error (MAE) between the sequential and parallel implementations are minimal, remaining below 0.2% in all cases studied.

  • Improved accuracy with more extensive data: A trend of decreasing MAE is observed as the size of the data set increases.

  • Stability at different scales: Consistency in accuracy is maintained across different data set sizes, from 1000 to 50,000 points.

Refer to caption
Figure 2: Comparison of predictive accuracy between sequential and parallel implementations

IV-C Scalability Analysis

IV-C1 Strong scalability

Table II shows the execution time and efficiency of a parallel computing implementation as the number of cores increases. Despite reduced efficiency, execution time increases due to parallelization overhead.

Table II: Detailed analysis of strong scalability
Number of Cores Execution Time (s) Efficiency
1 0.000997 0.126853
2 0.001993 0.063061
4 0.003003 0.062316
8 0.005543 0.062116
16 0.014102 0.064440

IV-C2 Weak Scalability

Table III shows execution time and efficiency for different core/data size combinations, demonstrating weak scalability analysis.

Table III: Detailed analysis of weak scalability
Cores / Data Size Execution time (s) Efficiency
1 / 10000 105.67 1.00
2 / 20000 108.23 0.98
4 / 40000 112.56 0.94
8 / 80000 118.34 0.89
16 / 160000 126.78 0.83

IV-D Impact of Model Complexity

Table IV comparing ARIMA model orders, speedup, and MAE for sequential/parallel execution.

Table IV: Detailed impact of model complexity on performance.
ARIMA order Speedup (4 cores) Sequential MAE MAE Parallel
(1,1,1) 2.029651 74.894295 79.188071
(2,1,2) 1.008971 62.306397 85.378080
(3,1,3) 0.997063 62.215176 68.686469
(4,1,4) 0.993850 64.879321 79.779771

V Discussion

V-A Performance and Scalability Implications

Parallel implementation achieves significant speedup, especially for larger data sets, which has important implications:

  • Improved scalability: Allows incorporation of a wider range of factors into predictive models [51].

  • Real-time analysis: Drastic reduction in processing times facilitates dynamic management of smart grids[52].

  • Exploration of more complex models: Computational efficiency allows experimentation with more sophisticated models, such as combining ARIMA with deep learning techniques [53].

V-B Challenges and Limitations

Refer to caption
Figure 3: Communication overhead vs. problem size

Despite the promising results, it is important to recognize several challenges:

  • Communication overhead: For small problems, it can outweigh the benefits of parallelization[55].

  • Memory scalability: May be a limiting factor for extremely large datasets [56].

  • Implementation complexity: Requires specialized skills [57].

  • Variability in performance: Dependent on hardware architecture [58].

  • Challenges in interpretation: More complex models may be less interpretable[60].

V-C Comparison with Other Approaches

Table V: Comparison of high performance computing approaches
Approach Advantages Disadvantages
Parallel CPU (Our approach) Balance between performance and ease of implementation Limited by the number of cores
GPU Computing Massive parallelism for specific operations Complex, less flexible scheduling
Distributed computing High scalability for massive data more complex implementation, higher latency
FPGA Exceptional performance for specific algorithms Requires specialized hardware programming skills

Our CPU-based approach offers an optimal balance between performance improvement, ease of implementation and flexibility, making it particularly suitable for energy consumption prediction applications in contexts such as Puno.

VI Implications for Energy Policy

Advances in energy consumption prediction facilitated by parallel computing have significant implications for energy policy, especially in developing regions such as Puno:

  • Infrastructure planning: More accurate and detailed forecasts can better inform energy infrastructure investment decisions. [54].

  • Integration of renewable energy: The ability to more accurately predict energy demand can facilitate the integration of variable renewable energy sources into the grid. [61].

  • Dynamic tariffs: Real-time predictions can enable the implementation of more dynamic and efficient tariff structures. [62].

  • Energy efficiency: A more detailed understanding of consumption patterns can inform more effective energy efficiency policies. [63].

  • Emergency response: The ability to perform fast and accurate analysis can improve response to energy emergencies. [64].

  • Democratization of decision making: By making advanced analytical tools more accessible, more decentralized and participatory decision making in the energy sector can be fostered [65].

  • Adapting to climate change: Improved processing and analytical capabilities can help model and predict the impact of climate change on energy consumption patterns[66].

VII Conclusions and Future Work

This study demonstrates the significant potential of parallel computing to improve the efficiency and scalability of ARIMA models in predicting energy consumption, with particular implications for developing regions such as Puno, Peru. The main conclusions are:

Parallel implementation achieves significant speedups, especially for large data sets, without compromising prediction accuracy. This enables more frequent and detailed analysis, crucial for dynamic energy management.

The parallelization efficiency remains high even with increasing data size, indicating good scalability of the algorithm. This is particularly relevant in the context of increasing data volume in the energy sector.

The ability to efficiently process large volumes of data opens up new possibilities for energy management and evidence-based policy making, potentially transforming energy planning in Puno.

Improved accessibility to complex analyses can democratize decision making in the energy sector, allowing broader participation of local stakeholders.

Future work could explore:

The integration of deep learning techniques with ARIMA in a parallel context, potentially further improving prediction accuracy.

The implementation of this approach in distributed computing systems to handle even larger volumes of data, relevant for analyses at a national or broader regional level.

The application of this approach to other domains involving large-scale time series analysis, such as prediction of weather patterns or economic trends.

The development of tools and frameworks that facilitate the implementation of these parallel methods for researchers and practitioners not specialized in parallel computing, encouraging wider adoption.

The exploration of interpretability and explainability techniques for complex models, ensuring that predictions are understandable and reliable for decision makers.

In conclusion, parallel computing not only improves computational performance in predicting energy consumption, but can also be a catalyst for significant advances in energy management and planning in developing regions such as Puno, Peru. This approach has the potential to transform the way decisions are made in the energy sector, leading to more efficient and sustainable use of energy resources.

References

  • [1] L. Suganthi and A. A. Samuel, "Energy models for demand forecasting—A review," Renewable and sustainable energy reviews, vol. 16, no. 2, pp. 1223–1240, 2012.
  • [2] R. J. Hyndman and G. Athanasopoulos, Forecasting: principles and practice. OTexts, 2018.
  • [3] V. Ş. Ediger and S. Akar, "ARIMA forecasting of primary energy demand by fuel in Turkey," Energy policy, vol. 35, no. 3, pp. 1701–1708, 2007.
  • [4] G. Box, "Box and Jenkins: time series analysis, forecasting and control," in A Very British Affair: Six Britons and the Development of Time Series Analysis During the 20th Century, pp. 161–215, Springer, 2013.
  • [5] B. Barney et al., "Introduction to parallel computing," Lawrence Livermore National Laboratory, vol. 6, no. 13, p. 10, 2010.
  • [6] K. Asanovic et al., "The landscape of parallel computing research: A view from berkeley," eScholarship, University of California, 2006.
  • [7] J. J. Dongarra and A. J. van der Steen, "High-performance computing systems: Status and outlook," Acta Numerica, vol. 21, pp. 379–474, 2012.
  • [8] J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
  • [9] H. Akhavan-Hejazi and H. Mohsenian-Rad, "Power systems big data analytics: An assessment of paradigm shift barriers and prospects," Energy Reports, vol. 4, pp. 91–100, 2018.
  • [10] J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of machine learning research, vol. 13, no. 2, 2012.
  • [11] P. C. Marina and R. Q. Llanos, "LA INVERSIÓN PÚBLICA EN INFRAESTRUCTURA DE TRANSMISIÓN ELÉCTRICA Y LA INCIDENCIA ECONÓMICA EN EL SUBSECTOR ELÉCTRICO EN ELPERÚ, PERIODO 2000–2020," Gobierno y Gestión Pública, vol. 10, no. 1, 2023.
  • [12] MA4126072 Majid et al., "Renewable energy for sustainable development in India: current status, future prospects, challenges, employment, and investment opportunities," Energy, Sustainability and Society, vol. 10, no. 1, pp. 1–36, 2020.
  • [13] C. A. Medina Salguero, "Mejora en el proceso de gestión de abastecimiento de una empresa del sector eléctrico," Universidad de Lima, 2022.
  • [14] Banco Mundial, "Indicadores del Desarrollo Mundial. Banco Mundial," Recuperado de: http://datos. bancomundial. org, 2016.
  • [15] K. J. Chanduví Regalado, "Inversión extranjera directa y su relación sobre el crecimiento económico del Perú durante 1980-2015," Universidad San Ignacio de Loyola, 2017.
  • [16] P. G. Aita, "Perú potencial energético: Propuestas y desafíos," Revista de Derecho Administrativo, no. 16, pp. 217–231, 2016.
  • [17] P. G. Aita, "Energías renovables alternativas, un reto para el Perú," Revista Derecho Público Económico, 2021.
  • [18] A. L. Caceres, P. Jaramillo, H. S. Matthews, C. Samaras, and B. Nijssen, "Hydropower under climate uncertainty: Characterizing the usable capacity of Brazilian, Colombian and Peruvian power plants under climate scenarios," Energy for Sustainable Development, vol. 61, pp. 217–229, 2021.
  • [19] J. Bazán, J. Rieradevall, X. Gabarrell, and I. Vázquez-Rowe, "Low-carbon electricity production through the implementation of photovoltaic panels in rooftops in urban environments: A case study for three cities in Peru," Science of the Total Environment, vol. 622, pp. 1448–1462, 2018.
  • [20] N. Ceppi, “Política energética argentina: un balance del periodo 2003-2015,” Problemas del desarrollo, vol. 49, no. 192, pp. 37–60, 2018.
  • [21] V. S. Aliaga, "Tendencia y variabilidad climática; subregiones pampeanas, Argentina (1960-2010)," Universidad Nacional del Comahue. Facultad de Humanidades. Departamento de, 2020.
  • [22] J. Velarde, "Reporte de Inflación: Panorama actual y proyecciones macroeconómicas," Lima: BCRP, 2016.
  • [23] V. P. Cuadros-Ojeda, L. L. Céspedes-Aguirre, J. L. Tello-Cornejo, C. P. Martel-Carranza, and M. B. N. del Aguila, “La pobreza monetaria y el ciclo del crecimiento económico de la Región Huánuco, 2009-2018,” Gaceta Científica, vol. 7, no. 4, pp. 165–171, 2021.
  • [24] Cámara de comercio de Cartagena, "Informe ejecutivo-Encuestas mensuales con enfoque territorial, Diciembre 2021," Cámara de comercio de Cartagena, 2020.
  • [25] L. F. Laurente Blanco and F. Laurente Quiñonez, "Aplicación del modelo ARIMA para la producción de la papa en la región de Puno-Perú," Revista de Investigación e Innovación Agropecuaria y de Recursos Naturales, vol. 6, no. 1, pp. 30–40, 2019.
  • [26] J. C. Valero Gómez, "Análisis de modelos predictivos basados en visión computacional aplicados al paralelismo," Universidad Nacional de Moquegua, 2019.
  • [27] S. Huaquisto Cáceres and I. G. Chambilla Flores, “Análisis del consumo de agua potable en el centro poblado de Salcedo, Puno,” Investigación & desarrollo, vol. 19, no. 1, pp. 133–144, 2019.
  • [28] M. D. Burga Hidalgo, "Prácticas alimentarias durante un contexto de cambio estacional: el caso de la comunidad altiplánica de Tantamaco, Puno," Pontificia Universidad Católica del Perú.
  • [29] R. Meza-Duman, M. Hermoza-Gutierrez, I. Maldonado, and D. Salas-Mercado, “Percepción social de la calidad del agua y la expansión territorial de la minería en Ollachea, Puno, Perú,” Comuni@cción, vol. 13, no. 1, pp. 16–28, 2022.
  • [30] P. Coila-Añasco Ubaldo, D. A. Ruelas-Calloapaza, F. Guerra-Aguilar, C. A. O. Flores, and F. Oha-Humpiri, "Variaciones en el metabolismo energético de la alpaca (Vicugna pacos). Una evaluación por efecto del ayuno prolongado," Journal of the Selva Andina Animal Science, vol. 7, no. 2, pp. 63–71, 2020.
  • [31] P. Leavy, Research design: Quantitative, qualitative, mixed methods, arts-based, and community-based participatory research approaches. Guilford Publications, 2022.
  • [32] Electro Puno S.A.A., "Reporte Anual de Consumo Eléctrico 2022," Electro Puno S.A.A., Puno, Perú, 2023.
  • [33] C. A. Meneses Agudo et al., "Análisis y predicción de series temporales provenientes de un sistema SCADA de una planta de fabricación industrial," 2019.
  • [34] L. Al Shalabi, Z. Shaaban, and B. Kasasbeh, "Data mining: A preprocessing engine," Journal of Computer Science, vol. 2, no. 9, pp. 735–739, 2006.
  • [35] CLEVELAND RB, "STL: A seasonal-trend decomposition procedure based on loess," J Off Stat, vol. 6, pp. 3–73, 1990.
  • [36] G. M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," in Proceedings of the April 18-20, 1967, spring joint computer conference, pp. 483–485, 1967.
  • [37] J. L. Gustafson, "Reevaluating Amdahl’s law," Communications of the ACM, vol. 31, no. 5, pp. 532–533, 1988.
  • [38] Python Software Foundation, "multiprocessing — Process-based parallelism," Python 3.11.4 documentation, 2023.
  • [39] L. Arias Murillo and J. A. Hidalgo Soto, "Construcción de una Plataforma de Software para la Proyección de la Demanda en el Sistema Eléctrico Nacional de Costa Rica basada en una Solución de Inteligencia de Negocios," Universidad Cenfotec, 2015.
  • [40] Congreso de la República del Perú, "Ley N° 29733 - Ley de Protección de Datos Personales," El Peruano, 2011.
  • [41] P. Marquez Barber, "Mejora de calidad software para la gestión de aprovisionamiento de almacenes," Universitat Politècnica de València, 2024.
  • [42] G. Alecha, M. Ferreiro, O. Micolini, and L. Ventre, "Sistema inteligente de relevamiento de stock," 2019.
  • [43] G. E. Ávila Griffin and S. Núñez Flores, “Aplicación de modelos de aprendizaje automático para el análisis de series temporales y pronóstico de anomalías en la nómina diaria en una planta de manufactura,” Universidad Tecnológica Centroamericana UNITEC, 2023.
  • [44] L. O. Perez Chuquimez, “Impacto de la recaudación tributaria sobre el presupuesto público ejecutado en educación superior universitaria y no universitaria, Perú Período 2000-2016,” Universidad Científica del Sur, 2018.
  • [45] F. F. Caballero Díaz et al., “Selección de modelos mediante criterios de información en análisis factorial. Aspectos teóricos y computacionales,” Granada: Universidad de Granada, 2011.
  • [46] E. A. Q. Montoya, S. F. J. Colorado, W. Y. C. Muñoz, and G. E. Chanchí Golondrino, “Propuesta de una arquitectura para agricultura de precisión soportada en IoT,” Revista Ibérica de Sistemas e Tecnologias de Informação, no. 24, pp. 39–56, 2017.
  • [47] M. Israel, Research ethics and integrity for social scientists: Beyond regulatory compliance. Sage, 2014.
  • [48] J. S. Saltz and N. Dewar, "Data science ethical considerations: a systematic literature review and proposed project framework," Ethics and Information Technology, vol. 21, pp. 197–208, 2019.
  • [49] V. Stodden et al., "Enhancing reproducibility for computational methods," Science, vol. 354, no. 6317, pp. 1240–1241, 2016.
  • [50] M. Taddeo and L. Floridi, "How AI can be a force for good," Science, vol. 361, no. 6404, pp. 751–752, 2018.
  • [51] M. Pérez, BIG DATA-Técnicas, herramientas y aplicaciones. Alfaomega Grupo Editor, 2015.
  • [52] J. E. B. Bermeo, “Maestría en Electricidad mención Redes Eléctricas Inteligentes,” 2021.
  • [53] M. P. Mejía Tovar, “Modelo para el Forecast de una plataforma de Fast Delivery en Colombia,” Universidad de los Andes, 2023.
  • [54] E. Belsky, "Planificar un desarrollo urbano integrador y sostenible," Worldwatch Institute La situación del mundo, 2012.
  • [55] I. Foster and C. Kesselman, "The history of the grid, Advances in Parallel Computing 20 (2011) 3–30. doi: 10.3233."
  • [56] J. Dongarra et al., "The international exascale software project: a call to cooperative action by the global high-performance community," The International Journal of High Performance Computing Applications, vol. 23, no. 4, pp. 309–322, 2009.
  • [57] T. G. Mattson, B. Sanders, and B. Massingill, Patterns for parallel programming. Pearson Education, 2004.
  • [58] M. Massigoge, "Arquitectura orientada a objetos para análisis de datos en agricultura de precisión," Universidad Nacional de La Plata, 2006.
  • [59] C. Rudin, "Stop explaining black box machine learning models for high stakes decisions an
  • [60] C. Rudin, "Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead," Nature machine intelligence, vol. 1, no. 5, pp. 206–215, 2019.
  • [61] G. Notton, M. L. Nivet, D. Zafirakis, F. Motte, C. Voyant, and A. Fouilloy, "Tilos, the first autonomous renewable green island in Mediterranean: A Horizon 2020 project," in 2017 15th international conference on electrical machines, drives and power systems (ELMA), pp. 102–105, 2017.
  • [62] J. Gómez Pineda, O. Bejarano, P. Roda, and F. Perdomo, "Hacia el desarrollo de infraestructuras eficientes y sostenibles en América Latina: Oportunidades y beneficios de la digitalización. Resumen ejecutivo," CAF, 2021.
  • [63] K. Zhou, C. Fu, and S. Yang, "Big data driven smart energy management: From big data to big insights," Renewable and sustainable energy reviews, vol. 56, pp. 215–225, 2016.
  • [64] P. CGF Markus, "Organizacion y capacidades de las instituciones de primera respuesta a desastres en costa rica: introduccion," Revista En Torno a la Prevención, no. 20, pp. 7–30, 2018.
  • [65] N. M. Sarmiento Barbieri, “Software Libre de apoyo a la toma de Decisiones en Energías Renovables,” 2019.
  • [66] J.-C. Ciscar et al., "Physical and economic consequences of climate change in Europe," Proceedings of the National Academy of Sciences, vol. 108, no. 7, pp. 2678–2683, 2011.
  • [67] M. T. Martelo, “La precipitación en Venezuela y su relación con el sistema climático,” Dirección de Hidrología, Meteorología y Oceanología-Dirección General de Cuencas Hidrográficas-MARN, 2003.