Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,113)

Search Parameters:
Keywords = XGBoost

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 6078 KiB  
Article
Prediction of Oil–Water Two-Phase Flow Patterns Based on Bayesian Optimisation of the XGBoost Algorithm
by Dudu Wang, Haimin Guo, Yongtuo Sun, Haoxun Liang, Ao Li and Yuqing Guo
Processes 2024, 12(8), 1660; https://doi.org/10.3390/pr12081660 (registering DOI) - 7 Aug 2024
Abstract
With the continuous advancement of petroleum extraction technologies, the importance of horizontal and inclined wells in reservoir exploitation has been increasing. However, accurately predicting oil–water two-phase flow regimes is challenging due to the complexity of subsurface fluid flow patterns. This paper introduces a [...] Read more.
With the continuous advancement of petroleum extraction technologies, the importance of horizontal and inclined wells in reservoir exploitation has been increasing. However, accurately predicting oil–water two-phase flow regimes is challenging due to the complexity of subsurface fluid flow patterns. This paper introduces a novel approach to address this challenge by employing extreme gradient boosting (XGBoost, version 2.1.0) optimised through Bayesian techniques (using the Bayesian-optimization library, version 1.4.3) to predict oil–water two-phase flow regimes. The integration of Bayesian optimisation aims to enhance the efficiency of parameter tuning and the precision of predictive models. The methodology commenced with experimental studies utilising a multiphase flow simulation apparatus to gather data across a spectrum of water cut rate, well inclination angles, and flow rates. Flow patterns were meticulously recorded via direct visual inspection, and these empirical datasets were subsequently used to train and validate both the conventional XGBoost model and its Bayesian-optimised counterpart. A total of 64 datasets were collected, with 48 sets used for training and 16 sets for testing, divided in a 3:1 ratio. The findings highlight a marked improvement in predictive accuracy for the Bayesian-optimised XGBoost model, achieving a testing accuracy of 93.8%, compared to 75% for the traditional XGBoost model. Precision, recall, and F1-score metrics also showed significant improvements: precision increased from 0.806 to 0.938, recall from 0.875 to 0.938, and F1-score from 0.873 to 0.938. The training accuracy further supported these results, with the Bayesian-optimised XGBoost (BO-XGBoost) model achieving an accuracy of 0.948 compared to 0.806 for the traditional XGBoost model. Comparative analyses demonstrate that Bayesian optimisation enhanced the predictive capabilities of the algorithm. Shapley additive explanations (SHAP) analysis revealed that well inclination angles, water cut rates, and daily flow rates were the most significant features contributing to the predictions. This study confirms the efficacy and superiority of the Bayesian-optimised XGBoost (BO-XGBoost) algorithm in predicting oil–water two-phase flow regimes, offering a robust and effective methodology for investigating complex subsurface fluid dynamics. The research outcomes are crucial in improving the accuracy of oil–water two-phase flow predictions and introducing innovative technical approaches within the domain of petroleum engineering. This work lays a foundational stone for the advancement and application of multiphase flow studies. Full article
(This article belongs to the Section Automation Control Systems)
Show Figures

Figure 1

17 pages, 1946 KiB  
Article
Data-Driven PM2.5 Exposure Prediction in Wildfire-Prone Regions and Respiratory Disease Mortality Risk Assessment
by Sadegh Khanmohammadi, Mehrdad Arashpour, Milad Bazli and Parisa Farzanehfar
Fire 2024, 7(8), 277; https://doi.org/10.3390/fire7080277 (registering DOI) - 7 Aug 2024
Abstract
Wildfires generate substantial smoke containing fine particulate matter (PM2.5) that adversely impacts health. This study develops machine learning models integrating pre-wildfire factors like weather and fuel conditions with post-wildfire health impacts to provide a holistic understanding of smoke exposure risks. Various [...] Read more.
Wildfires generate substantial smoke containing fine particulate matter (PM2.5) that adversely impacts health. This study develops machine learning models integrating pre-wildfire factors like weather and fuel conditions with post-wildfire health impacts to provide a holistic understanding of smoke exposure risks. Various data-driven models including Support Vector Regression, Multi-layer Perceptron, and three tree-based ensemble algorithms (Random Forest, Extreme Gradient Boosting (XGBoost), and Natural Gradient Boosting (NGBoost)) are evaluated in this study. Ensemble models effectively predict PM2.5 levels based on temperature, humidity, wind, and fuel moisture, revealing the significant roles of radiation, temperature, and moisture. Further modelling links smoke exposure to deaths from chronic obstructive pulmonary disease (COPD) and lung cancer using age, sex, and pollution type as inputs. Ambient pollution is the primary driver of COPD mortality, while age has a greater influence on lung cancer deaths. This research advances atmospheric and health impact understanding, aiding forest fire prevention and management. Full article
(This article belongs to the Special Issue Forest Fuel Treatment and Fire Risk Assessment)
Show Figures

Figure 1

18 pages, 3533 KiB  
Article
Rice Yield Forecasting Using Hybrid Quantum Deep Learning Model
by De Rosal Ignatius Moses Setiadi, Ajib Susanto, Kristiawan Nugroho, Ahmad Rofiqul Muslikh, Arnold Adimabua Ojugo and Hong-Seng Gan
Computers 2024, 13(8), 191; https://doi.org/10.3390/computers13080191 (registering DOI) - 7 Aug 2024
Abstract
In recent advancements in agricultural technology, quantum mechanics and deep learning integration have shown promising potential to revolutionize rice yield forecasting methods. This research introduces a novel Hybrid Quantum Deep Learning model that leverages the intricate processing capabilities of quantum computing combined with [...] Read more.
In recent advancements in agricultural technology, quantum mechanics and deep learning integration have shown promising potential to revolutionize rice yield forecasting methods. This research introduces a novel Hybrid Quantum Deep Learning model that leverages the intricate processing capabilities of quantum computing combined with the robust pattern recognition prowess of deep learning algorithms such as Extreme Gradient Boosting (XGBoost) and Bidirectional Long Short-Term Memory (Bi-LSTM). Bi-LSTM networks are used for temporal feature extraction and quantum circuits for quantum feature processing. Quantum circuits leverage quantum superposition and entanglement to enhance data representation by capturing intricate feature interactions. These enriched quantum features are combined with the temporal features extracted by Bi-LSTM and fed into an XGBoost regressor. By synthesizing quantum feature processing and classical machine learning techniques, our model aims to improve prediction accuracy significantly. Based on measurements of mean square error (MSE), the coefficient of determination (R2), and mean average error (MAE), the results are 1.191621 × 10−5, 0.999929482, and 0.001392724, respectively. This value is so close to perfect that it helps make essential decisions in global agricultural planning and management. Full article
Show Figures

Figure 1

33 pages, 2814 KiB  
Article
Explainable Graph Neural Networks: An Application to Open Statistics Knowledge Graphs for Estimating House Prices
by Areti Karamanou, Petros Brimos, Evangelos Kalampokis and Konstantinos Tarabanis
Technologies 2024, 12(8), 128; https://doi.org/10.3390/technologies12080128 - 6 Aug 2024
Viewed by 244
Abstract
In the rapidly evolving field of real estate economics, the prediction of house prices continues to be a complex challenge, intricately tied to a multitude of socio-economic factors. Traditional predictive models often overlook spatial interdependencies that significantly influence housing prices. The objective of [...] Read more.
In the rapidly evolving field of real estate economics, the prediction of house prices continues to be a complex challenge, intricately tied to a multitude of socio-economic factors. Traditional predictive models often overlook spatial interdependencies that significantly influence housing prices. The objective of this study is to leverage Graph Neural Networks (GNNs) on open statistics knowledge graphs to model these spatial dependencies and predict house prices across Scotland’s 2011 data zones. The methodology involves retrieving integrated statistical indicators from the official Scottish Open Government Data portal and applying three representative GNN algorithms: ChebNet, GCN, and GraphSAGE. These GNNs are compared against traditional models, including the tabular-based XGBoost and a simple Multi-Layer Perceptron (MLP), demonstrating superior prediction accuracy. Innovative contributions of this study include the use of GNNs to model spatial dependencies in real estate economics and the application of local and global explainability techniques to enhance transparency and trust in the predictions. The global feature importance is determined by a logistic regression surrogate model while the local, region-level understanding of the GNN predictions is achieved through the use of GNNExplainer. Explainability results are compared with those from a previous work that applied the XGBoost machine learning algorithm and the SHapley Additive exPlanations (SHAP) explainability framework on the same dataset. Interestingly, both the global surrogate model and the SHAP approach underscored the comparative illness factor, a health indicator, and the ratio of detached dwellings as the most crucial features in the global explainability. In the case of local explanations, while both methods showed similar results, the GNN approach provided a richer, more comprehensive understanding of the predictions for two specific data zones. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

13 pages, 3652 KiB  
Article
Research on Predictive Auxiliary Diagnosis Method for Gastric Cancer Based on Non-Invasive Indicator Detection
by Xia Zhang, Mao Zhang, Gang Wei and Jia Wang
Appl. Sci. 2024, 14(16), 6858; https://doi.org/10.3390/app14166858 - 6 Aug 2024
Viewed by 343
Abstract
Chronic atrophic gastritis is a serious health issue beyond the stomach health problems that affect normal life. This study aimed to explore the influencing factors related to chronic atrophic gastritis (CAG) using non-invasive indicators and establish an optimal prediction model to aid in [...] Read more.
Chronic atrophic gastritis is a serious health issue beyond the stomach health problems that affect normal life. This study aimed to explore the influencing factors related to chronic atrophic gastritis (CAG) using non-invasive indicators and establish an optimal prediction model to aid in the clinical diagnosis of CAG. Electronic medical record data from 20,615 patients with CAG were analyzed, including routine blood tests, liver function tests, and coagulation tests. The logistic regression algorithm revealed that age, hematocrit, and platelet distribution width were significant influences suggesting chronic atrophic gastritis in the Chongqing population (p < 0.05), with an area under the curve (AUC) of 0.879. The predictive model constructed based on the Random Forest algorithm exhibited an accuracy of 83.15%, precision of 97.38%, recall of 77.36%, and an F1-score of 70.86%, outperforming the models constructed using XGBoost, KNN, and SVC algorithms in a comprehensive comparison. The prediction model derived from this study serves as a valuable tool for future studies and can aid in the prediction and screening of chronic atrophic gastritis. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

24 pages, 7013 KiB  
Article
Comparative Analysis of Nature-Inspired Metaheuristic Techniques for Optimizing Phishing Website Detection
by Thomas Nagunwa
Analytics 2024, 3(3), 344-367; https://doi.org/10.3390/analytics3030019 - 6 Aug 2024
Viewed by 312
Abstract
The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving [...] Read more.
The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving complex optimization problems in diverse domains. Following these successes, this research paper aims to investigate the effectiveness of metaheuristic techniques, particularly Genetic Algorithms (GAs), Differential Evolution (DE), and Particle Swarm Optimization (PSO), in optimizing the hyperparameters of machine learning (ML) algorithms for detecting phishing websites. Using multiple datasets, six ensemble classifiers were trained on each dataset and their hyperparameters were optimized using each metaheuristic technique. As a baseline for assessing performance improvement, the classifiers were also trained with the default hyperparameters. To validate the genuine impact of the techniques over the use of default hyperparameters, we conducted statistical tests on the accuracy scores of all the optimized classifiers. The results show that the GA is the most effective technique, by improving the accuracy scores of all the classifiers, followed by DE, which improved four of the six classifiers. PSO was the least effective, improving only one classifier. It was also found that GA-optimized Gradient Boosting, LGBM and XGBoost were the best classifiers across all the metrics in predicting phishing websites, achieving peak accuracy scores of 98.98%, 99.24%, and 99.47%, respectively. Full article
Show Figures

Figure 1

16 pages, 1311 KiB  
Article
Hybrid Predictive Machine Learning Model for the Prediction of Immunodominant Peptides of Respiratory Syncytial Virus
by Syed Nisar Hussain Bukhari and Kingsley A. Ogudo
Bioengineering 2024, 11(8), 791; https://doi.org/10.3390/bioengineering11080791 - 5 Aug 2024
Viewed by 406
Abstract
Respiratory syncytial virus (RSV) is a common respiratory pathogen that infects the human lungs and respiratory tract, often causing symptoms similar to the common cold. Vaccination is the most effective strategy for managing viral outbreaks. Currently, extensive efforts are focused on developing a [...] Read more.
Respiratory syncytial virus (RSV) is a common respiratory pathogen that infects the human lungs and respiratory tract, often causing symptoms similar to the common cold. Vaccination is the most effective strategy for managing viral outbreaks. Currently, extensive efforts are focused on developing a vaccine for RSV. Traditional vaccine design typically involves using an attenuated form of the pathogen to elicit an immune response. In contrast, peptide-based vaccines (PBVs) aim to identify and chemically synthesize specific immunodominant peptides (IPs), known as T-cell epitopes (TCEs), to induce a targeted immune response. Despite their potential for enhancing vaccine safety and immunogenicity, PBVs have received comparatively less attention. Identifying IPs for PBV design through conventional wet-lab experiments is challenging, costly, and time-consuming. Machine learning (ML) techniques offer a promising alternative, accurately predicting TCEs and significantly reducing the time and cost of vaccine development. This study proposes the development and evaluation of eight hybrid ML predictive models created through the permutations and combinations of two classification methods, two feature weighting techniques, and two feature selection algorithms, all aimed at predicting the TCEs of RSV. The models were trained using the experimentally determined TCEs and non-TCE sequences acquired from the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) repository. The hybrid model composed of the XGBoost (XGB) classifier, chi-squared (ChST) weighting technique, and backward search (BST) as the optimal feature selection algorithm (ChST−BST–XGB) was identified as the best model, achieving an accuracy, sensitivity, specificity, F1 score, AUC, precision, and MCC of 97.10%, 0.98, 0.97, 0.98, 0.99, 0.99, and 0.96, respectively. Additionally, K-fold cross-validation (KFCV) was performed to ensure the model’s reliability and an average accuracy of 97.21% was recorded for the ChST−BST–XGB model. The results indicate that the hybrid XGBoost model consistently outperforms other hybrid approaches. The epitopes predicted by the proposed model may serve as promising vaccine candidates for RSV, subject to in vitro and in vivo scientific assessments. This model can assist the scientific community in expediting the screening of active TCE candidates for RSV, ultimately saving time and resources in vaccine development. Full article
(This article belongs to the Special Issue Machine Learning Technology in Predictive Healthcare)
Show Figures

Figure 1

27 pages, 15447 KiB  
Article
High-Resolution Rainfall Estimation Using Ensemble Learning Techniques and Multisensor Data Integration
by Maulana Putra, Mohammad Syamsu Rosid and Djati Handoko
Sensors 2024, 24(15), 5030; https://doi.org/10.3390/s24155030 - 3 Aug 2024
Viewed by 372
Abstract
In Indonesia, the monitoring of rainfall requires an estimation system with a high resolution and wide spatial coverage because of the complexities of the rainfall patterns. This study built a rainfall estimation model for Indonesia through the integration of data from various instruments, [...] Read more.
In Indonesia, the monitoring of rainfall requires an estimation system with a high resolution and wide spatial coverage because of the complexities of the rainfall patterns. This study built a rainfall estimation model for Indonesia through the integration of data from various instruments, namely, rain gauges, weather radars, and weather satellites. An ensemble learning technique, specifically, extreme gradient boosting (XGBoost), was applied to overcome the sparse data due to the limited number of rain gauge points, limited weather radar coverage, and imbalanced rain data. The model includes bias correction of the satellite data to increase the estimation accuracy. In addition, the data from several weather radars installed in Indonesia were also combined. This research handled rainfall estimates in various rain patterns in Indonesia, such as seasonal, equatorial, and local patterns, with a high temporal resolution, close to real time. The validation was carried out at six points, namely, Bandar Lampung, Banjarmasin, Pontianak, Deli Serdang, Gorontalo, and Biak. The research results show good estimation accuracy, with respective values of 0.89, 0.91, 0.89, 0.9, 0.92, and 0.9, and root mean square error (RMSE) values of 2.75 mm/h, 2.57 mm/h, 3.08 mm/h, 2.64 mm/h, 1.85 mm/h, and 2.48 mm/h. Our research highlights the potential of this model to accurately capture diverse rainfall patterns in Indonesia at high spatial and temporal scales. Full article
(This article belongs to the Special Issue Atmospheric Precipitation Sensors)
Show Figures

Figure 1

17 pages, 16804 KiB  
Article
Land Cover Mapping in a Mangrove Ecosystem Using Hybrid Selective Kernel-Based Convolutional Neural Networks and Multi-Temporal Sentinel-2 Imagery
by Seyd Teymoor Seydi, Seyed Ali Ahmadi, Arsalan Ghorbanian and Meisam Amani
Remote Sens. 2024, 16(15), 2849; https://doi.org/10.3390/rs16152849 - 3 Aug 2024
Viewed by 338
Abstract
Mangrove ecosystems provide numerous ecological services and serve as vital habitats for a wide range of flora and fauna. Thus, accurate mapping and monitoring of relevant land covers in mangrove ecosystems are crucial for effective conservation and management efforts. In this study, we [...] Read more.
Mangrove ecosystems provide numerous ecological services and serve as vital habitats for a wide range of flora and fauna. Thus, accurate mapping and monitoring of relevant land covers in mangrove ecosystems are crucial for effective conservation and management efforts. In this study, we proposed a novel approach for mangrove ecosystem mapping using a Hybrid Selective Kernel-based Convolutional Neural Network (HSK-CNN) framework and multi-temporal Sentinel-2 imagery. A time series of the Normalized Difference Vegetation Index (NDVI) products derived from Sentinel-2 imagery was produced to capture the temporal behavior of land cover types in the dynamic ecosystem of the study area. The proposed algorithm integrated Selective Kernel-based feature extraction techniques to facilitate the effective learning and classification of multiple land cover types within the dynamic mangrove ecosystems. The model demonstrated a high Overall Accuracy (OA) of 94% in classifying eight land cover classes, including mangrove, tidal zone, water, mudflat, urban, and vegetation. The HSK-CNN demonstrated superior performance compared to other algorithms, including random forest (OA = 85%), XGBoost (OA = 87%), Three-Dimensional (3D)-DenseNet (OA = 90%), Two-Dimensional (2D)-CNN (OA = 91%), Multi-Layer Perceptron (MLP)-Mixer (OA = 92%), and Swin Transformer (OA = 93%). Additionally, it was observed that the structure of the network, such as the types of convolutional layers and patch sizes, affected the classification accuracy using the proposed model and, thus, the optimum scenarios and values of these parameters should be determined to obtain the highest possible classification accuracy. Overall, it was observed that the produced map could offer valuable insights into the distribution of different land cover types in the mangrove ecosystem, facilitating informed decision-making for conservation and sustainable management efforts. Full article
Show Figures

Figure 1

16 pages, 3022 KiB  
Article
Data-Driven Optimization of Plasma Electrolytic Oxidation (PEO) Coatings with Explainable Artificial Intelligence Insights
by Patricia Fernández-López, Sofia A. Alves, Aleksey Rogov, Aleksey Yerokhin, Iban Quintana, Aitor Duo and Aitor Aguirre-Ortuzar
Coatings 2024, 14(8), 979; https://doi.org/10.3390/coatings14080979 - 3 Aug 2024
Viewed by 312
Abstract
PEO constitutes a promising surface technology for the development of protective and functional ceramic coatings on lightweight alloys. Despite its interesting advantages, including enhanced wear and corrosion resistances and eco-friendliness, the industrial implementation of PEO technology is limited by its relatively high energy [...] Read more.
PEO constitutes a promising surface technology for the development of protective and functional ceramic coatings on lightweight alloys. Despite its interesting advantages, including enhanced wear and corrosion resistances and eco-friendliness, the industrial implementation of PEO technology is limited by its relatively high energy consumption. This study explores the development and optimization of novel PEO processes by means of machine learning (ML) to improve the coating thickness. For this purpose, ML models random forest and XGBoost were employed to predict the thickness of the developed PEO coatings based on the key process variables (frequency, current density, and electrolyte composition). The predictive performance was significantly improved by including the composition of the used electrolyte in the models. Furthermore, Shapley values identified the pulse frequency and the TiO2 concentration in the electrolyte as the most influential variables, with higher values leading to increased coating thickness. The residual analysis revealed a certain heteroscedasticity, which suggests the need for additional samples with high thickness to improve the accuracy of the model. This study reveals the potential of artificial intelligence (AI)-driven optimization in PEO processes, which could pave the way for more efficient and cost-effective industrial applications. The findings achieved further emphasize the significance of integrating interactions between variables, such as frequency and TiO2 concentration, into the design of processing operations. Full article
Show Figures

Figure 1

17 pages, 786 KiB  
Article
A Parallel Approach to Enhance the Performance of Supervised Machine Learning Realized in a Multicore Environment
by Ashutosh Ghimire and Fathi Amsaad
Mach. Learn. Knowl. Extr. 2024, 6(3), 1840-1856; https://doi.org/10.3390/make6030090 - 2 Aug 2024
Viewed by 523
Abstract
Machine learning models play a critical role in applications such as image recognition, natural language processing, and medical diagnosis, where accuracy and efficiency are paramount. As datasets grow in complexity, so too do the computational demands of classification techniques. Previous research has achieved [...] Read more.
Machine learning models play a critical role in applications such as image recognition, natural language processing, and medical diagnosis, where accuracy and efficiency are paramount. As datasets grow in complexity, so too do the computational demands of classification techniques. Previous research has achieved high accuracy but required significant computational time. This paper proposes a parallel architecture for Ensemble Machine Learning Models, harnessing multicore CPUs to expedite performance. The primary objective is to enhance machine learning efficiency without compromising accuracy through parallel computing. This study focuses on benchmark ensemble models including Random Forest, XGBoost, ADABoost, and K Nearest Neighbors. These models are applied to tasks such as wine quality classification and fraud detection in credit card transactions. The results demonstrate that, compared to single-core processing, machine learning tasks run 1.7 times and 3.8 times faster for small and large datasets on quad-core CPUs, respectively. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

14 pages, 2447 KiB  
Article
Air Quality Prediction and Ranking Assessment Based on Bootstrap-XGBoost Algorithm and Ordinal Classification Models
by Jingnan Yang, Yuzhu Tian and Chun Ho Wu
Atmosphere 2024, 15(8), 925; https://doi.org/10.3390/atmos15080925 - 2 Aug 2024
Viewed by 246
Abstract
Along with the rapid development of industries and the acceleration of urbanisation, the problem of air pollution is becoming more serious. Exploring the relevant factors affecting air quality and accurately predicting the air quality index are significant in improving the overall environmental quality [...] Read more.
Along with the rapid development of industries and the acceleration of urbanisation, the problem of air pollution is becoming more serious. Exploring the relevant factors affecting air quality and accurately predicting the air quality index are significant in improving the overall environmental quality and realising green economic development. Machine learning algorithms and statistical models have been widely used in air quality prediction and ranking assessment. In this paper, based on daily air quality data for the city of Xi’an, China, from 1 October 2022 to 30 September 2023, we construct support vector regression (SVR), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), random forests (RF), neural network (NN) and long short-term memory (LSTM) models to analyse the influence of the air quality index for Xi’an and to conduct comparative tests. The predicted values and 95% prediction intervals of the AQI for the next 15 days for Xi’an, China, are given based on the Bootstrap-XGBoost algorithm. Further, the ordinal logit regression and ordinal probit regression models are constructed to evaluate and accurately predict the AQI ranks of the data from 1 October 2023 to 15 October 2023 for Xi’an. Finally, this paper proposes some suggestions and policy measures based on the findings of this paper. Full article
(This article belongs to the Special Issue Atmospheric Pollutants: Monitoring and Observation)
Show Figures

Figure 1

22 pages, 4022 KiB  
Article
A Multi-Modal Machine Learning Methodology for Predicting Solitary Pulmonary Nodule Malignancy in Patients Undergoing PET/CT Examination
by Ioannis D. Apostolopoulos, Nikolaos D. Papathanasiou, Dimitris J. Apostolopoulos, Nikolaos Papandrianos and Elpiniki I. Papageorgiou
Big Data Cogn. Comput. 2024, 8(8), 85; https://doi.org/10.3390/bdcc8080085 - 2 Aug 2024
Viewed by 300
Abstract
This study explores a multi-modal machine-learning-based approach to classify solitary pulmonary nodules (SPNs). Non-small cell lung cancer (NSCLC), presenting primarily as SPNs, is the leading cause of cancer-related deaths worldwide. Early detection and appropriate management of SPNs are critical to improving patient outcomes, [...] Read more.
This study explores a multi-modal machine-learning-based approach to classify solitary pulmonary nodules (SPNs). Non-small cell lung cancer (NSCLC), presenting primarily as SPNs, is the leading cause of cancer-related deaths worldwide. Early detection and appropriate management of SPNs are critical to improving patient outcomes, necessitating efficient diagnostic methodologies. While CT and PET scans are pivotal in the diagnostic process, their interpretation remains prone to human error and delays in treatment implementation. This study proposes a machine-learning-based network to mitigate these concerns, integrating CT, PET, and manually extracted features in a multi-modal manner by integrating multiple image modalities and tabular features). CT and PET images are classified by a VGG19 network, while additional SPN features in combination with the outputs of VGG19 are processed by an XGBoost model to perform the ultimate diagnosis. The proposed methodology is evaluated using patient data from the Department of Nuclear Medicine of the University Hospital of Patras in Greece. We used 402 patient cases with human annotations to internally validate the model and 96 histopathological-confirmed cases for external evaluation. The model exhibited 97% agreement with the human readers and 85% diagnostic performance in the external set. It also identified the VGG19 predictions from CT and PET images, SUVmax, and diameter as key malignancy predictors. The study suggests that combining all available image modalities and SPN characteristics improves the agreement of the model with the human readers and the diagnostic efficiency. Full article
Show Figures

Figure 1

30 pages, 24527 KiB  
Article
Application of Artificial Intelligence to Forecast Drought Index for the Mekong Delta
by Duong Hai Ha, Phong Nguyen Duc, Thuan Ha Luong, Thang Tang Duc, Thang Trinh Ngoc, Tien Nguyen Minh and Tu Nguyen Minh
Appl. Sci. 2024, 14(15), 6763; https://doi.org/10.3390/app14156763 - 2 Aug 2024
Viewed by 337
Abstract
Droughts have a substantial impact on water supplies, agriculture, and ecosystems worldwide. Agricultural sustainability and production in the Mekong Delta of Vietnam are being jeopardized by droughts caused by climate change. Conventional forecasting methods frequently struggle to comprehend the intricate dynamics of meteorological [...] Read more.
Droughts have a substantial impact on water supplies, agriculture, and ecosystems worldwide. Agricultural sustainability and production in the Mekong Delta of Vietnam are being jeopardized by droughts caused by climate change. Conventional forecasting methods frequently struggle to comprehend the intricate dynamics of meteorological occurrences connected to drought, necessitating the use of sophisticated prediction techniques. This study assesses the effectiveness of various statistical models (ARIMA), machine learning, and deep learning models (Gradient Boosting, XGBoost, RNN, and LSTM) in forecasting the SPEI over different time periods (1, 3, 6, and 12 months) across six prediction intervals. The models were developed and evaluated using data from 11 meteorological stations spanning from 1985 to 2022. These models incorporated various climatic variables, including precipitation, temperature, humidity, potential evapotranspiration (PET), Southern Oscillation Index (SOI) Anomaly, and sea surface temperature in the NINO4 region (SST_NINO4). The results demonstrate that XGBoost and LSTM models exhibit outstanding performance, showcasing lower error metrics and higher R² values compared to Gradient Boosting and RNN. The performance of the model fluctuated depending on the forecast step, with error metrics often increasing with longer prediction horizons. The use of climatic indices improved the accuracy of the model. These findings are consistent with earlier research on drought episodes in the Mekong Delta and support studies from other areas that show the effectiveness of advanced modeling tools for predicting droughts. The work emphasizes the capacity of machine learning and deep learning models to enhance the precision of drought forecasting, which is vital for efficient water resource management and agricultural planning in places prone to drought. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

28 pages, 20313 KiB  
Article
SHAP-Driven Explainable Artificial Intelligence Framework for Wildfire Susceptibility Mapping Using MODIS Active Fire Pixels: An In-Depth Interpretation of Contributing Factors in Izmir, Türkiye
by Muzaffer Can Iban and Oktay Aksu
Remote Sens. 2024, 16(15), 2842; https://doi.org/10.3390/rs16152842 - 2 Aug 2024
Viewed by 432
Abstract
Wildfire susceptibility maps play a crucial role in preemptively identifying regions at risk of future fires and informing decisions related to wildfire management, thereby aiding in mitigating the risks and potential damage posed by wildfires. This study employs eXplainable Artificial Intelligence (XAI) techniques, [...] Read more.
Wildfire susceptibility maps play a crucial role in preemptively identifying regions at risk of future fires and informing decisions related to wildfire management, thereby aiding in mitigating the risks and potential damage posed by wildfires. This study employs eXplainable Artificial Intelligence (XAI) techniques, particularly SHapley Additive exPlanations (SHAP), to map wildfire susceptibility in Izmir Province, Türkiye. Incorporating fifteen conditioning factors spanning topography, climate, anthropogenic influences, and vegetation characteristics, machine learning (ML) models (Random Forest, XGBoost, LightGBM) were used to predict wildfire-prone areas using freely available active fire pixel data (MODIS Active Fire Collection 6 MCD14ML product). The evaluation of the trained ML models showed that the Random Forest (RF) model outperformed XGBoost and LightGBM, achieving the highest test accuracy (95.6%). All of the classifiers demonstrated a strong predictive performance, but RF excelled in sensitivity, specificity, precision, and F-1 score, making it the preferred model for generating a wildfire susceptibility map and conducting a SHAP analysis. Unlike prevailing approaches focusing solely on global feature importance, this study fills a critical gap by employing a SHAP summary and dependence plots to comprehensively assess each factor’s contribution, enhancing the explainability and reliability of the results. The analysis reveals clear associations between factors such as wind speed, temperature, NDVI, slope, and distance to villages with increased fire susceptibility, while rainfall and distance to streams exhibit nuanced effects. The spatial distribution of the wildfire susceptibility classes highlights critical areas, particularly in flat and coastal regions near settlements and agricultural lands, emphasizing the need for enhanced awareness and preventive measures. These insights inform targeted fire management strategies, highlighting the importance of tailored interventions like firebreaks and vegetation management. However, challenges remain, including ensuring the selected factors’ adequacy across diverse regions, addressing potential biases from resampling spatially varied data, and refining the model for broader applicability. Full article
(This article belongs to the Special Issue Artificial Intelligence for Natural Hazards (AI4NH))
Show Figures

Figure 1

Back to TopTop