Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,690)

Search Parameters:
Keywords = K-nearest neighbors

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 13279 KiB  
Article
Comparison of Algorithms and Optimal Feature Combinations for Identifying Forest Type in Subtropical Forests Using GF-2 and UAV Multispectral Images
by Guowei He, Shun Li, Chao Huang, Shi Xu, Yang Li, Zijun Jiang, Jiashuang Xu, Funian Yang, Wei Wan, Qin Zou, Mi Zhang, Yan Feng and Guoqing He
Forests 2024, 15(8), 1327; https://doi.org/10.3390/f15081327 (registering DOI) - 30 Jul 2024
Abstract
The composition and spatial distribution of tree species are pivotal for biodiversity conservation, ecosystem productivity, and carbon sequestration. However, the accurate classification of tree species in subtropical forests remains a formidable challenge due to their complex canopy structures and dense vegetation. This study [...] Read more.
The composition and spatial distribution of tree species are pivotal for biodiversity conservation, ecosystem productivity, and carbon sequestration. However, the accurate classification of tree species in subtropical forests remains a formidable challenge due to their complex canopy structures and dense vegetation. This study addresses these challenges within the Jiangxi Lushan National Nature Reserve by leveraging high-resolution GF-2 remote sensing imagery and UAV multispectral images collected in 2018 and 2022. We extracted spectral, texture, vegetation indices, geometric, and topographic features to devise 12 classification schemes. Utilizing an object-oriented approach, we employed three machine learning algorithms—Random Forest (RF), k-Nearest Neighbor (KNN), and Classification and Regression Tree (CART)—to identify 12 forest types in these regions. Our findings indicate that all three algorithms were effective in identifying forest type in subtropical forests, and the optimal overall accuracy (OA) was more than 72%; RF outperformed KNN and CART; S12 based on feature selection was the optimal feature combination scheme; and the combination of RF and Scheme S12 (S12) yielded the highest classification accuracy, with OA and Kappa coefficients for 2018-RF-S12 of 90.33% and 0.82 and OA and Kappa coefficients for 2022-RF-S12 of 89.59% and 0.81. This study underscores the utility of combining multiple feature types and feature selection for enhanced forest type recognition, noting that topographic features significantly improved accuracy, whereas geometric features detracted from it. Altitude emerged as the most influential characteristic, alongside significant variables such as the Normalized Difference Greenness Index (NDVI) and the mean value of reflectance in the blue band of the GF-2 image (Mean_B). Species such as Masson pine, shrub, and moso bamboo were accurately classified, with the optimal F1-Scores surpassing 89.50%. Notably, a shift from single-species to mixed-species stands was observed over the study period, enhancing ecological diversity and stability. These results highlight the effectiveness of GF-2 imagery for refined, large-scale forest-type identification and dynamic diversity monitoring in complex subtropical forests. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
40 pages, 44470 KiB  
Article
A Decision Support System for Crop Recommendation Using Machine Learning Classification Algorithms
by Murali Krishna Senapaty, Abhishek Ray and Neelamadhab Padhy
Agriculture 2024, 14(8), 1256; https://doi.org/10.3390/agriculture14081256 - 30 Jul 2024
Abstract
Today, crop suggestions and necessary guidance have become a regular need for a farmer. Farmers generally depend on their local agriculture officers regarding this, and it may be difficult to obtain the right guidance at the right time. Nowadays, crop datasets are available [...] Read more.
Today, crop suggestions and necessary guidance have become a regular need for a farmer. Farmers generally depend on their local agriculture officers regarding this, and it may be difficult to obtain the right guidance at the right time. Nowadays, crop datasets are available on different websites in the agriculture sector, and they play a crucial role in suggesting suitable crops. So, a decision support system that analyzes the crop dataset using machine learning techniques can assist farmers in making better choices regarding crop selections. The main objective of this research is to provide quick guidance to farmers with more accurate and effective crop recommendations by utilizing machine learning methods, global positioning system coordinates, and crop cloud data. Here, the recommendation can be more personalized, which enables the farmers to predict crops in their specific geographical context, taking into account factors like climate, soil composition, water availability, and local conditions. In this regard, an existing historical crop dataset that contains the state, district, year, area-wise production rate, crop name, and season was collected for 246,091 sample records from the Dataworld website, which holds data on 37 different crops from different areas of India. Also, for better analysis, a dataset was collected from the agriculture offices of the Rayagada, Koraput, and Gajapati districts in Odisha state, India. Both of these datasets were combined and stored using a Firebase cloud service. Thirteen different machine learning algorithms have been applied to the dataset to identify dependencies within the data. To facilitate this process, an Android application was developed using Android Studio (Electric Eel | 2023.1.1) Emulator (Version 32.1.14), Software Development Kit (SDK, Android SDK 33), and Tools. A model has been proposed that implements the SMOTE (Synthetic Minority Oversampling Technique) to balance the dataset, and then it allows for the implementation of 13 different classifiers, such as logistic regression, decision tree (DT), K-Nearest Neighbor (KNN), SVC (Support Vector Classifier), random forest (RF), Gradient Boost (GB), Bagged Tree, extreme gradient boosting (XGB classifier), Ada Boost Classifier, Cat Boost, HGB (Histogram-based Gradient Boosting), SGDC (Stochastic Gradient Descent), and MNB (Multinomial Naive Bayes) on the cloud dataset. It is observed that the performance of the SGDC method is 1.00 in accuracy, precision, recall, F1-score, and ROC AUC (Receiver Operating Characteristics–Area Under the Curve) and is 0.91 in sensitivity and 0.54 in specificity after applying the SMOTE. Overall, SGDC has a better performance compared to all other classifiers implemented in the predictions. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

41 pages, 1726 KiB  
Article
An Improved Binary Crayfish Optimization Algorithm for Handling Feature Selection Task in Supervised Classification
by Shaymaa E. Sorour, Lamia Hassan, Amr A. Abohany and Reda M. Hussien
Mathematics 2024, 12(15), 2364; https://doi.org/10.3390/math12152364 - 29 Jul 2024
Viewed by 250
Abstract
Feature selection (FS) is a crucial phase in data mining (DM) and machine learning (ML) tasks, aimed at removing uncorrelated and redundant attributes to enhance classification accuracy. This study introduces an improved binary crayfish optimization algorithm (IBCOA) designed to tackle the FS problem. [...] Read more.
Feature selection (FS) is a crucial phase in data mining (DM) and machine learning (ML) tasks, aimed at removing uncorrelated and redundant attributes to enhance classification accuracy. This study introduces an improved binary crayfish optimization algorithm (IBCOA) designed to tackle the FS problem. The IBCOA integrates a local search strategy and a periodic mode boundary handling technique, significantly improving its ability to search and exploit the feature space. By doing so, the IBCOA effectively reduces dimensionality, while improving classification accuracy. The algorithm’s performance was evaluated using support vector machine (SVM) and k-nearest neighbor (k-NN) classifiers on eighteen multi-scale benchmark datasets. The findings showed that the IBCOA performed better than nine recent binary optimizers, attaining 100% accuracy and decreasing the feature set size by as much as 0.8. Statistical evidence supports that the proposed IBCOA is highly competitive according to the Wilcoxon rank sum test (alpha = 0.05). This study underscores the IBCOA’s potential for enhancing FS processes, providing a robust solution for high-dimensional data challenges. Full article
(This article belongs to the Special Issue Combinatorial Optimization and Applications)
Show Figures

Figure 1

43 pages, 431 KiB  
Article
Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms
by Jorge A. Morgan-Benita, José M. Celaya-Padilla, Huizilopoztli Luna-García, Carlos E. Galván-Tejada, Miguel Cruz, Jorge I. Galván-Tejada, Hamurabi Gamboa-Rosales, Ana G. Sánchez-Reyna, David Rondon and Klinge O. Villalba-Condori
Diagnostics 2024, 14(15), 1623; https://doi.org/10.3390/diagnostics14151623 - 27 Jul 2024
Viewed by 464
Abstract
Type 2 diabetes mellitus (T2DM) is one of the most common metabolic diseases in the world and poses a significant public health challenge. Early detection and management of this metabolic disorder is crucial to prevent complications and improve outcomes. This paper aims to [...] Read more.
Type 2 diabetes mellitus (T2DM) is one of the most common metabolic diseases in the world and poses a significant public health challenge. Early detection and management of this metabolic disorder is crucial to prevent complications and improve outcomes. This paper aims to find core differences in male and female markers to detect T2DM by their clinic and anthropometric features, seeking out ranges in potential biomarkers identified to provide useful information as a pre-diagnostic tool whie excluding glucose-related biomarkers using machine learning (ML) models. We used a dataset containing clinical and anthropometric variables from patients diagnosed with T2DM and patients without TD2M as control. We applied feature selection with three different techniques to identify relevant biomarker models: an improved recursive feature elimination (RFE) evaluating each set from all the features to one feature with the Akaike information criterion (AIC) to find optimal outputs; Least Absolute Shrinkage and Selection Operator (LASSO) with glmnet; and Genetic Algorithms (GA) with GALGO and forward selection (FS) applied to GALGO output. We then used these for comparison with the AIC to measure the performance of each technique and collect the optimal set of global features. Then, an implementation and comparison of five different ML models was carried out to identify the most accurate and interpretable one, considering the following models: logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), k-nearest neighbors (KNN), and nearest centroid (Nearcent). The models were then combined in an ensemble to provide a more robust approximation. The results showed that potential biomarkers such as systolic blood pressure (SBP) and triglycerides are together significantly associated with T2DM. This approach also identified triglycerides, cholesterol, and diastolic blood pressure as biomarkers with differences between male and female actors that have not been previously reported in the literature. The most accurate ML model was selection with RFE and random forest (RF) as the estimator improved with the AIC, which achieved an accuracy of 0.8820. In conclusion, this study demonstrates the potential of ML models in identifying potential biomarkers for early detection of T2DM, excluding glucose-related biomarkers as well as differences between male and female anthropometric and clinic profiles. These findings may help to improve early detection and management of the T2DM by accounting for differences between male and female subjects in terms of anthropometric and clinic profiles, potentially reducing healthcare costs and improving personalized patient attention. Further research is needed to validate these potential biomarkers ranges in other populations and clinical settings. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

20 pages, 7336 KiB  
Article
Spectral Features Analysis for Print Quality Prediction in Additive Manufacturing: An Acoustics-Based Approach
by Michael Olowe, Michael Ogunsanya, Brian Best, Yousef Hanif, Saurabh Bajaj, Varalakshmi Vakkalagadda, Olukayode Fatoki and Salil Desai
Sensors 2024, 24(15), 4864; https://doi.org/10.3390/s24154864 - 26 Jul 2024
Viewed by 215
Abstract
Quality prediction in additive manufacturing (AM) processes is crucial, particularly in high-risk manufacturing sectors like aerospace, biomedicals, and automotive. Acoustic sensors have emerged as valuable tools for detecting variations in print patterns by analyzing signatures and extracting distinctive features. This study focuses on [...] Read more.
Quality prediction in additive manufacturing (AM) processes is crucial, particularly in high-risk manufacturing sectors like aerospace, biomedicals, and automotive. Acoustic sensors have emerged as valuable tools for detecting variations in print patterns by analyzing signatures and extracting distinctive features. This study focuses on the collection, preprocessing, and analysis of acoustic data streams from a Fused Deposition Modeling (FDM) 3D-printed sample cube (10 mm × 10 mm × 5 mm). Time and frequency-domain features were extracted at 10-s intervals at varying layer thicknesses. The audio samples were preprocessed using the Harmonic–Percussive Source Separation (HPSS) method, and the analysis of time and frequency features was performed using the Librosa module. Feature importance analysis was conducted, and machine learning (ML) prediction was implemented using eight different classifier algorithms (K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), Decision Trees (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LightGBM)) for the classification of print quality based on the labeled datasets. Three-dimensional-printed samples with varying layer thicknesses, representing two print quality levels, were used to generate audio samples. The extracted spectral features from these audio samples served as input variables for the supervised ML algorithms to predict print quality. The investigation revealed that the mean of the spectral flatness, spectral centroid, power spectral density, and RMS energy were the most critical acoustic features. Prediction metrics, including accuracy scores, F-1 scores, recall, precision, and ROC/AUC, were utilized to evaluate the models. The extreme gradient boosting algorithm stood out as the top model, attaining a prediction accuracy of 91.3%, precision of 88.8%, recall of 92.9%, F-1 score of 90.8%, and AUC of 96.3%. This research lays the foundation for acoustic based quality prediction and control of 3D printed parts using Fused Deposition Modeling and can be extended to other additive manufacturing techniques. Full article
(This article belongs to the Collection Sensors and Sensing Technology for Industry 4.0)
Show Figures

Figure 1

30 pages, 7774 KiB  
Review
Analysis of Models to Predict Mechanical Properties of High-Performance and Ultra-High-Performance Concrete Using Machine Learning
by Mohammad Hematibahar, Makhmud Kharun, Alexey N. Beskopylny, Sergey A. Stel’makh, Evgenii M. Shcherban’ and Irina Razveeva
J. Compos. Sci. 2024, 8(8), 287; https://doi.org/10.3390/jcs8080287 - 26 Jul 2024
Viewed by 262
Abstract
High-Performance Concrete (HPC) and Ultra-High-Performance Concrete (UHPC) have many applications in civil engineering industries. These two types of concrete have as many similarities as they have differences with each other, such as the mix design and additive powders like silica fume, metakaolin, and [...] Read more.
High-Performance Concrete (HPC) and Ultra-High-Performance Concrete (UHPC) have many applications in civil engineering industries. These two types of concrete have as many similarities as they have differences with each other, such as the mix design and additive powders like silica fume, metakaolin, and various fibers, however, the optimal percentages of the mixture design properties of each element of these concretes are completely different. This study investigated the differences and similarities between these two types of concrete to find better mechanical behavior through mixture design and parameters of each concrete. In addition, this paper studied the correlation matrix through the machine learning method to predict the mechanical properties and find the relationship between the concrete mix design elements and the mechanical properties. In this way, Linear, Ridge, Lasso, Random Forest, K-Nearest Neighbors (KNN), Decision tree, and Partial least squares (PLS) regressions have been chosen to find the best regression types. To find the accuracy, the coefficient of determination (R2), mean absolute error (MAE), and root-mean-square error (RMSE) were selected. Finally, PLS, Linear, and Lasso regressions had better results than other regressions, with R2 greater than 93%, 92%, and 92%, respectively. In general, the present study shows that HPC and UHPC have different mix designs and mechanical properties. In addition, PLS, Linear, and Lasso regressions are the best regressions for predicting mechanical properties. Full article
(This article belongs to the Special Issue Research on Sustainable Cement-Based Composites)
Show Figures

Figure 1

12 pages, 386 KiB  
Article
A Performance Analysis of Stochastic Processes and Machine Learning Algorithms in Stock Market Prediction
by Mohammed Bouasabah
Economies 2024, 12(8), 194; https://doi.org/10.3390/economies12080194 - 24 Jul 2024
Viewed by 606
Abstract
In this study, we compare the performance of stochastic processes, namely, the Vasicek, Cox–Ingersoll–Ross (CIR), and geometric Brownian motion (GBM) models, with that of machine learning algorithms, such as Random Forest, Support Vector Machine (SVM), and k-Nearest Neighbors (KNN), for predicting the trends [...] Read more.
In this study, we compare the performance of stochastic processes, namely, the Vasicek, Cox–Ingersoll–Ross (CIR), and geometric Brownian motion (GBM) models, with that of machine learning algorithms, such as Random Forest, Support Vector Machine (SVM), and k-Nearest Neighbors (KNN), for predicting the trends of stock indices XLF (financial sector), XLK (technology sector), and XLV (healthcare sector). The results showed that stochastic processes achieved remarkable prediction performance, especially the CIR model. Additionally, this study demonstrated that the metrics of machine learning algorithms are relatively lower. However, it is important to note that stochastic processes use the actual current index value to predict tomorrow’s value, which may overestimate their performance. In contrast, machine learning algorithms offer a more flexible approach and are not as dependent on the current index value. Therefore, optimizing the hyperparameters of machine learning algorithms is crucial for further improving their performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

34 pages, 7032 KiB  
Article
Radio Signal Modulation Recognition Method Based on Hybrid Feature and Ensemble Learning: For Radar and Jamming Signals
by Yu Zhou, Ronggang Cao, Anqi Zhang and Ping Li
Sensors 2024, 24(15), 4804; https://doi.org/10.3390/s24154804 - 24 Jul 2024
Viewed by 292
Abstract
The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning [...] Read more.
The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning stacking algorithm improved by meta-feature enhancement, the proposed method adopts random forests, K-nearest neighbors, and Gaussian naive Bayes as the base-learners, with logistic regression serving as the meta-learner. It takes the multi-domain features of signals as input, which include time-domain features including fuzzy entropy, slope entropy, and Hjorth parameters; frequency-domain features, including spectral entropy; and fractal-domain features, including fractal dimension. The simulation experiment, including seven common signal types of radar and active jamming, was performed for the effectiveness validation and performance evaluation. Results proved the proposed method’s performance superiority to other classification methods, as well as its ability to meet the requirements of low signal-to-noise ratio and few-shot learning. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

17 pages, 6276 KiB  
Article
Integrating Interpolation and Extrapolation: A Hybrid Predictive Framework for Supervised Learning
by Bo Jiang, Xinyi Zhu, Xuecheng Tian, Wen Yi and Shuaian Wang
Appl. Sci. 2024, 14(15), 6414; https://doi.org/10.3390/app14156414 - 23 Jul 2024
Viewed by 311
Abstract
In the domain of supervised learning, interpolation and extrapolation serve as crucial methodologies for predicting data points within and beyond the confines of a given dataset, respectively. The efficacy of these methods is closely linked to the nature of the dataset, with increased [...] Read more.
In the domain of supervised learning, interpolation and extrapolation serve as crucial methodologies for predicting data points within and beyond the confines of a given dataset, respectively. The efficacy of these methods is closely linked to the nature of the dataset, with increased challenges when multivariate feature vectors are handled. This paper introduces a novel prediction framework that integrates interpolation and extrapolation techniques. Central to this method are two main innovations: an optimization model that effectively classifies new multivariate data points as either interior or exterior to the known dataset, and a hybrid prediction system that combines k-nearest neighbor (kNN) and linear regression. Tested on the port state control (PSC) inspection dataset at the port of Hong Kong, our framework generally demonstrates superior precision in predictive outcomes than traditional kNN and linear regression models. This research enriches the literature by illustrating the enhanced capability of combining interpolation and extrapolation techniques in supervised learning. Full article
(This article belongs to the Special Issue Big Data: Analysis, Mining and Applications)
Show Figures

Figure 1

21 pages, 4723 KiB  
Article
A Comparative Study on Imputation Techniques: Introducing a Transformer Model for Robust and Efficient Handling of Missing EEG Amplitude Data
by Murad Ali Khan
Bioengineering 2024, 11(8), 740; https://doi.org/10.3390/bioengineering11080740 - 23 Jul 2024
Viewed by 388
Abstract
In clinical datasets, missing data often occur due to various reasons including non-response, data corruption, and errors in data collection or processing. Such missing values can lead to biased statistical analyses, reduced statistical power, and potentially misleading findings, making effective imputation critical. Traditional [...] Read more.
In clinical datasets, missing data often occur due to various reasons including non-response, data corruption, and errors in data collection or processing. Such missing values can lead to biased statistical analyses, reduced statistical power, and potentially misleading findings, making effective imputation critical. Traditional imputation methods, such as Zero Imputation, Mean Imputation, and k-Nearest Neighbors (KNN) Imputation, attempt to address these gaps. However, these methods often fall short of accurately capturing the underlying data complexity, leading to oversimplified assumptions and errors in prediction. This study introduces a novel Imputation model employing transformer-based architectures to address these challenges. Notably, the model distinguishes between complete EEG signal amplitude data and incomplete data in two datasets: PhysioNet and CHB-MIT. By training exclusively on complete amplitude data, the TabTransformer accurately learns and predicts missing values, capturing intricate patterns and relationships inherent in EEG amplitude data. Evaluation using various error metrics and R2 score demonstrates significant enhancements over traditional methods such as Zero, Mean, and KNN imputation. The Proposed Model achieves impressive R2 scores of 0.993 for PhysioNet and 0.97 for CHB-MIT, highlighting its efficacy in handling complex clinical data patterns and improving dataset integrity. This underscores the transformative potential of transformer models in advancing the utility and reliability of clinical datasets. Full article
(This article belongs to the Special Issue Intelligent IoMT Systems for Brain–Computer Interface)
Show Figures

Figure 1

19 pages, 420 KiB  
Article
k-Nearest Neighbors Estimator for Functional Asymmetry Shortfall Regression
by Mohammed B. Alamari, Fatimah A. Almulhim, Zoulikha Kaid and Ali Laksaci
Symmetry 2024, 16(7), 928; https://doi.org/10.3390/sym16070928 - 19 Jul 2024
Viewed by 373
Abstract
This paper deals with the problem of financial risk management using a new expected shortfall regression. The latter is based on the expectile model for financial risk-threshold. Unlike the VaR model, the expectile threshold is constructed by an asymmetric least square loss function. [...] Read more.
This paper deals with the problem of financial risk management using a new expected shortfall regression. The latter is based on the expectile model for financial risk-threshold. Unlike the VaR model, the expectile threshold is constructed by an asymmetric least square loss function. We construct an estimator of this new model using the k-nearest neighbors (kNN) smoothing approach. The mathematical properties of the constructed estimator are stated through the establishment of the pointwise complete convergence. Additionally, we prove that the constructed estimator is uniformly consistent over the nearest neighbors (UCNN). Such asymptotic results constitute a good mathematical support of the proposed financial risk process. Thus, we examine the easy implantation of this process through an artificial and real data. Our empirical analysis confirms the superiority of the kNN-approach over the kernel method as well as the superiority of the expectile over the quantile in financial risk analysis. Full article
(This article belongs to the Section Mathematics)
Show Figures

Figure 1

15 pages, 1744 KiB  
Article
Machine Learning to Estimate Workload and Balance Resources with Live Migration and VM Placement
by Taufik Hidayat, Kalamullah Ramli, Nadia Thereza, Amarudin Daulay, Rushendra Rushendra and Rahutomo Mahardiko
Informatics 2024, 11(3), 50; https://doi.org/10.3390/informatics11030050 - 19 Jul 2024
Viewed by 370
Abstract
Currently, utilizing virtualization technology in data centers often imposes an increasing burden on the host machine (HM), leading to a decline in VM performance. To address this issue, live virtual migration (LVM) is employed to alleviate the load on the VM. This study [...] Read more.
Currently, utilizing virtualization technology in data centers often imposes an increasing burden on the host machine (HM), leading to a decline in VM performance. To address this issue, live virtual migration (LVM) is employed to alleviate the load on the VM. This study introduces a hybrid machine learning model designed to estimate the direct migration of pre-copied migration virtual machines within the data center. The proposed model integrates Markov Decision Process (MDP), genetic algorithm (GA), and random forest (RF) algorithms to forecast the prioritized movement of virtual machines and identify the optimal host machine target. The hybrid models achieve a 99% accuracy rate with quicker training times compared to the previous studies that utilized K-nearest neighbor, decision tree classification, support vector machines, logistic regression, and neural networks. The authors recommend further exploration of a deep learning approach (DL) to address other data center performance issues. This paper outlines promising strategies for enhancing virtual machine migration in data centers. The hybrid models demonstrate high accuracy and faster training times than previous research, indicating the potential for optimizing virtual machine placement and minimizing downtime. The authors emphasize the significance of considering data center performance and propose further investigation. Moreover, it would be beneficial to delve into the practical implementation and dissemination of the proposed model in real-world data centers. Full article
Show Figures

Figure 1

22 pages, 9521 KiB  
Article
Estimation of Leaf Area Index for Dendrocalamus giganteus Based on Multi-Source Remote Sensing Data
by Zhen Qin, Huanfen Yang, Qingtai Shu, Jinge Yu, Li Xu, Mingxing Wang, Cuifen Xia and Dandan Duan
Forests 2024, 15(7), 1257; https://doi.org/10.3390/f15071257 - 19 Jul 2024
Viewed by 374
Abstract
The Leaf Area Index (LAI) plays a crucial role in assessing the health of forest ecosystems. This study utilized ICESat-2/ATLAS as the primary information source, integrating 51 measured sample datasets, and employed the Sequential Gaussian Conditional Simulation (SGCS) method to derive surface grid [...] Read more.
The Leaf Area Index (LAI) plays a crucial role in assessing the health of forest ecosystems. This study utilized ICESat-2/ATLAS as the primary information source, integrating 51 measured sample datasets, and employed the Sequential Gaussian Conditional Simulation (SGCS) method to derive surface grid information for the study area. The backscattering coefficient and texture feature factor from Sentinel-1, as well as the spectral band and vegetation index factors from Sentinel-2, were integrated. The random forest (RF), gradient-boosted regression tree (GBRT) model, and K-nearest neighbor (KNN) method were employed to construct the LAI estimation model. The optimal model, RF, was selected to conduct accuracy analysis of various remote sensing data combinations. The spatial distribution map of Dendrocalamus giganteus in Xinping County was then generated using the optimal combination model. The findings reveal the following: (1) Four key parameters—optimal fitted segmented terrain height, interpolated terrain surface height, absolute mean canopy height, and solar elevation angle—are significantly correlated. (2) The RF model constructed using a combination of ICESat-2/ATLAS, Sentinel-1, and Sentinel-2 data achieved optimal accuracy, with a coefficient of determination (R2) of 0.904, root mean square error (RMSE) of 0.384, mean absolute error (MAE) of 0.319, overall estimation accuracy (P1) of 88.96%, and relative root mean square error (RRMSE) of 11.04%. (3) The accuracy of LAI estimation using a combination of ICESat-2/ATLAS, Sentinel-1, and Sentinel-2 remote sensing data showed slight improvement compared to using either ICESat-2/ATLAS data combined with Sentinel-1 or Sentinel-2 data alone, with a significant enhancement in LAI estimation accuracy compared to using ICESat-2/ATLAS data alone. (4) LAI values in the study area ranged mainly from 2.29 to 2.51, averaging 2.4. Research indicates that employing ICESat-2/ATLAS spaceborne LiDAR data for regional-scale LAI estimation presents clear advantages. Incorporating SAR data and optical imagery and utilizing diverse data types for complementary information significantly enhances the accuracy of LAI estimation, demonstrating the feasibility of LAI inversion with multi-source remote sensing data. This approach offers an innovative framework for utilizing multi-source remote sensing data for regional-scale LAI inversion, demonstrates a methodology for integrating various remote sensing data, and serves as a reference for low-cost high-precision regional-scale LAI estimation. Full article
Show Figures

Figure 1

23 pages, 665 KiB  
Review
Machine Learning Models and Applications for Early Detection
by Orlando Zapata-Cortes, Martin Darío Arango-Serna, Julian Andres Zapata-Cortes and Jaime Alonso Restrepo-Carmona
Sensors 2024, 24(14), 4678; https://doi.org/10.3390/s24144678 - 18 Jul 2024
Viewed by 361
Abstract
From the various perspectives of machine learning (ML) and the multiple models used in this discipline, there is an approach aimed at training models for the early detection (ED) of anomalies. The early detection of anomalies is crucial in multiple areas of knowledge [...] Read more.
From the various perspectives of machine learning (ML) and the multiple models used in this discipline, there is an approach aimed at training models for the early detection (ED) of anomalies. The early detection of anomalies is crucial in multiple areas of knowledge since identifying and classifying them allows for early decision making and provides a better response to mitigate the negative effects caused by late detection in any system. This article presents a literature review to examine which machine learning models (MLMs) operate with a focus on ED in a multidisciplinary manner and, specifically, how these models work in the field of fraud detection. A variety of models were found, including Logistic Regression (LR), Support Vector Machines (SVMs), decision trees (DTs), Random Forests (RFs), naive Bayesian classifier (NB), K-Nearest Neighbors (KNNs), artificial neural networks (ANNs), and Extreme Gradient Boosting (XGB), among others. It was identified that MLMs operate as isolated models, categorized in this article as Single Base Models (SBMs) and Stacking Ensemble Models (SEMs). It was identified that MLMs for ED in multiple areas under SBMs’ and SEMs’ implementation achieved accuracies greater than 80% and 90%, respectively. In fraud detection, accuracies greater than 90% were reported by the authors. The article concludes that MLMs for ED in multiple applications, including fraud, offer a viable way to identify and classify anomalies robustly, with a high degree of accuracy and precision. MLMs for ED in fraud are useful as they can quickly process large amounts of data to detect and classify suspicious transactions or activities, helping to prevent financial losses. Full article
(This article belongs to the Special Issue AI-Assisted Condition Monitoring and Fault Diagnosis)
Show Figures

Figure 1

20 pages, 3167 KiB  
Article
Modeling and Sustainability Implications of Harsh Driving Events: A Predictive Machine Learning Approach
by Antonis Kostopoulos, Thodoris Garefalakis, Eva Michelaraki, Christos Katrakazas and George Yannis
Sustainability 2024, 16(14), 6151; https://doi.org/10.3390/su16146151 - 18 Jul 2024
Viewed by 427
Abstract
Human behavior significantly contributes to severe road injuries, underscoring a critical road safety challenge. This study addresses the complex task of predicting dangerous driving behaviors through a comprehensive analysis of over 356,000 trips, enhancing existing knowledge in the field and promoting sustainability and [...] Read more.
Human behavior significantly contributes to severe road injuries, underscoring a critical road safety challenge. This study addresses the complex task of predicting dangerous driving behaviors through a comprehensive analysis of over 356,000 trips, enhancing existing knowledge in the field and promoting sustainability and road safety. The research uses advanced machine learning algorithms (e.g., Random Forest, Gradient Boosting, Extreme Gradient Boosting, Multilayer Perceptron, and K-Nearest Neighbors) to categorize driving behaviors into ‘Dangerous’ and ‘Non-Dangerous’. Feature selection techniques are applied to enhance the understanding of influential driving behaviors, while k-means clustering establishes reliable safety thresholds. Findings indicate that Gradient Boosting and Multilayer Perceptron excel, achieving recall rates of approximately 67% to 68% for both harsh acceleration and braking events. This study identifies critical thresholds for harsh events: (a) 48.82 harsh accelerations and (b) 45.40 harsh brakings per 100 km, providing new benchmarks for assessing driving risks. The application of machine learning algorithms, feature selection, and k-means clustering offers a promising approach for improving road safety and reducing socio-economic costs through sustainable practices. By adopting these techniques and the identified thresholds for harsh events, authorities and organizations can develop effective strategies to detect and mitigate dangerous driving behaviors. Full article
(This article belongs to the Collection Emerging Technologies and Sustainable Road Safety)
Show Figures

Figure 1

Back to TopTop