Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,620)

Search Parameters:
Keywords = long short-term memory

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 7895 KiB  
Article
A Novel Trajectory Prediction Method Based on CNN, BiLSTM, and Multi-Head Attention Mechanism
by Yue Xu, Quan Pan, Zengfu Wang and Baoquan Hu
Aerospace 2024, 11(10), 822; https://doi.org/10.3390/aerospace11100822 (registering DOI) - 8 Oct 2024
Abstract
A four-dimensional (4D) trajectory is a multi-dimensional time series that embodies rich spatiotemporal features. However, its high complexity and inherent uncertainty pose significant challenges for accurate prediction. In this paper, we present a novel 4D trajectory prediction model that integrates convolutional neural networks [...] Read more.
A four-dimensional (4D) trajectory is a multi-dimensional time series that embodies rich spatiotemporal features. However, its high complexity and inherent uncertainty pose significant challenges for accurate prediction. In this paper, we present a novel 4D trajectory prediction model that integrates convolutional neural networks (CNNs), bidirectional long short-term memory networks (BiLSTMs), and multi-head attention mechanisms. This model effectively addresses the characteristics of aircraft flight trajectories and the difficulties associated with simultaneously extracting spatiotemporal features using existing prediction methods. Specifically, we leverage the local feature extraction capabilities of CNNs to extract key spatial and temporal features from the original trajectory data, such as geometric shape information and dynamic change patterns. The BiLSTM network is employed to consider both forward and backward temporal orders in the trajectory data, allowing for a more comprehensive capture of long-term dependencies. Furthermore, we introduce a multi-head attention mechanism that enhances the model’s ability to accurately identify key information in the trajectory data while minimizing the interference of redundant information. We validated our approach through experiments conducted on a real ADS-B trajectory dataset. The experimental results demonstrate that the proposed method significantly outperforms comparative approaches in terms of trajectory estimation accuracy. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

20 pages, 1853 KiB  
Article
Chinese Named Entity Recognition Based on Multi-Level Representation Learning
by Weijun Li, Jianping Ding, Shixia Liu, Xueyang Liu, Yilei Su and Ziyi Wang
Appl. Sci. 2024, 14(19), 9083; https://doi.org/10.3390/app14199083 (registering DOI) - 8 Oct 2024
Abstract
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which [...] Read more.
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which hinders the accuracy of entity recognition. To this end, a Chinese NER model based on multi-level representation learning is proposed. The model leverages a pre-trained word-based embedding to capture contextual information. A linear layer adjusts dimensions to fit an Extended Long Short-Term Memory (XLSTM) network, enabling the capture of long-range dependencies and contextual information, and providing deeper representations. An adaptive multi-head attention mechanism is proposed to enhance the ability to capture global dependencies and comprehend deep semantic context. Additionally, GlobalPointer with rotational position encoding integrates global information for entity category prediction. Projected Gradient Descent (PGD) is incorporated, introducing perturbations in the embedding layer of the pre-trained model to enhance stability in noisy environments. The proposed model achieves F1-scores of 96.89%, 74.89%, 72.19%, and 80.96% on the Resume, Weibo, CMeEE, and CLUENER2020 datasets, respectively, demonstrating improvements over baseline and comparison models. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

20 pages, 8982 KiB  
Article
Neuropharmacological Assessment of Sulfonamide Derivatives of Para-Aminobenzoic Acid through In Vivo and In Silico Approaches
by Ankit Ganeshpurkar, Ravi Singh, Pratigya Tripathi, Qadir Alam, Sairam Krishnamurthy, Ashok Kumar and Sushil Kumar Singh
Drugs Drug Candidates 2024, 3(4), 674-693; https://doi.org/10.3390/ddc3040038 - 7 Oct 2024
Viewed by 369
Abstract
Background/Objectives: Alzheimer’s disease (AD), a complex neurogenerative disorder, manifests as dementia and concomitant neuropsychiatric symptoms, including apathy, depression, and circadian disruption. The pathology involves a profound degeneration of the hippocampus and cerebral cortex, leading to the impairment of both short-term and long-term memory. [...] Read more.
Background/Objectives: Alzheimer’s disease (AD), a complex neurogenerative disorder, manifests as dementia and concomitant neuropsychiatric symptoms, including apathy, depression, and circadian disruption. The pathology involves a profound degeneration of the hippocampus and cerebral cortex, leading to the impairment of both short-term and long-term memory. The cholinergic hypothesis is among the various theories proposed, that assume the loss of the cholinergic tract contributes to the onset of AD and proves clinically effective in managing mild to moderate stages of the disease. This study explores the potential therapeutic efficacy of sulfonamide-based butyrylcholinesterase inhibitors in mitigating scopolamine-induced amnesia in rats. Methods: Behavioral assessments utilizing Y-maze, Barnes maze, and neurochemical assays were conducted to evaluate the effectiveness of the test compounds. Results: Results demonstrated a significant reduction in the impact of scopolamine administration on behavioral tasks at a dose of 20 mg/kg for both compounds. Correspondingly, neurochemical assays corroborated these findings. In silico docking analysis on rat butyrylcholinesterase (BChE) was performed to elucidate the binding mode of the compounds. Subsequent molecular dynamics studies unveiled the formation of stable complexes between the test compounds and rat BChE. Conclusions: These findings contribute valuable insights into the potential therapeutic role of sulfonamide-based butyrylcholinesterase inhibitors in addressing memory deficits associated with AD, emphasizing their in silico molecular interactions and stability. Full article
(This article belongs to the Section Preclinical Research)
Show Figures

Figure 1

28 pages, 13775 KiB  
Article
Elderly Fall Detection in Complex Environment Based on Improved YOLOv5s and LSTM
by Thioanh Bui, Juncheng Liu, Jingyu Cao, Geng Wei and Qian Zeng
Appl. Sci. 2024, 14(19), 9028; https://doi.org/10.3390/app14199028 - 6 Oct 2024
Viewed by 381
Abstract
This work was conducted mainly to provide a healthy and safe monitoring system for the elderly living in the home environment. In this paper, two different target fall detection schemes are proposed based on whether the target is visible or not. When the [...] Read more.
This work was conducted mainly to provide a healthy and safe monitoring system for the elderly living in the home environment. In this paper, two different target fall detection schemes are proposed based on whether the target is visible or not. When the target is visible, a vision-based fall detection algorithm is proposed, where an image of the target captured by a camera is transmitted to the improved You Only Look Once version 5s (YOLOv5s) model for posture detection. In contrast, when the target is invisible, a WiFi-based fall detection algorithm is proposed, where channel state information (CSI) signals are used to estimate the target’s posture with an improved long short-term memory (LSTM) model. In the improved YOLOv5s model, adaptive picture scaling technology named Letterbox is used to maintain consistency in the aspect ratio of images in the dataset, and the weighted bidirectional feature pyramid (BiFPN) and the attention mechanisms of squeeze-and-excitation (SE) and coordinate attention (CA) modules are added to the Backbone network and Neck network, respectively. In the improved LSTM model, the Hampel filter is used to eliminate the noise from CSI signals and the convolutional neural network (CNN) model is combined with the LSTM to process the image made from CSI signals, and thus the object of the improved LSTM model at a point in time is the analysis of the amplitude of 90 CSI signals. The final monitoring result of the health status of the target is the result of combining the fall detection of the improved YOLOv5s and LSTM models with the physiological information of the target. Experimental results show the following: (1) the detection precision, recall rate, and average precision of the improved YOLOv5s model are increased by 7.2%, 9%, and 7.6%, respectively, compared with the original model, and there is almost no missed detection of the target; (2) the detection accuracy of the improved LSTM model is improved by 15.61%, 29.36%, and 52.39% compared with the original LSTM, CNN, and neural network (NN) models, respectively, while the convergence speed is improved by 90% compared with the original LSTM model; and (3) the proposed algorithm can meet the requirements of accurate, real-time, and stable applications of health monitoring. Full article
Show Figures

Figure 1

18 pages, 4420 KiB  
Article
Machine Learning Approach for Arabic Handwritten Recognition
by A. M. Mutawa, Mohammad Y. Allaho and Monirah Al-Hajeri
Appl. Sci. 2024, 14(19), 9020; https://doi.org/10.3390/app14199020 - 6 Oct 2024
Viewed by 342
Abstract
Text recognition is an important area of the pattern recognition field. Natural language processing (NLP) and pattern recognition have been utilized efficiently in script recognition. Much research has been conducted on handwritten script recognition. However, the research on the Arabic language for handwritten [...] Read more.
Text recognition is an important area of the pattern recognition field. Natural language processing (NLP) and pattern recognition have been utilized efficiently in script recognition. Much research has been conducted on handwritten script recognition. However, the research on the Arabic language for handwritten text recognition received little attention compared with other languages. Therefore, it is crucial to develop a new model that can recognize Arabic handwritten text. Most of the existing models used to acknowledge Arabic text are based on traditional machine learning techniques. Therefore, we implemented a new model using deep machine learning techniques by integrating two deep neural networks. In the new model, the architecture of the Residual Network (ResNet) model is used to extract features from raw images. Then, the Bidirectional Long Short-Term Memory (BiLSTM) and connectionist temporal classification (CTC) are used for sequence modeling. Our system improved the recognition rate of Arabic handwritten text compared to other models of a similar type with a character error rate of 13.2% and word error rate of 27.31%. In conclusion, the domain of Arabic handwritten recognition is advancing swiftly with the use of sophisticated deep learning methods. Full article
(This article belongs to the Special Issue Applied Intelligence in Natural Language Processing)
Show Figures

Figure 1

17 pages, 4904 KiB  
Article
Development of a Digital Twin Driven by a Deep Learning Model for Fault Diagnosis of Electro-Hydrostatic Actuators
by Roman Rodriguez-Aguilar, Jose-Antonio Marmolejo-Saucedo and Utku Köse
Mathematics 2024, 12(19), 3124; https://doi.org/10.3390/math12193124 (registering DOI) - 6 Oct 2024
Viewed by 308
Abstract
The first quarter of the 21st century has witnessed many technological innovations in various sectors. Likewise, the COVID-19 pandemic triggered the acceleration of digital transformation in organizations driven by artificial intelligence and communication technologies in Industry 4.0 and Industry 5.0. Aiming at the [...] Read more.
The first quarter of the 21st century has witnessed many technological innovations in various sectors. Likewise, the COVID-19 pandemic triggered the acceleration of digital transformation in organizations driven by artificial intelligence and communication technologies in Industry 4.0 and Industry 5.0. Aiming at the construction of digital twins, virtual representations of a physical system allow real-time bidirectional communication. This will allow the monitoring of operations, identification of possible failures, and decision making based on technical evidence. In this study, a fault diagnosis solution is proposed, based on the construction of a digital twin, for a cloud-based Industrial Internet of Things (IIoT) system contemplating the control of electro-hydrostatic actuators (EHAs). The system was supported by a deep learning model using Long Short-Term Memory (LSTM) networks for an effective diagnostic approach. The implemented study considers data preparation and integration and system development and application to evaluate the performance against the fault diagnosis problem. According to the results obtained, positive results are shown in the construction of the digital twin using a deep learning model for the fault diagnosis problem of an active EHA-IIoT configuration. Full article
Show Figures

Figure 1

16 pages, 3004 KiB  
Article
Time Series Prediction of Gas Emission in Coal Mining Face Based on Optimized Variational Mode Decomposition and SSA-LSTM
by Jingzhao Zhang, Yuxin Cui, Zhenguo Yan, Yuxin Huang, Chenyu Zhang, Jinlong Zhang, Jiantao Guo and Fei Zhao
Sensors 2024, 24(19), 6454; https://doi.org/10.3390/s24196454 - 6 Oct 2024
Viewed by 332
Abstract
The accurate prediction of gas emissions has important guiding significance for the prevention and control of gas disasters in order to further improve the prediction accuracy of gas emissions in the mining face. According to the absolute gas emission monitoring data of the [...] Read more.
The accurate prediction of gas emissions has important guiding significance for the prevention and control of gas disasters in order to further improve the prediction accuracy of gas emissions in the mining face. According to the absolute gas emission monitoring data of the 1417 working face in a coal mine in Shaanxi Province, a GA-VMD-SSA-LSTM gas emission prediction model (GVSL) based on genetic algorithm (GA)-optimized variational mode decomposition (VMD) and sparrow search algorithm (SSA)-optimized long short-term memory (LSTM) is proposed. Firstly, a VMD evaluation standard for evaluating the amount of decomposition loss is proposed. Under this standard, the GA is used to find the optimal parameters of the VMD. Then, the SSA is used to optimize the key parameters of the LSTM to establish a GVSL prediction model. The model predicts each component and finally superimposes the prediction results for each component to obtain the final gas emission result. The results show that the accuracy of the evaluation indexes of the GVSL model and VMD-LSTM model, as well as the SSA-LSTM model and Gaussian process regression (GPR) model, are compared and analyzed horizontally and vertically under three scenarios with prediction sets of 121,94 and 57 groups. The GVSL model has the best prediction effect, and its fitting degree R2 values are 0.95, 0.96, and 0.99, which confirms the effectiveness of the proposed GVSL model for the time series prediction of gas emission in the mining face. Full article
Show Figures

Figure 1

23 pages, 1456 KiB  
Article
Enhancing Photovoltaic Power Predictions with Deep Physical Chain Model
by Sebastián Dormido-Canto, Joaquín Rohland, Matías López, Gonzalo Garcia, Ernesto Fabregas and Gonzalo Farias
Algorithms 2024, 17(10), 445; https://doi.org/10.3390/a17100445 - 5 Oct 2024
Viewed by 483
Abstract
Predicting solar power generation is a complex challenge with multiple issues, such as data quality and choice of methods, which are crucial to effectively integrate solar power into power grids and manage photovoltaic plants. This study creates a hybrid methodology to improve the [...] Read more.
Predicting solar power generation is a complex challenge with multiple issues, such as data quality and choice of methods, which are crucial to effectively integrate solar power into power grids and manage photovoltaic plants. This study creates a hybrid methodology to improve the accuracy of short-term power prediction forecasts using a model called Transformer Bi-LSTM (Bidirectional Long Short-Term Memory). This model, which combines elements from the transformer architecture and bidirectional LSTM (Long–Short-Term Memory), is evaluated using two strategies: the first strategy makes a direct prediction using meteorological data, while the second employs a chain of deep learning models based on transfer learning, thus simulating the traditional physical chain model. The proposed approach improves performance and allows you to incorporate physical models to refine forecasts. The results outperform existing methods on metrics such as mean absolute error, specifically by around 24%, which could positively impact power grid operation and solar adoption. Full article
(This article belongs to the Special Issue Artificial Intelligence for More Efficient Renewable Energy Systems)
Show Figures

Figure 1

15 pages, 856 KiB  
Article
DAFE-MSGAT: Dual-Attention Feature Extraction and Multi-Scale Graph Attention Network for Polyphonic Piano Transcription
by Rui Cao, Zushuang Liang, Zheng Yan and Bing Liu
Electronics 2024, 13(19), 3939; https://doi.org/10.3390/electronics13193939 - 5 Oct 2024
Viewed by 416
Abstract
Automatic music transcription (AMT) aims to convert raw audio signals into symbolic music. This is a highly challenging task in the fields of signal processing and artificial intelligence, and it holds significant application value in music information retrieval (MIR). Existing methods based on [...] Read more.
Automatic music transcription (AMT) aims to convert raw audio signals into symbolic music. This is a highly challenging task in the fields of signal processing and artificial intelligence, and it holds significant application value in music information retrieval (MIR). Existing methods based on convolutional neural networks (CNNs) often fall short in capturing the time-frequency characteristics of audio signals and tend to overlook the interdependencies between notes when processing polyphonic piano with multiple simultaneous notes. To address these issues, we propose a dual attention feature extraction and multi-scale graph attention network (DAFE-MSGAT). Specifically, we design a dual attention feature extraction module (DAFE) to enhance the frequency and time-domain features of the audio signal, and we utilize a long short-term memory network (LSTM) to capture the temporal features within the audio signal. We introduce a multi-scale graph attention network (MSGAT), which leverages the various implicit relationships between notes to enhance the interaction between different notes. Experimental results demonstrate that our model achieves high accuracy in detecting the onset and offset of notes on public datasets. In both frame-level and note-level metrics, DAFE-MSGAT achieves performance comparable to the state-of-the-art methods, showcasing exceptional transcription capabilities. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

17 pages, 4739 KiB  
Article
Deep Learning-Enabled Dynamic Model for Nutrient Status Detection of Aquaponically Grown Plants
by Mohamed Farag Taha, Hanping Mao, Samar Mousa, Lei Zhou, Yafei Wang, Gamal Elmasry, Salim Al-Rejaie, Abdallah Elshawadfy Elwakeel, Yazhou Wei and Zhengjun Qiu
Agronomy 2024, 14(10), 2290; https://doi.org/10.3390/agronomy14102290 - 5 Oct 2024
Viewed by 373
Abstract
Developing models to assess the nutrient status of plants at various growth stages is challenging due to the dynamic nature of plant development. Hence, this study encoded spatiotemporal information of plants within a single time-series model to precisely assess the nutrient status of [...] Read more.
Developing models to assess the nutrient status of plants at various growth stages is challenging due to the dynamic nature of plant development. Hence, this study encoded spatiotemporal information of plants within a single time-series model to precisely assess the nutrient status of aquaponically cultivated lettuce. In particular, the long short-term memory (LSTM) and deep autoencoder (DAE) approaches were combined to classify aquaponically grown lettuce plants according to their nutrient status. The proposed approach was validated using extensive sequential hyperspectral reflectance measurements acquired from lettuce leaves at different growth stages across the growing season. A DAE was used to extract distinct features from each sequential spectral dataset time step. These features were used as input to an LSTM model to classify lettuce grown across a gradient of nutrient levels. The results demonstrated that the LSTM outperformed the convolutional neural network (CNN) and multi-class support vector machine (MCSVM) approaches. Also, features selected by the DAE showed better performance compared to features extracted using both genetic algorithms (GAs) and sequential forward selection (SFS). The hybridization of deep autoencoder and long short-term memory (DAE-LSTM) obtained the highest overall classification accuracy of 94%. The suggested methodology presents a pathway to automating the process of nutrient status diagnosis throughout the entire plant life cycle, with the LSTM technique poised to assume a pivotal role in forthcoming time-series analyses for precision agriculture. Full article
(This article belongs to the Special Issue The Use of NIR Spectroscopy in Smart Agriculture)
Show Figures

Figure 1

23 pages, 26337 KiB  
Article
High Stability Control of a Magnetic Suspension Flywheel Based on SA-BPNN and CNN+LSTM+ATTENTION
by Weiyu Zhang and Haotian Ji
Machines 2024, 12(10), 710; https://doi.org/10.3390/machines12100710 - 5 Oct 2024
Viewed by 239
Abstract
Compared to traditional, static-based flywheel systems, vehicle-mounted magnetic suspension flywheels face more complex operating conditions, and existing control strategies usually regard disturbances in vehicles under different operating conditions to be the same problem. Therefore, it is necessary to determine the interference from complex [...] Read more.
Compared to traditional, static-based flywheel systems, vehicle-mounted magnetic suspension flywheels face more complex operating conditions, and existing control strategies usually regard disturbances in vehicles under different operating conditions to be the same problem. Therefore, it is necessary to determine the interference from complex operating conditions and reasonably distinguish among them under different operating conditions to provide flywheel systems with strong stability (the rotor offset was less than 0.025 mm). Thus, this paper proposes a high-stability control strategy for flywheels based on the classification of vehicle-driving conditions and designs its control strategy by taking the vehicle-mounted magnetic suspension flywheel with a virtual inertia spindle as an example. First, according to the different vehicle working conditions and the varying interference intensities affecting the flywheel system, the working mode is divided into four modes. Considering the obvious differences in each working mode, it is proposed to use BP neural network optimization based on the simulated annealing algorithm (SA-BPNN) to determine the flywheel’s working condition. A relatively simple neural network can improve the response speed of the whole system. It also has a good effect. Secondly, it is proposed to use deep learning models based on convolutional neural networks, long short-term memory networks and attention mechanisms (CNN+LSTM+ATTENTION) to train the corresponding control parameters under each working condition to judge and predict the control parameters under different working conditions. Three evaluation parameters are used to evaluate the training results, and all achieved good results. Finally, the classification of working conditions and performance tests are carried out. The experimental results show the effectiveness and superiority of the proposed control strategy. Full article
(This article belongs to the Special Issue Magnetic Bearing Related Technology and Its Equipment Fields)
Show Figures

Figure 1

14 pages, 1739 KiB  
Article
Older Adult Fall Risk Prediction with Deep Learning and Timed Up and Go (TUG) Test Data
by Josu Maiora, Chloe Rezola-Pardo, Guillermo García, Begoña Sanz and Manuel Graña
Bioengineering 2024, 11(10), 1000; https://doi.org/10.3390/bioengineering11101000 - 5 Oct 2024
Viewed by 318
Abstract
Falls are a major health hazard for older adults; therefore, in the context of an aging population, predicting the risk of a patient suffering falls in the near future is of great impact for health care systems. Currently, the standard prospective fall risk [...] Read more.
Falls are a major health hazard for older adults; therefore, in the context of an aging population, predicting the risk of a patient suffering falls in the near future is of great impact for health care systems. Currently, the standard prospective fall risk assessment instrument relies on a set of clinical and functional mobility assessment tools, one of them being the Timed Up and Go (TUG) test. Recently, wearable inertial measurement units (IMUs) have been proposed to capture motion data that would allow for the building of estimates of fall risk. The hypothesis of this study is that the data gathered from IMU readings while the patient is performing the TUG test can be used to build a predictive model that would provide an estimate of the probability of suffering a fall in the near future, i.e., assessing prospective fall risk. This study applies deep learning convolutional neural networks (CNN) and recurrent neural networks (RNN) to build such predictive models based on features extracted from IMU data acquired during TUG test realizations. Data were obtained from a cohort of 106 older adults wearing wireless IMU sensors with sampling frequencies of 100 Hz while performing the TUG test. The dependent variable is a binary variable that is true if the patient suffered a fall in the six-month follow-up period. This variable was used as the output variable for the supervised training and validations of the deep learning architectures and competing machine learning approaches. A hold-out validation process using 75 subjects for training and 31 subjects for testing was repeated one hundred times to obtain robust estimations of model performances At each repetition, 5-fold cross-validation was carried out to select the best model over the training subset. Best results were achieved by a bidirectional long short-term memory (BLSTM), obtaining an accuracy of 0.83 and AUC of 0.73 with good sensitivity and specificity values. Full article
Show Figures

Figure 1

14 pages, 8341 KiB  
Article
Detecting Urban Traffic Anomalies Using Traffic-Monitoring Data
by Yunkun Mao, Yilin Shi and Binbin Lu
ISPRS Int. J. Geo-Inf. 2024, 13(10), 351; https://doi.org/10.3390/ijgi13100351 - 4 Oct 2024
Viewed by 594
Abstract
Traffic anomaly detection is crucial for urban management, yet current research is often confined to small-scale endeavors. This study collected 9 months of real-time Wuhan traffic-monitoring data from Amap. We propose Traffic-ConvLSTM, a multi-scale spatial-temporal technique based on long short-term memory (LSTM) networks [...] Read more.
Traffic anomaly detection is crucial for urban management, yet current research is often confined to small-scale endeavors. This study collected 9 months of real-time Wuhan traffic-monitoring data from Amap. We propose Traffic-ConvLSTM, a multi-scale spatial-temporal technique based on long short-term memory (LSTM) networks and convolutional neural networks (CNNs) to effectively achieve long-term anomaly detection at the city level. First, we converted traffic track points into an image representation, which enables spatial correlation between traffic flow and roads and correlations between traffic flow and roads, as well as the surrounding environment, to be captured. Second, the model utilizes convolution kernels of different sizes to extract spatial features at road-, regional-, and city-level scales while incorporating the temporal features of different time steps to capture hourly, daily, and weekly dynamics. Additionally, varying weights are assigned to the convolution kernels and temporal features of varying spatio-temporal scales to capture the heterogeneous strengths of spatio-temporal correlations within patterns of traffic anomalies. The proposed Traffic-ConvLSTM model exhibits improved performance over existing techniques in the task of identifying long-term and large-scale traffic anomaly occurrences. Furthermore, the analysis reveals significant traffic anomalies during holidays and urban sporting events. The diverse travel patterns observed in response to various activities offer insights for large-scale urban traffic anomaly management, providing recommendations for city-level traffic-control strategies. Full article
(This article belongs to the Special Issue Advances in AI-Driven Geospatial Analysis and Data Generation)
Show Figures

Figure 1

23 pages, 12985 KiB  
Article
Discrete Time Series Forecasting of Hive Weight, In-Hive Temperature, and Hive Entrance Traffic in Non-Invasive Monitoring of Managed Honey Bee Colonies: Part I
by Vladimir A. Kulyukin, Daniel Coster, Aleksey V. Kulyukin, William Meikle and Milagra Weiss
Sensors 2024, 24(19), 6433; https://doi.org/10.3390/s24196433 - 4 Oct 2024
Viewed by 472
Abstract
From June to October, 2022, we recorded the weight, the internal temperature, and the hive entrance video traffic of ten managed honey bee (Apis mellifera) colonies at a research apiary of the Carl Hayden Bee Research Center in Tucson, AZ, USA. [...] Read more.
From June to October, 2022, we recorded the weight, the internal temperature, and the hive entrance video traffic of ten managed honey bee (Apis mellifera) colonies at a research apiary of the Carl Hayden Bee Research Center in Tucson, AZ, USA. The weight and temperature were recorded every five minutes around the clock. The 30 s videos were recorded every five minutes daily from 7:00 to 20:55. We curated the collected data into a dataset of 758,703 records (280,760–weight; 322,570–temperature; 155,373–video). A principal objective of Part I of our investigation was to use the curated dataset to investigate the discrete univariate time series forecasting of hive weight, in-hive temperature, and hive entrance traffic with shallow artificial, convolutional, and long short-term memory networks and to compare their predictive performance with traditional autoregressive integrated moving average models. We trained and tested all models with a 70/30 train/test split. We varied the intake and the predicted horizon of each model from 6 to 24 hourly means. Each artificial, convolutional, and long short-term memory network was trained for 500 epochs. We evaluated 24,840 trained models on the test data with the mean squared error. The autoregressive integrated moving average models performed on par with their machine learning counterparts, and all model types were able to predict falling, rising, and unchanging trends over all predicted horizons. We made the curated dataset public for replication. Full article
(This article belongs to the Special Issue Smart Decision Systems for Digital Farming: 2nd Edition)
Show Figures

Figure 1

12 pages, 4323 KiB  
Article
Threshold-Based Combination of Ideal Binary Mask and Ideal Ratio Mask for Single-Channel Speech Separation
by Peng Chen, Binh Thien Nguyen, Kenta Iwai and Takanobu Nishiura
Information 2024, 15(10), 608; https://doi.org/10.3390/info15100608 - 4 Oct 2024
Viewed by 305
Abstract
An effective approach to addressing the speech separation problem is utilizing a time–frequency (T-F) mask. The ideal binary mask (IBM) and ideal ratio mask (IRM) have long been widely used to separate speech signals. However, the IBM is better at improving speech intelligibility, [...] Read more.
An effective approach to addressing the speech separation problem is utilizing a time–frequency (T-F) mask. The ideal binary mask (IBM) and ideal ratio mask (IRM) have long been widely used to separate speech signals. However, the IBM is better at improving speech intelligibility, while the IRM is better at improving speech quality. To leverage their respective strengths and overcome weaknesses, we propose an ideal threshold-based mask (ITM) to combine these two masks. By adjusting two thresholds, these two masks are combined to jointly act on speech separation. We list the impact of using different threshold combinations on speech separation performance under ideal conditions and discuss a reasonable range for fine tuning the thresholds. By using masks as a training target, to evaluate the effectiveness of the proposed method, we conducted supervised speech separation experiments applying a deep neural network (DNN) and long short-term memory (LSTM), the results of which were measured by three objective indicators: the signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio improvement (SAR). Experimental results show that the proposed mask combines the strengths of the IBM and IRM and implies that the accuracy of speech separation can potentially be further improved by effectively leveraging the advantages of different masks. Full article
Show Figures

Figure 1

Back to TopTop