Next Article in Journal
Road Surface Defect Detection Algorithm Based on YOLOv8
Next Article in Special Issue
CFE-YOLOv8s: Improved YOLOv8s for Steel Surface Defect Detection
Previous Article in Journal
Cough Detection Using Acceleration Signals and Deep Learning Techniques
Previous Article in Special Issue
GDCP-YOLO: Enhancing Steel Surface Defect Detection Using Lightweight Machine Learning Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network

Faculty of Automatic Control, Robotics and Electrical Engineering, Poznan University of Technology, 60-965 Poznań, Poland
Submission received: 22 May 2024 / Revised: 13 June 2024 / Accepted: 18 June 2024 / Published: 20 June 2024
(This article belongs to the Special Issue Machine Learning and Deep Learning Based Pattern Recognition)

Abstract

:
Accurate vibration classification using inertial measurement unit (IMU) data is critical for various applications such as condition monitoring and fault diagnosis. This study proposes a novel convolutional neural network (CNN) based approach, the IMU6DoF-SST-CNN in six variants, for robust vibration classification. The method utilizes Fourier synchrosqueezed transform (FSST) and wavelet synchrosqueezed transform (WSST) for time-frequency analysis, effectively capturing the temporal and spectral characteristics of the vibration data. Additionally, was used the IMU6DoF-SST-CNN to explore three different fusion strategies for sensor data to combine information from the IMU’s multiple axes, allowing the CNN to learn from complementary information across various axes. The efficacy of the proposed method was validated using three datasets. The first dataset consisted of constant fan velocity data (three classes: idle, normal operation, and fault) at 200 Hz. The second dataset contained variable fan velocity data (also with three classes: normal operation, fault 1, and fault 2) at 2000 Hz. Finally, a third dataset of Case Western Reserve University (CWRU) comprised bearing fault data with thirteen classes, sampled at 12 kHz. The proposed method achieved a perfect validation accuracy for the investigated vibration classification task. While all variants of the method achieved high accuracy, a trade-off between training speed and image generation efficiency was observed. Furthermore, FSST demonstrated superior localization capabilities compared to traditional methods like continuous wavelet transform (CWT) and short-time Fourier transform (STFT), as confirmed by image representations and interpretability analysis. This improved localization allows the CNN to effectively capture transient features associated with faults, leading to more accurate vibration classification. Overall, this study presents a promising and efficient approach for vibration classification using IMU data with the proposed IMU6DoF-SST-CNN method. The best result was obtained for IMU6DoF-SST-CNN with FSST and sensor-type fusion.

1. Introduction

The ever-growing complexity of electromechanical machinery—from bustling factories to sprawling cities and even within our own homes—underscores the critical need for effective maintenance practices. These systems are the backbone of our modern world and ensuring their long lifespan and minimizing waste hinges on proper upkeep. Industrial machinery, in particular, presents a distinct challenge due to its intricate design and operation. To prevent costly disruptions and equipment damage, proactive strategies for identifying faults are essential. Ultimately, such strategies can lead to significant financial savings and environmental advantages. Machine fault diagnosis plays a central role in safeguarding the reliability and operational longevity of industrial machinery.
Vibration analysis, a well-established technique, is highly sensitive to subtle changes in the machine condition, making it a popular choice for detecting faults in rotating machinery. Recent advancements have seen the application of deep learning techniques, particularly convolutional neural networks (CNNs), demonstrate promising results in automating these fault diagnostic processes. The field of fault diagnosis is undergoing continuous evolution, fuelled by the growth of data sharing facilitated by the Internet of Things (IoT) and the ever-expanding capabilities of machine learning. This research delves into the potential of employing image-based diagnostics, leveraged from sensor data and processed by CNNs, to achieve robust and interpretable fault detection within the realm of industrial machinery. This research proposes a novel approach that leverages synchrosqueezed transform (SST) for improved time-frequency analysis of vibration data, followed by CNNs for robust fault classification. By combining the SST’s ability to localize frequency content with the powerful feature learning capabilities of CNNs, this work aims to achieve accurate and efficient machine fault diagnosis using multisensor vibration data.
Accurately pinpointing faults within electromechanical machinery hinges on the strategic selection of sensors and the corresponding signals they capture. This choice is ultimately driven by the specific machine design and the particular fault signatures the diagnostic system aims to identify. Vibration analysis, a widely adopted technique due to its high sensitivity to various fault types [1,2,3,4,5], remains a cornerstone approach. It is complemented by other mechanical sensors measuring displacement [6], torque [7,8], and angular velocity/position [9,10], providing a comprehensive picture of the machine’s mechanical health. Electrical measurements, including current [11,12] and voltage [13,14], offer valuable insights into power delivery and motor health issues. Beyond these traditional approaches, additional signals like temperature (internal/external) [15,16] and sound [17,18,19] can be particularly useful for detecting specific fault types. Recent research delves even further, exploring the potential of image-based diagnostics using cameras [20,21,22,23] and the conversion of other signals into virtual images [12,24,25,26,27]. This adaptability in sensor selection empowers a holistic approach to machine health monitoring and fault detection by leveraging a diverse range of data sources.
Previous research built upon a series of successful investigations into leveraging CNNs for machine fault diagnosis using vibration analysis. In earlier work, Łuczak et al. [12] demonstrated the effectiveness of CNNs for fault detection and localization in a three-phase inverter system using phase current data. Furthermore, Łuczak et al. [28] explored the application of CNNs in conjunction with continuous wavelet transform (CWT) and time-frequency image recognition for machine fault diagnosis. This research established the potential of CNNs for accurately classifying faults based on the visual representation of vibration data in the time-frequency domain. However, limitations associated with CWT, such as the trade-off between time and frequency resolution, motivated the author to explore alternative approaches. The current study investigated the application of synchrosqueezed transform, a technique offering improved time-frequency localization compared to CWT. The author hypothesized that by utilizing SST for time-frequency analysis and combining it with the powerful feature learning capabilities of CNNs, it can achieve even more accurate and robust machine fault diagnosis using multisensor vibration data.
Previous research by the author on fault diagnosis using time-frequency image recognition has shown that the short-time Fourier transform (STFT) results in blurred images compared to the continuous wavelet transform (CWT) with a complex Morlet wavelet [28]. While CWT offers increased image sharpness, it also leads to significantly higher computational demands. This has motivated the development of alternative methods that offer better image sharpness than STFT with lower computational requirements than CWT. Synchrosqueezing methods, such as Fourier synchrosqueezed transform (FSST) and wavelet synchrosqueezed transform (WSST), appear promising due to their ability to achieve both better sharpness and lower computational cost, particularly for FSST, making them suitable for fault diagnostic applications. This research investigated two specific synchrosqueezing methods, FSST and WSST. Additionally, it addressed the second gap identified in the previous research: how to effectively organize multiple time-frequency images obtained from multiple sensors like multi-axis IMU (inertial measurement unit) data. While the previous study utilized sensor-type based fusion, where each axis (X, Y, and Z) corresponded to a colour channel in an RGB (red, green, blue) image, this research explored different fusion techniques and their impact on fault diagnosis training and accuracy.
This article prioritized the exploration of innovative image fusion techniques for effective time-frequency data representation within a CNN framework for fault diagnosis using multiaxis IMU sensor data. Previous research, particularly a study on six-switch and three-phase (6S3P) topology inverter faults [12], demonstrated the effectiveness of converting phase currents into RGB images for fault classification. This approach achieved superior results compared to traditional machine learning methods such as decision trees, naive Bayes, support vector machines (SVMs), k-nearest neighbours (KNN) and even simpler neural networks.
In the 6S3P inverter fault analysis [12], each channel of the RGB image represented a different phase of the inverter current. While this approach forms a strong foundation for the current work, a significant challenge arises when dealing with the multi-dimensional data from a 6-DoF (six degrees of freedom) IMU sensor. Unlike single-dimensional currents, data from multiple axes (accelerometer, gyroscope) necessitate a well-defined conversion strategy to the time-frequency domain for effective image representation. The existing literature acknowledges a knowledge gap in relation to optimal conversion of multiaxis IMU sensor data into a time-frequency image format suitable for CNN-based fault classification. This work addresses this gap by focusing on the efficacy of the proposed time-frequency image fusion methods within the CNN framework. This approach highlights the significant role of data representation in improving fault diagnosis and opens the door to further exploration of CNN architectures specifically tailored for this application. While this focused approach acknowledges the need for a more comprehensive study comparing various machine learning methods, the IMU6DoF-SST-CNN variants address a critical gap by proposing a novel approach for converting the time-frequency domain of 6-DoF IMU data into greyscale, RGB by sensor, and RGB by axis alignment images. These formats effectively capture the temporal characteristics of vibration signals across all axes, paving the way for leveraging the power of CNNs for accurate fault classification in scenarios involving complex multidimensional sensor data. The proposed method IMU6DoF-SST-CNN with variants was evaluated with three different datasets with a constant-velocity fan with an imbalance, a variable-velocity fan with two imbalances, and Case Western Reserve University (CWRU), Ohio, Cleveland, United States of America (USA), bearing faults [29].
The article [30] discusses a novel fault detection method for chemical processes using convolutional autoencoders (CAEs). It emphasizes the importance of spatial correlations between the measured variables, often neglected in traditional methods. The method in [30] utilizes only time-domain data arranged as a grayscale matrix, where rows represent variables and columns represent time. The current work proposes using data in the time-frequency domain by IMU6DoF-SST-CNN, which was not explored in the referenced article. Time-frequency analysis naturally produces a two-dimensional result interpretable as a greyscale image. For a multi-variable system, multiple time-frequency images are obtained. In the case of an inertial measurement unit (IMU) with six degrees of freedom (IMU6DoF), SST yields six such images. The literature lacks investigation into how to order these images. Therefore, Section 2.1, Section 2.2 and Section 2.3 explore different arrangement methods: side-by-side grids, RGB ordered by sensor type, and RGB ordered by sensor axis. Combining these three arrangements (grid, RGB by sensor type and RGB by sensor axis) with two SST methods (FSST and WSST) leads to the six variants investigated in Section 4.1, Section 4.2, Section 4.3 and Section 4.4. Section 5 discusses the strengths and limitations of each variant in terms of accuracy, training time, and execution time. Compared to a side-by-side grid arrangement, arranging images by the sensor axis or sensor type yields better training accuracy in fewer training iterations for both FSST and WSST. The execution time of the proposed IMU6DoF-SST-CNN is faster than the previously studied CWTx6-CNN but slower than the STFTx6-CNN; however, it is worth underlining that the time-frequency image obtained with the proposed method is sharper (see Section 4).
The current work in Section 2, Section 3 and Section 4 focuses on a single fault type (fan blade imbalance) where the visual differences between normal and faulty signals might seem significant, which allows for fast verification of the proposed method with less computational resources. However, this apparent distinction can be misleading. The fault diagnosis task presents challenges beyond apparent visual differences. Although the vibration signals of the first dataset exhibit a clear visual distinction under certain conditions, human interpretation of such data can be subjective. This subjectivity can lead to the overlooking of subtle features or misinterpretations that might be crucial for accurate fault classification. Addressing these challenges and demonstrating the potential of the proposed CNN-based approach for robust fault classification is the goal of this paper. The utilized data beyond the single fault type are presented in the current work in Section 5. The efficacy of the proposed method was validated using three datasets: one with a constant fan velocity at 200 Hz, another with variable fan velocity at 2000 Hz, and a third containing bearing fault data at 12 kHz. A dataset with variable velocity and two faults was evaluated. This additional dataset introduces a new dimension of complexity by incorporating velocity data alongside vibration signals. It also includes two distinct fault types, further testing the model’s ability to differentiate between different fault conditions beyond the single imbalance case. Finally, the proposed method was tested with the CWRU dataset, which provides a comprehensive and diverse set of vibration data that covers 13 different fault types. This significantly increased the various fault patterns, enhancing their generalizability and robustness in real-world scenarios with diverse fault possibilities.
This research is motivated by the shortcomings of existing time-frequency representations in the context of fault diagnosis using multi-axis IMU sensor data. While the STFT is a prevalent technique [1], it generates blurred images. The CWT with a complex Morlet wavelet mitigates this issue but incurs significantly higher computational demands [28]. This has spurred the development of alternative methods that offer superior image sharpness compared to STFT while exhibiting lower computational requirements than CWT. Synchrosqueezing methods, such as FSST and WSST, demonstrate promise due to their ability to achieve both improved image clarity and reduced computational cost. This research is particularly motivated by the potential of synchrosqueezing methods for real-time fault diagnostic applications. Furthermore, this research addresses a critical gap identified in prior work: the effective organization of multiple time-frequency images obtained from various sensors, such as multi-axis IMU data.
For comparison, Table 1 provides a detailed breakdown of how the proposed method differs from previous approaches in terms of datasets, faults, sensors, feature extraction, features, fusion techniques, classifiers, and overall methodology. The CWRU publicly available dataset is widely used for benchmarking fault diagnosis algorithms [2,3,31,32]. It contains data collected from a test rig simulating bearing faults in a motor. The proposed IMU6DoF-SST-CNN method was successfully verified on the CWRU bearing fault dataset with thirteen classes, unlike [2,3,31] studies with a smaller number of classes. Verification of IMU6DoF-SST-CNN with thirteen classes offers a more complex and comprehensive challenge compared to the fan demonstrator datasets. Table 1 in the column “Features and time window” presents the durations of the time domain windows in milliseconds, which were used to validate the methods. The proposed method was successfully verified at a time window of 85 ms, which is much faster compared to the other methods listed in Table 1.
This manuscript is organized as follows. Section 2 introduces the core methodology, the IMU6DoF-SST-CNN approach. Three variations of this method are presented within this section, each exploring different strategies for fusing time-frequency images generated using synchrosqueezed transform from multisensor vibration data (Section 2.1—sensor fusion, Section 2.2—grid fusion, Section 2.3—axis fusion). Section 3 details the implementation of a demonstrator showcasing the machine fault diagnosis process. Section 4 presents the results obtained from applying the different variations of the IMU6DoF-SST-CNN method using both Fourier (FSST) and wavelet (WSST) based synchrosqueezed transform for time-frequency analysis (Section 4.1—FSST sensor fusion, Section 4.2—FSST grid and axis fusion, Section 4.3—WSST). Section 4.4 delves into the interpretability of the proposed methods across all six variants. Section 5 provides an evaluation of the proposed method in the variable velocity dataset and the CWRU dataset with a discussion of the findings, and Section 6 concludes the manuscript by summarizing the key contributions and outlining potential future directions.

2. Machine Fault Diagnosis through Multisensor Vibration Data Analysis Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network

The synchrosqueezed transform (SST), introduced in [35], is a powerful time-frequency analysis technique that aims to improve the sharpness of both time and frequency domains compared to traditional methods, while remaining invertible. It enhances the localization of frequency content in the time-frequency domain compared to traditional methods like the short-time Fourier transform (STFT) [1] or continuous wavelet transform (CWT) [28,36]. CWT-based SST was presented in [37] by Ingrid Daubechies et al. The SST builds upon the concept of wavelets and is named wavelet-based synchrosqueezing transform (WSST). Further research looked at Fourier-based synchrosqueezing. Leveraging a STFT and inspired by concepts from [37], a method was presented that performs synchrosqueezing [38,39], named STFT-based SST. Fourier-based synchrosqueezing transform (FSST) was developed in [39]. The WSST and FSST aim to localize both the frequency content and the time evolution of a signal. WSST achieves this by utilizing a mother wavelet function and adapting its scale based on the instantaneous frequency of the signal [40].
As mentioned earlier, the synchrosqueezed transform (SST) comes in two primary methods: wavelet-based (WSST) and Fourier-based (FSST). At its core, WSST leverages the concept of wavelets, mathematical functions that offer localized analysis in both the time and frequency domains. WSST employs a mother wavelet function, and similar to CWT [28,36], scales and translates this function to match different frequency components within the signal. However, unlike CWT, WSST dynamically adjusts the scale of the wavelet function based on the instantaneous frequency of the signal component being analysed. The instantaneous frequencies from the CWT output can be extracted using a phase transform that is proportional to the first derivative of the CWT output with respect to the time-shift. This allows WSST to achieve a sharper concentration of frequency content in the time-frequency domain compared to CWT. The resulting time-frequency representation provides a clearer visualization of how the frequency content of the signal evolves over time. This improved localization of both time and frequency information makes WSST particularly well-suited for analysing non-stationary signals like the vibration data often encountered in machine fault diagnosis, as explored in the following section.

2.1. IMU6DoF-SST-CNN Method with Time-Frequency Images Fusion by Sensor

The proposed method for machine fault diagnosis leverages a multisensor fusion approach that incorporates information from an inertial measurement unit (IMU) with 6 degrees of freedom (IMU6DoF) vibration data as RGB images generated through synchrosqueezed transform (SST). The IMU6DoF sensor data captures the vibration characteristics of the machine, while the SST extracts time-frequency features from the vibration signals. These time-frequency features are then converted into RGB images using a colour mapping strategy, where each colour channel represents a specific axis of data. The resulting RGB images effectively encode both temporal and spectral information about the machine’s vibrational state. Subsequently, a convolutional neural network (CNN) is employed to process these RGB images and automatically learn discriminative features for fault classification. The CNN architecture is designed to efficiently extract relevant features from the time-frequency image representations, enabling the model to distinguish between healthy and faulty machine conditions. This data-driven approach offers a promising solution for robust and accurate machine fault diagnosis. The proposed IMU6DoF-SST-CNN method shown in Figure 1 consists of the following steps:
  • Sensor data acquisition. The process starts with collecting 128 samples of vibration data from the sensor of each of six axes. The IMU6DoF sensor is a type of inertial measurement unit capable of measuring acceleration, gyroscopic rotation, and possibly other environmental factors (depending on the specific sensor model). These data are collected from the machine under various operating conditions (normal, faulty, etc.) to create a robust dataset for analysis.
  • Feature extraction by synchrosqueezed transform (FSST or WSST). The vibration data undergoes analysis using the synchrosqueezed transform. This is a signal processing technique specifically designed for non-stationary signals, where the frequency content changes over time. SST helps decompose the vibration signals into time-frequency components, providing a visual representation of how different frequency elements evolve over time.
  • Time-frequency image generation by fusion of 6 time-frequency images. The results from the SST are then converted into a visual representation suitable for the CNN used later in the process. This involves transforming the time-frequency data into an RGB image. This conversion process essentially assigns each colour channel in the RGB image to represent a specific axis (X as red channel, Y as green channel, and Z as blue channel.
  • Fault classification with convolutional neural network. The final step utilizes a CNN to analyse the time-frequency images (RGB images) and perform fault classification. CNNs are a type of deep learning architecture particularly well-suited for image recognition tasks. In this case, the CNN is trained to automatically learn discriminative features from the time-frequency image representations. These features help the CNN distinguish between different vibration patterns that might correspond to healthy or faulty operating conditions of the machine.

2.2. IMU6DoF-SST-CNN Method with Time-Frequency Images Fusion as Grid

In comparison to the IMU6DoF-SST-CNN method with time-frequency images fused by sensor described in Section 2.1, this method adopts a grid-based fusion approach. Previous research, like STFTx6-CNN [1] and CWTx6-CNN [28], employed sensor alignment and fusion into RGB for combining the data. The conducted research not only investigates modifications to the time-frequency technique but also explores the fusion of partial images. In the fusion as a grid approach, six individual time-frequency images, each representing one axis of the accelerometer and gyroscope, are combined into a single greyscale image grid as shown in Figure 2. The first row of the grid contains the time-frequency representations of the accelerometer’s X, Y, and Z axes. The second row corresponds to the gyroscope’s X, Y, and Z axes data transformed into the time-frequency domain using SST. While the fusion as a greyscale grid approach offers a unified representation, it results in the largest images (384 × 130 pixels) among the methods considered. The final image size is determined by the method used to fuse (combine) the results from each SST of IMU6DoF data. In this specific case shown in Figure 2, the image has a resolution of 384 × 150 pixels. This is achieved by placing three SST images of size 128 × 75 pixels side-by-side in the first row, resulting in a width of 384 pixels (128 × 3). The second row contains the gyroscope data, bringing the total height to 150 pixels (75 × 2). Additionally, this method loses spatial information and does not group data by sensor type.

2.3. IMU6DoF-SST-CNN Method with Time-Frequency Images Fusion by Axis

Similar to STFTx6-CNN [1] and CWTx6-CNN [28], earlier methods (described in Section 2.1) relied on sensor alignment and fusion into an RGB-like structure for data combination by sensor. This work delves deeper, investigating not only modifications to the time-frequency analysis but also exploring the fusion of the resulting “partial images” (individual time-frequency representations from each sensor axis). Six time-frequency images are generated, each representing a single axis (X, Y, and Z) of both the accelerometer and gyroscope. These individual images are then combined, not by sensor type, but by axis as shown in Figure 3. This creates a unique data structure for processing by the CNN. Compared to other methods, this approach generates larger 384 × 75 × 3 data structures due to the separate processing and fusion of individual axis images. By focusing on individual axes, this method benefits from the spatial relationships between data points. The fusion by axis approach provides an alternative data representation for CNN processing.

3. Demonstrator of Machine Fault Diagnosis

To ensure consistency and facilitate comparisons with previous research, this section utilizes the same demonstrator used in earlier studies [1,28,41]. This demonstrator setup has been well characterized and provides a controlled environment for evaluating the effectiveness of the proposed IMU6DoF-SST-CNN methods with different fusion approaches. The demonstrator shown in Figure 4 was designed to evaluate the feasibility of the proposed IMU6DoF-SST-CNN methods with three fusion options: greyscale grid, RGB by axis, and RGB by type for image-based recognition using IMU data. The functionality of the proposed concept was experimentally validated during the demonstration using a Yate Loon Electronics (Taiwan) fan model GP-D12SH-12(F) operating at nominal 12 V DC and 0.3 A. To facilitate a rapid proof-of-concept evaluation of the proposed IMU6DoF-SST-CNN method with its six variants, a sampling time of 5 ms was chosen, resulting in a small data volume. The fan nominal rotational speed of 3000 RPM (revolutions per minute) translates to 50 Hz. By supplying a reduced voltage of 5 V, the rotational speed was successfully decreased to approximately 21 Hz. This observation underscores the method’s efficacy in accommodating a broad spectrum of operational parameters allowing for the usage of 200 Hz sampling frequency. The investigation focused on applications demanding constant rotational speeds, a characteristic commonly encountered in numerous industrial environments. Specific examples include centrifugal pumps and blowers, spindles utilized in machine tools, conveyor belts, cooling fans employed in electronic devices, and duct fans used within air conditioning systems. Furthermore, the potential applicability extends beyond scenarios with entirely constant speeds. Owing to its inherent ability to manage variations in operational conditions, the IMU6DoF-SST-CNN approach holds promise for scenarios involving controlled adjustments in speed or minor fluctuations. This broadened adaptability paves the way for its implementation in a more extensive array of industrial machinery. The demonstrator mimics a real-world scenario where an IMU sensor is used for vibration analysis. The setup shown in Figure 4 simulates a scenario in which an IMU sensor could be mounted on a machine to monitor vibrations for fault diagnosis. Controlled vibrations generated by the unbalanced fan blade mimic potential machine faults that the proposed methods can learn to identify. IMU data are continuously acquired at a constant sampling rate of 200 Hz, corresponding to a sampling interval of 5 milliseconds (ms). This results in a buffer containing 128 samples for each axis, representing a total acquisition time of 0.64 s (128 samples multiplied by 5 ms/sample = 0.64 s). The data collected are sent from the microcontroller STM32F746ZG (NUCLEO board) to an MQTT (message queuing telemetry transport) broker using the MQTT protocol as shown in Figure 5. The collected data are stored as an object in JSON (JavaScript object notation).
To evaluate the effectiveness of the proposed methods, data were gathered across three distinct operational states: idle, normal operation, and fault. In the fault class, a paperclip was attached to the fan blade, similar to previous research, to induce an imbalance and generate controlled vibrations, mimicking a potential machine fault scenario. The time series data for each class are visualized in Figure 6. Each data segment now comprises 128 IMU samples, capturing information from all three axes (X, Y, and Z) of both the accelerometer and the gyroscope. This results in a total of six data streams per segment (128 × 6). IMU sensor calibration typically focuses on characterizing and potentially correcting for sensor gain errors, not necessarily repeatability within a single trial. Moreover, a constant offset error will yield only values at a frequency of 0 Hz in time-frequency analysis. This type of error does not affect other frequencies, which are key parts in fault recognition. The MPU6050 IMU6DoF sensor has factory calibration. To enhance the data reliability and facilitate subsequent statistical analysis of potential variations, signal acquisitions were repeated at least three times throughout the experiment. Statistical analysis of the sensor data at an idle state revealed the following standard deviations 0.0049, 0.0036, 0.0065 for accelerometer X, Y, and Z axes, respectively. Similarly, the standard deviations for the gyroscope’s X, Y, and Z axes were 0.1104, 0.1033, 0.1033, respectively. Signal variance analysis indicated minimal variance close to zero for all accelerometer axes. In contrast, the gyroscope axes exhibited variances of 0.0122, 0.0107, and 0.0107 for the X, Y, and Z axes, respectively.
For the examination of the frequency content of the data, each recorded segment (containing 128 temporal data points from the three directional readings of the accelerometer (X, Y, and Z) and the three angular rate axes of the gyroscope (X, Y, and Z)) were transformed into the frequency domain using the fast Fourier transform (FFT) technique. This essentially translates the time-based signal from each axis into its spectral components, allowing for identification of the prominent frequencies present in the data. Figure 7 illustrates the time series data converted into the frequency domain for each operating state (idle, normal operation, and fault). In this study, distinct frequency patterns can be observed across different operational states. During the idle state, a prominent peak is observed at 0 Hz, indicating minimal vibration, consistent with the absence of significant activity. Conversely, normal operation is characterized by the presence of subtle vibrations distributed across a frequency spectrum ranging from 20 Hz to 90 Hz. These vibrations are likely attributed to motor operations. In contrast, the fault condition presents a unique frequency signature. Specifically, a dominant frequency of 20 Hz emerges, particularly noticeable in the X-axis of the accelerometer data and the Z-axis of the gyroscope data. This targeted occurrence of a specific frequency suggests a characteristic pattern associated with the presence of an imbalanced fan blade in the fault scenario. This finding underscores the potential of frequency analysis to discern fault conditions with precision, offering valuable insights for effective fault diagnosis and predictive maintenance strategies.

4. Results of Multisensor Vibration Data Analysis Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network

This section introduces a novel approach for analysing and classifying complex vibration patterns using data from inertial measurement units. The proposed method, named IMU6DoF-SST-CNN, leverages the strengths of time-frequency analysis, sensor fusion, and deep learning to achieve robust and accurate vibration classification. The method utilizes data from all six degrees-of-freedom (DoF) of an IMU, capturing the complete vibration information (acceleration, gyroscope) along each axis. The IMU6DoF-SST-CNN method incorporates a novel fusion strategy. It processes the time-frequency images obtained from each sensor axis (e.g., accelerometer X, Y, Z) independently using the CNN. Subsequently, the extracted features are fused along the sensor and axis dimensions, potentially capturing richer vibration representations. Additionally, the fusion leverages greyscale information within the time-frequency images, which encodes the magnitude of the signal components at different frequencies and times. To illustrate the effectiveness of the proposed IMU6DoF-SST-CNN method in capturing informative time-frequency features, see Figure 8. It presents a comparison of the time-frequency representations for a specific vibration class (“fault” in this example) obtained using three different approaches. The leftmost image shows the result of applying the IMU6DoF-SST-CNN method (referred to as FSST in Figure 8). This image displays a clear and concentrated representation of the dominant frequencies associated with the “fault” vibration across all sensor axes. The middle image showcases the outcome using the CWT with a complex Morlet wavelet. While CWT offers time-frequency analysis, it does not provide the same level of spectral concentration as the proposed method, leading to a more scattered representation of the key frequencies. The rightmost image depicts the result obtained using the short-time Fourier transform method. STFT suffers from limitations in resolving closely spaced frequencies, resulting in a less precise representation of the time-varying spectral components compared to the proposed IMU6DoF-SST-CNN method.
To ensure a balanced evaluation, all approaches processed a total of 8064 images. Each class (idle, normal operation, and fault) was equally contributed by 2560 images generated individually for each method. These images underwent a separation process to create training and testing sets. This split followed an 80/20 ratio. In other words, 80% (translating to 2150 images per class) were allocated for training the CNN models. The remaining 20% (538 images per class) were designated for testing purposes to assess their performance.

4.1. IMU6DoF-SST-CNN Method with FSST Time-Frequency Images Fusion by Sensor

Figure 9 provides further insights into the feature extraction capabilities of the IMU6DoF-SST-CNN method with sensor fusion defined in Section 2.1. This figure likely showcases RGB images, where each image represents the time-frequency information for a specific vibration class after processing data from all sensor axes. The RGB channels correspond to the features extracted from the three axes (X, Y, and Z) of each sensor (accelerometer and gyroscope).
The effectiveness of the feature learning process within the IMU6DoF-SST-CNN method can be evaluated by examining Figure 10 and Figure 11. Figure 10 depicts the training progress of the CNN component. Ideally, this figure should show a clear trend of increasing training accuracy over a certain number of training iterations (epochs). The shown training accuracy reaches at least 90% after five iterations and suggests that the CNN is successfully learning discriminative features from the time-frequency images (visualized in Figure 9) generated by the IMU6DoF-SST processing.
Furthermore, Figure 11 presents the confusion matrix after training, showcasing the classification performance on both the training and testing datasets. The confusion matrix ideally should exhibit high diagonal values, indicating a large number of correctly classified samples for each vibration class. An accuracy of 100% at the finish for training and validation would be an exceptional outcome, suggesting perfect classification on the training data. However, it is crucial to assess the performance on the independent test dataset as well. The test accuracy is also high (at least 90% after training), which signifies that the IMU6DoF-SST-CNN method with sensor fusion generalizes well and can effectively classify unseen vibration patterns. It is important to remember that achieving 100% accuracy on real-world datasets is often challenging. However, observing a significant increase in training accuracy and a high test accuracy in the confusion matrix provides strong evidence that the proposed method is capable of learning effective features for vibration classification tasks.

4.2. IMU6DoF-SST-CNN Method with FSST Time-Frequency Images Fusion as Grid and Fusion by Axis

While the previous section explored the application of IMU6DoF-SST-CNN with sensor fusion, which utilized RGB images for feature representation (shown in Figure 9 for that method), the next method adopts a different approach defined in Section 2.2.
Figure 12 in this context showcases greyscale images, one for each axis of each class (fault, idle, and normal). These greyscale images represent the time-frequency information obtained using the fusion method after processing data from all sensor axes by FSST. By analysing the variations in intensity and patterns across these greyscale images, it can gain insights into how the next method captures class-specific features in the time-frequency domain. The goal remains similar: to extract discriminative representations that aid in differentiating between different vibrations types for classification tasks.
Analysis, defined in Section 2.3, of the RGB channels in Figure 13 can be expected to observe how the features extracted from each sensor axis contribute to the overall vibration classification task. Each channel captures distinct aspects of the vibration in the time-frequency domain. The subsequent fusion stage within the CNN architecture combines the information from these individual axis-specific representations to achieve robust classification. It is important to compare the results obtained with this approach (fusion by axis) to those using sensor fusion (mentioned in Section 4.1) to assess any potential benefits or drawbacks of each fusion strategy for vibration classification using the IMU6DoF-SST-CNN method.
Building upon the promising results observed with the IMU6DoF-SST-CNN method, the next fusion method achieves an equally impressive performance in terms of training and test accuracy. As noted previously, achieving 100% accuracy on the training set signifies perfect classification, but generalizability to unseen data is crucial. This consistency in accuracy across the two fusion methods highlights the potential effectiveness of using time-frequency analysis with sensor fusion for further applications. Figure 14 delves into the effectiveness of different fusion strategies (sensor, grid, and axis) within the IMU6DoF-SST-CNN framework. It compares the training progress of the CNN for the FSST approach alongside three fusion methods: sensor fusion, grid-based fusion, and axis-based fusion. By analysing the trends in Figure 14, we can gain insights into the learning behaviour of the CNN for each approach. The comparison was limited to the first 15 iterations of learning. Interestingly, all three fusion methods (sensor, grid, and axis) seemed to reach 100% training accuracy. This suggests that all approaches can potentially learn effective features for the classification task. However, achieving 100% on the training set does not necessarily guarantee the best generalizability. The training curves in Figure 14 reveal that fusion by axis achieves the fastest convergence to a high accuracy level. This suggests that the CNN can learn most efficiently from features where information from each sensor axis is processed and presented in groups (RGB images shown in Figure 13). Conversely, grid-based fusion appears to be the slowest, potentially indicating a more complex learning process for the CNN when dealing with the combined spatiotemporal information.

4.3. Method IMU6DoF-SST-CNN with WSST

Continuing the exploration of the IMU6DoF-SST-CNN framework, Section 4.3 focuses on the application of wavelet synchrosqueezed transform (WSST) alongside sensor data fusion. Figure 15 showcases RGB images, one for each class (fault, idle, and normal). These RGB images represent the processed data after applying WSST to each sensor axis (accelerometer and gyroscope X, Y, and Z) and subsequently employing sensor fusion within the IMU6DoF-SST-CNN method shown in Figure 1.
The greyscale images shown in Figure 16 represent the outcome of applying WSST followed by grid-based fusion within the IMU6DoF-SST-CNN architecture (see Figure 2). By examining these images, it can be observed how the fusion process in the grid captures the interplay between the time-frequency characteristics and spatial information. This combined representation might be particularly useful for vibration classification tasks where the location of the IMU sensor on the body or object can influence the captured data.
Continuing the analysis of the IMU6DoF-SST-CNN framework with WSST, Figure 17 presents RGB images, one for each class (fault, idle, and normal). These images represent the processed data after applying WSST to each sensor axis and subsequently employing fusion by axis within the method shown in Figure 3.
Figure 18 sheds light on the training behaviour of the CNN for the IMU6DoF-SST-CNN framework when employing WSST and compares the three fusion methods: sensor, grid, and axis. Interestingly, the training progress curves for all fusion methods exhibit a similar trend, suggesting that the CNN can effectively learn from the features extracted using WSST regardless of the chosen fusion strategy. This observation aligns with the findings for the FSST presented earlier in Figure 14. In both the FSST and WSST scenarios, all three fusion methods (sensor, grid, and axis) appeared to achieve similar training progress. However, a closer look at Figure 18 reveals that fusion by axis led to slightly better results compared to the other two fusion approaches. While the difference might be subtle, it suggests that for WSST, presenting information from each sensor axis separately during training (Figure 17) offers a slight advantage for the CNN in learning discriminative features.

4.4. Interpretability of Proposed IMU6DoF-SST-CNN Method for All Variants

Moving beyond the training process and classification accuracy, Figure 19, Figure 20, Figure 21, Figure 22, Figure 23 and Figure 24 delve into the interpretability aspects of the proposed IMU6DoF-SST-CNN method. These figures showcase the results of interpretability techniques applied to the CNN models trained with various combinations of time-frequency analysis (FSST and. WSST) and fusion strategies (sensor, grid, and axis). Figure 19 delves into the interpretability aspects of the proposed IMU6DoF-SST-CNN method with sensor fusion. This figure presents four columns corresponding to different techniques used to understand which parts of the time-frequency images (shown in Figure 9) contribute most significantly to the classification decisions made by the CNN. Figure 19, Figure 20 and Figure 21 depict the interpretability for the FSST approach with sensor fusion (Figure 19), greyscale grid fusion (Figure 20), and axis-based fusion (Figure 22). These figures present visualizations using techniques like Grad-CAM (gradient-weighted class activation mapping), occlusion sensitivity, and LIME (local interpretable model-agnostic explanations). By analysing these visualizations for each fusion strategy with FSST, we can gain insights into which regions of the time-frequency representations are most critical for the CNN’s classification decisions for different vibrations in each class.
Specifically considering Figure 21, which showcases the interpretability results for the IMU6DoF-SST-CNN with FSST and fusion by axis, the occlusion sensitivity analysis reveals valuable insights into the feature importance for different classes. The results indicate that for the fault class, occluding the information from the accelerometer’s X-axis has a significant impact on the CNN’s prediction. This suggests that the features extracted from the X-axis accelerometer data play a dominant role in identifying faulty vibration patterns using the FSST and axis-based fusion approach. Conversely, for the normal class, occluding the gyroscope’s Y-axis data led to a more substantial drop in classification accuracy compared to occluding other regions. This implies that the features captured from the gyroscope’s Y-axis are particularly crucial for the CNN to correctly classify normal patterns in this specific scenario.
Similarly, Figure 22, Figure 23 and Figure 24 illustrate the interpretability for the WSST approach with sensor fusion (Figure 22), greyscale grid fusion (Figure 23), and axis-based fusion (Figure 24). By comparing the interpretability results for WSST (Figure 22, Figure 23 and Figure 24) with those for FSST (Figure 19, Figure 20 and Figure 21), it can be observed how the choice of time-frequency analysis technique (FSST and WSST) might influence the features that the CNN focuses on for classification.

5. Discussion

A high-performance computing environment, established as a remote virtual machine courtesy of Poznan University of Technology, ensured a fair and controlled evaluation of the proposed methods. The virtual machine, leveraging VMware, Palo Alto, CA, USA, technology, offered ample RAM (16 GB) for efficient memory handling. Processing power was delivered by a powerful AMD EPYC 7402 CPU (central processing unit), Santa Clara, CA, USA with two cores and four threads specifically dedicated to this task. Notably, convolutional neural network training was performed entirely on the CPU to maintain consistent conditions for comparison across all methods. MathWorks, Natick, MA, USA MATLAB R2023a served as the software foundation for this research. This comprehensive suite provided a toolbox for data preparation, image generation, CNN implementation, and performance assessment.
The selection of model parameters for the IMU6DoF-SST-CNN method can be important to achieve accurate fault diagnosis. In preliminary research, the following approaches were used (a) hand hyperparameter tuning to explore a good combination of hyperparameters; (b) analysis of confusion matrices; and c) visualisation techniques like Grad-CAM to show which regions of the time-frequency image contribute most to the model’s classification decisions. The IMU6DoF-SST-CNN method was employed to generate time-frequency representations from multiple sensor data. A critical parameter in the fault diagnosis system is the window size. A large window captures more details, but, of course, it leads to unwanted delays in making a decision and requires more computational resources. Therefore, to reduce delay, a smaller window can be applied; however, it has blurred images in the time-frequency domain. In this article, small size time windows of 640 ms, 512 ms, and 85 ms for the fan imbalance dataset with constant velocity, the fan imbalance dataset with variable velocity, and the CWRU bearing faults dataset were investigated, respectively.
Table 2 provides a valuable comparison of the training progress achieved by the proposed methods in the first dataset of the fan fault with constant velocity. By analysing the data in this table, we can gain insights into the learning behaviour of the CNNs for each approach. The training time required by the proposed methods varies depending on the chosen time-frequency analysis technique (FSST vs. WSST) and fusion strategy (sensor, grid, and axis). Generally, methods utilizing FSST are faster to train compared to those using WSST. For instance, FSST with sensor fusion takes only 2 min and 58 s, whereas WSST with sensor fusion requires 8 min and 25 s. All methods (reference and proposed) achieved a perfect validation accuracy of 100%. However, the number of training iterations required to reach this accuracy can differ. For most methods, the validation accuracy surpasses 90% within the first 10 iterations. Notably, FSST with grid fusion takes slightly longer (10th iteration) to achieve this milestone compared to other FSST-based methods (5th iteration). The comparison between fusion strategies within each time-frequency analysis technique is also insightful. For FSST, sensor fusion appears to be the fastest training option (2 min 58 s), while grid fusion takes slightly longer (3 min 20 s) and axis-based fusion requires the most training time (4 min 13 s). A similar trend was observed for WSST, with sensor fusion being the fastest (8 min 25 s) followed by grid fusion (9 min 30 s) and axis-based fusion (12 min 17 s).
Table 3 presents the time taken for each method to convert the raw sensor data into the specific image format employed for the CNN (e.g., RGB images for FSST with sensor fusion or greyscale images for FSST with grid fusion) for the first dataset of the fan fault with constant velocity. This information is crucial for evaluating the overall efficiency of the proposed methods, particularly in scenarios where real-time fault diagnosis is desired. Complementing the training progress analysis in Table 2, Table 3 sheds light on the efficiency of image generation for fault diagnosis using the proposed methods. Examining this data reveals interesting insights. The time required to process the sensor data and generate the time-frequency images varies across the methods. Generally, methods utilizing STFT for time-frequency analysis (reference methods) exhibit the fastest image generation times. For instance, STFTx6-CNN with sensor fusion takes only 75.4 s to process all image iterations, while the fastest proposed method (FSST with grid fusion) requires 142.6 s. This difference can be attributed to the inherent computational complexity of the chosen time-frequency analysis technique. Within each time-frequency analysis approach (FSST and WSST), the image generation time shows minimal variation among the three fusion strategies (sensor, grid, and axis). This suggests that the choice of fusion strategy has a less significant impact on image generation efficiency compared to the time-frequency analysis technique itself. A clear distinction is observed between methods using FSST and those using WSST. For both sensor and grid fusion, FSST-based methods (around 142–149 s) are considerably faster than their WSST counterparts (around 246–256 s) in generating images. This difference aligns with the inherent complexity of WSST compared to FSST. When combined with the observations from Table 2, a more comprehensive picture emerges. While some proposed methods (like FSST with sensor fusion) achieve a good balance between training speed and image generation efficiency, others (like WSST with any fusion strategy) might be less suitable for real-time applications due to their higher image generation times.
All proposed methods using FSST (sensor, grid, and axis fusion) achieved similar training times (around 3–4 min) compared to CWTx6-CNN (3 min 4 s). This suggests that FSST might be computationally more efficient for training the CNN. The accumulation of 128 × 6 samples from IMU6DoF took 640 ms at 200 Hz sampling time. The time of image generation is given in Table 3 in column “Average time of single iteration in milliseconds (ceiling round)”. The CWTx6-CNN exhibits a slower image generation time (average iteration time 28.452 ms) compared to most proposed methods using FSST (ranging from 17.681 ms to 18.543 ms). This is because CWT has a higher inherent complexity than FSST for the specific task. While both FSST and CWT are time-frequency analysis techniques, a key advantage of FSST lies in its ability to achieve better localization in the time-frequency domain. This improved localization can be crucial for capturing the transient features of vibration data associated with faults. Figure 8 showcases this advantage by comparing the RGB image representations generated for the fault class using FSST (left), CWT with a complex Morlet wavelet (middle), and STFT (right). The visual clarity of the fault-related features within the FSST image suggests a more precise localization of these features in both the time and frequency domains compared to the CWT and STFT representations. This improved localization by FSST can be further corroborated by analysing the interpretability results in Figure 19. Figure 19 depicts the local interpretable model-agnostic explanations analysis for the proposed IMU6DoF-SST-CNN with FSST and sensor fusion. The interpretability map highlights the specific regions in the time-frequency domain that the CNN heavily relies on for fault classification. The clear and focused nature of these highlighted regions aligns with the superior localization capabilities of FSST, suggesting that the CNN can effectively leverage the precise temporal and spectral information captured by FSST for accurate fault detection.
The verification of the proposed method IMU6DoF-SST-CNN using FSST and WSST with different fusion was performed using a modified configuration of the demonstrator apparatus (illustrated in Figure 25) within a subsequent experimental scenario. This scenario incorporated variations in fan speeds while maintaining a constant 12V DC power supply. To achieve this, the demonstrator depicted in Figure 4 was augmented with a P-channel MOSFET (a type of field-effect transistor) to facilitate granular control over the fan velocity. The MOSFET enabled adjustments in 10% increments, ranging from 10% to 100% of the fan’s nominal speed. To simulate a distinct fault condition, an additional paper clip was introduced. For this specific scenario, the data acquisition process employed a sampling frequency of 2000 Hz.
A segment of 1024 samples in the time-domain data of the 3-axis gyroscope for the modified demonstrator with variable velocity is presented in Figure 26. The red, green, and blue lines represent the X, Y, and Z axes, respectively. Analysing these time series signals alongside the implemented fault conditions and varying fan speeds was crucial for evaluating the effectiveness of the proposed IMU6DoF-SST-CNN method. Each row in Figure 26 corresponds to the same velocity and the columns address the fan condition.
A total of 58,380 (10 velocities × 3 class × 1946 images) images were generated to comprehensively represent the data from the second scenario. This encompasses 10 distinct fan velocities, three operational classes (fault (fault 1), fault 2, and normal operation), and 1946 images generated for each combination of velocity and class. Figure 27 presents exemplar RGB images, one for each class (fault 1, fault 2, and normal operation), generated using the IMU6DoF-SST-CNN method with sensor-level fusion and FSST. These images serve to visually depict the characteristics extracted from the sensor data by the proposed method under varying fan speeds and operating conditions. By analysing these visualizations alongside the actual sensor data, researchers can gain valuable insights into the effectiveness of the IMU6DoF-SST-CNN method for fault classification. Each row in Figure 27 corresponds to the same velocity and the columns address the fan condition.
The training process of the IMU6DoF-SST-CNN method for the second scenario dataset, involving a fan with variable velocity, is shown in Figure 28. The second dataset images were split into training and testing sets with an 80/20 ratio. A total of 58,380 images were split randomly into 46,704 images for training and 11,676 images for testing (3892 images × 3 classes). This figure illustrates metrics such as training accuracy and loss over training epochs. Analysing this plot allows evaluation of the model for each of six variants. Following successful training, the performance of the IMU6DoF-SST-CNN method was further evaluated using a confusion matrix (Figure 29). Each row in Figure 29 corresponds to the same synchrosqueeze transform (FSST or WSST), and the columns address the image fusion method (by sensor, axis, or as grid). The confusion matrix presents the classification results for the test dataset from the second scenario. It visualizes the number of correctly classified (matrix diagonal) and misclassified instances for each operational class (fault 1, fault 2, and normal operation). Analysing this set of matrices allows the best accuracy variant to be chosen, which is FSST with fusion of images by sensor type (first column and first row in Figure 29).
The FSST and image fusion by sensor variant of the IMU6DoF-SST-CNN method that achieved the best accuracy on the second dataset (fan with variable velocity) was further evaluated on a third dataset. This new dataset, obtained from the Bearing Data Center of Case Western Reserve University (CWRU [29]), focuses on bearing fault detection. The CWRU bearing faults dataset uses three single axis accelerometers. Figure 30 presents a sample of the time-domain data from the third dataset. This specific example represents a drive-end bearing fault scenario with a sampling frequency of 12 kHz. The red, green, and blue lines correspond to the data acquired from the drive-end (DE), fan-end (FE), and base (BA) accelerometers, respectively. Each row in Figure 30 corresponds to the same motor load from 0 HP to 3 HP, and the columns address the condition (IR—inner race fault, B—ball fault, OR—outer race fault, @6—centered, @3—orthogonal, @12—opposite) at different fault diameter of 0.007”, 0.014”, 0.021”, and 0.028”. Analysing these raw signals can provide initial insights into the characteristics of the bearing fault. The condition at class normal and IR028 was excluded from further analysis because the BA data are not given. All data were divided into windows of 1024 samples each, which correspond to 85 ms.
The CWRU bearing faults dataset uses three single axis accelerometers, which can be threatened in a similar way to the 3-axis accelerometer of IMU6DoF to obtain time-frequency images with fusion of the images by sensor. Because CWRU uses a 3-axis accelerometer, the name of the proposed method was changed to ACCELEROMETER×3-SST-CNN to underline the fused data. Figure 31 presents an example RGB image generated using the ACCELEROMETER×3-SST-CNN method with FSST and sensor-level fusion defined in Section 2.1. Each row in Figure 31 corresponds to the same fault condition, and the columns address the motor load from 0 HP to 4 HP. This image is derived from the drive-end bearing fault data at 12 kHz. The proposed method aims to extract essential features from the raw sensor data in time-frequency and visually represent them in this RGB image format. The total number of images obtained is equal to 34,788 (four motor loads × 13 classes × 669 images). This includes four different motor loads, 13 fault classes, and 669 images generated for each combination of motor load and class. The generation of a single image took around 150 ms, and all images were generated in around 87 min. By comparing these generated time-frequency images under different operating conditions and types of faults, researchers can potentially gain valuable insights for classification tasks. The images of the third dataset were split into training and testing sets with an 80/20 ratio. A total of 34,788 images were randomly split into 27,833 images for training and 6955 images for test (535 images × 13 classes). The efficacy of the ACCELEROMETER×3-SST-CNN method was evaluated for the drive-end bearing fault classification task using a confusion matrix, presented in Figure 32. This matrix depicts the model’s perfect performance on both the training and test sets within the third dataset.

6. Conclusions

This research investigated the effectiveness of CNNs for vibration classification tasks using IMU data. This study proposed a novel approach, the IMU6DoF-SST-CNN, which leverages FSST for time-frequency analysis and explores different fusion strategies for sensor data. The proposed IMU6DoF-SST-CNN method achieved a perfect validation accuracy of 100% for the investigated vibration classification task for the constant velocity of a fan. The proposed method, IMU6DoF-SST-CNN with six variants, was also evaluated using a second scenario dataset. This dataset consisted of fan vibration data collected at variable velocities ranging from 10% to 100% with 10% increments. The data were acquired at a sampling frequency of 2000 Hz. The best performance was achieved by the IMU6DoF-SST-CNN variant that employed FSST for feature extraction and sensor-type fusion. This variant was subsequently successfully applied to the CWRU bearing fault dataset, which comprises 13 classes of bearing faults and has a sampling frequency of 12 kHz. Verification of the CWRU bearing fault dataset required adaptation of the proposed method to the number of used accelerometers, which leads to the ACCELEROMETER×3-SST-CNN method. This demonstrates the capability of the CNN architecture in conjunction with FSST to learn discriminative features from the sensor data for accurate classification. While all methods achieved high accuracy, a careful analysis of the training time and image generation efficiency revealed trade-offs between the different approaches. FSST-based methods generally exhibited faster training times compared to those using CWT. This study highlighted the benefit of FSST compared to other time-frequency analysis techniques like CWT and STFT. FSST’s superior localization capabilities were visually confirmed through image representations and further supported by interpretability analysis using LIME. This improved localization allows the CNN to effectively capture the transient features of vibration data associated with faults, leading to more accurate classification. Building upon these findings, future research directions can explore optimizing the IMU6DoF-SST-CNN method for real-time deployment on resource-constrained embedded systems for fault diagnosis tasks.
This research addressed the knowledge gap in relation to the optimal conversion of multi-axis IMU sensor data into a time-frequency image format suitable for CNN-based fault classification. It proposed and evaluated six variants of the IMU6DoF-SST-CNN method, which explored different combinations of FSST and WSST with three image fusion techniques. These variants effectively captured the temporal characteristics of the vibration signals across all axes, paving the way for leveraging the power of CNNs for accurate fault classification in scenarios involving complex multidimensional sensor data. This study highlights the benefits of FSST compared to other time-frequency analysis techniques such as CWT and STFT. FSST’s superior localisation capabilities were visually confirmed through image representations and further supported by interpretability analysis using LIME. This improved localisation allows the CNN to effectively capture the transient features of vibration data associated with faults, leading to more accurate classification.
In conclusion, this study presents a promising approach for vibration classification using IMU data with the proposed IMU6DoF-SST-CNN method. The effectiveness of FSST for time-frequency analysis and the exploration of different fusion strategies contribute to the overall accuracy and efficiency of the system.

Funding

This research was funded by the Poznan University of Technology, grant number 0214/SBAD/0249.

Data Availability Statement

All data are contained within the article.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Łuczak, D.; Brock, S.; Siembab, K. Cloud Based Fault Diagnosis by Convolutional Neural Network as Time–Frequency RGB Image Recognition of Industrial Machine Vibration with Internet of Things Connectivity. Sensors 2023, 23, 3755. [Google Scholar] [CrossRef] [PubMed]
  2. Chen, H.-Y.; Lee, C.-H. Vibration Signals Analysis by Explainable Artificial Intelligence (XAI) Approach: Application on Bearing Faults Diagnosis. IEEE Access 2020, 8, 134246–134256. [Google Scholar] [CrossRef]
  3. Wang, Y.; Yang, M.; Li, Y.; Xu, Z.; Wang, J.; Fang, X. A Multi-Input and Multi-Task Convolutional Neural Network for Fault Diagnosis Based on Bearing Vibration Signal. IEEE Sens. J. 2021, 21, 10946–10956. [Google Scholar] [CrossRef]
  4. Rauber, T.W.; da Silva Loca, A.L.; Boldt, F.d.A.; Rodrigues, A.L.; Varejão, F.M. An Experimental Methodology to Evaluate Machine Learning Methods for Fault Diagnosis Based on Vibration Signals. Expert Syst. Appl. 2021, 167, 114022. [Google Scholar] [CrossRef]
  5. Meyer, A. Vibration Fault Diagnosis in Wind Turbines Based on Automated Feature Learning. Energies 2022, 15, 1514. [Google Scholar] [CrossRef]
  6. Li, Z.; Zhang, Y.; Abu-Siada, A.; Chen, X.; Li, Z.; Xu, Y.; Zhang, L.; Tong, Y. Fault Diagnosis of Transformer Windings Based on Decision Tree and Fully Connected Neural Network. Energies 2021, 14, 1531. [Google Scholar] [CrossRef]
  7. Gao, S.; Xu, L.; Zhang, Y.; Pei, Z. Rolling Bearing Fault Diagnosis Based on SSA Optimized Self-Adaptive DBN. ISA Trans. 2022, 128, 485–502. [Google Scholar] [CrossRef] [PubMed]
  8. Wang, C.-S.; Kao, I.-H.; Perng, J.-W. Fault Diagnosis and Fault Frequency Determination of Permanent Magnet Synchronous Motor Based on Deep Learning. Sensors 2021, 21, 3608. [Google Scholar] [CrossRef] [PubMed]
  9. Feng, Z.; Gao, A.; Li, K.; Ma, H. Planetary Gearbox Fault Diagnosis via Rotary Encoder Signal Analysis. Mech. Syst. Signal Process. 2021, 149, 107325. [Google Scholar] [CrossRef]
  10. Ma, J.; Li, C.; Zhang, G. Rolling Bearing Fault Diagnosis Based on Deep Learning and Autoencoder Information Fusion. Symmetry 2022, 14, 13. [Google Scholar] [CrossRef]
  11. Huang, W.; Du, J.; Hua, W.; Lu, W.; Bi, K.; Zhu, Y.; Fan, Q. Current-Based Open-Circuit Fault Diagnosis for PMSM Drives with Model Predictive Control. IEEE Trans. Power Electron. 2021, 36, 10695–10704. [Google Scholar] [CrossRef]
  12. Łuczak, D.; Brock, S.; Siembab, K. Fault Detection and Localisation of a Three-Phase Inverter with Permanent Magnet Synchronous Motor Load Using a Convolutional Neural Network. Actuators 2023, 12, 125. [Google Scholar] [CrossRef]
  13. Jiang, L.; Deng, Z.; Tang, X.; Hu, L.; Lin, X.; Hu, X. Data-Driven Fault Diagnosis and Thermal Runaway Warning for Battery Packs Using Real-World Vehicle Data. Energy 2021, 234, 121266. [Google Scholar] [CrossRef]
  14. Chang, C.; Zhou, X.; Jiang, J.; Gao, Y.; Jiang, Y.; Wu, T. Electric Vehicle Battery Pack Micro-Short Circuit Fault Diagnosis Based on Charging Voltage Ranking Evolution. J. Power Sources 2022, 542, 231733. [Google Scholar] [CrossRef]
  15. Wang, Z.; Tian, B.; Qiao, W.; Qu, L. Real-Time Aging Monitoring for IGBT Modules Using Case Temperature. IEEE Trans. Ind. Electron. 2016, 63, 1168–1178. [Google Scholar] [CrossRef]
  16. Dhiman, H.S.; Deb, D.; Muyeen, S.M.; Kamwa, I. Wind Turbine Gearbox Anomaly Detection Based on Adaptive Threshold and Twin Support Vector Machines. IEEE Trans. Energy Convers. 2021, 36, 3462–3469. [Google Scholar] [CrossRef]
  17. Cao, Y.; Sun, Y.; Xie, G.; Li, P. A Sound-Based Fault Diagnosis Method for Railway Point Machines Based on Two-Stage Feature Selection Strategy and Ensemble Classifier. IEEE Trans. Intell. Transp. Syst. 2022, 23, 12074–12083. [Google Scholar] [CrossRef]
  18. Shiri, H.; Wodecki, J.; Ziętek, B.; Zimroz, R. Inspection Robotic UGV Platform and the Procedure for an Acoustic Signal-Based Fault Detection in Belt Conveyor Idler. Energies 2021, 14, 7646. [Google Scholar] [CrossRef]
  19. Karabacak, Y.E.; Gürsel Özmen, N.; Gümüşel, L. Intelligent Worm Gearbox Fault Diagnosis under Various Working Conditions Using Vibration, Sound and Thermal Features. Appl. Acoust. 2022, 186, 108463. [Google Scholar] [CrossRef]
  20. Zhou, Q.; Chen, R.; Huang, B.; Liu, C.; Yu, J.; Yu, X. An Automatic Surface Defect Inspection System for Automobiles Using Machine Vision Methods. Sensors 2019, 19, 644. [Google Scholar] [CrossRef]
  21. Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. A Review on State-of-the-Art Power Line Inspection Techniques. IEEE Trans. Instrum. Meas. 2020, 69, 9350–9365. [Google Scholar] [CrossRef]
  22. Davari, N.; Akbarizadeh, G.; Mashhour, E. Intelligent Diagnosis of Incipient Fault in Power Distribution Lines Based on Corona Detection in UV-Visible Videos. IEEE Trans. Power Deliv. 2021, 36, 3640–3648. [Google Scholar] [CrossRef]
  23. Kim, S.; Kim, D.; Jeong, S.; Ham, J.-W.; Lee, J.-K.; Oh, K.-Y. Fault Diagnosis of Power Transmission Lines Using a UAV-Mounted Smart Inspection System. IEEE Access 2020, 8, 149999–150009. [Google Scholar] [CrossRef]
  24. Ullah, Z.; Lodhi, B.A.; Hur, J. Detection and Identification of Demagnetization and Bearing Faults in PMSM Using Transfer Learning-Based VGG. Energies 2020, 13, 3834. [Google Scholar] [CrossRef]
  25. Long, H.; Xu, S.; Gu, W. An Abnormal Wind Turbine Data Cleaning Algorithm Based on Color Space Conversion and Image Feature Detection. Appl. Energy 2022, 311, 118594. [Google Scholar] [CrossRef]
  26. Xie, T.; Huang, X.; Choi, S.-K. Intelligent Mechanical Fault Diagnosis Using Multisensor Fusion and Convolution Neural Network. IEEE Trans. Ind. Inform. 2022, 18, 3213–3223. [Google Scholar] [CrossRef]
  27. Zhou, Y.; Wang, H.; Wang, G.; Kumar, A.; Sun, W.; Xiang, J. Semi-Supervised Multiscale Permutation Entropy-Enhanced Contrastive Learning for Fault Diagnosis of Rotating Machinery. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
  28. Łuczak, D. Machine Fault Diagnosis through Vibration Analysis: Continuous Wavelet Transform with Complex Morlet Wavelet and Time–Frequency RGB Image Recognition via Convolutional Neural Network. Electronics 2024, 13, 452. [Google Scholar] [CrossRef]
  29. Case Western Reserve University Bearing Data Center Website. Available online: https://engineering.case.edu/bearingdatacenter/welcome (accessed on 9 May 2024).
  30. Ma, F.; Ji, C.; Xu, M.; Wang, J.; Sun, W. Spatial Correlation Extraction for Chemical Process Fault Detection Using Image Enhancement Technique Aided Convolutional Autoencoder. Chem. Eng. Sci. 2023, 278, 118900. [Google Scholar] [CrossRef]
  31. Seong, G.; Kim, D. An Intelligent Ball Bearing Fault Diagnosis System Using Enhanced Rotational Characteristics on Spectrogram. Sensors 2024, 24, 776. [Google Scholar] [CrossRef]
  32. Ruiz-Sarrio, J.E.; Antonino-Daviu, J.A.; Martis, C. Comprehensive Diagnosis of Localized Rolling Bearing Faults during Rotating Machine Start-Up via Vibration Envelope Analysis. Electronics 2024, 13, 375. [Google Scholar] [CrossRef]
  33. Kim, M.S.; Yun, J.P.; Park, P. Deep Learning-Based Explainable Fault Diagnosis Model with an Individually Grouped 1-D Convolution for Three-Axis Vibration Signals. IEEE Trans. Ind. Inform. 2022, 18, 8807–8817. [Google Scholar] [CrossRef]
  34. Zhang, X.; Zhao, Z.; Wang, Z.; Wang, X. Fault Detection and Identification Method for Quadcopter Based on Airframe Vibration Signals. Sensors 2021, 21, 581. [Google Scholar] [CrossRef] [PubMed]
  35. Daubechies, I.; Maes, S. A Nonlinear Squeezing of the Continuous Wavelet Transform Based on Auditory Nerve Models. In Wavelets in Medicine and Biology; Routledge: London, UK, 1996; ISBN 978-0-203-73403-2. [Google Scholar]
  36. Łuczak, D. Mechanical Vibrations Analysis in Direct Drive Using CWT with Complex Morlet Wavelet. Power Electron. Drives 2023, 8, 65–73. [Google Scholar] [CrossRef]
  37. Daubechies, I.; Lu, J.; Wu, H.-T. Synchrosqueezed Wavelet Transforms: An Empirical Mode Decomposition-like Tool. Appl. Comput. Harmon. Anal. 2011, 30, 243–261. [Google Scholar] [CrossRef]
  38. Thakur, G.; Wu, H.-T. Synchrosqueezing-Based Recovery of Instantaneous Frequency from Nonuniform Samples. SIAM J. Math. Anal. 2011, 43, 2078–2095. [Google Scholar] [CrossRef]
  39. Oberlin, T.; Meignen, S.; Perrier, V. The Fourier-Based Synchrosqueezing Transform. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 315–319. [Google Scholar]
  40. Thakur, G.; Brevdo, E.; Fučkar, N.S.; Wu, H.-T. The Synchrosqueezing Algorithm for Time-Varying Spectral Analysis: Robustness Properties and New Paleoclimate Applications. Signal Process. 2013, 93, 1079–1094. [Google Scholar] [CrossRef]
  41. Łuczak, D. Machine Fault Diagnosis through Vibration Analysis: Time Series Conversion to Grayscale and RGB Images for Recognition via Convolutional Neural Networks. Energies 2024, 17, 1998. [Google Scholar] [CrossRef]
Figure 1. IMU6DoF-SST-CNN method with time-frequency images fusion by sensor.
Figure 1. IMU6DoF-SST-CNN method with time-frequency images fusion by sensor.
Electronics 13 02411 g001
Figure 2. IMU6DoF-SST-CNN method with time-frequency images fusion as grid.
Figure 2. IMU6DoF-SST-CNN method with time-frequency images fusion as grid.
Electronics 13 02411 g002
Figure 3. IMU6DoF-SST-CNN method with time-frequency images fusion by axis.
Figure 3. IMU6DoF-SST-CNN method with time-frequency images fusion by axis.
Electronics 13 02411 g003
Figure 4. Microcontroller based demonstrator of machine fault diagnosis.
Figure 4. Microcontroller based demonstrator of machine fault diagnosis.
Electronics 13 02411 g004
Figure 5. Data transmission by MQTT protocol.
Figure 5. Data transmission by MQTT protocol.
Electronics 13 02411 g005
Figure 6. Single sensor data window with 128 samples, capturing temporal measurements from the three axes (X, Y, and Z) of both the accelerometer and gyroscope.
Figure 6. Single sensor data window with 128 samples, capturing temporal measurements from the three axes (X, Y, and Z) of both the accelerometer and gyroscope.
Electronics 13 02411 g006
Figure 7. Frequency spectrum for a single data segment, representing each class (idle, normal operation, and fault). Data from all accelerometer and gyroscope axes (X, Y, and Z) are combined.
Figure 7. Frequency spectrum for a single data segment, representing each class (idle, normal operation, and fault). Data from all accelerometer and gyroscope axes (X, Y, and Z) are combined.
Electronics 13 02411 g007
Figure 8. Comparison of the RGB image (six time-frequency components) with fusion by sensor for the class fault calculated by the proposed method FSST (left), reference CWT with complex Morlet wavelet (middle) and reference STFT method (right).
Figure 8. Comparison of the RGB image (six time-frequency components) with fusion by sensor for the class fault calculated by the proposed method FSST (left), reference CWT with complex Morlet wavelet (middle) and reference STFT method (right).
Electronics 13 02411 g008
Figure 9. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion by sensor and FSST, where RGB channels are defined in Section 2.1.
Figure 9. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion by sensor and FSST, where RGB channels are defined in Section 2.1.
Electronics 13 02411 g009
Figure 10. Training progress of the CNN for IMU6DoF-SST-CNN with fusion by sensor and FSST.
Figure 10. Training progress of the CNN for IMU6DoF-SST-CNN with fusion by sensor and FSST.
Electronics 13 02411 g010
Figure 11. Matrix of confusion after training IMU6DoF-SST-CNN with fusion by sensor and FSST (train—left; test—right).
Figure 11. Matrix of confusion after training IMU6DoF-SST-CNN with fusion by sensor and FSST (train—left; test—right).
Electronics 13 02411 g011
Figure 12. Greyscale image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion as grid and FSST, where greyscale image is defined in Section 2.2.
Figure 12. Greyscale image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion as grid and FSST, where greyscale image is defined in Section 2.2.
Electronics 13 02411 g012
Figure 13. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion by axis and FSST, where RGB channels are defined in Section 2.3.
Figure 13. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with fusion by axis and FSST, where RGB channels are defined in Section 2.3.
Electronics 13 02411 g013
Figure 14. Training progress of the CNN for IMU6DoF-SST-CNN with FSST compared with three fusion methods (sensor, grid, and axis).
Figure 14. Training progress of the CNN for IMU6DoF-SST-CNN with FSST compared with three fusion methods (sensor, grid, and axis).
Electronics 13 02411 g014
Figure 15. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion by sensor, where RGB channels are defined in Section 2.1.
Figure 15. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion by sensor, where RGB channels are defined in Section 2.1.
Electronics 13 02411 g015
Figure 16. Greyscale image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion as grid, where greyscale image is defined in Section 2.2.
Figure 16. Greyscale image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion as grid, where greyscale image is defined in Section 2.2.
Electronics 13 02411 g016
Figure 17. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion by axis, where RGB channels are defined in Section 2.3.
Figure 17. RGB image for each class (fault, idle, and normal) of IMU6DoF-SST-CNN method with WSST and fusion by axis, where RGB channels are defined in Section 2.3.
Electronics 13 02411 g017
Figure 18. Training progress of the CNN for IMU6DoF-SST-CNN with WSST compared with three fusion methods (sensor, grid, and axis).
Figure 18. Training progress of the CNN for IMU6DoF-SST-CNN with WSST compared with three fusion methods (sensor, grid, and axis).
Electronics 13 02411 g018
Figure 19. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion by sensor.
Figure 19. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion by sensor.
Electronics 13 02411 g019
Figure 20. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion as greyscale grid.
Figure 20. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion as greyscale grid.
Electronics 13 02411 g020
Figure 21. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion by axis.
Figure 21. Interpretability of proposed IMU6DoF-SST-CNN with FSST and fusion by axis.
Electronics 13 02411 g021
Figure 22. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion by sensor.
Figure 22. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion by sensor.
Electronics 13 02411 g022
Figure 23. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion as greyscale grid.
Figure 23. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion as greyscale grid.
Electronics 13 02411 g023
Figure 24. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion by axis.
Figure 24. Interpretability of proposed IMU6DoF-SST-CNN with WSST and fusion by axis.
Electronics 13 02411 g024
Figure 25. Modification of microcontroller-based demonstrator of machine fault diagnosis with variable velocity.
Figure 25. Modification of microcontroller-based demonstrator of machine fault diagnosis with variable velocity.
Electronics 13 02411 g025
Figure 26. Time domain data of 3 axis gyroscope for modified demonstrator with variable velocity (red—gyroscope x axis, green—gyroscope y axis, blue—gyroscope z axis).
Figure 26. Time domain data of 3 axis gyroscope for modified demonstrator with variable velocity (red—gyroscope x axis, green—gyroscope y axis, blue—gyroscope z axis).
Electronics 13 02411 g026
Figure 27. RGB image for second scenario at each class (fault 1, fault 2, and normal) of IMU6DoF-SST-CNN method with fusion by sensor and FSST, where RGB channels are defined in Section 2.1.
Figure 27. RGB image for second scenario at each class (fault 1, fault 2, and normal) of IMU6DoF-SST-CNN method with fusion by sensor and FSST, where RGB channels are defined in Section 2.1.
Electronics 13 02411 g027
Figure 28. Training progress of the IMU6DoF-SST-CNN for second scenario dataset (fan with variable velocity).
Figure 28. Training progress of the IMU6DoF-SST-CNN for second scenario dataset (fan with variable velocity).
Electronics 13 02411 g028
Figure 29. Test confusion matrix after training IMU6DoF-SST-CNN for each of six variants for a fan with variable velocity—second scenario dataset.
Figure 29. Test confusion matrix after training IMU6DoF-SST-CNN for each of six variants for a fan with variable velocity—second scenario dataset.
Electronics 13 02411 g029
Figure 30. Time domain data for dataset from Bearing Data Center—drive end bearing fault data at 12 kHz (red—DE—drive end accelerometer data, green—FE—fan end accelerometer data, blue—BA—base accelerometer data).
Figure 30. Time domain data for dataset from Bearing Data Center—drive end bearing fault data at 12 kHz (red—DE—drive end accelerometer data, green—FE—fan end accelerometer data, blue—BA—base accelerometer data).
Electronics 13 02411 g030
Figure 31. RGB image for dataset of drive end bearing fault data at 12 kHz of method ACCELEROMETER×3-SST-CNN with FSST and fusion by sensor, where red channel of RGB is an FSST of DE data, green channel is an FSST of FE data, and blue channel is an FSST of BA data.
Figure 31. RGB image for dataset of drive end bearing fault data at 12 kHz of method ACCELEROMETER×3-SST-CNN with FSST and fusion by sensor, where red channel of RGB is an FSST of DE data, green channel is an FSST of FE data, and blue channel is an FSST of BA data.
Electronics 13 02411 g031
Figure 32. Matrix of confusion after training ACCELEROMETER×3-SST-CNN with FSST and fusion by sensor of drive end bearing fault data at 12 kHz (train—left; test—right).
Figure 32. Matrix of confusion after training ACCELEROMETER×3-SST-CNN with FSST and fusion by sensor of drive end bearing fault data at 12 kHz (train—left; test—right).
Electronics 13 02411 g032
Table 1. Comparison of proposed fault diagnosis IMU6DoF-SST-CNN method.
Table 1. Comparison of proposed fault diagnosis IMU6DoF-SST-CNN method.
DatasetsFaultsSensorsFeatures ExtractionFeatures and Time WindowFusionClassifierMethod
(a) Demonstrator with fan blade imbalance with constant velocity at 200 Hz
(b) Demonstrator with fan blade imbalance with variable velocity at 2 kHz
(c) Publicly available CWRU bearing fault dataset at 12 kHz
(a) Normal, fan turned off, fan fault
(b) Fan imbalance 1, fan imbalance 2, and normal
(c) Bearing faults (thirteen classes) condition inner race fault, ball fault, outer race fault (@6—centered, @3—orthogonal, @12—opposite) at different fault diameters of 0.007”, 0.014”, and 0.021” for motor loads 0 HP, 1 HP, 2 HP and 3 HP
(a) IMU 6 DoF
(b) IMU 6 DoF
(c) Three accelerometers (drive end, fan end, and base)
Fourier synchrosqueezed transform (FSST) and wavelet synchrosqueezed transform (WSST)1. RGB image made of six time-frequency domain data
2. Time windows of
(a) 640 ms,
(b) 512 ms,
(c) 85 ms
(a) Fusion by sensor,
(b) Fusion as grid,
(c) Fusion by axis
CNN-2D
(image recognition)
Proposed
Demonstrator with fan blade imbalance with constant velocity at 200 HzNormal, fan turn off, fan faultIMU 6 DoFCWT with complex Morlet wavelet1. RGB image made of six time-frequency (time-scale) domain data
2. Time window of 640 ms
Fusion by sensorCNN-2D[28]
Demonstrator with fan blade imbalance with constant velocity at 200 HzNormal, fan turn off, fan faultIMU 6 DoFSDFT (sliding discrete Fourier transform) or STFT at 6 axes1. RGB image made of six spectrograms
2. Time window of 640 ms
Fusion by sensorCNN-2D[1]
Publicly available CWRU bearing fault dataset at 12 kHzBearing faults (four class) normal, inner ring, outer ring, ball. The diameters are not discussed in this study; for example, a 7-mil inner ring fault and a 14-mil inner ring fault are considered as the same class.Single accelerometer (drive end)STFT1. Colour spectrogram of single signal
2. Time window of 1000 ms
No fusion single sensor dataCNN-2D[2]
Southeast University (SEU) Bearing Fault Dataset at 2 kHzBearing four faulty classes (ball, inner ring, outer ring, inner + outer) and healthyThree-axis accelerometerTransforming frequency with a weight map.1. Frequency domain for each axis
2. Time window of 512 ms
Each of 3-axes as separate RGB colourGroup of CNN-1D[33]
Quadcopter blades under three health states at 200 HzBlades undamaged and two faults (5% and 15% damaged blades)From unidirectional to the three axes of angular velocityWPT (wavelet packet transform)—wavelet name unspecified1. WPT at third level of decomposition
2. Time window of 1000 ms
No fusionLSTM (long and short-term memory)[34]
(a) Publicly available CWRU bearing fault dataset at 12 kHz
(b) Machinery Failure Prevention Technology (MFPT) bearing vibration signal dataset at 48.828 kHz
(a) Bearing faults (four class) normal, outer, ball, inner.
(b) Bearing faults (three class) normal, inner ring, and outer ring.
Single accelerometer (drive end)CWT (wavelet unspecified), STFT1. CWT, time domain and frequency domain feature aggregation
2. Time window
(a) not given
(b) 105.5 ms
No fusionMIMTNet (multiple input, multiple-task CNN)[3]
Publicly available CWRU bearing fault dataset at 12 kHzBearing faults (no class) normal, inner-ring faults, ball faults, and outer-ring faults.One of accelerometers (drive end, fan end, and base)Envelope STFT1. Envelope spectrum
2. Time window of 5000 ms
No fusionNo classification[32]
Publicly available CWRU bearing fault dataset at 48 kHz down-sampled to a sampling rate of 1 kHzTwo class normal and faultSingle accelerometer (not given which)Rotational characteristic emphasis (RCE) spectrogram1. RCE filter bank
2. Time window of 1000 ms
No fusionCNN[31]
Table 2. Comparison of training progress for proposed methods.
Table 2. Comparison of training progress for proposed methods.
MethodTime-Frequency MethodTime-Frequency Images FusionTotal Number of Images for TrainingTraining TimeTraining IterationsIteration with Validation Accuracy More than 90%Final Validation Accuracy
Reference method STFTx6-CNN [1]STFTfusion by sensor6450
RGB images
1 m 59 s505-th iteration100%
Reference method CWTx6-CNN [28]CWTfusion by sensor6528
RGB images
3 m 4 s515-th iteration100%
Proposed method IMU6DoF-SST-CNNFSSTfusion by sensor6450
RGB images
2 m 58 s505-th iteration100%
Proposed method IMU6DoF-SST-CNNFSSTfusion as grid6450
greyscale
images
3 m 20 s5010-th iteration100%
Proposed method IMU6DoF-SST-CNNFSSTfusion by axis6450
RGB images
4 m 13 s505-th iteration100%
Proposed method IMU6DoF-SST-CNNWSSTfusion by sensor6450
RGB images
8 m 25 s505-th iteration100%
Proposed method IMU6DoF-SST-CNNWSSTfusion as grid6450
greyscale
images
9 m 30 s505-th iteration100%
Proposed method IMU6DoF-SST-CNNWSSTfusion by axis6450
RGB images
12 m 17 s505-th iteration100%
Table 3. Image generation efficiency comparison for fault diagnosis.
Table 3. Image generation efficiency comparison for fault diagnosis.
Time Measurement ConditionTime-Frequency MethodTime Series Segment SizeTotal Number of ImagesTotal Time in Seconds for All Iterations (Ceiling Round)Average Time of Single Iteration in Milliseconds (Ceiling Round)
Reference method STFTx6-CNN fusion by sensorSTFT128 × 6 samples806475.417 s9.353 ms
Reference method CWTx6-CNN fusion by sensorCWT96 × 6 samples8160232.162 s28.452 ms
Proposed method IMU6DoF-SST-CNN fusion by sensorFSST128 × 6 samples8064146.532 s18.172 ms
Proposed method IMU6DoF-SST-CNN fusion as gridFSST128 × 6 samples8064142.575 s17.681 ms
Proposed method IMU6DoF-SST-CNN fusion by axisFSST128 × 6 samples8064149.53 s18.543 ms
Proposed method IMU6DoF-SST-CNN fusion by sensorWSST128 × 6 samples8064256.519 s31.82 ms
Proposed method IMU6DoF-SST-CNN fusion as gridWSST128 × 6 samples8064246.754 s30.06 ms
Proposed method IMU6DoF-SST-CNN fusion by axisWSST128 × 6 samples8064251.22 s31.16 ms
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Łuczak, D. Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network. Electronics 2024, 13, 2411. https://doi.org/10.3390/electronics13122411

AMA Style

Łuczak D. Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network. Electronics. 2024; 13(12):2411. https://doi.org/10.3390/electronics13122411

Chicago/Turabian Style

Łuczak, Dominik. 2024. "Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network" Electronics 13, no. 12: 2411. https://doi.org/10.3390/electronics13122411

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop