1. Introduction
Diabetes is a chronic metabolic disorder that affects millions of people worldwide [
1]. It is characterized by elevated blood glucose levels resulting from either insufficient insulin production or the body’s resistance to insulin. One of the key indicators of long-term glycemic control in diabetic patients is the glycated hemoglobin (HbA1c) level, which represents the average blood glucose concentration over the past 3 months [
2]. The accurate and timely measurement of HbA1c levels is essential for effective diabetes management, prevention of complications, and improvements in patients’ quality of life [
3].
Traditional methods for HbA1c estimation require blood samples to be analyzed in a laboratory setting, which is invasive and inconvenient for patients [
4]. This has led to a growing interest in noninvasive and continuous monitoring techniques that can be easily integrated into patients’ daily lives [
5]. One promising approach is the use of wrist photoplethysmography (PPG) data, which measure the changes in blood volume in the microvascular bed of tissue via light-emitting diodes (LEDs) and a photodetector (PD). PPG data have previously been employed for various health-monitoring applications such as heart rate, SpO
2, and blood pressure estimation [
6,
7]. In addition, recent studies on glucose or HbA1c estimation using PPG signals have been published [
8,
9,
10,
11,
12,
13]. One of the previous studies used the simple Beer–Lambert-law-based model to estimate the HbA1c level in vivo [
9]. Another study focused on estimations based on the photon diffusion theorem by considering both transmission- and reflection-type PPG signals [
10]. These are mathematical and theory-based models developed to noninvasively estimate HbA1c values from fingertip PPG signals. In [
11], machine learning algorithms (random forest and XGBoost) are employed to predict the actual value of blood glucose level from the extracted features. Here, related features are extracted from the PPG signals. In [
12], features are extracted based on feature importance from the acquired PPG signals, and machine learning algorithms are used to estimate the glycated hemoglobin value from the extracted features. It should be noted that the results in the aforementioned references [
8,
9,
10,
11,
12] are based on fingertip PPG data, whereas the proposed study focuses on wrist PPG data with wearable applications in mind.
In this study, we present a comparative analysis of various machine learning algorithms to improve the accuracy of HbA1c estimation using wrist PPG data. We investigate the performance of XGBoost [
14], random forest (RF) [
15], CatBoost [
16], and LightGBM [
17] algorithms on a dataset comprising 22 subjects’ PPG data. A leave-one-out cross-validation scheme was employed to validate the models [
18], while a grid search technique was used to optimize the hyperparameters [
19]. The performance of the model was assessed based on its ability to predict HbA1c levels from the PPG data. Our results indicate that the machine learning algorithms achieved varying levels of success in HbA1c estimation. In addition to applying feature-importance-based selection to each machine learning algorithm, we further improved the performance by taking the features extracted from the PPG signal, such as AC-to-DC ratio (AC/DC) values at various wavelengths. In general, blood flow affects the alternating current (AC) component of the PPG signal, while tissue properties and motion artifacts affect the direct current (DC) component. The findings of this study suggest that wrist PPG data combined with advanced machine learning techniques have the potential to provide a noninvasive and convenient alternative to HbA1c estimation, paving the way for more accessible diabetes monitoring and management.
The contributions of this study can be summarized as follows.
(1) We provide a thorough comparison of different machine learning techniques for estimating HbA1c using wrist PPG data. Here, we evaluate the relative performance of some of the most well-known algorithms: XGBoost, RF, LightGBM, and CatBoost.
(2) In order to cope with the substantial dependence of existing evaluation features based on PPG waveform characteristics, new evaluation features extracted from PPG signals, such as AC/DC values at various wavelengths, were proposed and the performance improvements were demonstrated.
(3) Performance was improved through feature importance-based selection and performance was compared and analyzed according to combinations of RGB wavelengths.
(4) To analyze the performance of HbA1c estimation utilizing machine learning algorithms, we compared several regression algorithms, and the groundwork for implementing the results in wrist PPG-based hardware is provided for completely non-invasive HbA1c estimation.
2. Methodology
The wrist PPG signal data were acquired using a TMD3719-based [
20] prototype with a photodetector (PD) and a low-power white LED of three wavelengths (465 nm, 525 nm, and 615 nm). Under the direct supervision of the institutional review board (IRB) of Kookmin University (IRB protocol number: KMU-202111-BR-286), Seoul, Republic of Korea, all of the data used to evaluate the experimental results were collected from 22 subjects. The 2 min raw data at the rate of 24 samples per second were then passed through the filter. The PPG signal was segmented into 3 s intervals, yielding approximately 72 samples for the 3 s interval. The segmented PPG signal was then fed through the system’s feature extraction module. We then conducted an experiment using the 15 features from the PPG signal as in [
12], and also conducted an experiment using the dominant features selected based on feature importance. Due to the relatively low amount of data samples, all regression algorithms were optimized to ensure the selection of appropriate hyperparameters to prevent overfitting and the leave-one-out cross-validation (LOOCV) method was used for training each model. According to this method, one data point is left out from the dataset and the model is trained on the remaining data points. The left-out data point is then used as a validation set to test the model’s performance. This process is repeated for each data point in the dataset.
Figure 1 shows a block diagram of the entire proposed system.
2.1. Hardware Device
The color sensor module TMD 3719 is used, in this study, for the purpose of color (RGB) sensing. The color sensor had three different filters on top of the sensor die: blue (465 nm), green (525 nm), and red (615 nm). An ESP32 microcontroller was used for communicating with the TMD 3719 module. We also used only one white LED (CLM3C-WKW) as a light source, instead of using three high-intensity light sources of different wavelengths.
Figure 2 represents the block diagram of the proposed hardware system.
Figure 3 presents a detailed microcontroller unit (MCU) peripheral circuit diagram for the MCU part in the block diagram of
Figure 2. Finally,
Figure 4 represents the structure diagram of the device.
2.2. Regression Models
Four regression models were fit and tested in this study. These four algorithms were chosen because of their strong performance in similar prediction tasks [
21,
22]. They are also known for their robustness and ability to capture complex patterns in data [
23]. All regression algorithms were implemented with the Python programming language using the scikit-learn library. The possibility of overfitting was addressed through carefully considered hyperparameters and LOOCV was used for training each model. A description of the regression models used in this study is as follows.
2.2.1. XGBoost
The XGBoost regressor is widely known as a powerful boosting algorithm. In this approach, decision trees are constructed in sequence, and the weights of all independent variables are computed before any of them are fed into the decision tree. In the second decision tree, additional weight is given to variables if they were mistakenly predicted by the first decision tree. In the end, the sum of these individual regressors will yield a reliable and accurate model [
14]. Although XGBoost is capable of high performance, it frequently requires careful parameter-tuning to prevent overfitting and optimize the algorithm.
2.2.2. Random Forest (RF)
To ensure that the model’s subsamples remain the same size as the original input, a random sampling technique called “replacement sampling” is used to extract from the dataset and fit a number of decision trees. The data are then averaged to prevent overfitting and improve prediction accuracy [
15]. However, the model output is difficult to interpret and less effective in terms of speed and accuracy when compared to boosting algorithms.
2.2.3. CatBoost
To compute the index of a leaf using bitwise operations, CatBoost generates “oblivious trees” with the constraint that all nodes at the same level must test the same predictor by applying the same conditions. The tree structure serves as a regularizer to find the optimal solution while avoiding overfitting, and the oblivious tree technique provides a simple fitting strategy and excellent CPU efficiency [
16]. However, it may take a longer time to train.
2.2.4. LightGBM
When generating decision trees, LightGBM uses an expansion strategy known as leaf-wise expansion. That is, if the condition is met, only one leaf is divided according to the gain. The training procedure for the standard gradient boosting decision tree can be sped up with the help of LightGBM [
17]. Despite its tendency to overfit on small or noisy datasets, this risk can be mitigated through the cautious selection and tuning of hyperparameters. Given the size of our dataset, we were able to obtain a good balance between model complexity and overfitting risk using these parameters. The effectiveness of our model indicates that, even with a limited dataset, LightGBM can be a useful tool for predicting HbA1c from wrist PPG signals when designed properly.
2.3. Dataset Description
In this study, PPG-based data from 22 subjects were used along with the variables BMI, SpO
2 and HbA1c. Statistics of the data set used are shown in
Table 1. For the reference values, we measured the HbA1c and SpO
2 values of the subjects using an invasive SD Biosensor F200 analyzer [
24] and a MD300C26 fingertip pulse oximeter device [
25], respectively.
2.4. PPG Signal Processing
The raw PPG signal was filtered to remove high- and low-frequency noise and passed two-stage filtering. A second-order Butterworth low pass filter (LPF) with 8 Hz cutoff frequency was used to eliminate the high-frequency noise. The DC component is easily obtained by averaging the LPF output signal. After that, another 2nd-order high pass filter was used, with a cutoff frequency of 0.5 Hz, to remove the DC and respiratory components (less than 0.33 Hz). Baseline drift removal was also performed to keep the DC values constant. The AC and DC values in the PPG signal represent the pulsatile and the baseline (or static) component of the PPG signal, respectively.
Figure 5 shows the AC and DC values of the typical PPG signal. The DC value can be defined as an average of valley values (DC
1) or an average of intermediate values (DC
2) of a peak value and a valley value.
2.5. Correct Peak and Valley Detection for Determining AC and DC Value from PPG Signal
To accurately obtain the AC and DC components of a PPG signal, it is important to accurately detect the peaks and valleys of the signal. We built an algorithm to accurately detect peaks and valleys. A pseudocode for determining AC and DC values from PPG signals is shown as Algorithm 1.
Algorithm 1: Pseudocode of determining AC and DC value from PPG signal. |
|
In the algorithm above, the function calculates the AC and DC components from a given input signal. The function then loops through each value in the input signal and updates the maximum and minimum values. Finally, the function computes the AC signal by taking the distance between the average of the maximum and the average of the minimum, and the DC signal by taking the average of the midpoint values, and returns these values.
2.6. Feature Extraction
The physiological attributes based on PPG signal and physical parameters can be regarded as features, and 15 significant and distinctive features were extracted from the PPG signal using time series feature extraction library (TSFEL) [
11,
12]. The 15 features are zero-crossing rate (ZCR), autocorrelation (ACR), kurtosis (kurt), variance (var), and mean of power spectral density (PSD); kurtosis (kurt), variance (var), mean, and skewness (skew) of Kaiser–Teager energy (KTE); kurtosis (kurt) and skewness (skew) of spectral analysis (spec); mean of wavelet analysis; autoregressive (AR) coefficients; skewness and sum of absolute difference (SAD). Among them, ZCR, ACR, and SAD are temporal features of the PPG signal, while the mean of wavelet analysis and PSD are spectral features. In addition, two other demographic features were considered: BMI and SpO
2. The final feature vector can be obtained using Equation (1) for each frame of the PPG signal [
12].
In particular, in this study, AC/DC values at different wavelengths calculated from PPG signals were considered as additional features to improve HbA1c estimation performance. The feature selection procedure is primarily based on feature importance. We used different techniques (Gini for random forest, gain-based method for XGBoost and CatBoost, both split- and gain-based method for LightGBM) to determine feature importance, and further improved the model by selecting the most significant features based on the importance metrics.
2.7. AC/DC Value as a Feature for Various Wavelengths
The AC/DC value represents the ratio of the pulsatile (i.e., AC) to the baseline or static (i.e., DC) component of the PPG signal. There could be an underlying relation between AC/DC value and HbA1c. However, there are few studies on this issue. One study demonstrated a proportional relationship between AC/DC value and glucose [
26]. Another study showed a correlation between blood glucose levels and the AC/DC values of PPG signals [
27]. Despite the lack of sufficient evidence to support a direct relationship between the AC/DC value and HbA1c levels, we obtained significantly better results using the AC/DC value as a feature.
Figure 6 shows AC/DC values versus HbA1c for 22 subjects. In the case of the green wavelength, the AC/DC value is generally larger than those of the other two wavelengths (blue and red), and the results for all three wavelengths show a tendency to roughly increase according to the HbA1c value.
2.8. Importance-Based Feature Selection
In reference [
12], feature-importance-based selection was performed on 15 features obtained from the PPG signal. In this study, unlike the previous feature selection method, 15 features were considered for each wavelength, and feature selection was performed for each wavelength for a total of 47 features, including external features (BMI, SpO
2). The corresponding feature importance plots for each of the four machine learning algorithms can be seen in
Figure 7. Then, including AC/DC value at each wavelength as an additional feature, the feature importance plots for the total 50 features are also shown in
Figure 8. To illustrate the efficacy of AC/DC values for the three wavelengths as new features, feature importance was compared with 47 existing features. As shown in the figure, AC/DC values are superior and can estimate the HbA1c more accurately.
The performance of all regression models was significantly improved using these new AC/DC features. Results are described in
Section 3.2.
4. Conclusions
In this study, we presented an efficient and non-invasive HbA1c measurement system based on machine learning algorithms that utilizes the wristband PPG signal as well as physical parameters such as BMI and SpO
2. We employed several regression models (RF, XGBoost, LightGBM, CatBoost) to estimate HbA1c levels for a PPG-based dataset of 22 subjects, and our results demonstrated that the inclusion of PPG-based features such as the AC/DC values of three wavelengths significantly improved the accuracy of the model. The performance for the RGB combination by the Pearson’s r value improved from 0.803 to 0.914, 0.796 to 0.904, 0.822 to 0.925 and 0.766 to 0.917 for the algorithms RF, XGBoost, LightGBM, and CatBoost, respectively, after AC/DC values were included as features. We also showed that feature importance-based selection can improve performance while reducing computational complexity. As shown in
Table 5, the overall performance was slightly improved by Pearson’s r value after applying feature-importance selection by removing the redundant features in the process. The results of various error metrics, including Pearson’s r, indicate that the LightGBM algorithm outperforms the other algorithms in terms of both accuracy and predictive power. The LightGBM algorithm achieved the lowest MSE of 0.061 and RMSE of 0.246, and also achieved the highest
score of 0.881. Finally, EGA and B&A analysis were performed to verify the clinical safety of the proposed non-invasive HbA1c estimation method. Consistent with the Pearson’s r performance in
Table 5 and the evaluation metrics results in
Table 7, the LightGBM algorithm showed the best performance; in this case, the area accuracy of the EGA plot was 100% and 0% for zone A and zone B, respectively. By calculating the 95% limits of agreement through B & A analysis, we showed that all four algorithms have comparatively small limits of agreement, with LightGBM’s limit of agreement being the lowest at 0.49.
Our findings in this study suggest that the proposed noninvasive HbA1c measurement system has the potential to provide accurate and reliable measurements of HbA1c levels, which may have significant clinical implications for diabetic patients. Although our study provides a promising proof-of-concept, further validation with a larger sample size is needed to fully evaluate the performance of the proposed system.