1 s2.0 S2352914821001970 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Informatics in Medicine Unlocked 26 (2021) 100717

Contents lists available at ScienceDirect

Informatics in Medicine Unlocked


journal homepage: www.elsevier.com/locate/imu

Research on feature mining algorithm and disease diagnosis of pulse signal


based on piezoelectric sensor☆,☆☆
Fan Lin a, b, Jincheng Zhang a, *, Zhongmin Wang a, b, Xiaokang Zhang a, Ruiling Yao a, Yan Li a
a
School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an, Shaanxi, 710121, China
b
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an University of Posts and Telecommunications, Xi’an, Shaanxi, 710121, China

A R T I C L E I N F O A B S T R A C T

Keywords: The human pulse contains various information reflecting the internal environment of the human body. However,
Pulse diagnosis the classical method of pulse diagnosis in traditional Chinese medicine (TCM) has the disadvantages of relying
Pathological feature mining too much on the doctor’s experience and the diagnosis result is too subjective. Based on the principle of TCM
Physiological signal feature
pulse diagnosis, the use of photoelectric sensors to collect the pulse signals of multiple healthy people and pa­
Pulse wave analysis
tients with chronic diseases, and organize the detailed pulse information into a data set and analyze it with
algorithms, is a solution to overcome this problem through modern technology. However, this method is still
difficult to understand the patient’s physiological condition in detail, and it is also difficult to explain the internal
connection between abnormal pulse conditions and their physiological conditions. In the experiment, after
denoising, smoothing, and eliminating the baseline drift of the subjects’ pulse data, we designed two algorithms
to describe the difference between the two-dimensional images of the pulse data of normal people and patients
with chronic diseases. The specific feature values obtained are converted into a multi-dimensional array and
trained in a support vector machine (SVM) classifier. The classification accuracy is higher than the basic tem­
poral features. Experimental results show that it is feasible to use specific feature mining algorithms for disease
detection. Through analysis, this paper found the pathological characteristics reflected in the two-dimensional
pulse image, discovered the internal connection between the pulse waveform characteristics of the human
body and the disease, and tried to describe it through algorithms, trying to establish a method for detecting
specific diseases using photoelectric signals.

1. Introduction the data and extracting spatial features from the fit can effectively di­
agnose the diseases in a short time and at a low cost to some extent [10].
Pulse diagnosis is a very common physical diagnosis method in TCM, However, in the practice implementation of clinical diagnosis, the ac­
that is, the use of fingers to press the pulse to diagnose human diseases. curacy of pulse diagnosis depends heavily on the practitioner’s skills and
Before the popularization of modern medicine, wrist pulse diagnosis has experience. Different practitioners may not give identical results for the
always been the main method to diagnose diseases in TCM [1–3]. same patient [11,12]. The lack of high-precision testing equipment and
However, it is very subjective to diagnose pulse condition only by doc­ the recording and analysis of pulse data are inherent shortcomings of
tors’ fingers, so it is not easy to confirm the clinical holistic diagnosis ancient Chinese medicine, and its clinical diagnosis results are also
(CHD) of TCM systematically [4–8]. Pulse signals can be effectively used difficult to verify.
to analyze a person’s health status and reflect the pathological changes At present, the use of electronic equipment to detect pulse signals is
of a person’s physical condition [9]. As an important source on health the main way to combine modern TCM with computers. Moreover, the
status evaluation, the wrist pulse signal contains important information pulse signal can also reflect the physiological characteristics of the
about the status of the human body, fitting a bi-modal Gaussian model to subject [13,14]. The research of pulse diagnosis in TCM is to analyze the


This document is the results of the research project funded by the National Science Foundation.☆☆ This note has no numbers. In this work we demonstrate ab the
formation Y_1 of a new type of polariton on the interface between a cuprous oxide slab and a polystyrene micro-sphere placed on the slab.
* Corresponding author.
E-mail addresses: [email protected] (F. Lin), [email protected] (J. Zhang), [email protected] (Z. Wang), [email protected] (X. Zhang),
[email protected] (R. Yao), [email protected] (Y. Li).

https://doi.org/10.1016/j.imu.2021.100717
Received 3 July 2021; Received in revised form 16 August 2021; Accepted 22 August 2021
Available online 7 October 2021
2352-9148/© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

pulse of the subject to judge the state of the patient. Using sensors and 2. Data processing and feature extraction of chronic diseases
computers to obtain more detailed information than the experience of
TCM practitioners is a technically feasible method, and machine 2.1. Pulse wave decomposition
learning techniques are exploited to analyze health conditions based on
the acquired pulse signals [15]. Pulse signal, as a biological signal with The research of this paper mainly uses photoelectric sensors to
great clinical diagnostic value, many sensors have been used for pulse measure, and the measured two-dimensional pulse change image will
signal acquisition, including pressure (selected in this paper), photo­ show a typical peak and trough structure (as shown in Fig. 1). After we
electric, electric pulse, and ultrasonic sensors [16–19]. Recently, the use decompose each part of the subject’s image, we can obtain the first-level
of effective signal processing technology to process the pulse pressure feature value that distinguishes the individual.
signal of the wrist has become a research hotspot. For effective methods As Fig. 1 described, pulse waves image of most people have three
that can be used for pulse signal analysis, there is the statistical analysis peaks(red circle at the picture) and three troughs(red triangle at the
of wrist pulse signals, multi-feature fusion, temporal and spatial feature picture) structure, which it can extract several characteristic indexes,
extraction, and independent factor (composite factor) analysis and such as peak value of the first wave peak (h1), the peak value of the
prediction, etc [20–24]. The basic idea of pulse wave analysis (PWA) is second wave peak (h2), the peak value of the third wave peak (h3), the
to use sensor equipment to collect human pulse biological information interval between h1 and h2 (Δta), the interval between h2 and h3 (Δtb)
accurately and without interference [25,26]. The waveform of these the three point-in-time of three troughs(t1, t2, t3). Fig. 1(b) is an
data can be used to extract the physiological characteristics of the test example of a human wrist pulse image eigen decomposition, but some
set. PWA can not only identify the physiological characteristics of in­ people’s pulse wave images do not fully conform to the above because of
dividuals but also can be used to analyze the differences between in­ different measurement methods and physiological status. A few exam­
dividuals caused by specific diseases [13,21]. PWA allows quantifying ples, such as decreased vascular elasticity due to advanced age or illness,
the changes in vascular impedance caused by arterial stiffness and/or patients with arrhythmia or weak heartbeat, and patients with some
endothelial dysfunction in patients with cardiovascular disease. For diseases affecting the circulatory system. By comparing the waveform of
example, a pulse oximeter sensor based on organic materials is used to patients, it is easy to find out the interference of waveforms and take
estimate the risk of disease by analyzing the photoelectric changes of the targeted classification.
blood vessels in the fingertips [27]. Alternative indicators of arterial However, according to our observations on a large amount of data,
stiffness and peripheral arterial resistance [28,29] are variables related not all the individual differences of subjects identified by this method
to the prognosis of cardiovascular disease and can be determined by are due to diseases. Most of the differences in pulse characteristics are
PWA. With technological innovation, the accuracy and data volume of due to the individual’s physiological characteristics, and the degree to
piezoelectric sensors will increase to make the results more reliable. A which the characteristic value is affected will often be affected. Greater
team uses a two-dimensional sensor designed by a piezoelectric array to than pathological features, these differences will make individual
collect pulse signals, which can perform omnidirectional and pathological features more difficult to find. Therefore, due to the
high-efficiency analysis of deformation. Real-time measurement can be different physiological characteristics of the subjects, the segmentation
achieved through a fast three-dimensional digital image correlation algorithm should consider the factors that affect the shape of the pulse
(3D-DIC) method, and finally applied to the PWA method [30,31]. In the waveform. Because the feature extraction algorithm of the experiment is
field of disease recognition, there is evidence that PWA is using pulse designed based on the subject’s two-dimensional waveform image data,
time-domain feature data images, pulse information extracted by in the process of disease recognition, individual physiological differ­
detecting the physiological state of the human body in disease classifi­ ences and the type of sensor used will affect the accuracy of the final
cation research, and can achieve high accuracy in the SVM classification classification result, so it must consider these factors before exper­
model, which can be used as a non-invasive, Reliable evaluation method imenting. Under this premise, we can distinguish which factors have
for cardiovascular disease [32]. However, Using this model for disease caused the difference in pulse waveform between patients with the
identification, the existing research does not take into account the disease and normal people.
interference caused by individual physiological differences. Cause of the As the age of the subject, the elasticity of blood vessels will gradually
different physiological characteristics of each person, the pulse wave­ decrease. This situation is reflected in the elder subject’s pulse image,
form will also be different due to age, gender, and activity [33]. For and the boundary between the systolic and diastolic pulses of their pulse
example, indices such as the augmentation index (AIx) and augmented will also become inconspicuous [28,33]. In addition, according to the
pressure (AP) extracted from the pulse wave of the wrist or proximal information reflected in our data, the heartbeat difference between the
right carotid artery differed between genders and ages [34–36]. Due to sexes will also reflect the different pulse waveform states. Females’
the influence of many factors, it is difficult to define the extracted fea­ hearts are more inclined to “beat”, and males are more inclined to
tures as disease features. This research takes these factors into account in “systole.” Specifically, the ratio of systolic and diastolic blood pressure
the classification algorithm through the comparison of each sample after in the pulse image of women to the entire pulse cycle is lower than that
feature extraction and proposes corresponding solutions. of men [32,34,35].
Therefore, this article first analyzes the characteristics of the data An important finding in this study and the existing works of litera­
graph generated by the data set. The two pathological feature extraction ture is that in the pulse cycle, the ratio of systolic to diastolic phase
methods designed in this paper are the stability index and the three- caused by gender is much higher in men than in women, which makes it
peaks index. These two sets of features are extracted by two algo­ inevitable to consider some physiological differences in pulse diagnosis.
rithms. The pathological characteristics of the subject’s pulse image will It is easier to find the individual physiological differences caused by
be reflected in the two sets of data, so the two sets of indices are used for gender and age. These factors are mainly reflected in the intensity of the
classifier identification. The SVM classification algorithm is used to pulse waveform and the shape of the systolic waveform. As Fig. 2(a)
calculate the extracted features, and finally, the detection of human illustrated, men’s pulse waveforms intensity is usually higher than
diseases can be realized under the premise of individual physiological women’s when other factors are the same, and similarly, the pulse
differences. Under this experimental goal, we also tried to use photo­ profile of the diastolic phase from young people is much clearer than
electric sensors to measure the pulse signal in pulse diagnosis and that of old people (because young people’s elasticity of blood vessels can
explore the internal relationship between the human pulse waveform cause the change of electric pulse waveform), which is the diastolic
distribution and pathological characteristics. phase of the elderly’s pulse waveforms lack of easily observable vaso­
constriction point as Fig. 2(b) illustrated.
Moreover, acupuncture points are also a factor, and different

2
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 1. (a) Signal image difference between piezoelectric (upper) and photoelectric sensor (lower) (b) Characteristic value of the typical pulse period measured by
the photoelectric sensor (after wavelet denoising).

diagnosis. However, they often lack process, and they lack the analysis
of pulse graph data, and they also lack algorithms that can reflect
pathological differences into values. Although they have a much higher
depth and breadth of digging disease characteristics than TCM practi­
tioners, they are still very limited. Moreover, the feature value extracted
from the signal is limited to single feature data, and it is impossible to
quantify the characteristics of the pulse data reflected in the two-
dimensional image and the rate of change of its waveform. Therefore,
it is impossible to dig deeper into pathological information and use those
data for disease diagnosis. Its accuracy is also very limited.
Usually, the pulse signal is collected by the sensor because the pulse
signal can reflect the physiological characteristics of the subject, the
data returned by the sensor, the measurement of the human pulse can be
used as the basis for the physiological characteristics of the subject, and
the diagnosis of the disease. In the theory of TCM, different acupoints
reflect the pulse signals of different organs of the human body. There are
also differences in the shape and characteristic value of the pulse be­
tween different acupoints of the same person.
Fig. 4 is the standard process for this experiment, the core of this
experiment is data collection and process. Depend on their health status,
the subjects were divided into normal and pancreatitis patients. After
removing the baseline of the recorded data set, we collected the pulse
data extracting the features required for the algorithms, which must
have a certain distinction between the two subjects. We designed two
feature extraction schemes to calculate the results of the pulse dataset,
calculated and grouping all pulse waves data used for SVM
classification.
While measuring, controlling measurement errors is a step that
cannot be ignored. For instance, during the measurement, the subject’s
wrist may move or vibrate, which may lead to abnormal pulse waveform
Fig. 2. According to the existing medical research, the main individual dif­ oscillation. But the more systematic error is that the data analysis of
ferences of human pulse images are found. (a) The difference in waveform different subjects needs to consider factors such as measurement loca­
intensity is reflected by the different gender of subjects. (b) Pulse waveforms tion, temperature, subject status, equipment interference, and so on.
shape under the different ages (dicrotic notch), the old man on the left and the Perform horizontal comparisons under a unified standard. The key of the
young man on the right, the elderly subjects lack the trough of vasoconstriction positive experiment is how to extract the features which can distinguish
in pulse waveforms. the difference between the two pulse patterns, and design appropriate
algorithms to make these features more distinguishable.
acupuncture points reflect different images. In the pulse diagnosis the­
ory of TCM, the acupoints on the human wrist are usually divided into
three points: Cun, Guan, and Chi, reflecting the signals conducted by 2.3. Preparation and characterization of the dataset
different organs of the human body, and also corresponding to three
different pulse waveforms. In the measurement of the piezoelectric 2.3.1. Data acquisition
sensor, the three acupoints are measured separately to obtain three This article uses piezoelectric sensors and embedded devices to
different pulse waveform images (see Fig. 3). measure changes in arterial blood flow, connects to a computer and
records pulse data in the form of one-dimensional data, and then sam­
2.2. Research and experiment steps ples these data at regular intervals and converts them into data sets.
In order to verify our experimental method, we selected 57 subjects
There are already existing researches on extracting physiological (29 normal people and 28 patients) of different ages and physiques for
information features and using machine learning algorithms for disease pulse wave data collection. Take 25 effective pulse periods as a group for

3
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

2.3.3. Data images analysis


By using MATLAB software, all the collected one-dimensional data
are printed into charts in the form of two-dimensional coordinates, and
the pulse function of each cycle is superimposed, so that the distribution
range of the pulse function image can be directly observed.
We selected 30 groups of subjects (evenly distributed in age and
balanced in gender) with different numbers of pulse wave data and
divided them into groups according to disease. After processing these
data according to Fig. 4, all pulse images are grouped and clustered.
As a control group, first, by transforming the pulse data set into a
chart, we can observe the pulse changes of two kinds of people in the
most intuitive way. Superimpose all pulse waves images of the subject,
we can clearly observe that the pulse waves range of the detected person
in the form of the two-dimensional image, and analyzed the difference
between pancreatitis disease and normal people. Second, by observing
the range of abscissa 30 to 60, there are two very similar peaks in the
pulse waves image of patients with pancreatitis compare with the
normal people have only one higher peak.
The original data need to use the stability algorithm (refer to Equa­
tion (1)) for anomaly screening, that is, the stability value is significantly
greater than the mean value. And make sure the corresponding period
has been tailored, the principle is to find the fixed reference point of
each cycle (as Fig. 5(a) shown). It is convenient to observe the pulse data
distribution of this type of subject and find out the unique pathological
characteristics of patients with chronic diseases by superimposing the
multi-cycle pulse image signals. If the periodic distribution of the pulse
image is disordered, the pulse image that deviates from the reference
value can be manually adjusted horizontally and aligned with other
periodic images at the peak position.
The pulse waves of sixteen groups of normal people of four to twenty
groups superposition (Fig. 6(a)) have higher stability compare with
Fig. 6(b). By observing the last two-third of pulse waves images in 20
groups, which is 40 to 80 scale with the abscissa, the pulse waves of
pancreatitis patients have obvious disturbance phenomenon.
Two pulse forms are divided by a specific algorithm, sampling 40 to
Fig. 3. Examples of data acquisition instruments and measurement methods 80 scales of the pulse waves cycle. In the horizontal axis, the data is
(Above) The three acupoints of human pulse and the corresponding physio­
sampled at intervals of a short distance, and then the difference between
logical information in the theory of TCM (Bottom).
the distances of two adjacent sampling points (unit amplitude is ordi­
nate, millisecond time is abscissa) on the two-dimensional image co­
single pulse measurement, and record 5–25 groups for each patient with ordinates is taken (Refer to Fig. 7). Because the pulse waves image of
chronic diseases. Similarly, we choose healthy people to record 4–20 patients with pancreatitis is more vibration than that of normal people in
groups’ pulse wave data in the same way (the number of groups is less the control group, therefore, it can be used as an important indicator to
than the number of patients). divide the two subjects by SVM classification. The goal of making this
method more accurate is to set an appropriate sampling frequency to
2.3.2. Pulse waves processing make the entire partition the most effective. Generally, for pulse elec­
Before the data analysis step, only after the required pulse wave trical signals, the most suitable sampling frequency is between 30 Hz
image records are processed, can they be converted into a useable data and 90Hz [37], and our sampling frequency is set at about 40 Hz, the
set, that is, to remove the pulse baseline changes caused by the mea­ pulse image is the clearest.
surement or instrument algorithm, remove the noise in the original
waveform data, and extract Pulse wave curve.
2.4. Classification algorithm design
The principle of the baseline drift elimination algorithm is to
calculate the average period of the pulse of the subject. After the pulse
Based on the previous analysis and observations to visualize the
data is made into a two-dimensional graph, a point is selected at the
existing PWA data, one of the factors that caused the pathological dif­
peak position of the first pulse image, and its height is obtained, and
ference was discovered, namely the stability of the pulse waveform.
other pulses at that point have calculated the difference of the period
Therefore, a classification algorithm that can amplify the difference
position, and use this value to move the entire pulse period vertically.
between normal and pancreatic pulse wave images is designed, and can
We overlap the pulse images of each subject with multiple cycles and
effectively extract the value of the pulse image difference between the
store them in the same file as a sample for analysis. Therefore, we can
two types of cases.
analyze the pulse image of each subject more accurately. Secondly,
Take the black circle on the pulse waveform graph as the sampling
when the data is integrated, the entire data model is more visualized, so
point, and each black circle has its own height value. Select one of the
it is easier to observe the pulse distribution of each subject, making the
cyan pulse waveform lines for sampling, starting from 40 Hz and sam­
comparison between the normal person and the patient group clearer.
pling from the abscissa position backward (the interval between the
Based on a comparative analysis of massive data, this article makes a
abscissa of the sampling point is the sampling frequency). As shown in
preliminary comparison between the two groups with the largest dif­
Fig. 7, the absolute value of the height difference between two adjacent
ferences (pancreatitis and normal people) as test samples for SVM
sampling points is the absolute value difference. The instability of this
classification.
part of the line graph is reflected in the absolute value difference of the

4
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 4. General flowchart of pulse diagnosis experimental research program and description of each step.

5
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 5. (a) Select the datum points in the pulse images and align them to the unified baseline (b) The effect of denoised and smoothed the original pulse waveform.

adjacent sampling points (cyan curve stability Deviation from the


w ∗ Di + b < 0(∀Xi = − 1)
normal value, real experiments usually remove the extremely unstable (4)
w ∗ Di + b > 0(∀Xi = +1)
blue pulse line) We can use this idea to design an algorithm, then the
equation for calculating the instability of the pulse wave line is: In some cases, it makes sense to do this, and certain practices make it
an algorithm model that can distinguish normal people from patients

N
Di = | Xt+ni − ⃗ Xt+(n− 1)i |
2 with pancreatitis. On this basis, setting optimal sampling frequency can
i=0 maximize the utility of the whole classification method.

N
2 2
Also shown in Fig. 6(c), the data show that the control group of
= (xt+ni − xt+(n− 1)i ) + (yt+ni − yt+(n− 1)i ) patients with other chronic diseases will also have a double peak
structure, which is also applicable to this method. Compared with
i=0

n ∈ SF , Xi ∈ {(xt , yt ), (xt+n , yt+n ), …, (xt+N , yt+N )} (1)


normal people, there are two peaks and obvious concave arc structure in
The equation was recorded the change rate of the second half of each the image of patients with pulse pancreatitis (within the scope of the
pulse wave image of the same tester. Di represents the sampling point considered region). We design an algorithm to find the double-slit
variance of each pulse. Parameter Xi represents the absolute coordinate structure in the pulse image of patients with pancreatitis:
value of the sampling point at the abscissa position i. The subscript t of X
1 ∑N ∑
N
represents the starting position of the sampling points, and parameter n A= yt+ni Di = N (yt+ni − A)
can represent the sampling frequency SF. Different sampling frequencies N i=0 i=0
(5)
can not only get different results but also affect the accuracy of the n ∈ SF , Y ∈ {(yt ), (yt+n ), (yt+2n ), …, (yt+N )}
classification algorithm. Equation (1) calculated and recorded the
changes in the second half of each pulse wave line chart for each subject, t = max(Y), T = max(Cy t)
and as a dataset divided by SVM classification. Select the perceptron Equation (5) was designed by compared and analyzed the differences
strategy to divide the two kinds of subjects into SVM classification. between the two kinds of testers. First, calculate the mean values of the
To accurately determine whether a patient has certain chronic dis­ vertical coordinate height of each sampling point of every normal person
eases, it is usually necessary to extract multiple pathological features in the considered region (this area is located in the position of two
from the subject’s pulse data for judgment. An effective classification troughs to the left and right of the second wave peak, as Fig. 8 shown).
model is to extract multiple pathological characteristic values from Parameter Di recorded the difference between each sampling point of
several sets of pulse waveforms of subjects and represent them by multi- pancreatitis patients and the normal mean value A. Parameter t and T
dimensional data. Through our observation of massive sample data, we represent two peaks respectively in the considered region. The Di value
discovered a phenomenon. The data calculated by the method in the of patients with pancreatitis is usually lower than that of normal people.
PWA experimental process designed in this article is usually distributed This algorithm is used to calculate whether there are three peaks in the
in clusters. In various fields of machine learning, the core method of pulse image. The value of normal people must be higher than pancrea­
SVM can quickly and effectively classify massive multi-dimensional data titis patients (see Fig. 9).
when the data present a clustered distribution. Therefore, each group of
multi-dimensional data is converted into a data set for storage according 3. Disease classifier and results
to different recording times of different subjects, which can effectively
classify. 3.1. The calculation of pulse data
For a dataset:
D = {D1 , D2 , D3 , …, DN } (2) Based on the above algorithm designed for support vector machine
classification, the stability of each group of pulse data can be quantified.
Using the perceptron model and the perceptron strategy we designed Through this algorithm, the pulse wave image of the second half of each
to make the whole model linearly separable and calculated all of Di from tester is sampled at different frequencies to find the most suitable ac­
two subjects. Therefore, suppose that there is a hyperplane that can curacy to distinguish the two testers. In order to verify the accuracy of
divide sampling points variance of each pulse from pancreatitis and this method, we designed 20 groups of subjects as a training set, and
normal in D accurately into two sides of S, as follows: their pulse stability is as follows:
∃Π : w ∗ X + b = 0 (3) From Fig. 8, every value from each tester in two multiline lists all
reflect from Fig. 7, the number of pulse groups varies from 1 to 15, and
Making them into a linearly separable data set is like: each pulse image in each group is calculated by Equation (1), and

6
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 6. (a) Pulse waves of sixteen groups of normal people of four to twenty groups superposition. (b) Pulse waves of four groups of pancreatitis people of five to
twenty-five groups superposition. (c) Pulse waves of three groups of patients with appendicitis (A), acute appendicitis (AA), duodenal ulcer (DBU).

sampling frequency n is 40 Hz, starting position i is located between The average value obtained by this algorithm will be slightly
wave peak h2 and trough t3 as the Fig. 2 shown. Two types of data from different. This is due to the increase of abnormal values caused by the
normal people and patients apply for SVM classification calculation (the oscillation of the pulse waveform. Using the given algorithm to segment
calculation accuracy can be adjusted by changing the starting position i and eliminate the abnormal pulse waveform, a more accurate value can
and sampling frequency n) in a certain proportion. be obtained.
The following twenty sets of data are randomly selected from all The Table 1 is calculated by using Equation (1), which are values
samples and used as a training set for classification testing. We bring the used to distinguish between normal and pancreatitis patients and
data of each set of randomly selected subjects into Equation (1), and Table 2 shows the detailed distribution of the stability index of the pulse
train the calculated values that reflect the stability of their pulse waveform of subject N4, and Table 1 shows the mean values of the pulse
waveforms. To avoid the bias in the calculation results caused by cycle images of the integration groups. This calculation of pulse waves
deliberate selection of data, each subject we select does not have a value can be used as indicators of the unitary classification algorithm.
specific number of groups, and the selected subjects are also completely However, to find the maximum partition plane and classify SVM, we
randomly selected from the entire sample. need to add another set of indexes for the SVM model to transform into a

7
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 7. The pictorial diagram of one of pancreatitis patient pulse information (stability index, SI) extraction. Blue pulse wave line wave marked by a black dot is the
selected line chart of pulse wave. The black circle on the line chart is the sampling point (the sampling frequency in the diagram is 80 Hz).

Fig. 8. The comparison of pulse lines between pancreatitis and normal persons (concave structure index, CSI).

typical convex quadratic programming problem. By observing the pulse set for training for nuclear method classification.
image in Fig. 6, there another obvious feature can be found to distin­ The above data is the recognition accuracy rate of several sets of
guish the two types of testers. pulse data of four pancreatitis patients under different methods. Table 3
shows the classification success rate of the above two algorithms under
3.2. The classification results different division methods. To verify the classification accuracy of the
SVM kernel method, we will divide the two major features of SI and CSI
Through Equation (1) and Equation (5) could obtain two kinds of extracted from the pulse image with a linear division method, and use
calculated data and transform the quadratic programming problem in SVM to divide the two algorithms respectively and calculate the two sets
the SVM module into a dual problem. Two kinds of data as training of data. The result is converted into a two-dimensional array, and the
dataset satisfying.Karush–Kuhn–Tucker (KTT) condition, the training support vector machine and the kernel method of the support vector
dataset is converted to the standard unit features metric space. machine are used for classification, and the classification effect of the
We screened the pulse waveform data of several groups of normal experimental scheme is verified according to the accuracy of the clas­
people and patients with chronic diseases, selected each group of pulse sification result. Using multiple sets of pulse data of people with specific
cycle data that best reflected the characteristics of the subject, and diseases and normal people as the SVM training set, the classification
labeled whether they were sick or not, and used these data as a training accuracy shown in Table 3 can be obtained (the positive and negative

8
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Fig. 9. Distribution of stability index (average value of photoelectric signal samples in each group) of normal and pancreatitis samples.

Table 1
The Calculated SI of Pulse Waves of Each Tester (part of).

values of the accuracy in the table are for the training set error). Judging people in Table 3 is 82% as the standard, and the test subject who is
from the classification results obtained from the training set we selected, higher than this value is judged as a disease patient, and this standard is
the results show that the accuracy of the SVM kernel method is signifi­ used as a disease diagnosis basis. We randomly sampled three sets of
cantly higher than the former, and can reach an accuracy of more than pulse data for all subjects (30 normal people and 30 disease patients)
95% or even higher. and applied a confusion matrix to analyze the success rate of the clas­
In this experiment, we use the classification algorithm code estab­ sifier (between normal people and four specific chronic diseases). As
lished by Python to process the MATLAB data set. The unique feature shown in Table 4, from the results of a single test, the success rate of this
mining algorithms of the paper can be used as a breakthrough in the classifier in identifying patients with four chronic diseases is 83.33%,
current pulse image recognition, and the purpose of these algorithms is 88.89%, 66.67%, 94.44%, and the overall misdiagnosis rate is 97.22%.,
mainly to determine whether the subject has a chronic disease. We 96.03%, 87.03%, 93.30%. Judging from the sample classification results
predict that the classification accuracy of the classification model be­ of existing test subjects, except for disease AA, as long as other diseases
tween chronic patients and normal people is usually higher than that of are monitored multiple times, an almost 100% diagnosis success rate can
normal people and normal people. According to the data rows of N1-16 be achieved. The result proves that the classification method is feasible
in Table 3, if the classification accuracy of the PW data of a subject is for the result of disease prediction.
higher or much higher than 81.5%, the risk of chronic disease can be We have added a new PWA-based disease recognition model to the
judged for the subject. Based on the currently selected subjects, if the standard pulse diagnosis experiment process, which can improve the
standard is set at 82%, then the model’s prevalence recognition rate for success rate of disease recognition, but whether this model can be used
randomly selected patients with chronic diseases is 100%. in other special population individuals (such as athletes, children, the
Then, the classification success rate between normal and normal elderly, obesity) To get more extensive verification, more experimental

9
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

Table 2
The calculation of SI of pulse waves of tester N4.
Avg. 182.2439471 125.1787536 121.2582868 119.2583642 88.64006496 126.9122154 123.9768548

The Calculation of Each Pulse Waves 243.4307764 89.4746307 147.4907703 90.3968228 73.65077466 144.7883605 130.0546737
154.4141778 203.9817782 113.7453793 80.56851777 76.10171886 104.3149293 114.8419646
163.0019556 98.768238 115.5987904 129.9065796 121.1505807 137.7865722 121.6705657
176.7413977 108.523761 101.4687908 117.2729878 70.76725107 140.1592647 113.1422266
182.6170942 115.8351541 113.8500397 106.9658541 85.4812433 104.7098793 114.8424235
114.6728744 113.6551208 155.6160473 117.9268775 77.0621336 175.0761616 125.1233579
104.980123 80.4526249 72.262149 98.6220622 94.4377 96.2729062 92.6679422
224.8037486 139.8371943 143.5160426 99.8817306 121.8542903 100.2172017 132.7198609
194.9709663 64.0655696 125.5684841 67.9219226 75.364019 135.1276061 112.124694
177.7616218 143.7572835 150.8910187 88.7445184 79.4428787 78.3096277 119.1937235
211.2481987 120.9563947 101.7888941 145.3608918 76.4314924 118.8401516 131.4446227
130.282267 110.382883 104.6147982 167.8756261 75.10691955 100.7401093 117.215563
187.442179 158.5483736 101.3721552 180.6183058 145.4863357 106.5292259 140.5061528
227.54534 134.6337038 155.0842084 180.9914472 89.97892348 194.804423 150.6289353
140.4671416 199.0331473 161.6320331 120.8065387 59.1087315 128.5514876 128.7167642
139.572658 120.9541999 132.8602114 120.8386309 85.60335796 165.2340945 122.0032128
324.1945802 109.7607738 112.692878 99.8527535 126.0456608 140.7098476
75.5285763

Pulse values of one of the normal subjects, some abnormal data are excluded.

Table 3
The division accuracy achieved by various Linear and SVM algorithms with normal people.
Tester SI by Linear Classification CSI by Linear Classification SI by SVM CSI by SVM Two calculated index by SVM kernel method

P1 0.8793 0.931 0.895 ± 0.02 0.954 ± 0.03 0.9827 ± 0.002


P2 0.7586 0.8668 0.801 ± 0.02 0.91 ± 0.05 0.9791 ± 0.003
P3 0.7414 0.6034 0.754 ± 0.01 0.62 ± 0.01 0.9112 ± 0.02
P4 0.9138 0.9483 0.912 ± 0.01 0.947 ± 0.01 0.9962 ± 0.001
A1 0.8675 0.9295 0.895 ± 0.01 0.963 ± 0.04 0.9879 ± 0.003
AA1 0.7123 0.6221 0.718 ± 0.02 0.623 ± 0.01 0.8612 ± 0.003
AA2 0.9345 0.9574 0.887 ± 0.01 0.961 ± 0.01 0.9954 ± 0.002
DBU1 0.6842 0.5891 0.648 ± 0.09 0.733 ± 0.15 0.8524 ± 0.008
DBU2 0.7284 0.6963 0.791 ± 0.05 0.786 ± 0.08 0.9258 ± 0.004
N1-16(mean) 0.598 0.624 0.652 0.759 0.8148

designed in this essay is how to detect subjects with multiple chronic


Table 4
diseases. We selected the group with the disease that has been confirmed
Recognition success rate of multiple groups of pulse data of four diseases.
to have pancreatitis, but whether the test subjects with worse health
Real Condition Disease Type status or two chronic diseases can also achieve this accuracy is unknown.
Pancreatitis Appendicitis Acute Duodenal As Fig. 6(b) shown, the pulse waveforms from the third pancreatitis
Appendicitis Ulcer patient group could be a person with more than two diseases, this affects
Yes No Yes No Yes No Yes No the accuracy of the classification algorithm, it has become a factor we
Healthy 0 90 1 89 8 82 3 87
have to consider. On the other hand, what is more, difficult is that even if
Chronic Disease 15 3 32 4 12 6 17 1 our model is only a simple dichotomy, it has a high accuracy rate for
patients with specific chronic diseases, but it is also a difficult point for
the subjects with multiple chronic diseases to detect which chronic
subjects and data verification are needed. But using the data obtained by diseases they are suffering from. Usually, patients who need a quick
the above two algorithms, combined with the SVM kernel method for check don’t know their health.
classification, the model can already achieve high accuracy. If you need Besides, among all the factors affecting the shape of pulse waveforms
to further optimize the algorithm, please change the parameter n in image, age is a very difficult factor to overcome. As one ages the elas­
Equation (1) to a more suitable value, and perform more algorithm ticity of the blood vessels will substantially decrease, and the strength of
optimizations on the SVM classification model to obtain higher classi­ his heartbeat will weaken, Their pulse waveforms image will be very
fication accuracy and two calculation indicators. different from the standard pulse waveform image of the human body.
Although the data indicate that the method described in Fig. 4 is This greatly increases the difficulty of the disease monitoring method
feasible to detect three chronic diseases as pancreatitis, appendicitis, proposed in this paper to determine whether the subjects have a certain
and duodenal ulcer. However, no matter what classification method is disease. Different diseases will have different pulse waveforms image
used, it will be greatly affected by individuals with special physiological manifestation, the above two algorithms may not have the same effect
conditions (such as children, the elderly, patients with arrhythmia, and on other chronic diseases. In fact, in the detection of each disease, it is
other factors that cause abnormal heart rate). Therefore, how to opti­ necessary to design a set of different algorithms according to the pulse
mize the algorithm, make some people with abnormal heart rate and waveform image characteristics of each disease, to extract the feature
special pulse waves also maintain high accuracy, the further solutions values that can be classified by the SVM. Although, as the number of
are needed to explore. algorithms increases, the accuracy of disease monitoring will also in­
crease, the detection time and computational resources consumed will
3.3. The difficulties and limitations in implementing this application also increase. How to choose a balance between the two is an aspect that
we need to consider for a long time.
First of all, the biggest problem of the disease detection method Finally, the scheme designed in this paper only verifies that it is

10
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

feasible to distinguish between normal people and several diseases we interests or personal relationships that could have appeared to influence
mention above. But to complete the diagnosis of patients with other the work reported in this paper.
diseases, this still needs more testers and further experimental confir­
mations. I believe that in the future, according to the experimental ideas Acknowledgment
of this paper, we can design a prototype machine of PWA to complete
disease diagnosis, to verify whether the pulse diagnosis theory of TCM is We are grateful to Zhongmin Wang for the assistance with the ex­
still feasible in modern medicine. periments, and the pulse datasets provided by the 211 Hospital of the
People’s Liberation Army for our institute.
4. Conclusion
References
This paper designs two algorithms through analysis and uses the
SVM-KM for numerical segmentation. Through the above two algorithm [1] Jiang M, Lu C, Zhang C, Yang J, Tan Y, Lu AP, et al. Syndrome differentiation in
modern research of traditional Chinese medicine. J Ethnopharmacol 2012;140:
models in this article, by horizontally referencing the average 81.48% 634–42.
classification success rate among healthy subjects and the average [2] Li WF, Jiang JG, Chen J. Chinese medicine and its modernization demands. Arch
diagnosis success rate of more than 90% of nine patients with chronic Med Res 2008;39:246–51.
[3] Bilton K, Hammer L, Zaslawski C. Contemporary Chinese pulse diagnosis: a modern
disease, it can be concluded that the experimental method is effective to interpretation of an ancient and traditional method. J Acupunct Meridian Stud
a certain extent. Although these two methods have limitations (the in­ 2013;6:227–33.
fluence of individual pulse waveform characteristics on the two in­ [4] Korpas D, Halek J, Dolezal L. Parameters describing the pulse wave. Physiol Res
2009;58:473–9.
dexes), it is recommended to test the results of the experiment based on [5] Huang PY, Lin WC, Chiu BYC, Chang HH, Lin KP. Wrist pulse signal diagnosis using
different principles and training data. But judging from the results, these modified Gaussian models and Fuzzy C-Means classification. Med Eng Phys 2009;
algorithms can achieve high accuracy through the support vector ma­ 31:1283–9.
[6] Chen YH, Zhang L, Zhang D, Zhang DY. Study of wrist pulse signals using time
chine method.
domain spatial features. Comput Electr Eng 2015;45:100–7.
Although the classification result is based on the traditional SVM-KM [7] Sun Y, Shen B, Chen Y, Xu Y. Study of wrist pulse signals using time domain spatial
method, compared with others such as the Physical meaning design features. In: Zhang D, Sonka M, editors. Medical biometrics. Berlin Heidelberg:
algorithm and BP Neural Network, this mode combines feature mining Springer; 2010. p. 334–43.
[8] Yan J, Wang Y, Xia C, Li F, Guo R. Detecting nonlinearity in wrist pulse using delay
and classification algorithms [38–40], and its results are more robust. vector variance method. In: Advances in cognitive neurodynamics ICCN 2007.
And this mode combined with more highly sensitive and Netherlands: Springer; 2008. p. 867–71.
information-gathering optical sensors (such as ultrasensitive pulse sen­ [9] Shu Jian-Jun, Sun Yuguang. Developing classification indices for Chinese pulse
diagnosis. Compl Ther Med 2007;15(3):190–8.
sors [41]), its accuracy is expected to be higher. If this mode is applied to [10] Rangaprakash D, Narayana Dutt D. Study of wrist pulse signals using time domain
wearable devices equipped with this instrument, combined with related spatial features. Comput Electr Eng 2015;45:100–7.
machine learning algorithms, it can even solve the problem of using [11] Chen Yinghui, Zhang Lei, Zhang David, Zhang Dongyu. Wrist pulse signal diagnosis
using modified Gaussian models and Fuzzy C-Means classification. Med Eng Phys
smart devices to monitor diseases and use smart devices to monitor the 2009;31:1283–9.
wearer suffering from pancreatitis, appendicitis, hypertension, and du­ [12] Chen Yinghui, Zhang Lei, Zhang David, Zhang Dongyu. Computerized wrist pulse
odenum The condition of the disease, even the physiological state of the signal diagnosis using modified auto-regressive models. J Med Syst 2011;35:321–8.
[13] Yao Chu, Zhong Junwen, Liu Huiliang, Yuan Ma, Liu Nathaniel, Song Yu,
subject, such as pregnancy, before and after meals, and before and after Liang Jiaming, Shao Zhichun, Sun Yu, Dong Ying, Wang Xiaohao, Lin Liwei.
exercise, etc [6,38]. If these measuring and recording instruments are Human pulse diagnosis for medical assessments using a wearable piezoelectret
fully improved and applied, combined with similar feature extraction sensing system. Adv Funct Mater 2018;28(40):1803413.
[14] Wang Jingyi, Liu Kewei, Sun Qizhen, Ni Xiaoling, Fan Ai, Wang Senmao,
algorithms, we will obtain a faster, more accurate, and low-cost disease
Yan Zhijun, Liu Deming. Diaphragm-based optical fiber sensor for pulse wave
monitoring method too. monitoring and cardiovascular diseases diagnosis. J Biophot 2019:20190084.
[15] Zuo Wangmeng, Wang Peng, Zhang David. Comparison of three different types of
Funding wrist pulse signals by their physical meanings and diagnosis performance. IEEE J
Biomed Health Inf 2016;20(1):119–27.
[16] Boutry Clementine M, Beker Levent, Kaizawa Yukitoshi, Vassos Christopher,
National Natural Science Foundation of China (61373116); Shanxi Tran Helen, Hinckley Allison C, Pfattner Raphael, Niu Simiao, Li Junheng, Jean
Province Science and technology overall planning and innovation Claverie, Wang Zhen, Chang James, Fox Paige M, Bao Zhenan. Biodegradable and
flexible arterial-pulse sensor for the wireless monitoring of blood flow. Nat Biomed
project (2016ktzdgy04-01); Xianyang science and Technology Bureau Eng 2019;3:47–57.
project, based on multimodal deep learning of TCM “pulse” auxiliary [17] Irawan, Y., Fernando, Y., & Wahyuni, R. Detecting heart rate using pulse sensor as
diagnosis system research (2017k01-25-1). alternative knowing heart condition. J Appl Eng Technol Sci (JAETS), 1(1), p.30-
42.
[18] Han Ouyang, Tian Jingjing, Sun Guanglong, Zou Yang, ZhuoLiu, Hu Li,
Data availability Zhao Luming, Shi Bojing, Fan Yubo, Fan Yifan, Wang Zhong Lin, Zhou Li. Self-
Powered pulse sensor for antidiastole of cardiovascular disease. Adv Mater 2017:
1703456.
Any data is provided as a supplementary file for the paper, it is used [19] Tong-tong Q, Guang-da L, Ting-hang G, Qiu-yue Z, Yu H. Summary of research on
to verify the reliability of the experimental results. digital pulse collection and analysis technology. In: 2021 36th youth academic
annual conference of Chinese association of automation (YAC); 2021. p. 243–7.
[20] Laurent S, Boutouyrie P, Asmar R, Gautier I, Laloux B, Guize L, Ducimetiere P,
Authorship contributions
Benetos A. Aortic stiffness is an independent predictor of all-cause and
cardiovascular mortality in hypertensive patients. Hypertension 2001;37:1236–41.
Lin Fan: Work concept design, Raw data processing, Examination of [21] Rangaprakash D, Narayana Dutt D. Analysis of wrist pulse signals using spatial
features in time domain. In: IEEE international conference on communications and
papers, Preliminary revision of the paper, Publishing assistance. Jin­
signal processing. Melmaruvathur; April 2014. p. 345–8 (India).
Cheng Zhang: Data collection, Drafting papers, Paper algorithm design, [22] Liu L, Zuo W, Zhang D, Li N, Zhang H. Combination of heterogeneous features for
Experimental scheme design, Data calculation, Make important re­ wrist pulse blood flow signal diagnosis via multiple kernel learning. IEEE Trans Inf
visions to the paper, Approval of final papers to be published. Zhongmin Technol Biomed 2012;16(4):598–606.
[23] Rangaprakash D. Statistical analysis of wrist pulse signals obtained under different
Wang: Project sponsorship, Work concept design, Schedule food intake conditions. In: Proceedings of IEEE international conference on
determination. communications and signal processing. Melmaruvathur (India); April 2014.
p. 1410–3.
[24] Chung Yu-Feng, Hu Chung-Shing, Yeh Cheng-Chang, Luo Ching-Hsing. How to
Declaration of competing interest standardize the pulse-taking method of traditional Chinese medicine pulse
diagnosis. Comput Biol Med 2013;43(4):342–9.
The authors declare that they have no known competing financial

11
F. Lin et al. Informatics in Medicine Unlocked 26 (2021) 100717

[25] Sorvoja H, Kokko VM, Myllyla R, Miettinen J. Use of EMFi as a blood pressure pulse [33] Lee Bum Ju, Jeon Young Ju, Bae Jang-Han, Yim Mi Hong, Kim Jong Yeol. Gender
transducer. IEEE Trans Instrum Meas 2005;54:2505–12. differences in arterial pulse wave and anatomical properties in healthy Korean
[26] Nitzan M. Automatic noninvasive measurement of arterial blood pressure. Instrum adults. Eur J Integr Med 2019;25:41–8.
Meas Mag 2011;14:32–7. [34] Weber T, Auer J, O’Rourke MF, et al. Arterial stiffness, wave reflections, and the
[27] Lochner Claire M, Khan Yasser, Pierre Adrien, Arias Ana C. All-organic risk of coronary artery disease. Circulation 2004;109:184–9.
optoelectronic sensor for pulse oximetry. Nat Commun 2014;5:5745. [35] Gatzka CD, Kingwell BA, Cameron JD, et al. Gender differences in the timing of
[28] Yokota T, Zalar P, Kaltenbrunner M, Jinno H, Matsuhisa N, Kitanosako H, arterial wave reflection beyond differences in body height. J Hypertens 2001;19:
Tachibana Y, Yukita W, Koizumi M, Someya T. Ultraflexible organic photonic skin. 2197–203.
Sci Adv 2016;2:e1501856. [36] Hayward CS, Kelly RP. Gender-related differences in the central arterial pressure
[29] Xue Y, Su Y, Zhang C, Xu X, Gao Z, Wu S, Zhang Q, Wu X. Full-field wrist pulse waveform. J Am Coll Cardiol 1997;30:1863–71.
signal acquisition and analysis by 3D digital image correlation. Opt Laser Eng [37] Wang Dimin, Zhang David, Luc Guangming. A robust signal preprocessing
2017;98:76. framework for wrist pulse analysis. Biomed Signal Process Control 2016;23:62–75.
[30] Shao Xinxing, Dai Xiangjun, Chen Zhenning, He Xiaoyuan. Real-time 3D digital [38] Chen Jianhong, Huang Huang, Hao Wenrui, Xu Jinchao. A machine learning
image correlation method and its application in human pulse monitoring. Appl Opt method correlating pulse pressure wave data with pregnancy. Int J Numer Meth
2016:696–704. Biomed Eng 2020;36:e3272.
[31] Zhiqiang Chuanglu Chen, Zhang Yitao, Zhang Shaolong, Hou Jiena, [39] Wang S, Jiang J, Lu X. Study on the classification of pulse signal based on the BP
Zhang Haiying. A 3D wrist pulse signal acquisition system for width information of neural Network. J Biosci Med 2020;8:104–12.
pulse wave. Sensors 2020;20(1):11. [40] Yuesheng Lou, Physical meaning of pulse signal characteristics in TCM and the
[32] Wang Nanyue, Yu Youhua, Huang Dawei, Xu Bin, Jia Liu, Li Tongda, Xue Liyuan, latest research progress of its recognition methods. 2020 2nd international
Shan Zengyu, Chen Yanping, Wang Jia. Pulse diagnosis signals analysis of fatty symposium on the frontiers of biotechnology and bioengineering.
liver disease and cirrhosis patients by using machine learning. Bioinf/Med Inf Trad [41] Xu Liangxu, Zhang Zheng, Gao Fangfang, Zhao Xuan, Xun Xiaochen, Kang Zhuo,
Med Integr Med 2015:859192. Liao Qingliang, Zhang Yue. Self-powered ultrasensitive pulse sensors for
noninvasive multi-indicators cardiovascular monitoring. Nano Energy 2021;81:
105614. ISSN 2211-2855.

12

You might also like