0% found this document useful (0 votes)
19 views9 pages

Application of Neural Network and Cluster Analyses To Differentiate TCM Patterns in Patients With Breast Cancer

Background and Purpose: Pattern differentiation is a critical element of the prescription process for Traditional Chinese Medicine (TCM) practitioners. Application of advanced machine learning techniques will enhance the effectiveness of TCM in clinical practice. The aim of this study is to explore the relationships between clinical features and TCM patterns in breast cancer patients.

Uploaded by

黃維德
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views9 pages

Application of Neural Network and Cluster Analyses To Differentiate TCM Patterns in Patients With Breast Cancer

Background and Purpose: Pattern differentiation is a critical element of the prescription process for Traditional Chinese Medicine (TCM) practitioners. Application of advanced machine learning techniques will enhance the effectiveness of TCM in clinical practice. The aim of this study is to explore the relationships between clinical features and TCM patterns in breast cancer patients.

Uploaded by

黃維德
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

ORIGINAL RESEARCH

published: 08 May 2020


doi: 10.3389/fphar.2020.00670

Application of Neural Network and


Cluster Analyses to Differentiate
TCM Patterns in Patients With
Breast Cancer
Wei-Te Huang 1†, Hao-Hsiu Hung 1†, Yi-Wei Kao 2,3, Shi-Chen Ou 1, Yu-Chuan Lin 1,
Wei-Zen Cheng 1, Zi-Rong Yen 4, Jian Li 2, Mingchih Chen 2, Ben-Chang Shia 3,5,6*
and Sheng-Teng Huang 1,7,8,9,10,11*
1 Department of Chinese Medicine, China Medical University Hospital, Taichung, Taiwan, 2 Graduate Institute of Business

Edited by: Administration, College of Management, Fu Jen Catholic University, New Taipei City, Taiwan, 3 Research Center of Big Data,
Min Ye, College of Management, Taipei Medical University, Taipei, Taiwan, 4 Information Technology Office, China Medical University
Peking University, China Hospital, Taichung, Taiwan, 5 College of Management, Taipei Medical University, Taipei, Taiwan, 6 Executive Master Program
of Business Administration in Biotechnology, College of Management, Taipei Medical University, Taipei, Taiwan, 7 School of
Reviewed by:
Chinese Medicine, China Medical University, Taichung, Taiwan, 8 Research Center for Traditional Chinese Medicine,
Xu Wu,
Department of Medical Research, China Medical University Hospital, Taichung, Taiwan, 9 Chinese Medicine Research Center,
Southwest Medical University,
China Medical University, Taichung, Taiwan, 10 Research Center for Chinese Herbal Medicine, China Medical University,
China
Taichung, Taiwan, 11 Department of Chinese Medicine, An-Nan Hospital, China Medical University, Tainan, Taiwan
Gang Bai,
Nankai University, China

*Correspondence: Background and Purpose: Pattern differentiation is a critical element of the prescription
Ben-Chang Shia process for Traditional Chinese Medicine (TCM) practitioners. Application of advanced
[email protected]
Sheng-Teng Huang
machine learning techniques will enhance the effectiveness of TCM in clinical practice. The
[email protected]; aim of this study is to explore the relationships between clinical features and TCM patterns
[email protected]
in breast cancer patients.

These authors have contributed
equally to this work Methods: The dataset of breast cancer patients receiving TCM treatment was recruited
from a single medical center. We utilized a neural network model to standardize
Specialty section: terminologies and address TCM pattern differentiation in breast cancer cases. Cluster
This article was submitted to
Ethnopharmacology,
analysis was applied to classify the clinical features in the breast cancer patient dataset. To
a section of the journal evaluate the performance of the proposed method, we further compared the TCM
Frontiers in Pharmacology
patterns to therapeutic principles of Chinese herbal medication in Taiwan.
Received: 26 January 2020
Accepted: 23 April 2020 Results: A total of 2,738 breast cancer cases were recruited and standardized. They
Published: 08 May 2020 were divided into 5 groups according to clinical features via cluster analysis. The pattern
Citation: differentiation model revealed that liver-gallbladder dampness-heat was the primary TCM
Huang W-T, Hung H-H, Kao Y-W,
Ou S-C, Lin Y-C, Cheng W-Z, Yen Z-R,
pattern identified in patients. The main therapeutic goals of the top 10 Chinese herbal
Li J, Chen M, Shia B-C and Huang S-T medicines prescribed for breast cancer patients were to clear heat, drain dampness, and
(2020) Application of Neural Network
detoxify. These results demonstrated that the neural network successfully identified
and Cluster Analyses to Differentiate
TCM Patterns in Patients patterns from a dataset similar to the prescriptions of TCM clinical practitioners.
With Breast Cancer.
Front. Pharmacol. 11:670.
Conclusion: This is the first study using machine-learning methodology to standardize
doi: 10.3389/fphar.2020.00670 and analyze TCM electronic medical records. The patterns revealed by the analyses were

Frontiers in Pharmacology | www.frontiersin.org 1 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

highly correlated with the therapeutic principles of TCM practitioners. Machine learning
technology could assist TCM practitioners to comprehensively differentiate patterns and
identify effective Chinese herbal medicine treatments in clinical practice.
Keywords: traditional Chinese medicine, electronic medical records, breast cancer, neural network analysis,
cluster analysis, pattern differentiation

INTRODUCTION Artificial neural networks (ANN) are non-linear models that


have shown to be useful in elucidating the relationship between
Breast cancer is the most common cancer affecting the female the input and output signals of a complex system (Zhang et al.,
population globally. As an adjunct for cancer treatments, 2018). In this study, we utilized DeepMedic software which
complementary and alternative medicine (CAM) is an incorporated TCM pattern data with ANN to differentiate the
increasingly popular option sought by patients with breast TCM patterns identified in individual breast cancer patients. A
cancer (Crocetti et al., 1998; Balneaves et al., 2006; Boon et al., series of methods including cluster analysis were applied to
2007). Meanwhile, Traditional Chinese Medicine (TCM) is an analyze a dataset of EMR. The cluster analysis was also applied
important component of CAM, and is currently widely used by to evaluate the relationships between clinical features, referred to
breast cancer patients in the ethnic Chinese population (Chen as symptoms and signs in TCM clinical practice, to distinguish
et al., 2008). Many patients seek TCM to resolve side effects TCM pattern differentiation. To evaluate the performance of the
including nausea and vomiting, fatigue, paresthesia, chronic TCM pattern differentiation system developed for our study, we
pain, constipation, and anorexia which may result from further compared the TCM patterns identified in each cluster
standard Western medicine cancer treatments (Chung subgroup with the top ten Chinese herbal prescriptions for
et al., 2016). Taiwanese breast cancer patients (Huang et al., 2017).
Despite the increased popularity of TCM, modernization in The aim of this study was to apply neural network analysis
the field of TCM remains gradual (Ling and Xu, 2013). One and cluster analysis to reveal patterns from an EMR dataset and
particular limitation lies in the fact that the diagnostic and to compare them with the prescriptions of TCM clinical
therapeutic systems of TCM depend heavily on the notion of practitioners for the treatment of patients with breast cancer
pattern differentiation. The TCM pattern is a diagnostic in Taiwan.
summary of each individual based on four diagnostic methods:
observation, listening, questioning, and pulse detection (World
Health Organization. Regional Office for the Western, 2007).
Until recently, inefficient data extraction methods have limited
MATERIALS AND METHODS
the development of automated TCM pattern differentiation. Data Acquisitions
Furthermore, the combinational and highly individualized The EMR of breast cancer patients (ICD-9 174.0–174.9) having
nature of TCM prescriptions in clinical practice create received TCM treatment between January 01, 2003 and June 15,
challenges for researchers to successfully execute randomized 2018 were collected from the China Medical University Hospital
control trials to verify TCM theories. (CMUH) database. The diagnoses were based on the
In recent decades, access to electronic medical records (EMR) International Classification of Diseases, Ninth Revision,
and advanced machine-learning techniques have enabled the Clinical Modification (ICD-9-CM). This study was approved
development of computational methods to enhance the field of by the Research Ethics Committee of China Medical University
TCM. More specifically, researchers can now automate the data and Hospital, Taichung, Taiwan (CMUH107-REC2-023). All of
mining process through natural language processing and the datasets analyzed were decoded so that the review board
information extraction methods. A previous study has waived the requirement to sign informed consent from patients.
demonstrated a framework of automatic diagnosis of TCM by
analyzing raw free-text clinical records (Wang et al., 2012). DeepMedic Neural Network Analysis
In this study, we used the DeepMedic software to standardize the
terminologies of TCM, and to summarize the most likely TCM
Abbreviations: AF, Average frequency; ANN, Artificial neural networks; AP, pattern in each case. The standardization process aimed to unify
Amount of people; CAM, Complementary and Alternative Medicine; CMUH,
the polysemous or synonymous vocabulary used in the TCM
Chinese Medical University Hospital; DLTF, Depressed liver qi transforming into
fire; EMR, Electronic medical records; ICD-9-CM, International Classification of diagnostic system to facilitate the neural network analysis. The
Diseases, Ninth Revision, Clinical Modification; KPI, Key Performance Indicators; standardization process was accomplished by modifying
LDSD, Liver depression and spleen deficiency; LGDH, Liver-gallbladder symptom vocabulary to match the thesaurus within the
dampness-heat; LKYD, Liver-kidney yin deficiency; NLP, Natural language DeepMedic software, which contains over 20,000 symptom
processing; PF, Primary features; QDBS, Qi deficiency with blood stasis; RDH,
terminologies. Respective standard nomenclatures were applied
Retained dampness-heat; RDT, Retained dampness-toxin; SF, Secondary features;
SSQD, Spleen-stomach qi deficiency; TCM, Traditional Chinese medicine; TF- in the standardization process of syndrome elements, TCM
IDF, Term-Frequency-Inverse Document Frequency. patterns, and treatment modalities. The DeepMedic software

Frontiers in Pharmacology | www.frontiersin.org 2 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

can convert TCM patterns into several codes, and label the the lowest ranking was not more than 10 and all frequencies of
standard TCM terminologies. For each case being analyzed as this variable in each cluster were more than 5% among clusters,
input, the specific TCM pattern was identified by determining were considered the primary features of breast cancer cases, since
the higher-weighted code of symptoms and signs. A forward and these symptoms had similar importance in each cluster. When
backward propagation of the neural network, consisting of the cluster analytical result of KPI has the most number of
several hidden layers, was used to calculate the weightings of primary features, it will be defined as the best KPI.
each code. The weighting of each pattern was based on different A symptom is defined as a subjective experience of a disease
symptoms and signs, calculated by using the well-known or physical ailment reported by a patient, while a sign is defined
heuristic equation, Term-Frequency-Inverse Document as any abnormal indication of disease that is identified by TCM
Frequency (TF-IDF), with some modifications. practitioners (Dodd et al., 2001). Pulse and tongue inspections
TF = (the frequencies of symptom A in code B/code) are the primary diagnostic methods applied by TCM
practitioners to collect the data of clinical signs. Despite the
Term frequency = ft,d ∕ o ft ,d
t 0 ∈d
0 correlation between symptoms and signs, the data collection
methodologies are different; therefore, we separately collected
and analyzed data of symptoms and three types of signs for
Inverse document frequency smooth = logð1 + N ∕ nt Þ subsequent TCM pattern differentiation.
The efficacies, as well as the details of related methods, have Clinical signs including tongue appearance, tongue coating,
been demonstrated in our previous study (Lin et al., 2019). The and pulse were analyzed individually due to variables. The
website accessing the demo version of DeepMedic software can symptoms and signs were ranked according to the frequency
be found at: http://bigdata-demo.deepmedic.cn/. of concurrent events. To make the high-ranking symptom and
sign variables more representative, we excluded variables with a
Cluster Analysis frequency of less than 5%, and the remaining variables were
In statistical methodologies, the purpose of cluster analysis is to regarded as secondary features (SF) in each cluster.
group the classification objects according to the characteristics of
the particular dataset. Study objects classified to the same group TCM Pattern Identification With Various PF
have similar characteristics, while those classified to different and SF
groups indicate that there are considerable differences in the From the previous analysis, we obtained the PF and SF of each
characteristics. We used K-means cluster to divide data into cluster in the cluster analysis with the best KPI. Each SF had
groups, and the number of clusters was determined by using the different chances in the cluster due to differing frequencies. In
smallest total within the sum of squares. order to analyze various possibilities, we disassembled the SF in a
cluster and combined them into “Sx_n”. Where “x” was the
Key Performance Indicators (KPI) number of a cluster, and “n” was the top number of symptoms of
Each variable in the dataset of this study was recorded by binary the SF. For example, S1_5 represented the top five symptoms of
classification of “yes/no”. Additionally, more even variables are the SF in cluster 1 and its frequency was judged by the fifth
more effective at finding similarity between each cluster. symptom. Finally, these were combined with the PF as “P +
Therefore, we calculated the mean and standard deviation Sx_n”. DeepMedic software was applied to objectively analyze
from all variables according to the concept of coefficient of the general TCM pattern of all combinations. We counted the
variation. The KPI obtained from dividing the standard number of various types of patterns and weighted each pattern
deviation by the mean is used for selecting variables. The with the frequency of the last symptom in each combination to
statistical formula is shown below. The higher value of this calculate the percentage of this pattern occurring in the cluster.
statistic represents more even variables. In order to find the The percentage of a pattern equal to the average frequency of a
optimal KPI, we limited the capture frequency of the variable to pattern was divided by the sum of average frequency of all
more than 5%. Starting from the minimum KPI, we increased the patterns. The statistical formula is shown below.
interval by 0.01 to find the best one.
fij
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Percentageij = Fi
itemyes itemno pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s itemall  itemall itemyes  itemno itemno Fi = Sum of average frequency of patterns in cluster i
KPIitem = = = =
x itemyes itemyes itemyes fij = Average frequency of pattern j in cluster i
itemall

i = 1, 2, …, 5
The Analysis of Symptoms and Signs in
j = 1, 2, …, number of patterns in cluster i
Cluster Model
If there were no statistically significant differences and greater
than 5% frequency of a variable among clusters, this variable Chinese Herbal Prescriptions in Breast
would be determined as a primary feature (PF). Additionally, Cancer Patients
symptoms that had significant differences in frequency but TCM herbs were classified into several categories based on their
similar rankings, where the difference between the highest and usage. To prove that the study objects are compatible with the

Frontiers in Pharmacology | www.frontiersin.org 3 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

clinical prescriptions, we analyzed the top 10 single herbs and 2,738 cases contained records of the specific herbs and formulas
formulas prescribed by clinical TCM practitioners in Taiwan prescribed. The flowchart of our data acquisition is shown in
(Huang et al., 2017). To compare the usage in frequency and dose Figure 2.
of each herb and formula, we ranked these medications
according to the value obtained by the number of person-days The Standardization of Clinical Features
multiplied by average daily dose. In the 2,738 analyzable records, the top twenty symptoms in
Overall, the architecture (see Figure 1) of this study is frequency included “insomnia”, “dry mouth”, “lack of strength”,
primarily composed of five steps, as shown below. “dizziness”, “loss of appetite”, “abdominal distention”, “profuse
dreaming”, “bitter taste of mouth”, “lumbago”, “back pain”,
1. Standardize the terminologies of TCM. “afraid of cold”, “loose stool”, “headache”, “nausea”, “absence
2. Find the best KPI to indicate that cluster analytical result has of thirst”, “cough”, “acid regurgitation”, “soreness”, “nocturia”,
the most number of primary features. and “dry eyes”. The top five tongue appearances included “pale
3. Combine primary features and secondary features into red tongue”, “red tongue”, “teeth-marked tongue”, “dark red
different arrangements in each cluster. tongue”, and “dry tongue”. The top five tongue coatings included
4. Identify TCM patterns of each combination in each cluster “white coating”, “thin coating”, “thin white tongue”, “slimy
through machine-learning confirmation. coating”, and “thick coating”. The top five pulses included
5. Compare the similarity between TCM patterns in each cluster “string-like pulse”, “slippery pulse”, “fine pulse”, “weak pulse”,
and the therapeutic principles of Top 10 Chinese herbal and “sunken pulse”. The ranking and frequency of each
prescriptions in Taiwan. symptom and sign are listed in Table 1.

Cluster Analysis
The declining slope of total within the sum of squares moderated
RESULTS when the data was divided into five groups, indicating that it was
an acceptable number of groups for the analysis of breast cancer
Data Extraction patient records (Supplementary Figure 1).
We selected only the initial visit records of individual patients,
and excluded the remaining follow-up records, which contained Symptoms and Signs of PF and SF in
incomplete data. All of these records must have included Each Cluster
patient's gender, age, and details concerning symptoms and The minimum KPI for this study of breast cancer patients was
signs. A total of 78,917 breast cancer patients' records were 0.231, and the best one was 0.252045. The frequency ranking
recruited, including 2,913 complete initial visit records, of which differences of tongue appearances, tongue coatings, and pulses in

FIGURE 1 | The analytical architecture of TCM EMR in the patients with breast cancer. The workflow (1) ~ (5) describes the process of multiple analyses to classify
TCM clinical features and patterns of breast cancer patients, compared with therapeutic principles of Chinese herbal prescriptions by TCM practitioners in Taiwan.

Frontiers in Pharmacology | www.frontiersin.org 4 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

TABLE 1 | The top 20 symptoms and the top five signs in breast cancer
patients.

Rank Symptom n(%) Rank Sign n(%)

1 Insomnia 1060 (38.7%) Tongue appearance


2 Dry mouth 863 (31.5%) 1 Pale red tongue 516 (18.8%)
3 Lack of strength 467 (17.1%) 2 Red tongue 460 (16.8%)
4 Dizziness 395 (14.4%) 3 Teeth-marked tongue
241 (8.8%)
5 Loss of appetite 377 (13.8%) 4 Dark red tongue 137 (5.0%)
6 Abdominal distention 5 Dry tongue 120 (4.4%)
273 (10.0%)
7 Profuse dreaming 266 (9.7%)
8 Bitter taste of mouth
234 (8.5%) Tongue coating
9 Lumbago 231 (8.4%) 1 White coating 1163 (42.5%)
10 Back pain 199 (7.3%) 2 Thin coating 804 (29.4%)
11 Afraid of cold 194 (7.1%) 3 Thin and white coating
634 (23.2%)
12 Loose stool 192 (7.0%) 4 Slimy coating 368 (13.4%)
13 Headache 189 (6.9%) 5 Thick coating 213 (7.8%)
14 Nausea 181 (6.6%)
15 Absence of thirst 175 (6.4%) Pulse
16 Cough 175 (6.4%) 1 String-like pulse 865 (31.6%)
FIGURE 2 | Flow chart of study cases in CMUH. 17 Acid regurgitation 170 (6.2%) 2 Slippery pulse 520 (19.0%)
18 Soreness 143 (5.2%) 3 Fine pulse 443 (16.2%)
19 Nocturia 139 (5.1%) 4 Weak pulse 369 (13.5%)
each cluster were evaluated. According to the proportion 20 Dry eyes 136 (5.0%) 5 Sunken pulse 333 (12.2%)
between symptoms and signs, if the difference of an individual
tongue appearance or pulse was no more than 3 among clusters,
or individual coating was no more than 5 among clusters, it
would be considered a PF. TCM Patterns of Combinations With
The PF of breast cancer patients included insomnia, dry Various PF and SF
mouth, lack of strength, dizziness, loss of appetite, bitter taste There were 87 combinations of PF and SF. The analysis of these
of mouth, abdominal distention, headache, loose stool, nausea, feature combinations is demonstrated in Supplementary Table
slippery pulse, and rapid pulse. The number of cases and SF in 1. Liver-gallbladder dampness-heat (LGDH) was the TCM
each cluster subgroup are listed in Table 2. pattern identified as PF. The main TCM pattern and its

TABLE 2 | The PF and SF in each cluster subgroup.

Categories PF S1* S2 S3 S4 S5

AP 2738 975 290 634 501 338


Symptom Insomnia, dry mouth, lack of Profuse dreaming, Profuse dreaming, Profuse dreaming, Lumbago, Absence of thirst,
strength, dizziness, loss of lumbago, afraid of cold, nocturia, afraid of lumbago, absence of backache, cough, lumbago,
appetite, bitter taste of mouth, backache, acid cold, absence of thirst, soreness, backache, soreness, profuse backache, afraid
abdominal distention, headache, regurgitation, nocturia, thirst, acid acid regurgitation, dreaming, dry eyes, of cold
loose stool, nausea cough, soreness regurgitation abdominal pain, cough afraid of cold,
cough
Pulse Slippery pulse, rapid pulse Sunken pulse, weak String-like pulse, String-like pulse, fine String-like pulse, Moderate pulse,
pulse, fine pulse sunken pulse, pulse, weak pulse, fine pulse, weak sunken pulse, fine
weak pulse, fine moderate pulse, sunken pulse, sunken pulse, weak pulse,
pulse pulse, rough pulse pulse, rough pulse, fine pulse, rough
floating pulse pulse
Tongue Pale red tongue Red tongue, dry Pale red tongue, red Pale red tongue, Pale red tongue,
appearance tongue tongue, teeth-marked teeth-marked teeth-marked
tongue, dark red tongue, tongue, red tongue, tongue, red
dry tongue, enlarged dark red tongue tongue
tongue
Coating Slimy coating, thin Slimy coating, thin Thin coating White coating, slimy White coating,
coating coating, white coating, thin thick coating, slimy
coating, thick coating coating
coating

*SX, Secondary features in cluster X; PF, Primary features; SF, Secondary features; AP, Amount of people.

Frontiers in Pharmacology | www.frontiersin.org 5 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

TABLE 3 | Average frequency and percentage of the main TCM patterns in each main TCM pattern (70%) in cluster 2, followed by DLTF (22%).
cluster subgroup.
DLTF was the main TCM pattern (74%) in cluster 3, followed by
ALL AF % C1* AF % C2 AF % LDSD (13%). In cluster 4, LGDH was still the main TCM pattern
(59%), followed by DLTF (19%), and LDSD (9%). LGDH
LGDH 85% 43% DLTF 12% 35% LGDH 70% 70%
accounted for the main TCM pattern (64%) in cluster 5,
DLTF 38% 20% RDH 10% 30% DLTF 22% 22%
RDT 22% 12% LDSD 8% 23% LDSD 9% 9%
followed by RDT (15%), and spleen-stomach qi deficiency
LKYD 11% 6% SSQD 4% 13% (SSQD) (11%). For detailed definition of each pattern from
LDSD 11% 6% WHO (World Health Organization. Regional Office for the
SSQD 11% 5% Western, 2007), please refer to Supplementary Table 2.
RDH 10% 5%
QDBS 7% 4%
The Top 10 of Chinese Herbal
C3 AF % C4 AF % C5 AF %
DLTF 71% 74% LGDH 99% 59% LGDH 99% 64%
Prescriptions in Breast Cancer Patients
LDSD 13% 13% DLTF 33% 19% RDT 22% 15% As shown in Tables 4 and 5, the top 10 of Chinese herbal
LKYD 6% 6% LDSD 15% 9% SSQD 17% 11% prescriptions in breast cancer patients included those that could
QDBS 6% 6% LKYD 12% 7% LDSD 10% 7% clear heat, drain dampness and detoxify (29%), harmonize the
QDBS 8% 5% QDBS 6% 4%
liver and spleen (19%), tonify qi (18%), nourish the heart to
*CX, Cluster X; TCM, Traditional Chinese medicine; AF, Average frequency; LGDH, Liver- tranquilize (15%), activate blood and resolve stasis (12%), tonify
gallbladder dampness-heat; DLTF, Depressed liver qi transforming into fire; RDT, Retained yin (4%), clear heat and resolve phlegm (2%), and offensive
dampness-toxin; LKYD, Liver-kidney yin deficiency; LDSD, Liver depression and spleen
deficiency; SSQD, Spleen-stomach qi deficiency; RDH, Retained dampness-heat; QDBS,
purgative (1%). The components of each formula were
Qi deficiency with blood stasis. summarized in Supplementary Table 3.

percentage in each cluster subgroup are demonstrated in Table 3


and Figure 3. LGDH was the main TCM pattern (43%) among DISCUSSION
all feature combinations, followed by the patterns of depressed
liver qi transforming into fire (DLTF) (20%) and retained TCM combined with western medical treatment is widely used
dampness-toxin (RDT) (12%). among breast cancer patients. Previous studies have revealed
In cluster 1, DLTF was the main TCM pattern (35%), lower 5-year recurrence and metastasis rate, and decreased
followed by retained dampness-heat (RDH) (30%), and liver incidence of chronic hepatitis while receiving radiotherapy
depression and spleen deficiency (LDSD) (23%). LGDH was the and/or chemotherapy, in breast cancer patients with the

FIGURE 3 | The TCM pattern distribution in each cluster subgroup. LKYD, Liver-kidney yin deficiency; LGDH, Liver-gallbladder dampness-heat; DLTF, Depressed
liver qi transforming into fire; LDSD, Liver depression and spleen deficiency; QDBS, Qi deficiency with blood stasis; SSQD, Spleen-stomach qi deficiency; RDT,
Retained dampness-toxin; RDH, Retained dampness-heat.

Frontiers in Pharmacology | www.frontiersin.org 6 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

TABLE 4 | The top 10 of Chinese herbal prescription including single herbs and combination use of TCM (Liu et al., 2008; Huang et al., 2017).
formulae in breast cancer patients in Taiwan.
Some Chinese medicinal herbs have demonstrated effects in
Herbal prescription Total Therapeutic Effect controlling the progression, increasing the susceptibility to
consumption (g)* radiotherapy and chemotherapy, elevating immunity, and
decreasing the toxicities or side effects of cancer therapies (Yin
Single herb
Hedyotis diffusa Willd. 553153.5 Clear heat, drain dampness
et al., 2013). Based on the potential therapeutic effects of TCM,
and detoxify we explored the relationships between clinical features and TCM
Scutellaria barbata D. Don 498634 Clear heat, drain dampness patterns in breast cancer patients via the applications of machine
and detoxify learning techniques. TCM clinical records were gathered in this
Taraxacum mongolicum 442880.6 Clear heat, drain dampness
study for text analysis.
Hand.-Mazz. and detoxify
Spatholobus suberectus 277550.5 Activate blood and resolve Text analysis is a subfield of natural language processing
Dunn stasis (NLP). In the past, the lack of a widely adopted and consistently
Zizyphus jujuba Mill var. 236498.7 Nourish the yin to tranquilize implemented medical terminology limited the use of machine-
spinosa learning in medical research, especially in the field of TCM. In
Salvia miltiorrhiza Bge. 220686.4 Activate blood and resolve
this study, we used the DeepMedic software to analyze
stasis
Astragalus membranaceus 209072.8 Tonify qi unstructured electronic TCM clinical records. The software
(Fisch.) Bunge standardized and integrated key TCM terminology via the
Polygonum multiflorum 154919.8 Nourish the heart to application of an NLP system and neural network. A total of
Thunb. tranquilize 2,738 breast cancer records were standardized and divided into 5
Fritillaria thunbergii Miq. 148233 Clear heat and resolve
phlegm
subgroups via cluster analysis according to the frequency of
Rheum palmatum L. 66898.8 Offensive purgative clinical features reported in each case. Since patterns were not
directly observable, the TCM patterns were differentiated via
Formula DeepMedic software by analyzing the PF and SF in each
Jia-Wei-Xiao-Yao-San 1604612.1 Harmonize the liver and
cluster subgroup.
spleen
San-Zhong-Kui-Jian-Tang 523792.8 Clear heat, drain dampness
and detoxify The TCM Patterns in Breast
Xue-Fu-Zhu-Yu-Tang 517921.8 Activate blood and resolve Cancer Patients
stasis As shown in Table 3 and Figure 3, LGDH was the main TCM
Xiang-Sha-Liu-Jun-Zi-Tang 517717.2 Tonify qi
pattern (43%) identified in breast cancer patients, which was
Gui-Pi-Tang 464660 Nourish the heart to
tranquilize compatible with the analysis of PF. According to the TCM
Bu-Zhong-Yi-Qi-Tang 403355.6 Tonify qi patterns including LGDH, DLTF, and RDH, the liver is the
Suan-Zao-Ren-Tang 402041.9 Nourish the heart to main disease location of breast cancer, while dampness and heat
tranquilize were the main pathological mechanisms. According to TCM
Zhen-Ren-Huo-Ming-Yin 388555.2 Clear heat, drain dampness
and detoxify
theory, the liver is related to the nerve-endocrine-immune
Zhi-Bai-Di-Huang-Wan 367875 Tonify yin network, it is responsible for the regulation of emotion, the
Sheng-Mai-Yin 336522.3 Tonify qi promotion of digestion and absorption, and the maintenance of
*The total consumption of the herb is the number of person-days multiplied by average
qi and blood circulation via the nerves and endocrine (Liu et al.,
daily dose. 2017). In TCM theory, “fire” is the advanced status of “heat” in
severity, while “toxin” indicates faster transmission of heat and
worsening condition. Since heat and fire will damage the yin, and
the depressed liver qi will impair the function of the spleen, some
TABLE 5 | The therapeutic effects of the commonly used Chinese herbs in patients exhibit both yin and spleen qi deficiencies. Qi deficiency
breast cancer patients in Taiwan. with blood stasis (QDBS) was also one of the SF identified in
breast cancer patients, since qi deficiency will result in stagnated
Therapeutic effect Total Percentage
consumption (g)*
blood circulation. As exhibited in Table 3 and Figure 3, the
frequency of the LGDH and DLTF patterns had great impact on
Clear heat, drain dampness and detoxify 2407016.1 29% these cluster subgroups. The presence of some minor TCM
Harmonize the liver and spleen 1604612.1 19% patterns also helped to distinguish these five subgroups.
Tonify qi 1466667.9 18%
Nourish the heart to tranquilize 1258120.4 15%
Cluster 1
Activate blood and resolve stasis 1016158.7 12%
Tonify yin 367875 4% The percentage the DLTF pattern (35%) was similar to that of the
Clear heat and resolve phlegm 148233 2% RDH pattern (30%). Additionally, the percentage of the
Offensive purgative 66898.8 1% LDSD (23%) and SSQD (13%) patterns were higher than
*The total consumption of the herb is the number of person-days multiplied by average those of other cluster subgroups. This indicates that there was
daily dose. no dominant TCM pattern in cluster 1.

Frontiers in Pharmacology | www.frontiersin.org 7 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

Cluster 2 commonly prescribed in TCM formulas for the treatment of


The percentages and frequencies of the TCM patterns in cluster 2 cancer (Zhang et al., 2017). The therapeutic principle of
were similar to the overall cases. Compared with other harmonizing the liver and spleen is consistent with LDSD. The
patterns, the LGDH pattern was higher (70%). The therapeutic principle of tonifying qi is consistent with the
secondary pattern in cluster 2 was DLTF. patterns of LDSD, SSQD, and QDBS. The principle of
Cluster 3 activating blood and resolving stasis is consistent with the
pattern of QDBS. The principle of tonifying yin is consistent
The percentage of the DLTF pattern was higher (74%) than in
with the pattern of LKYD. The pattern of clearing heat and
other subgroups. Patterns related with heat were the leading
resolving phlegm is consistent with the patterns of RDT and
patterns in cluster 3. Unlike other clusters, there was no
RDH. Collectively, the therapeutic principles of these Chinese
pattern related with dampness in cluster 3. Due to depressed
herbal medications are suitable with the needs of breast cancer
liver qi and heat, some patients demonstrated patterns of
patients based on their TCM patterns.
spleen qi deficiency (13%), blood stasis (6%), and liver kidney
yin deficiency (LKYD) (6%).
Limitations
Cluster 4 Bias in the TCM pattern differentiation indeed exists, as it is
LGDH was the dominant pattern in cluster 4, according to its difficult to adjust the weight based on the frequency of each
high percentage (59%) and frequency (99%), followed by the clinical feature in the DeepMedic software. Moreover, selective
DLTF pattern. Some patients demonstrated the LDSD and bias may be present due to the retrieved clinical cases in this
QDBS patterns, similar to cluster 3. study being from a single medical center. Owing to a limited
Cluster 5 number of clinical cases, it is difficult to elucidate the TCM
LGDH was the primary pattern identified in cluster 5, with a high patterns in the different stages of breast cancer. Further study is
percentage (64%) and frequency (99%). RDT was the necessary to evaluate whether different TCM patterns are related
secondary pattern (22%) in cluster 5. Based on TCM theory, to the progression of tumor growth or to the side effects of
this indicates that more than one fifth of patients have a higher different therapeutic modalities.
degree of severity and the disease develops at a faster rate.
Overall, LGDH was the leading pattern in clusters 2, 4, and 5.
Meanwhile, DLTF was the secondary pattern in clusters 2 and CONCLUSION
4, but the leading pattern in cluster 3. RDT was the secondary
pattern in cluster 5. As for the patterns of spleen stomach This is the first study to apply a machine-learning model to
deficiency and of blood stasis, higher frequencies were noted standardize EMR terminology and analyze TCM patterns in
in clusters 4 and 5. Of note, there was no leading pattern breast cancer patients. With the application of neural network
identified in cluster 1, indicating no clear TCM pattern based and cluster analyses, five primary TCM patterns were identified
on the analysis. More clinical data and information regarding based on the clinical symptoms and signs reported in breast
clinical features are required in the TCM pattern analysis of cancer patients. The therapeutic principles and prescriptions by
this cluster subgroup. Clinical TCM practitioners need to TCM clinical practitioners focus on treating dampness, heat, and
exercise vigilance when treating such patients as those qi stagnation as the major pathologies in patients with breast
included in cluster 1, as these patients demonstrate no clear cancer. In conclusion, machine learning technology could assist
direction for Chinese medical treatment. As shown in TCM practitioners to comprehensively differentiate patterns and
Supplementary Table 1, the cluster analysis revealed that identify effective Chinese herbal medicine treatments in
LGDH was the primary pattern exhibited by breast cancer clinical practice.
patients in Taiwan.

Associations Between TCM Patterns and DATA AVAILABILITY STATEMENT


Herbal Medications
According to the DeepMedic software analysis, the liver was the The datasets generated for this study are available on request to
main viscera associated with breast cancer patients, followed by the corresponding authors.
gallbladder, spleen, stomach, and kidney. Dampness, heat, and qi
stagnation were the major etiologies associated with breast
cancer, followed by yin deficiency, qi deficiency, and blood ETHICS STATEMENT
stasis. As shown in Table 5, the main therapeutic goal of TCM
practitioners in Taiwan for treatment of breast cancer patients This study was approved by the Research Ethics Committee of
was to clear heat, drain dampness, and detoxify, consistent with China Medical University and Hospital, Taichung, Taiwan
the patterns of LGDH, DLTF, RDT, and RDH. This result (CMUH107-REC2-023). All of the datasets analyzed were
corresponds with a previous study by Zhang et al, which decoded so that the review board waived the requirement to
reported that heat-clearing and detoxifying herbs are sign informed consent from patients.

Frontiers in Pharmacology | www.frontiersin.org 8 May 2020 | Volume 11 | Article 670


Huang et al. TCM: Neural Network and Cluster Analyses

AUTHOR CONTRIBUTIONS Chinese Medicine Research Center, China Medical University,


under the Higher Education Sprout Project, Ministry of
W-TH and H-HH equally wrote the draft and interpreted the Education (CMRC-CHM-1) in Taiwan.
data. W-TH, Y-WK, S-CO, H-HH, Y-CL, and Z-RY collected
and assembled the data. W-TH and Y-WK analyzed the data. JL
and MC provided methodological support and rectified all of
analyzed data. B-CS and S-TH designed and conceived the study,
ACKNOWLEDGMENTS
and edited the manuscript. All of the authors approved the
final manuscript. We thank DeepMedic, Inc. (Taipei, Taiwan) for sharing the
software of DeepMedic for data analysis and extensive
comments. The authors would like to thank James Waddell of
FUNDING Concise Language Services for the critical reading and revision of
our manuscript.
This work was supported and funded by the Ministry of Science
and Technology of Taiwan (MOST 108-2320-B-039-022),
Health and Welfare Surcharge of Tobacco Products, China
Medical University Hospital Cancer Research Center of SUPPLEMENTARY MATERIAL
Excellence (MOHW108-TDU-B-212-124024), China Medical
University Hospital (DMR-108-007, DMR-108-009, DMR-108- The Supplementary Material for this article can be found online
044 and CRS-108-001), An-Nan Hospital, China Medical at: https://www.frontiersin.org/articles/10.3389/fphar.2020.
University (ANHRF-108-06 and ANHRF-108-08) and the 00670/full#supplementary-material

REFERENCES Liu, S., Zhao, J., Liu, J., Sun, Z. -P., Hua, Y. -Q., Lu, D. -M., et al. (2008). Effects of
Ru'ai Shuhou Recipe on 5-year recurrence rate after mastectomy in breast
Balneaves, L. G., Bottorff, J. L., Hislop, T. G., and Herbert, C. (2006). Levels of cancer. J. Chin. Integr. Med. 6 (10), 1000–1004. doi: 10.3736/jcim20081003
commitment: exploring complementary therapy use by women with breast Liu, Z. W., Shu, J., Tu, J. Y., Zhang, C. H., and Hong, J. (2017). Liver in the Chinese
cancer. J. Altern. Complement Med. 12 (5), 459–466. doi: 10.1089/ and Western Medicine. Integr. Med. Int. 4 (1-2), 39–45. doi: 10.1159/
acm.2006.12.459 000466694
Boon, H. S., Olatunde, F., and Zick, S. M. (2007). Trends in complementary/ Wang, Y., Yu, Z., Jiang, Y., Liu, Y., Chen, L., and Liu, Y. (2012). A framework and
alternative medicine use by breast cancer survivors: comparing survey data its empirical study of automatic diagnosis of traditional Chinese medicine
from 1998 and 2005. BMC Womens Health 7 (4). doi: 10.1186/1472-6874-7-4 utilizing raw free-text clinical records. J. BioMed. Inform 45 (2), 210–223.
Chen, Z., Gu, K., Zheng, Y., Zheng, W., Lu, W., and Shu, X. O. (2008). The use of doi: 10.1016/j.jbi.2011.10.010
complementary and alternative medicine among Chinese women with breast World Health Organization. Regional Office for the Western, P. (2007). WHO
cancer. J. Altern. Complement Med. 14 (8), 1049–1055. doi: 10.1089/acm.2008.0039 international standard terminologies on traditional medicine in the Western
Chung, V. C., Wu, X., Lu, P., Hui, E. P., Zhang, Y., Zhang, A. L., et al. (2016). Pacific Region: (Manila : WHO Regional Office for the Western Pacific).
Chinese Herbal Medicine for Symptom Management in Cancer Palliative Care: Yin, S. Y., Wei, W. C., Jian, F. Y., Yang, N. S.Therapeutic Applications of Herbal
Systematic Review And Meta-analysis. Med. (Baltimore) 95 (7), e2793. Medicines for Cancer Patients. (2013). Evid. Based Complement. Alternat.
doi: 10.1097/MD.0000000000002793 Med. 2013, 302426. doi: 10.1155/2013/302426
Crocetti, E., Crotti, N., Feltrin, A., Ponton, P., Geddes, M., and Buiatti, E. (1998). Zhang, Y., Liang, Y., and He, C. (2017). Anticancer activities and mechanisms of
The use of complementary therapies by breast cancer patients attending heat-clearing and detoxicating traditional Chinese herbal medicine. Chin. Med.
conventional treatment. Eur. J. Cancer 34 (3), 324–328. doi: 10.1016/s0959- 12 (20). doi: 10.1186/s13020-017-0140-2
8049(97)10043-0 Zhang, Z., Beck, M. W., Winkler, D. A., Huang, B., Sibanda, W., Goyal, H., et al.
Dodd, M., Janson, S., Facione, N., Faucett, J., Froelicher, E. S., Humphreys, J., et al. (2018). Opening the black box of neural networks: methods for interpreting
(2001). Advancing the science of symptom management. J. Adv. Nurs. 33 (5), neural network models in clinical applications. Ann. Transl. Med. 6 (11), 216.
668–676. doi: 10.1046/j.1365-2648.2001.01697.x doi: 10.21037/atm.2018.05.32
Huang, K. C., Yen, H. R., Chiang, J. H., Su, Y. C., Sun, M. F., Chang, H. H., et al.
(2017). Chinese Herbal Medicine as an Adjunctive Therapy Ameliorated the Conflict of Interest: The authors declare that the research was conducted in the
Incidence of Chronic Hepatitis in Patients with Breast Cancer: A Nationwide absence of any commercial or financial relationships that could be construed as a
Population-Based Cohort Study. Evid. Based Complement Alternat. Med. 2017, potential conflict of interest.
1052976. doi: 10.1155/2017/1052976
Lin, Y. C., Huang, W. T., Ou, S. C., Hung, H. H., Cheng, W. Z., Lin, S. S., et al. Copyright © 2020 Huang, Hung, Kao, Ou, Lin, Cheng, Yen, Li, Chen, Shia and
(2019). Neural network analysis of Chinese herbal medicine prescriptions for Huang. This is an open-access article distributed under the terms of the Creative
patients with colorectal cancer. Complement Ther. Med. 42, 279–285. Commons Attribution License (CC BY). The use, distribution or reproduction in other
doi: 10.1016/j.ctim.2018.12.001 forums is permitted, provided the original author(s) and the copyright owner(s) are
Ling, S., and Xu, J. W. (2013). Model organisms and traditional chinese medicine credited and that the original publication in this journal is cited, in accordance with
syndrome models. Evid. Based Complement Alternat. Med. 2013, 761987. accepted academic practice. No use, distribution or reproduction is permitted which
doi: 10.1155/2013/761987 does not comply with these terms.

Frontiers in Pharmacology | www.frontiersin.org 9 May 2020 | Volume 11 | Article 670

You might also like