Computer Methods and Programs in Biomedicine: Akhan Akbulut, Egemen Ertugrul, Varol Topcu

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Computer Methods and Programs in Biomedicine 163 (2018) 87–100

Contents lists available at ScienceDirect

Computer Methods and Programs in Biomedicine


journal homepage: www.elsevier.com/locate/cmpb

Fetal health status prediction based on maternal clinical history using


machine learning techniques
Akhan Akbulut a,b,∗, Egemen Ertugrul b, Varol Topcu b
a
Department of Computer Science, North Carolina State University, Raleigh, NC 27606, USA
b
Department of Computer Engineering, Istanbul Kultur University, Atakoy Campus Bakirkoy, Istanbul 34156, Turkey

a r t i c l e i n f o a b s t r a c t

Article history: Background and Objective: Congenital anomalies are seen at 1–3% of the population, probabilities of which
Received 18 February 2018 are tried to be found out primarily through double, triple and quad tests during pregnancy. Also, ultra-
Revised 19 May 2018
sonographical evaluations of fetuses enhance detecting and defining these abnormalities. About 60–70%
Accepted 8 June 2018
of the anomalies can be diagnosed via ultrasonography, while the remaining 30–40% can be diagnosed
after childbirth. Medical diagnosis and prediction is a topic that is closely related with e-Health and ma-
Keywords: chine learning. e-Health applications are critically important especially for the patients unable to see a
Machine learning doctor or any health professional. Our objective is to help clinicians and families to better predict fe-
Medical diagnosis tal congenital anomalies besides the traditional pregnancy tests using machine learning techniques and
Risk prediction
e-Health applications.
Pregnancy
Fetal health Methods: In this work, we developed a prediction system with assistive e-Health applications which both
Prognosis the pregnant women and practitioners can make use of. A performance comparison (considering Accu-
m-Health racy, F1-Score, AUC measures) was made between 9 binary classification models (Averaged Perceptron,
Boosted Decision Tree, Bayes Point Machine, Decision Forest, Decision Jungle, Locally-Deep Support Vec-
tor Machine, Logistic Regression, Neural Network, Support Vector Machine) which were trained with the
clinical dataset of 96 pregnant women and used to process data to predict fetal anomaly status based
on the maternal and clinical data. The dataset was obtained through maternal questionnaire and detailed
evaluations of 3 clinicians from RadyoEmar radiodiagnostics center in Istanbul, Turkey. Our e-Health ap-
plications are used to get pregnant women’s health status and clinical history parameters as inputs, rec-
ommend them physical activities to perform during pregnancy, and inform the practitioners and finally
the patients about possible risks of fetal anomalies as the output.
Results: In this paper, the highest accuracy of prediction was displayed as 89.5% during the develop-
ment tests with Decision Forest model. In real life testing with 16 users, the performance was 87.5%. This
estimate is sufficient to give an idea of fetal health before the patient visits the physician.
Conclusions: The proposed work aims to provide assistive services to pregnant women and clinicians
via an online system consisting of a mobile side for the patients, a web application side for their clin-
icians and a prediction system. In addition, we showed the impact of certain clinical data parameters
of pregnant on the fetal health status, statistically correlated the parameters with the existence of fetal
anomalies and showed guidelines for future researches.
© 2018 Elsevier B.V. All rights reserved.

1. Introduction tal health may be affected by the maternal adaptive changes during
this period, as well as by the medical history of the mother and
The stage of gestation or pregnancy of a woman is a phase that maternal/familial attributes. Thus, the observation of fetal health
may carry complications for both the mother and the fetus. The fe- and antenatal care during this phase is crucially important for ma-
ternal and fetal health [1].
Clinicians (especially gynaecologists) wish to inform the parents

Corresponding author at: Department of Computer Science, North Carolina State about well-being of their unborn infants and they do, based on al-
University, Raleigh, NC 27606, USA. ready studied cases and past experiences. Hence, defining the pos-
E-mail addresses: [email protected] (A. Akbulut), [email protected] (E.
Ertugrul), 160 0 0 [email protected] (V. Topcu).

https://doi.org/10.1016/j.cmpb.2018.06.010
0169-2607/© 2018 Elsevier B.V. All rights reserved.
88 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

sible familial (predominantly maternal) clinical data which is prone was predicted via machine learning methods by Yarlapati et al. [6].
to influence fetal health would help antenatal maternal care. With using Bayes minimum error rate classifier on Indian health
However, patient-oriented health care might not be fully car- care data, they managed to predict the fetal status as LBW or NOT-
ried out due to some difficulties, i.e. insufficient number of hos- LBW with the accuracy of 96.77%. In other research [7], ultrasound
pitals near the patient’s whereabouts (villages outlying in the ru- images obtained from lumbar spine of pregnant patients are used
ral areas) or overcrowded hospitals in cities. In Turkey, m-Health to determine proper needle insertion site. Identification of inter-
programs that provide access to health services and health infor- spinous region is realized using a Support Vector Machine (SVM)
mation through mobile services are established on a national level with Gaussian kernel. After training this model with 800 images
[2], where mobile-cellular subscriptions cover 96.02% and Internet from 20 pregnant subjects, it was tested with 640 images from
users cover 53.74% of the population of Turkey [3]. These establish- a separate set of 16 pregnant patients and the proposed model
ments and mobile services show the supportive and critical role achieved a success rate of 95.00%.
m-Health plays in inadequate conditions. Cheng et al. [8] inspected the effects of risk factors such as
Our main objective is to offer both clinicians and pregnant age and hematocrit over the gestational age of pregnant. The pri-
women an assistive technology where they can remotely predict mary contribution of this study is that the known risk factors are
the status of the fetal health with the help of mobile services and affecting differently on gestational hypertension group and pre-
ICT. The prediction is achieved by a machine learning system that eclampsia group from 412 pregnant women including 1874 clini-
learns a dataset of pregnant patients including their clinical his- cal follow-up records. Predicting the preterm birth risk was studied
tory and data, and outputs a result considering fetal health as the by Woolery and Grzymala-Busse [9]. They implemented an expert
main criteria, based on the trained dataset. This way, the output system to be used in determining the preterm birth risk with us-
of the system helps both sides to take precautions and gives them ing 18,890 subjects and 214 variables. Their proposed system was
awareness before the birth is given. The proposed machine learn- 53–88% accurate in predicting preterm delivery for 9419 patients.
ing system uses decision tree learning method. This method will In the study of predicting fetal well-being [10], antepartum car-
be explained in detail in the methodology section. diotocography (CTG) data was used with 8 different machine learn-
Day by day, telehealth [4] is becoming a crucial part of the ing approaches such as ANN, SVM, k-NN, RF, CART, Logistic Regres-
health care system. In this context, we aim to design and develop a sion, C4.5, and RBFN. Baby’s heart rate measured from the mothers
telehealth or m-Health system -since it supports mobile usage- to abdomen was utilized to extract uterine contraction (UC) and fetal
prognosis the fetal health of pregnant. We offer an alternative and heart rate (FHR) to be used in classifiers. The highest accuracy was
supportive system for remote patients to obtain clinical services. achieved by random forest with 99.2%.
The contributions of this paper are two-fold: The previous researches done on machine learning in medical
diagnosis and prediction include C4.5 decision tree classification
1. A user-friendly and assistive web and mobile application with a
algorithm and Naive Bayes Kernel algorithm, while the applications
machine learning system employed to predict fetal health sta-
of these algorithms can be about predicting and presenting ges-
tus.
tational risks [11,12] and normal or abnormal stages of pregnancy
2. A dataset, obtained from the pregnant women who were ac-
[13].
cepted at a radiodiagnostics center (RadyoEmar, Bakirkoy, Istan-
Kenny et al. [14] aimed to make a prognosis for pre-eclampsia
bul), is used to train the prediction (machine learning) system
disorder for pregnant. They used genetic programming to con-
in order to predict the fetal health, considering the clinical data
firm the patterns of metabolites that distinguish plasma from pa-
and history.
tients with pre-eclampsia using 97 plasma samples. Smyser et al.
In the remainder of this paper, previous works done on this [15] conducted a study on predicting brain maturity and neurode-
topic are mentioned, methodology and technical details about the velopmental outcome in infants using multivariate pattern analysis
proposed model are given, feature importances and weights in the on neonatal functional MRI data. SVM estimated the birth gesta-
dataset are analyzed, test results of a group of volunteer patients tional age of individual infants with 84% accuracy (p b 0.0 0 01).
are examined and finally the paper is concluded. Czabanski et al. [16] focused on cardiotocography (CTG) for bio-
physical assessment of fetal state. As the first step, the classifica-
2. Background tion of fetal heart rate (FHR) signals with a fuzzy method is car-
ried out. Then in the second phase, using Lagrangian Support Vec-
The wide popularity of machine learning is increasing in almost tor Machines (LSVM) a success rate (The highest CC = 92.0% and
every domain. Those methods allow researchers to solve diagnostic QI = 88.2%) is achieved with the proposed method. The study in
and prognostic problems in a variety of medical domains [5]. Med- which the fetal brain development is modeled [17], whole-brain
ical signal analysis, long-term patient tracking and analysis, med- functional connectivity was inspected on 105 preterm infants. Con-
ical image and sound processing, drug discovery, assistive robotic nections were estimated with 80% accuracy using the SVM method.
surgery, and automatic treatment systems or recommendation ser- In this regard, results were generated for areas that could not
vices for patients are emerging trends of new generation health- be fully interpreted by magnetic resonance image (MRI) analysis.
care applications. Krupa et al. [18] used empirical mode decomposition and SVM in
Probably the most promising area of the machine learning us- CTG analysis of newborns. Prediction of normal or at risk classes
age in the healthcare domain is medical diagnosis. Development of on 90 randomly selected records was achieved with 86% accuracy.
decision support systems for analyzing the new cases using trained Ocak [19] performed a CTG analysis with a UCI dataset
data sets and dealing with inappropriate medical data such as hav- [20] consists of fetal heart rate and uterine contractions and pro-
ing noise or missing data are the most-known implementations. posed a fetal heath estimation model with SVM method. The clas-
Although our proposed system is not exactly a decision support sification performance of the SVM is enhanced by enabling the
system, it is considered to be used for pre-diagnostic purposes for genetic algorithm to eliminate the irrelevant features from the
the end user. dataset. His proposed method achieved 99.3% and 100% accuracy
Rest of this section covers some of the most prominent research in predicting fetal state as normal or pathological, respectively. Re-
projects which were conducted on predicting pregnancy risks and search on a different UCI dataset with 2126 fetal CTGs recordings
fetal health using machine learning techniques. As an important in- perform a full prediction of pathologic cases using modular neural
dicator of fetal health, low birth weight (LBW) risk in pregnancies
A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 89

Table 1
Comparison of fetal health analysis studies.

Research No of features Machine learning method Size of dataset Accuracy Imp

Yarlapati et al. 18 Bayes minimum error rate classifier 101 96.77% ✕


Yu et al. N/A Support Vector Machine 36 96.15% ✕
Cheng et al. 14 Statistical Analysis 412 88.1% ✕
Woolery et al. 214 Rule Based 9419 88% 
Lakshmi et al. 17 C4.5 Decision Tree 600 94.78% 
Kenny et al. N/A Genetic Programming 87 98% ✕
Smyser et al. 214 Support Vector Machine 50 84% ✕
Sahin et al. 21 Random Forest 1831† 99.2% ✕
Ocak 21 Support Vector Machine 1831† 99.7% ✕
Krupa et al. 10 Support Vector Machine 15 81.5% ✕
Huang et al. 23 Artificial Neural Network 2126† 97.78% ✕
Jadhav et al. 23 Modular Neural Network 2126† 99.5% ✕
Ball et al. N/A Support Vector Machine 105 80.2% ✕
Czabanski et al. N/A Lagrangian Support Vector Machines 51 90.1% ✕
Our study 23 Decision Forest 96 89.5% 

Imp: Implementation of the proposed model on a web/mobile based e-Health system. † : UCI machine learning repos-
itory datasets.

network [21]. Traditional neural network method gives a similarly


high accuracy on the same dataset as well [22].
Evaluation of fetal well-being is a problem that is examined
from many different directions. Some studies focus on predicting
premature birth, while others focus on brain development. The
two most important elements are the preferred machine learn-
ing method and dataset. Model development is more flexible when
working with your own original dataset. On the other hand, since
the data collection process is laborious and time-consuming, the
choice of available datasets such as UCI is an alternative. The
sources used for the analysis are mostly physiological measure-
ments, biochemical results, MRI images, and ultrasonography. If the
model requires measurements made by medical systems, out-of-
hospital use is not possible. The most important difference that
distinguishes our work from the other alternatives is that the pro-
posed model is implemented on an e-Health system. And the com-
parison of our approach with others is difficult due to the dataset
is genuine to our work only and there are no other studies using
the feature configuration as our work. A comparison of our study
with similar researches is given in Table 1. We did not prefer re-
sources that required medical devices such as CTG to reduce hospi- Fig. 1. An overview of the proposed system.
tal dependence. The greatest limitation in the development of such
systems is that the sensors used as commodities do not give re-
sults with as high accuracy as the hospital devices.
mobile user (patient), the database and the prediction system. The
3. Methodology web application works as an intermediate between the desktop
user, the prediction system and the database server. The slight dif-
In this section, we explain the proposed e-Health system de- ference between mobile services and web application is that the
signed for fetal health status prediction in detail. Section 3.1 de- web application module is specifically designed for the use of the
scribes the system architecture. Section 3.2 describes the dataset supervisors or the clinicians, whereas mobile application is de-
consisting of maternal clinical data. Section 3.3 describes the ma- signed for the use of patients.
chine learning algorithms evaluated and used for the prediction. The patients are able to enter their clinical data using the pro-
posed mobile application through the mobile services and the mo-
3.1. System architecture bile application. The supervisors can observe the clinical data and
prediction system results of their corresponding patients via the
The proposed system consists of five modules: mobile appli- web application.
cation, mobile services, web application, database and prediction The activity diagram in Appendix A shows the overall process
(machine learning) system. The mobile application is the part that occurs when both the patient and the corresponding clinician
where patients can interact with the UI to fill in information and are interacting within the system.
communicate with the clinicians and supervisors. The mobile ap- The entire architecture of the system is depicted in Fig. 1.
plication sends and receives data over mobile services as its back-
end. 3.1.1. Mobile application
The mobile services work with a database server to store and The mobile application is mainly developed as a tool for pa-
receive data, while they establish the communication between the tients to enter clinical history and data, to communicate with a
90 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

(a) Dataset input (b) Dataset input (c) Exercise recommenda-


form form cont. tions based on the input
Fig. 2. Screenshots featuring some of the layouts of the mobile application.

supervisor (i.e. clinician) and to use the prediction system via a UI 3.1.2. Mobile services
provided in the application, as shown on the Fig. 2. In Fig. 2a and The mobile services are set-up on the cloud with a back-end
2b, the form with the corresponding parameters of the dataset that purpose for mobile users. The main objectives of this component
is filled out by the patient is shown. In Fig. 2c, the exercise recom- is to establish a connection between the mobile user (patient),
mendations according to the inputs of the patient is shown. the database and the prediction system, use database functions to
We added a minor feature that is independent of the prediction store and retrieve data and help the patient communicate with the
system to the mobile application, offering the pregnant user a suit- clinician.
able exercise schedule considering the clinical data of the patient As for the web service realization, RESTful API [27] was pre-
as well as some other factors (e.g. height and weight). Exercising ferred, rather than SOAP API. RESTful services have better features
is not advisable in case of asthma, heart disease, uncontrolled type in terms of system flexibility, scalability and performance com-
1 diabetes. Also, exercises can be harmful if vaginal bleeding, spot- pared to SOAP-based services. RESTful services also consume less
ting or weak servix is present. If no such conditions are present, resources (i.e., battery, processor speed, and memory) and do not
moderate exercises that take more than 30 min on most days of include complex standards and heterogeneous operations. Hence,
the week are advised [23] (unless a medical or an obstetric compli- the RESTful services are easier to consume and compose than
cation occurs). If the patient is physically active before pregnancy, SOAP-based services [28].
then she can exercise at former levels as long as she is comfortable The mobile services also handle input of missing/noisy data, au-
and has doctor’s approval. However, if the patient had no physi- thentication of patient users, such as securely logging in, register-
cal activity before pregnancy, then it is advised to consult a health ing etc.
care provider for exercises such as brisk walking, swimming, in- The mobile services are realized with Azure Mobile Services
door stationary bicycling and low-impact aerobics via a certified platform of Microsoft.
aerobics instructor; these exercises are relatively safer to perform
than the aforementioned exercises of physically active patients and 3.1.3. Web application
can be recommended to the users. The web application, serving as a back-end for the desktop user
If the patient’s Body Mass Index (BMI) is within the overweight (clinician), has almost the same purpose as the mobile services. Al-
range, it is always advised to be cautious when exercising, thus, the though one of the features is to establish communication between
BMI is calculated and categorized according to the weight input the clinician and the patient, the web application is also designed
[24,25]. for the clinicians/supervisors to store, observe and inspect the pa-
The mobile application is based on Android operating system tients’ clinical history/data and interact with the prediction system.
and version 6.0 Marshmallow is selected. Android Studio is used A clinician can see a patient’s clinical data and prediction re-
as IDE during the development process of the mobile application sult if the patient has used the mobile application to enter clini-
[26]. cal data parameters as shown in Appendix C and Appendix D. The
A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 91

prediction result is shown both as in binary (healthy or anomaly


risk/anomaly) and scored probability (in percentage form).
The web application is responsible for dealing with the input of
missing/noisy data, secure authentication of clinicians and doctors
in the system. Also, we designed it in a responsive way to have
user-friendliness and accessibility from any device that can run a
web browser.
Due to the high compatibility with Azure servers, an ASP.NET
platform is preferred for development of web application [29].

3.1.4. Database
The primary goal of having a database in this architecture is to
keep the records of the clinical dataset. Thus, the database plays
a supreme role in acquiring the best results as possible from the
machine learning system. The more clinical data the database has,
the better results machine learning system gets.
The database is also responsible for keeping the user credentials
for authentication and data. As it is used commonly by both sides
and platforms, it enables the web application and mobile services
to serve valid data to the users.
To achieve a well-functioning and compatible structure, Mi- Fig. 3. The flowchart diagram showing the machine learning process with two-class
crosoft Azure SQL Database V12 is chosen for this matter [30]. decision forest.

3.1.5. Prediction (machine learning) system


In order to predict the fetal health with an available dataset,
a machine learning system had to be set-up. Microsoft Azure ML able of the classification and the trained dataset has 72 healthy and
(Machine Learning) service has a lot of built-in features such as UI, 25 unhealthy (with anomaly risk or anomaly) number of fetuses.
state-of-the-art machine learning algorithms, capability of import-
ing dataset and outputting the results with ease; therefore, Azure 3.3. Prediction algorithms
ML service is chosen for this purpose.
In this system, there’s a web service deployed with the trained Our main objective is to predict the fetal health status, whether
model, which performs as a link between the users and the pre- it is normal or pathological / potentially pathological based on ma-
diction system. The required inputs from the users are employed ternal clinical data which were questioned. This can be achieved
to predict the fetal health status, via the trained web service of by classification methods, namely, two-class (binary) classification
the prediction system. The output can be interpreted by a clinician algorithms.
for further inspection and maternal care of the regarding patient. None of the two-class (binary) classification algorithms operate
The machine learning models employed by the prediction sys- in the same way. As each algorithm has its ups and downs de-
tem are explained in Section 3.3. pending on the situation, it is up to the developers of a machine
learning model to choose the most suitable algorithm; this is de-
3.2. Clinical dataset termined by evaluation and comparison of the statistical measures
and results of each algorithm.
The dataset is a vital element which makes the machine learn- The two-class classification (binary) algorithms that were com-
ing algorithms work. The machine learning algorithms need data pared are:
to be trained, so that prediction can be made on a target variable. 1. Averaged Perceptron
As far as we know, there are no open or available clinical 2. Boosted Decision Tree
datasets related with the features that we were looking for. Thus, 3. Bayes Point Machine
we decided to collect our own dataset. Currently, the trained 4. Decision Forest
dataset now consists of 96 pregnant women, aged 18–41, who were 5. Decision Jungle
accepted at the radiodiagnostics center in RadyoEmar, Bakirkoy, Is- 6. Locally-Deep Support Vector Machine (SVM)
tanbul between January 17, 2015 and February 21, 2017, with the 7. Logistic Regression
data of 97 fetuses (1 twins, 95 single pregnancy). We believe that 8. Neural Network
our dataset is generalizable and can be used in other datasets. 9. Support Vector Machine (SVM)
However, it is worth noting that our dataset is comprised of clin-
ical data of only Turkish patients, which may not be suitable for (Appendix B shows the proposed machine learning approach
some datasets. This dataset was collected with the permission of with all binary-classifiers.)
the aforementioned radiodiagnostics center and the personal data The 80% of the clinical dataset was used for training and 20%
of the patients are disclosed. was used for testing the chosen model. As an addition to this
Patients were taken to 4-D Color Doppler Ultrasonography de- method, Tune Model Hyperparameters module was used in order to
vice for detailed anatomical and hemodynamical evaluation of empirically choose the best set of parameters for the specific al-
their fetuses (between 15 weeks 5 days-26 weeks 2 days of gesta- gorithm and our dataset [31] with 10-fold cross validation. Tune
tional ages). Reports, DVD-screenings and sonographic photos were Model Hyperparameters had the parameter sweeping mode set to
also given to the patients. entire grid and the metric for measuring performance set to F1
According to the maternal questionnaire and detailed evalua- Score.
tions of 3 specialists, the features shown on Table 2 were found for The flowchart diagram about the development process with
each patient. The fetal health status correspond to the target vari- Two-Class Decision Forest is shown in Fig. 3.
92 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

Table 2
The list of features with data types.

Feature Data type Min value Max value

Maternal age (yr) Numeric 18 41


General menstrual cyclic status of mother (regular: 0, irregular: 1) Binary 0 1
Type of pregnancy (natural: 0, in-vitro / in-vivo fertilization: 1) Binary 0 1
Fetal age (dy) Numeric 110 183
Blood serotype of mothera Numeric 0 7
Past delivery number (history) of mother Numeric 0 2
Number of abortus (history) Numeric 0 4
Diabetes history of mother Binary 0 1
Hypertension history of mother Binary 0 1
Other significant illnesses of motherb Numeric 0 4
Past-surgical operations of mother Binary 0 1
Consanguineous marriage status Binary 0 1
Presence of disabled children Binary 0 1
Presence of disabled persons in mother’s family Binary 0 1
Presence of disabled persons in father’s family Binary 0 1
Result of double test (3) Numeric 0 2
Result of triple testc Numeric 0 2
Result of quad testc Numeric 0 2
Drug-usage during the pregnancyd Numeric 0 8
Alcohol taking status Binary 0 1
Smoking status Binary 0 1
Any existing illnesses of mother regarding the pregnancye Numeric 0 6
Fetal health statusf Binary 0 1
a
0−, 0+, A−, A+, B−, B+, AB−, AB+
b
None, Hypothyroidism, Hyperthyroidism, Epilepsy, Systemic lupus erythematosus (SLE).
c
No Risk, Low Risk, Trisomy 21.
d
None, Antibiotics, Hormonally regulator, Haematological, Central Nervous System (CNN), Antihypertensive,
Antienflamatuary, Rheumatological, Gastrointestinal.
e
None, Imminent abortion, Pneumonia, Myositis, MTHFR mutation, Respiratory tract infection (RTI), Urinary
infection.
f
Healthy, Anomaly Risk/Anomaly.

The statistical measures in the evaluation model of a trained for validation results with the corresponding True Positive, False
and tested system are [32]: Positive, True Negative and False Negative cases). Finally, it was de-
cided to choose Two-Class Decision Forest algorithm as the predic-
• “Accuracy”, which is the ratio of correctly predicted observa-
tion algorithm of our system, due to slightly better AUC value com-
tions.
pared to the other two algorithms, lower memory footprint com-
(T ruePositives + T rueNegatives ) pared to Boosted Decision Tree and shorter training time compared
(T ruePositives + F alsePositives + T rueNegatives + F alseNegatives ) to Decision Jungle [35,36].
(1) The Receiver Operating Characteristic (ROC) curves of the best
three classification algorithms are shown in Fig. 4, with the True
• “Precision”, which is the ratio of correct positive observations. Positive rate on the y-axis and False Positive rate on the x-axis of
each plot (positive class corresponds to “anomaly or anomaly risk”)
T ruePositives
(2) [37].
(T ruePositives + F alsePositives )
• “Recall”, which is the ratio of correctly predicted positive
4. Feature weights and ranks of the proposed model
events. Recall is calculated as:
T ruePositives The trained dataset consists of many features and elements that
(3)
(T ruePositives + F alseNegatives ) influence the result of the prediction algorithm. We have evaluated
• “F1 Score”, which is weighted average of Precision and Recall. their effect with the help of “Tree-based feature selection” algo-
The formula for F1 Score is as follows [33]: rithm using scikit-learn, a library for machine learning in Python
language [38].
2 × (Recall × P recision )
(4) As a result of the tree-based estimators, we have computed the
(Recall + Precision ) feature importances as shown on Fig. 5. By observing this graph,
Each method that was tested has different values for each of we can conclude that the parameters: Fetal age, Age (Mother),
the statistical measures in the evaluation model. These measures Blood Stereotype, Delivery Number (History), Any illnesses of
can be interpreted differently for each algorithm and dataset. But mother regarding this pregnancy, Test results and Abortus (History)
for comparison purposes, Accuracy, F1 score (the weighted average are the most dominant factors that may influence the fetal health
of Precision and Recall), and Area Under Curve (an attribute that status.
makes it easier to compare with other results) are the statistical However, some certain statements, such as “Usage of alcohol
measures that were mostly focused on [34]. or disabled persons in father’s family does not affect fetal health.”
With these statistical measures considered, we can conclude would be false and couldn’t be made, since in our dataset, none
that three algorithms, Boosted Decision Tree, Decision Forest and of the fetal cases had an alcohol consuming mother or disabled
Decision Jungle, suit better than the other algorithms (see Table 3 people in father’s family. Thus, these feature importance values are
A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 93

Table 3
Comparison between classification algorithms.

Classifier Accuracy Precision Recall F1 Score AUC TP FP TN FN

Averaged Perceptron 0.895 1.0 0 0 0.333 0.500 0.750 1 0 15 3


Boosted Decision Tree 0.895 0.750 0.750 0.750 0.933 3 1 14 1
Bayes Point Machine 0.737 0.0 0 0 0.0 0 0 0.0 0 0 0.617 0 1 14 4
Decision Forest 0.895 0.750 0.750 0.750 0.958 3 1 14 1
Decision Jungle 0.895 0.750 0.750 0.750 0.950 3 1 14 1
Locally-Deep SVM 0.789 0.500 0.500 0.500 0.758 2 2 13 2
Logistic Regression 0.789 0.500 0.250 0.333 0.533 1 1 14 3
Neural Network 0.842 1.0 0 0 0.250 0.400 0.567 1 0 15 3
SVM 0.789 0.500 0.250 0.333 0.517 1 1 14 3

Fig. 4. ROC plots of the best three classification algorithms.

Fig. 5. Significance of features in the dataset.


94 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

Fig. 6. The scored probabilities of each patient displayed on a bar-graph with colors corresponding to the ultrasonography diagnosis.

closely related to the quality and content of the dataset which was According to the ultrasonography diagnoses of the practitioners,
used and are only to be taken into account as assistive features 2 cases had anomaly, 5 cases had anomaly risk and the rest 9 cases
that may affect fetal health [39]. A more diverse dataset with a had no anomaly or anomaly risk. In total, 5 patients (31.25% of 16)
large number of patients could lead to more generalizable find- were classified to have fetuses with anomaly or anomaly risk, and
ings, however it is also worth mentioning that definitive medical 11 patients (68.75% of 16) were classified to have healthy fetus by
conclusions shouldn’t be made based solely on these results. the prediction algorithm (with the threshold at 0.4), as shown on
Fig. 6. When the prediction results were compared with the ultra-
sonographic diagnoses, the prediction had 87.5% success (14 out of
5. Test results 16 were accurately predicted).

For testing purposes, 16 volunteer pregnant patients were asked 6. Discussion


to use the mobile application, fill out the clinical data parameters
and submit for fetal health prediction. The patients were not ques- Based on the results, we observed that the Two-class Decision
tioned before and therefore their clinical data are unique to the Forest classifier yields satisfactory results for feeding our proposed
trained dataset. The fetal health status was predicted for each pa- e-Health system in predicting the fetal health status. Although the
tient and the output was presented to the corresponding clinician dataset is not considered large, the current form gives sufficient
for further inspection via our system. results for classification and prediction. We have performed a lot
Scored probability is an output generated by the classification of experiments in our search for the classifier giving the highest
algorithm with a value within the range of 0 and 1. We have set accuracy. It is known that different classification algorithms add
our own threshold at 0.4, whereas the default threshold is 0.5. This an overhead to the systems when they used in. Knowing that the
threshold is used to classify whether the fetus carries any con- training times and the memory footprints of the algorithms play
genital anomaly risk/anomaly or not. If the scored probability is an important role in our decision making, we prefer to implement
less than 0.4, we assume that the result is “Healthy”. Otherwise, our prediction module into a cloud-based environment to reduce
we consider the result as “Anomaly Risk / Anomaly”. Such cases and eliminate those metrics. Hosting this service as in the center
should be approached with caution by the clinicians. of the cloud will also allow adapting new applications that will be
About 60–70% of the anomalies can be diagnosed via ultra- developed in the future, easily.
sonography. The remaining 30–40% can be diagnosed after child- Beyond doubt the most critical component of the proposed e-
birth. The ultrasound level 2 scan (mid-pregnancy anomaly scan) Health system is the prediction module. Increasing the success of
report of each patient was examined by a clinician. The anomaly the module is possible with studies to be realized in two dimen-
diagnosis of each examination was noted and they were in three sions. The first approach is to magnify the dataset to get more
groups: healthy, anomaly risk and anomalous. The patients that results with more trained data. Alternatively, it may be prefer-
carry anomaly risk have no present anomalies, however have the able to adapt more complicated classification algorithms such as
chance to be diagnosed after childbirth; anomaly risk is deter- deep learning approaches. Deep learning approaches require high-
mined by the double, triple and quad test results of the patient. capacity work environments due to more computations than shal-
A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 95

low models. This is possible for our system since the prediction a mobile side for the patients, a web application side for their
module is already running on the cloud. clinicians, a database that is working collectively between the two
The exercise section of the mobile application has enabled sides and a prediction system. With the help of this system, the
long-term use. After getting the preliminary information about the clinicians are able to examine each patient case in more detail ac-
health status of the users’ fetuses, they were encouraged to do the cording to the parameters filled in by the corresponding patient.
minimum sport needed for a healthy pregnancy through the exer- In the prediction system, Two-Class Decision Forest algorithm
cises in this section. Those exercises are recommended for women was trained with the dataset of 96 pregnant women. We have
during pregnancy and also in the postpartum period to mini- chosen this algorithm among 9 classification algorithms in Azure
mize negative factors. Experts emphasize that weight loss exer- ML, with respect to their Accuracy, F1 Score and Area Under Curve
cises, Yoga and Pilates movements and sports that require athletic (AUC) measures. The prediction system was used to predict the fe-
performance should be avoided during pregnancy. On the other tal health status of a patient with the inputs corresponding to the
hand, recommended simple exercises is not associated with risks parameters in the dataset. A prediction accuracy of 89.5% about the
for the newborn and can lead to changes in lifestyle that imply fetal health status has been achieved during the development tests.
long-term benefits [40]. According to American College of Obste- In real life testing with 16 users, the performance was 87.5%. This
tricians and Gynecologists, physical activity in pregnancy has min- estimate gives an idea about the capabilities of our system.
imal risks which benefits most pregnant women [41]. Therefore, As for the continuation of this project, we are planning to make
women with uncomplicated pregnancies are encouraged to engage the prediction system more reliable and the communication be-
in aerobic and strength-conditioning exercises before, during, and tween system components encrypted. The reliability of the pre-
after pregnancy. diction can be improved by increasing the number of elements
On the whole, we have developed a health system aimed at us- in the dataset, while it is also vital to train the machine learning
ing the widespread mobile applications in the medical field. Al- model with a more generalizable dataset. Obtaining new patient
though the system we proposed is not going to replace a medical data that involve various features (especially with the missing fea-
procedure, it has been appreciated for providing a preliminary re- tures as seen in Section 4) can have great impact on achieving a
sult to its users and for suggesting exercise recommendations for more generalizable dataset. Encryption of the communication can
future periods. ensure the patient privacy and confidentiality among transactions
[42]. Encrypted storage of personal information is also targeted for
7. Conclusion our database. The delays and overhead that will be caused by the
encryption process will also be evaluated. Hence, widely accepted
Medical diagnosis and prediction is a topic that is closely re- standards (such as HL7) are planned to be satisfied within our
lated with machine learning. In machine learning, when a dataset communication protocols. Our other goal is to develop a mobile
is trained with a suitable machine learning algorithm, the predic- app for Apple platforms to support iOS-based mobile devices. This
tion of a certain feature can be obtained. This kind of ability can study also can benefit from clinical trials with getting more infor-
help the experts to take precautions or analyze an issue more in- mation about its effectiveness.
depth. Although, human body is very complex and medical deci-
sions can’t be solely based on predictions, the concept of machine Acknowledgments
learning in medicine can become a great tool for doctors and clin-
icians. The authors thank the administration, staff, and the practition-
The main contribution of this work is an assistive e-Health ap- ers of the RadyoEmar Hospital, Istanbul. The dataset used in this
plication for clinicians and pregnant patients that helps to predict work can be requested via email. Work on the development and
congenital anomalies besides the traditional methods using ma- extension of this dataset will continue. In this context, we would
chine learning techniques. We also propose guidelines for future like the researchers who want to support this research to commu-
researches about diagnosis of fetal health using machine learning nicate with us.
techniques.
The proposed system aims to provide services to pregnant
women and clinicians via an assistive online system consisting of
96 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

Appendix A. An activity diagram (pool) of the system


A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 97

Appendix B. The diagram of the 9 classification models in Azure ML Environment


98 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

Appendix C. A screenshot taken from the clinician’s web application (Profile Page)
A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100 99

Appendix D. A screenshot taken from the clinician’s web application (Patient Details)

Supplementary material

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.cmpb.2018.06.010.
100 A. Akbulut et al. / Computer Methods and Programs in Biomedicine 163 (2018) 87–100

References [22] M.-L. Huang, Y.-Y. Hsu, Fetal distress prediction using discriminant analysis, de-
cision tree, and artificial neural network, J. Biomed. Sci. Eng. 5 (09) (2012) 526.
[1] S. Blackburn, Maternal, Fetal, & Neonatal Physiology-E-Book, Elsevier Health [23] R.R. Pate, M. Pratt, S.N. Blair, W.L. Haskell, C.A. Macera, C. Bouchard, D. Buch-
Sciences, 2014. ner, W. Ettinger, G.W. Heath, A.C. King, Physical activity and public health:
[2] W. H. Organization, E-health Country Profiles - Turkey, 2015, [Online; Last ac- a recommendation from the centers for disease control and prevention
cessed: 5 April 2018], http://www.who.int/goe/publications/atlas/2015/tur.pdf? and the american college of sports medicine, JAMA 273 (5) (1995) 402–
ua=1. 407, doi:10.10 01/jama.1995.03520290 054029. arXiv:/data/journals/jama/9344/
[3] I.T. Union, ICT Development Index Rank, 2016, [Online; Last accessed: 5 April jama_273_5_029.pdf
2018], http://www.itu.int/net4/ITU-D/idi/2016/. [24] C. for Disease Control, P. (CDC), About Adult BMI, 2017, [Online; Last ac-
[4] J.J.A. Moran, A.V. Roudsari, The importance of telehealth for directors and other cessed: 5 April 2018], https://www.cdc.gov/healthweight/assessing/bmi/adult_
decision makers., in: ITCH, 2015, pp. 7–11. bmi/index.html.
[5] M.I. Jordan, T.M. Mitchell, Machine learning: trends, perspectives, and [25] C. for Disease Control, P. (CDC), Defining Adult Overweight and Obesity,
prospects, Science 349 (6245) (2015) 255–260. 2016, [Online; Last accessed: 5 April 2018], https://www.cdc.gov/obesity/adult/
[6] A.R. Yarlapati, S.R. Dey, S. Saha, Early prediction of LBW cases via minimum er- defining.html.
ror rate classifier: a statistical machine learning approach, in: Smart Comput- [26] N. Smyth, Android Studio 2.2 Development Essentials-Android, 7 ed., Payload
ing (SMARTCOMP), 2017 IEEE International Conference on, IEEE, 2017, pp. 1–6. Media, Inc., 2016.
[7] S. Yu, K.K. Tan, B.L. Sng, S. Li, A.T.H. Sia, Feature extraction and classification for [27] L. Richardson, M. Amundsen, S. Ruby, RESTful Web APIs: Services for a Chang-
ultrasound images of lumbar spine with support vector machine, in: Engineer- ing World, O’Reilly Media, Inc., 2013.
ing in Medicine and Biology Society (EMBC), 2014 36th Annual International [28] B. Upadhyaya, Y. Zou, H. Xiao, J. Ng, A. Lau, Migration of soap-based services to
Conference of the IEEE, IEEE, 2014, pp. 4659–4662. restful services, in: 2011 13th IEEE International Symposium on Web Systems
[8] W. Cheng, L. Fang, L. Yang, H. Zhao, P. Wang, J. Yan, Varying coefficient models Evolution (WSE), 2011, pp. 105–114, doi:10.1109/WSE.2011.6081828.
for analyzing the effects of risk factors on pregnant women’s blood pressure, [29] A. Freeman, Pro AsP. Net MVC 5 platform, in: Pro ASP. NET MVC 5 Platform,
in: Machine Learning and Applications (ICMLA), 2014 13th International Con- Springer, 2014, pp. 3–8.
ference on, IEEE, 2014, pp. 55–60. [30] D.G. Campbell, G. Kakivaya, N. Ellis, Extreme scale with full SQL language sup-
[9] L.K. Woolery, J. Grzymala-Busse, Machine learning for an expert system to port in Microsoft SQLAzure, in: Proceedings of the 2010 ACM SIGMOD Inter-
predict preterm birth risk, J. Am. Med. Inform. Assoc. 1 (6) (1994) 439–446, national Conference on Management of data, ACM, 2010, pp. 1021–1024.
doi:10.1136/jamia.1994.95153433. arXiv:/oup/backfile/content_public/journal/ [31] N.K. Brad Severtson, G. Ericson, Choose parameters to optimize your al-
jamia/1/6/10.1136/jamia.1994.95153433/2/1- 6- 439.pdf. gorithms in Azure machine learning, 2017, [Online; Last accessed: 5
[10] H. Sahin, A. Subasi, Classification of the cardiotocogram data for anticipation April 2018], https://docs.microsoft.com/en-us/azure/machine-learning/
of fetal risks using machine learning techniques, Appl. Soft Comput. 33 (2015) machine- learning- algorithm- parameters- optimize.
231–238. [32] D.M. Powers, Evaluation: from precision, recall and f-measure to ROC, in-
[11] B.N. Lakshmi, T.S. Indumathi, N. Ravi, Prediction based health monitoring in formedness, markedness and correlation, J. Mach. Learn. Technol. 2 (1) (2011)
pregnant women, in: 2015 International Conference on Applied and Theoreti- 37–63.
cal Computing and Communication Technology (iCATccT), 2015, pp. 594–598, [33] A. de Ruiter, Performance Measures in Azure ML: Accuracy, Pre-
doi:10.1109/ICATCCT.2015.7456954. cision, Recall and F1 Score, 2015, [Online; Last accessed: 5 April
[12] B.N. Lakshmi, T.S. Indumathi, N. Ravi, A comparative study of classification al- 2018], https://blogs.msdn.microsoft.com/andreasderuiter/2015/02/09/
gorithms for predicting gestational risks in pregnant women, in: 2015 Interna- performance- measures- in- azure- ml- accuracy- precision- recall- and- f1- score/.
tional Conference on Computers, Communications, and Systems (ICCCS), 2015, [34] A.P. Bradley, The use of the area under the ROC curve in the evaluation of
pp. 42–46, doi:10.1109/CCOMS.2015.7562849. machine learning algorithms, Pattern Recognit. 30 (7) (1997) 1145–1159.
[13] R. Sawant, N. Gaikwad, Hybrid prediction method for pregnancy data set, in: [35] W.A.R. Gary Ericson, How to Choose Algorithms for Microsoft Azure Machine
2015 1st International Conference on Next Generation Computing Technologies Learning, 2017, [Online; Last accessed: 5 April 2018], https://docs.microsoft.
(NGCT), 2015, pp. 918–920, doi:10.1109/NGCT.2015.7375253. com/en- us/azure/machine- learning/studio/algorithm- choice.
[14] L.C. Kenny, W.B. Dunn, D.I. Ellis, J. Myers, P.N. Baker, D.B. Kell, G. Consortium, [36] M. Azure, Two-class Decision Jungle, 2016, [Online; Last accessed: 5 April
et al., Novel biomarkers for pre-eclampsia detected using metabolomics and 2018], https://msdn.microsoft.com/library/azure/dn905976.aspx.
machine learning, Metabolomics 1 (3) (2005) 227. [37] K. Hajian-Tilaki, Receiver operating characteristic (ROC) curve analysis for med-
[15] C.D. Smyser, N.U. Dosenbach, T.A. Smyser, A.Z. Snyder, C.E. Rogers, T.E. In- ical diagnostic test evaluation, Casp. J. Intern. Med. 4 (2) (2013) 627.
der, B.L. Schlaggar, J.J. Neil, Prediction of brain maturity in infants using ma- [38] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
chine-learning algorithms, NeuroImage 136 (2016) 1–9. M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
[16] R. Czabanski, J. Jezewski, A. Matonia, M. Jezewski, Computerized analysis of D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine
fetal heart rate signals as the predictor of neonatal acidemia, Expert Syst. Appl. learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830.
39 (15) (2012) 11846–11860. [39] C. Nykjaer, N.A. Alwan, D.C. Greenwood, N.A. Simpson, A.W. Hay, K.L. White,
[17] G. Ball, P. Aljabar, T. Arichi, N. Tusor, D. Cox, N. Merchant, P. Nongena, J.V. Ha- J.E. Cade, Maternal alcohol intake prior to and during pregnancy and risk of
jnal, A.D. Edwards, S.J. Counsell, Machine-learning to characterise neonatal adverse birth outcomes: evidence from a british cohort, J. Epidemiol. Commu-
functional connectivity in the preterm brain, NeuroImage 124 (2016) 267–275. nity Health (2014) 542–549.
[18] N. Krupa, M. Ali, E. Zahedi, S. Ahmed, F.M. Hassan, Antepartum fetal heart rate [40] S.L. Nascimento, F.G. Surita, J.G. Cecatti, Physical exercise during pregnancy: a
feature extraction and classification using empirical mode decomposition and systematic review, Curr. Opin. Obstet. Gynecol. 24 (6) (2012) 387–394.
support vector machine, Biomed. Eng. Online 10 (1) (2011) 6. [41] A. C. of Obstetricians, Gynecologists, et al., Physical activity and exercise dur-
[19] H. Ocak, A medical decision support system based on support vector machines ing pregnancy and the postpartum period, Obstet. Gynecol. 126 (6) (2015)
and the genetic algorithm for the evaluation of fetal well-being, J. Med. Syst. e135–142. Committee opinion no. 650
37 (2) (2013) 9913. [42] I.T. Agaku, A.O. Adisa, O.A. Ayo-Yusuf, G.N. Connolly, Concern about secu-
[20] A. Frank, A. Asuncion, UCI Machine Learning Repository, University of Cal- rity and privacy, and perceived control over collection and use of health in-
ifornia, School of Information and Computer Science, Irvine, CA, 2010. 213, formation are related to withholding of health information from healthcare
http://archive.ics.uci.edu/ml providers, J. Am. Med. Inform.Assoc. (2014) 374–378.
[21] S. Jadhav, S. Nalbalwar, A. Ghatol, Modular neural network model based foetal
state classification, in: Bioinformatics and Biomedicine Workshops (BIBMW),
2011 IEEE International Conference on, IEEE, 2011, pp. 915–917.

You might also like