2020 - Medical Internet of Things Using Machine Learning
2020 - Medical Internet of Things Using Machine Learning
To cite this article: Kanchan Pradhan & Priyanka Chawla (2020): Medical Internet of things using
machine learning algorithms for lung cancer detection, Journal of Management Analytics, DOI:
10.1080/23270012.2020.1811789
Article views: 2
This paper empirically evaluates the several machine learning algorithms adaptable
for lung cancer detection linked with IoT devices. In this work, a review of nearly 65
papers for predicting different diseases, using machine learning algorithms, has
been done. The analysis mainly focuses on various machine learning algorithms
used for detecting several diseases in order to search for a gap toward the future
improvement for detecting lung cancer in medical IoT. Each technique was
analyzed on each step, and the overall drawbacks are pointed out. In addition, it
also analyzes the type of data used for predicting the concerned disease, whether
it is the benchmark or manually collected data. Finally, research directions have
been identified and depicted based on the various existing methodologies. This
will be helpful for the upcoming researchers to detect the cancerous patients
accurately in early stages without any flaws.
Keywords: disease prediction; lung cancer; machine learning algorithms; internet of
things
1. Introduction
Over the past decades, an incessant development that pertains to the cancer research
has been offered to a high extent. Multiple research works have implemented numer-
ous models for the earlier recognition of cancer before suffering from signs (Zhong &
Song, 2019). By the invention of new models in clinical areas, huge cancer data are
gathered and are freely accessible by the medical research society. However, there is
one significant challenging task to physicians i.e. the disease should be predicted accu-
rately. The manifestation of a 20% significant decrease in death from lung cancer is
reported by USA NLST and corresponding resolution is given them. Medicare and
Medicaid Service Centers have paved the way for national lung cancer screening in
the USA to provide Medicare coverage for lung cancer screening (Al-Anni, Hou,
Abdu-aljabar, & Xiang, 2017). The IoT is “a global information society infrastructure
that enables sophisticated services by connecting (physical and virtual) things based on
present and developing communication and information technologies”. In view of its
full potential, IoT is one of the most important technological advances of the current
period.
One of the most significant ways to diminish the deaths due to lung cancer (Alah-
mari et al., 2018; Cirujeda et al., 2016; Emaminejad et al., 2016) is its earlier prediction
(Alanni, Hou, Azzawi, & Xiang, 2019; Luo et al., 2019). Early detection needs an
accurate and steadfast diagnosis process, by which the surgeons are able to differen-
tiate a benign or malign cancer (Li, Xiang, et al., 2018; Ma, Wang, Zou, & Yan,
2017; Wu et al., 2019). In that case, pathological examinations and monitoring tests
were performed. In order to judge whether the lung cancer is present or not, screening
tests, consisting of smoking history, sputum examinations, physical tests, spiral CT
scans, and chest X-rays, will give doctors some early information (Al-Kadi &
Watson, 2008; Park, Lee, Weiss, & Motai, 2016). It is important to be aware of the
pathological staging of lung cancer because it can be utilized for predicting the diag-
nosis of a patient and can also allow specialists to provide an appropriate treatment.
Nevertheless, for determining the clinical stage of lung cancer (Hawkins et al., 2014;
Kumar, Sankar, Clausi, Taylor, & Wong, 2019), it generally consumes more time
and money for obtaining the report of pathology (Okada et al., 2012; Zamani,
Rezaeieh, & Abbosh, 2015).
The key intention of IoT is to make the surroundings smarter, by providing the
required data from historical or real basis and implement computational intelligence
automatically for taking smart decisions. Multiple types of research were reported in
the existing contributions and those are on the basis of various techniques have the
capacity to enable the early detection and prognosis (Arunkumar & Ramakrishnan,
2019; Pati, 2019). Data mining generally consists of many approaches like association
rule mining, NN, DT, etc. Every method evaluates the information in varied conducts
(Yu, Ni, Dan, & Xu, 2012; Zhang, Qi, et al., 2019). The information related to lung
cancer taken from IoT devices are utilized for knowing and managing difficult
environments, allowing great automation, more efficiency, accuracy, wealth gener-
ation, productivity, and better decision making (Das et al., 2019). In these environ-
ments, a significant challenge is the timely processing of huge amounts of data for
delivering highly steadfast and accurate observations and decisions so that IoT can
fulfill its promise.
The key contributions of the current survey are depicted below.
evaluation to predict lung cancer at an early stage. In 2010, Kim, Koh, and Park
(2010) have proposed a novel framework, named DT, related to lung cancer for
finding the growth of lung cancer through occupational exposures. In 2014, Zięba,
Tomczak, Lubicz, and Świątek (2014) have proposed a boosted SVM model for resol-
ving the imbalanced data issues. The presented solution merged the advantages of
ensemble models for rough data together by cost-sensitive SVMs. Later, an oracle-
based technique was presented to extract decision rules from the boosted SVM. In
2015, Engchuan and Chan (2015) have suggested a pathway activity transformation
approach for multi-class data named AFS. The proposed model has high classification
power. In 2016, Azzawi, Hou, Xiang, and Alanni (2016) introduced the GEP tech-
nique for predicting the lung cancer from microarray data. Moreover, the suggested
model utilized two gene selection approaches for extracting the important lung
cancer genes, and significantly recommended various GEP-based prediction
approaches. In 2016, Petousis, Han, Aberle, and Bui (2016) modeled a group of
DBN and analyzed for producing the intuition into how longitudinal information
was utilized for assisting lung cancer monitoring decisions. In 2017, Lynch, Abdollahi
et al. (2017) suggested many supervised learning methods for the SEER database, for
categorizing lung cancer people regarding survival, consisting of Linear Regression,
GBM, SVM, DT, and custom ensemble. In 2019, Petousis et al. (2019) suggested a
new technique for learning the POMDP, which optimized the detection of lung
cancer by improving the specificity. With the help of Bayesian Network, the NLST
data were trained and inverse reinforcement learning was employed for finding the suc-
cessive function on the basis of decisions of experts. In 2019, ALzubi et al. (2019)
suggested an ensemble of WONN-MLB for lung cancer disease in big data. In the
feature selection phase, the required attributes were chosen using an integrated
Newton-Raphsons MLMR for reducing the classification time. Later, the Boosted
WONN Ensemble classification model was implemented for categorizing the patient
using the selected attributes that enhanced the accuracy and minimized the FPR.
algorithms and for extracting the significant predictive values, huge dataset was uti-
lized. Consolidated factors were created for introducing Cox-regression approaches
for heart–lung transplantation using machine learning methods such as DT, logistic
regression, and NN, the traditional predictive approaches, and common-sense-based
interaction variables. Tang, Jiang, Wu, Shen, and Yu (2009) introduced a new gene
programming on the basis of equivalence and the probability density functions on
every gene to the class of interest corresponding to others. In order to classify the
disease, LKT-SVM was employed. Moreover, in 2010, Barakat, Bradley, and
Barakat (2010) suggested many machine learning and data mining approaches for
diagnosing and prognosing diabetes. To diagnose diabetes, SVMs were utilized. More-
over, an extra-module was utilized to carry out the black box test of SVM under the
stable depiction of SVMs classification.
In 2016, Yin, Zeng, Chen, and Fan (2016) evaluated medical services and different
intelligent models used for IoT applications. On the contrary, our paper examines
health system from the point of view of empowering IoT-based frameworks in
medical system and businesses by employing new strategies. In 2014, Xu, He, and
Li (2014) improved the IoT businesses including key of empowering advancements,
big IoT applications in venture, and different application models and problems .
IoT enabled the development of best ventures very effectively. In 2014, Boyi Xu
et al. (2014) presented an IoT framework for emergency clinical administrative to
understand how to assemble IoT data. Experimental results showed that IoT is work-
able in a dispersed heterogeneous environment for data in a particular pattern and
saved in cloud. In 2018, Yang and Xu (2018) presented interdisciplinary research
investigation in IoT-empowered human services, including systems from software
engineering, designing, data science, and behavioral science. In 2017, Li, Xu, and
Zhao (2018) presented 4G and 5G system required to support the operation IoT secur-
ity. This paper audits the fresh look into the best of 5G IoT, key empowering inno-
vations, and primary research patterns and difficulties in 5G IoT. In 2019, Giuseppe
Aceto, Persico, and Pescapé (2020) provided a portrayal of principle advances and
standards comparable to Healthcare 4.0 and talked about their fundamental appli-
cation situations. Industry 4.0 technology benefit and novel cross-disciplinary pro-
blems also have been discussed. In 2010 Yuan, Li, Guan, and Xu (2010) proposed
an exact internet traffic order and techniques that ordered the internet traffic into
wide application classes agreeing to the system stream parameters acquired from
the parcel headers. An advanced list of capabilities is obtained by means of different
classifiers of choice strategies. In 2020, Akman, Karaman, and Kuzey (2020) presented
utilization of SVM (Support Vector Machine) and Neural Network in research to
encourage the effect of visa strategies on two-sided exchange, utilizing the fare infor-
mation from Turkey for 2000–2014. In 2019 Chi-Hsien and Nagasawa (2019), the
specialist made machine learning model to help limit those examination hindrances.
This examination broke down the Chinese extravagance utilization conduct, while
the Chinese contributed 33% of the worldwide extravagance advertisement in 2018
and paid as a development motor in the extravagance showcase. In 2019, Lu (2019)
built a broad study over the period 1961–2018 of AI and Deep Learning. The exam-
ination gives an important reference to scientists and professionals through the multi-
point precise investigation of AI from the hidden components to viable application,
from principal calculations to industrial accomplishments from the current status to
the future model.
Journal of Management Analytics 5
providing the surgical treatment results, and in every step, a linear kernel function was
employed for enhancing the accuracy. Memarian, Kim, Dewar, Engel, and Staba
(2015) implemented machine learning techniques, especially the accumulation of
mutual data-based feature selection and supervised learning methods on multi-
modal information for predicting the surgery results, which were diagnosed by the
MTLE and consequently provided best anteromedial temporal lobectomy. Barbieri
et al. (2015) presented a new model with the help of various machine learning
methods, as the difference of the CKD patients was clearly considered for providing
the normal and steadfast approach to predict Erythropoiesis Stimulating Agents or
Iron therapy response. Dai et al. (2015) aimed to predict heart-related hospitalizations
accurately and effectively on the basis of the existing medical history. Here, five
machine learning techniques, such as SVM, AdaBoost, Logistic Regression, NB,
and a differentiation of Likelihood Ratio Test, were developed. Every method was
trained on the training and testing datasets.
In 2016, Lee and Kim (2016) evaluated the connection among the HW phenotype
and type 2 diabetes in Korean adults and also assessed the prediction power of differ-
ent phenotypes, including the accumulations of unique anthropometric metrics and
TG levels. Here, LR and NB classifiers were employed for validation using 10-fold
cross-evaluation approach.
In 2017, Chen, Hao, Hwang, Wang, and Wang (2017) organized the machine
learning techniques well to predict the chronic disease outbreak in disease-frequent
groups effectively. Luo, Ding, Liang, Cao, and Chen (2017) offered a CPTL for prior-
itizing miRNAs associated with disease. By merging the similarities of disease and
miRNA, a miRNA-disease network to predict miRNA-disease was constructed. By
using transduction learning, the relevance score of the node consisting of miRNA
and disease was computed. Zhang et al. (2017) introduced a fast Fourier transform-
ation-coupled machine learning ensemble technique to predict short-term disease
risk for producing chronic heart disease patients with suitable suggestions regarding
the requirement of clinical tests. Here, the combination of ANN, LS-SVM, and NB
was built with an ensemble model. Nilashi, binIbrahim, Ahmadi, and Shahmoradi
(2017) suggested an analytical approach for predicting the disease by clustering,
removing noise, and prediction methods. In order to generate fuzzy rules, CART
was utilized. Kotsavasiloglou, Kostikis, Hristu-Varsakelis, and Arnaoutoglou (2017)
implemented an extensive machine learning technique for construction of a system,
which was able to categorize the unknown subjects on the basis of their line-
drawing performance.
In 2018, Khalid and Sezerman (2018) merged the features of structural and series
to predict HIV resistance by implementing SVM and RF methods. Jordanski,
Radovic, Milosevic, Filipovic, and Obradovic (2018) suggested a machine learning-
based technique for computing the WSS distribution for initiating and developing
the atherosclerosis. For capturing the associations among the blood density,
dynamic viscosity, and velocity, WSS distribution and geometric parameters of
AAA and Carotid bifurcation approaches, a MLR, Gaussian Conditional Random
Fields, and MLP were presented. Raweh, Nassef, and Badr (2018) intended to
predict the cancer by a hybrid model on the basis of feature selection and extraction
mechanisms. Moreover, the developed model used a filter feature selection technique
named F-score for overcoming the high-dimensionality issue, and introduced an
extraction approach that utilized the mean methylation density, symmetry among
8 K. Pradhan and P. Chawla
the methylation density and the mean methylation density, and FFT model of normal
and cancer persons for appropriate classification of cancer and decreasing training
time. Sedaghat, Fathy, Modarressi, and Shojaie (2018) implemented a two-step
process for improving the outcomes of sequence-based prediction models. The first
step was on the basis of consensus learning, and the second step consists of SVM in
both unary and binary modes for recognizing the evaluated interactions that rely on
the binding and network features of genes in the gene regulatory network. Wang
et al. (2018) employed the advantages of CNN for automatic learning features from
time series of necessary signs and absolute feature embedding for efficient encode
feature vectors by heterogeneous clinical features. The features learned by CNN and
statistical features by feature embedding were given to MLP to predict. Çarklı Yavuz,
Yurtay, and Ozkan (2018) offered a contribution for predicting the protein secondary
structure by the nature-inspired algorithms. In the first stage, the data were trained
using CSA that was designed with respect to the live immune model. Later, classifi-
cation was done using MLP, which was inspired from the biological nervous system.
In 2019, Mohan, Thirumalai, and Srivastava (2019) introduced a framework for
finding the important features using machine learning approaches by enhancing the
prediction accuracy of cardiovascular disease. By accumulating several features of
the existing classification methods, a new prediction model was developed:
HRFLM. Fitriyani, Syafrudin, Alfian, and Rhee (2019) developed a DPM for the
early prediction of type 2 diabetes and hypertension on the basis of one’s risk factor
information. Moreover, it included iForest-based outlier detection approach for eradi-
cating the outlier information, for balancing data distribution utilized SMOTETomek,
and for disease prediction, an ensemble approach was employed. Prince, Andreotti,
and De Vos (2019) suggested a multisource ensemble learning approach that
merged the dataset deconstruction and enabled the participants with partial infor-
mation for including during the training of machine learning techniques to obtain
high participant retention rate. Li et al. (2019) recommended a non-invasive and accu-
rate detection process for DS and for minimizing the cost of parental diagnosis. A cas-
caded machine learning algorithm was introduced for predicting DS on the basis of
three steps, such as pre-judgment with the iForest approach, ensemble approach by
voting mechanism, and the last judgment by the logistic regression technique.
Vásquez-Morales, Martínez-Monterrubio, Moreno-Ger, and Recio-García (2019)
developed a neural network-based classifier for predicting whether the person is at
the risk of growth of CKD or not. Haq et al. (2019) presented SVM for predicting
the Parkinson’s disease. In order to classify the healthy and the Parkinson’s disease-
affected people, the L1-Norm of SVM feature selection was employed and provided
a new feature subset from the dataset on the basis of feature weight value. Davi
et al. (2019) offered a machine learning technique to predict the severity of dengue
fever. Here, SVM was utilized for determining the loci classification subset, whereas
ANN was employed for classifying the patients into dengue fever and severe
dengue. Lai, Zhang, Zhang, Su, and Bin Heyat (2019) suggested an automated mech-
anism to predict SCD by an extreme level of accuracy with the help of measurable
arrhythmic markers. The arrhythmic parameters consisted of two-conduction -repo-
larization and three repolarization interval ratios. The computed markers were utilized
for the classification of SCD and normal SCD by machine learning classifiers such as
KNN, NB, DT, RF, and SVM. Yoon and Li (2019) presented the TL approach (PTL),
which leveraged remaining patients’ data, while constructing a predictive approach to
Journal of Management Analytics 9
a target patient. The special characteristic of the proposed model was able to choose
the patients for transferring and therefore averting a negative transfer. Ali et al. (2019)
suggested a new prediction model that utilized χ 2 statistical model and prevented the
model from over-fitting and under-fitting. To remove the redundant features, χ2 stat-
istical method was introduced when the optimal configuration of DNN was searched
with an extensive searching mechanism. Wang et al. (2019) developed a new prediction
model for DMP_MI. The missing values were reduced using NB classifier for data
normalization. Later, for decreasing the effect of class imbalance on the prediction
performance, an ADASYN was used. At last, for generating predictions, RF was
adopted and validated with the help of evaluation indicators. Zhang, Ren, Cheng,
Wang, and Wei (2019) implemented GBDT when the blood pressure rates were fore-
casted on the data gathered using EIMO tool. The tool consisted of both ECG and
PPG signals. The optimal parameters were chosen using the cross-validation approach
to prevent over-fitting. Perveen, Shahbaz, Keshavjee, and Guergachi (2019) explored
the association among the diabetes mellitus and one of the risk factors of MetS, in a
non-conservative setting, by using related risk factors of MetS, the future onset of dia-
betes was predicted and for probing the respective performance of machine learning
algorithms the data sampling methods were employed for creating balanced training
datasets. Dinh, Miertschin, Young, and Mohanty (2019) investigated data-driven
models that employed supervised machine learning algorithms for recognizing the
patients suffering from which kind of disease. The machine learning models, such as
SVM, gradient boosting, logistic regression, RF, were merged for introducing a
weighted ensemble technique, which is able to leverage the performance of different
techniques for enhancing the diagnosing accuracy. Ed-daoudy and Maalmi (2019)
introduced a new model for predicting the status of real-time health and analytical
model by big data methodology, which concentrated on implemented distributed
machine learning approach. At first, the DT model was transformed into distributed,
scalable, fast, and parallel DT with the help of Spark in the place of Hadoop Map
Reduce that turned out to be restriction to the distributed sources of different diseases
for predicting the status.
Figure 1. Choronological review on machine learning algorithms used for various disease
predictions.
Figure 2. Use of Machine learning algorithms for different types of diseases till now.
Lin, 2011), MLSVM (Zhong et al., 2012), L1-SVMR (Choi et al., 2011), LFDA-SVM
(Chen, Liu, et al., 2011), Fuzzy SVM (Subasi, 2012), MTD-SVM (Majid et al., 2014),
and LS-SVM (Munsell et al., 2015) for predicting distinct diseases in the earlier con-
tributions. Similarly, NN is classified as DNN (Ali et al., 2019), CNN (Chen et al.,
2017), MLP, (Raweh et al., 2018), GANN (Tong & Schierz, 2011), and CSDNN
(Wang et al., 2018), which are employed for diagnosing different diseases in different
contributions. Moreover, GBDT (Zhang, Ren, et al., 2019) is the modified form of DT,
CVIFLR (Li et al., 2019) is the modified form of LR used for detecting diseases. More-
over, RF and Fuzzy logic are grouped into HRFLM (Mohan et al., 2019) and Fuzzy
SVM (Subasi, 2012), respectively in order to predict discrete diseases in various con-
tributions. Therefore, more research studies need to be improved for predicting lung
cancer in an efficient manner with the help of improved machine learning techniques.
(1) Supervised Learning: Explaining it with an example, where you have input
variable (X) and output variable (Y) and you use an algorithm to learn the
mapping function from the input to the output.
(2) Unsupervised Learning: It is the training of a model using information that is
neither classified nor labelled. This model can be used to cluster the input data
on the basis of their statistical properties.
(3) Reinforcement Learning: It is learning by interacting with space or an environ-
ment. Reinforcement learning agents learn from the consequences based on its
action rather than from being taught explicitly. It selects its action on the basis
of its past experiences (exploitation) and also by new choices (exploration).
Deep learning model is capable of focusing on the right feature by itself, requiring
little knowledge from the program. These models also partially solve the dimension-
ality problem. The idea of DL is to build learning algorithm that mimics the brain.
Journal of Management Analytics 13
Table 1. Merits and demerits of frequently used machine learning algorithm for predicting
diseases.
NN
DT
ELM
RF
. It has the capability for . They are very complex and take
solving regression and more time to build a DT.
classification issues. . It is highly expensive because
. It has the capacity to handle training more deep trees
the missing values requires more storage space.
automatically.
14 K. Pradhan and P. Chawla
Adaboost
NB
DBN
KNN
Transduction
Learning
. It is a well-known approach . It doesn’t construct a predictive
for sparse data. model.
. It has the ability to consider . The device size is too large.
all the points along with the
unlabelled points.
Journal of Management Analytics 15
Transfer
Learning
. With the help of pre-trained . Negative transfer is a critical
system, the training process problem.
of the model becomes fast on . The data transfer is happened
the new job. only when it is suitable.
. It requires less training time,
and has better performance
of NN and it doesn’t require
more amount of data.
Ensemble
Learning
. They won’t have the problem . These are computationally
of over-fitting. expensive.
. The subsets of data are also . These models suffer from lack
trained well. of interpretability.
Fuzzy
LR
A collection of statistical machine learning techniques are used to learn feature hier-
archies often used based on the artificial neural network. Some applications of DL are
self-driving cars, voice-controlled assistance, automatic machine translation, game
paying, etc. Deep learning skips the manual steps of extracting features, you can
directly feed images to the deep learning algorithm which predicts the objects.
Table 2 explains the comparison among various deep learning frameworks with
reference to framework, License, programming language, software support, release
date, and supporting algorithms such as CNN and RNN and DBN. In Table 2, it is
16 K. Pradhan and P. Chawla
observed that to develop any software using deep learning C++ and python program-
ming language are mostly used. In Guo et al. (2020), Python was used as programming
language with the software support Python3.3 or Jupyter Notebook. It was released in
2017 with the support of CNN, RNN, and DBN. Programming language C++ has
been used in frameworks such as PyTorch (Ketkar, 2017), Keras (Jakhar & Hooda,
2018), Caffe (Jia et al., 2014), MXNet (Chen et al., 2015), and TensorFlow (Abadi
et al. 2016) to increase speed. Likewise, conveyed estimation gets regular in some
recently discharged structures, for example, TensorFlow, MXNet, Keras, and
Chainer (Tokui et al., 2019). The objective is to additionally improve the figuring pro-
ficiency for deep learning. MXNet underpins a few interfaces including C++, Python,
R, Scala, Perl, MATLAB, Javascript,Go (Skoymind, 2017). It bolsters both calcu-
lation diagram affirmations and basic calculation presentations for engineering
plan. MXNet bolsters information and model parallelism as well as follows parameter
server plans to help circulated count too. MXNet is most useful, yet the exhibition
isn’t streamlined as much as other condition of the art structures.
Framework Licence Programming language Software support Released year CNN & RNN DBN
Gluon(Guo et al., 2020) Apache 2.0 Python Python 3.3 or Jupyter Notebook 2017 Yes Yes
PyTorch(Ketkar, 2017) BSD Python, C++, Cuda Python 2016 Yes Yes
Keras(Jakhar & Hooda, 2018) MIT Licence Python, Python 2015 Yes Yes
java
Caffe(Jia et al., 2014) BSD C++ Python and Matlab 2015 Yes No
MXNet(Chen et al., 2015) Apache 2.0 C++ C++,R, Scala, Perl, Python. 2015 Yes Yes
TensorFlow(Abadi et al., 2016) Apache 2.0 C++ and Python Python, Java, C and C++ 2015 Yes Yes
Chainer(Tokui et al., 2019 MIT Python Python 2015 Yes Yes
Deeplearning 4j(Skoymind, 2017) Apache 2.0 java Java, Python scala 2014 Yes Yes
Theano(Team et al. 2016) BSD Python Python 2008 Yes Yes
17
18 K. Pradhan and P. Chawla
Table 3. Various datasets utilized for predicting diseases using machine learning algorithms.
Lung Cancer
(Continued )
20 K. Pradhan and P. Chawla
Table 3. Continued.
Lung Cancer
Citations Accuracy Sensitivity Specificity Precision F1-score AUC Recall ROC Miscellaneous
Barakat et al. (2010) ✓ ✓ ✓ - - ✓ - - TPR, FPR
Lee et al. (2014) - ✓ ✓ ✓ ✓ ✓ - - -
Azzawi et al. (2016) ✓ ✓ ✓ - - ✓ - - -
Lee and Kim (2016) - - - - - ✓ - - -
Chen et al. (2017) ✓ - - ✓ ✓ - ✓ - -
Luo et al. (2017) - - - ✓ - - ✓ - FPR and TPR
Zhang et al. (2017) ✓ - - - - - - - Work load saving and risk
21
✓ ✓ ✓ ✓
22
Dinh et al. (2019) - - - - -
Ed-daoudy and Maalmi ✓ ✓ ✓ - - - - ✓ -
(2019)
Tan et al. (2009) ✓ ✓ ✓ - - - - - -
Akay (2009) ✓ ✓ ✓ - - - - ✓ PPV, and NPV
Çınar et al. (2009) ✓ ✓ ✓ - - - - - -
Anand and Suganthan ✓ - - - - - - - -
(2009)
Oztekin et al. (2009) ✓ ✓ ✓ - - - - - -
Tang et al. (2009) ✓ ✓ - ✓ - - - - -
Kim et al. (2010) ✓ - - - - - - - -
Choi et al. (2011) ✓ - - - - - - - -
23
24 K. Pradhan and P. Chawla
Figure 5. Line representation of the best achieved accuracy during different contributions of
disease prediction.
statistical approach, LR. For measuring the unbiased assessment of three detection
models, ten-fold cross-validation mechanisms were used for the performance compari-
son. The outcomes have proved that DT was the well-performing classifier for predict-
ing the disease with an accuracy of 93.6% on the holdout sample; ANN was standing
the second best position with an accuracy of 91.2%. Similarly, logistic regression has
attained the accuracy of 89.2%. A research was done by Delen (2009) for developing
detection techniques to know the survivability of prostate cancer, using SVM along
with those three methods that were mentioned earlier. Here, the outputs have revealed
that the singled-out SVM acquired higher accuracy than ANN and DT . Moreover,
prostate cancer survivability was examined by ANNs, DTs, and LR methods by
Delen and Patil (2006). Multiple methods were contrasted by Hoogendoorn,
Moons, Numans, and Sips (2014) in SEER colon cancer patient dataset for predicting
survival rate, and it recognized that NNs were best for predicting the survival rate.
Ensemble voting of three outperformed classifiers present by Al-Bahrani, Agrawal,
and Choudhary (2013) was resulted in optimal prediction, and AU-ROC curve to
colon cancer survival rate. In some research studies, the survival of lung cancer
patient was examined by evaluating the SEER database using machine learning algor-
ithms, consisting of SVM, LR (Fradkin, Muchnik, & Schneider, 2005), unsupervised
approaches (Lynch, Berkel, & Frieboes, 2017), and clustering-based techniques (Chen
et al., 2009). In Arshadi and Jurisica (2005), data classification approaches were
assessed for finding the chances of patients with definite indications for the growth
of lung cancer. The performances of DT and NB classifiers were compared by Dimi-
toglou, Adams, and Jim (2012), and they were implemented for lung cancer data
acquired from SEER database. This attained approximately 90% precision in detect-
ing the survival of patients. Ensemble voting of five DTs and meta-classifiers existing
by Agrawal, Misra, Narayanan, Polepeddi, and Choudhary (2011, 2012) was resolute
for acquiring the best prediction survival rate of lung cancer regarding precision and
AU-ROC curve. Many challenges related to the machine learning algorithms are
Journal of Management Analytics 25
associated with manual training. The significant thing is complexity in accurate recog-
nition of nature for pre-processing them correspondingly before subjected to machine
learning algorithms. The time and the experts linked with this job were majorly high.
According to the research, it was manifested that there is lack of consistency in the
detection accuracy of machine learning techniques over classical prediction tech-
niques. With the present literature, this was made reliable. Many investigations that
compared the machine learning models with classic statistical model have been con-
firmed that their outcomes were different.
Even though multiple strategies were utilized for predicting different types of dis-
eases, the predictive models using the machine learning algorithms reported in the
literal works are fewer for lung cancer detection with IoT integration. Hence, there
is a high scope to implement more well-performing deep learning models that might
produce best prediction outcomes. Moreover, the enlarged availability of adequate his-
torical data of patients has paved the way for the development of novel deep learning
algorithms for lung cancer prediction. In addition, the optimization algorithms have
the ability to improvise the deep learning models. GA (Tong & Schierz, 2011) is
very simple to implement, which has the ability to find appropriate solutions within
a short span of time. However, there are few disadvantages such as it is not able to
find the optimal solution to the problem defined, and it is complex to select par-
ameters. Moreover, the benefit of PSO (Mohebian et al., 2017) is its ability to solve
the complex optimization problem. But, the convergence concept is not applicable.
Some of the positives of SMO (Zięba et al., 2014) are useful for solving quadratic pro-
blems that occur in the training of SVM, and also it reduces the memory storage. Yet it
has to improve by introducing a new variant. The ability of machine learning to solve
composite tasks with dynamic environment and knowledge has contributed to its
success in prediction research especially lung cancer, enabled with novel met-heuristic
algorithms.
7. Conclusion
The presented paper made an effort to study multiple machine learning methodologies
suitable for detection of lung cancer associated with IoT devices. The review has made
a research of approximately 65 papers detecting various kinds of diseases by machine
learning techniques and mentioned the important defects with the existing method-
ologies. The research has concentrated on different machine learning approaches uti-
lized to detect many diseases for search to a gap in future enhancement to predict lung
cancer in clinical IoT. Each and every method was examined and the entire challenges
were mentioned. In numerous contributions, the performance metrics were specified
with its simulation platforms. Moreover, the dataset utilized to predict the related dis-
eases was also examined, whether the dataset was standard or manually gathered
information. Finally, a complete research gap was also given on the basis of pro-
gression of intelligent approaches that will help to the research studies for detecting
the lung cancer patients precisely in early stages.
Disclosure statement
No potential conflict of interest was reported by the author(s).
26 K. Pradhan and P. Chawla
ORCID
Priyanka Chawla http://orcid.org/0000-0002-6029-4122
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., … Ghemawat, S. (2016).
“Tensorflow: Large-scale machine learning on heterogeneous distributed systems,”
arXiv:1603.04467.
Aceto, G., Persico, V., & Pescapé, A. (2020). Industry 4.0 and health: internet of things, big data,
and cloud computing for healthcare 4.0. Journal of Industrial Information Integration, 18,
100129.
Agrawal, A., Misra, S., Narayanan, R., Polepeddi, L., & Choudhary, A. (2011). A lung cancer
outcome calculator using ensemble data mining on SEER data. Proceedings of the Tenth
International Workshop on Data mining in Bioinformatics, ACM.
Agrawal, A., Misra, S., Narayanan, R., Polepeddi, L., & Choudhary, A. (2012). Lung cancer
survival prediction using ensemble data mining on seer data. Scientific Programming, 20
(1), 29–42.
Akay, M. F. (2009). Support vector machines combined with feature selection for breast cancer
diagnosis. Expert Systems with Applications, 36(2), 3240–3247.
Akman, E., Karaman, A. S., & Kuzey, C. (2020). Visa trial of international trade: Evidence
from support vector machines and neural networks. Journal of Management Analytics, 7
(2), 231–252.
Al-Anni, R., Hou, J., Abdu-aljabar, R. D., & Xiang, Y. (2017). Prediction of NSCLC recurrence
from microarray data with GEP. IET Systems Biology, 11(3), 77–85.
Al-Bahrani, R., Agrawal, A., & Choudhary, A. (2013). Colon cancer survival prediction using
ensemble data mining on SEER data. 2013 IEEE International Conference on Big Data,
Silicon Valley, CA, pp. 9–16.
Al-Kadi, O. S., & Watson, D. (2008). Texture analysis of aggressive and nonaggressive lung
tumor CE CT images. IEEE Transactions on Biomedical Engineering, 55(7), 1822–1830.
Alahmari, S. S., Cherezov, D., Goldgof, D. B., Hall, L. O., Gillies, R. J., & Schabath, M. B.
(2018). Delta radiomics improves pulmonary nodule malignancy prediction in lung cancer
screening. IEEE Access, 6, 77796–77806.
Alanni, R., Hou, J., Azzawi, H., & Xiang, Y. (2019). Cancer adjuvant chemotherapy prediction
model for non-small cell lung cancer. IET Systems Biology, 13(3), 129–135.
Ali, L., Rahman, A., Khan, A., Zhou, M. I., Javeed, A., & Khan, J. A. (2019). An automated
diagnostic system for heart disease prediction based on χ 2 statistical model and optimally
configured deep neural network. IEEE Access, 7, 34938–34945.
ALzubi, J. A., Bharathikannan, B., Tanwar, S., Manikandan, R., Khanna, A., & Thaventhiran,
C. (2019). Boosted neural network ensemble classification for lung cancer disease diagnosis.
Applied Soft Computing, 80, 579–591.
Anand, A., & Suganthan, P. N. (2009). Multiclass cancer classification by support vector
machines with class-wise optimized genes and probability estimates. Journal of Theoretical
Biology, 259(3), 533–540.
Anooj, P. K. (2012). Clinical decision support system: Risk level prediction of heart disease
using weighted fuzzy rules. Journal of King Saud University – Computer and Information
Sciences, 24(1), 27–40.
Arshadi, N., & Jurisica, I. (2005). Data mining for case-based reasoning in high-dimensional
biological domains. IEEE Transactions on Knowledge and Data Engineering, 17(8), 1127–
1137.
Arunkumar, C., & Ramakrishnan, S. (2019). Prediction of cancer using customised fuzzy rough
machine learning approaches. Healthcare Technology Letters, 6(1), 13–18.
Åström, F., & Koker, R. (2011). A parallel neural network approach to prediction of Parkinson’s
disease. Expert Systems with Applications, 38(10), 12470–12474.
Azzawi, H., Hou, J., Xiang, Y., & Alanni, R. (2016). Lung cancer prediction from microarray
data by gene expression programming. IET Systems Biology, 10(5), 168–178.
Journal of Management Analytics 27
Babu, G. S., & Suresh, S. (2013). Parkinson’s disease prediction using gene expression – a pro-
jection based learning meta-cognitive neural classifier approach. Expert Systems with
Applications, 40(5), 1519–1529.
Barakat, N., Bradley, A. P., & Barakat, M. N. H. (2010). Intelligible support vector machines for
diagnosis of diabetes mellitus. IEEE Transactions on Information Technology in Biomedicine,
14(4), 1114–1120.
Barbieri, C., Mari, F., Stopper, A., Gatti, E., Escandell-Montero, P., Martínez-Martínez, J. M.,
& Martín-Guerrero, J. D. (2015). A new machine learning approach for predicting the
response to anemia treatment in a large cohort of End stage renal disease patients undergoing
dialysis. Computers in Biology and Medicine, 61, 56–61.
Capriotti, E., & Altman, R. B. (2011). A new disease-specific machine learning approach for the
prediction of cancer-causing missense variants. Genomics, 98(4), 310–317.
Çarklı Yavuz, B., Yurtay, N., & Ozkan, O. (2018). Prediction of protein secondary structure
With clonal selection algorithm and multilayer perceptron. IEEE Access, 6, 45256–45261.
Chen, A. H., & Lin, C.-H. (2011). A novel support vector sampling technique to improve classi-
fication accuracy and to identify key genes of leukaemia and prostate cancers. Expert Systems
with Applications, 38(4), 3209–3219.
Chen, D., Xing, K., Henson, D., Sheng, L., Schwartz, A. M., & Cheng, X. (2009). Developing
prognostic systems of cancer patients by ensemble clustering. Journal of Biomedicine and
Biotechnology, 2009, 1–7.
Chen, H.-L., Liu, D.-Y., Yang, B., Liu, J., & Wang, G. (2011). A new hybrid method based on
local fisher discriminant analysis and support vector machines for hepatitis disease diagnosis.
Expert Systems with Applications, 38(9), 11796–11803.
Chen, H.-L., Yang, B., Liu, J., & Liu, D.-Y. (2011). A support vector machine classifier with
rough set-based feature selection for breast cancer diagnosis. Expert Systems with
Applications, 38(7), 9014–9022.
Chen, M., Hao, Y., Hwang, K., Wang, L., & Wang, L. (2017). Disease prediction by machine
learning over Big data from healthcare communities. IEEE Access, 5, 8869–8879.
Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., … Zhang, Z. (2015). MXNet: A flexible
and efficient machine learning library for heterogeneous distributed systems.CoRR abs/
1512.01274. Retrieved from http://arxiv
Chi-Hsien, K., & Nagasawa, S. (2019). Applying machine learning to market analysis: Knowing
your luxury consumer. Journal of Management Analytics, 6(4), 404–419.
Choi, H., Yeo, D., Kwon, S., & Kim, Y. (2011). Gene selection and prediction for cancer classi-
fication using support vector machines with a reject option. Computational Statistics & Data
Analysis, 55(5), 1897–1908.
Çınar, M., Engin, M., Engin, E. Z., & Ziya Ateşçi, Y. (2009). Early prostate cancer diagnosis by
using artificial neural networks and support vector machines. Expert Systems with
Applications, 36(3), 6357–6361.
Cirujeda, P., Cid, Y. D., Müller, H., Rubin, D., Aguilera, T. A., Loo, B. W., … Depeursinge, A.
(2016). A 3-D riesz-covariance texture model for prediction of nodule recurrence in lung CT.
IEEE Transactions on Medical Imaging, 35(12), 2620–2630.
Dai, W., Brisimi, T. S., Adams, W. G., Mela, T., Saligrama, V., & Paschalidis, I. (2015).
Prediction of hospitalization due to heart diseases by supervised learning methods.
International Journal of Medical Informatics, 84(3), 189–197.
Das, A., Rad, P., Choo, K. K. R., Nouhi, B., Lish, J., & Martel, J. (2019). Distributed machine
learning cloud teleophthalmology IoT for predicting AMD disease progression. Future
Generation Computer Systems, 93, 486–498.
Davi, C., Pastor, A., Oliveira, T., Neto, F. B. L., Braga-Neto, U., Bigham, A. W., … Acioli-
Santos, B. (2019). Severe dengue prognosis using human genome data and machine learning.
IEEE Transactions on Biomedical Engineering, 66(10), 2861–2868.
Delen, D. (2009). Analysis of cancer data: A data mining approach. Expert Systems, 26(1), 100–
112.
Delen, D., & Patil, N. (2006). Knowledge extraction from prostate cancer data. Proceedings of
the 39th Annual Hawaii International Conference on, vol.5.
Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A compari-
son of three data mining methods. Artificial Intelligence in Medicine, 34(2), 113–127.
28 K. Pradhan and P. Chawla
Dimitoglou, G., Adams, J. A., & Jim, C. M. (2012). Comparison of the C4.5 and a naive Bayes
classifier for the prediction of lung cancer survivability. Journal of Computing, 4(8), 1–9.
Dinh, A., Miertschin, S., Young, A., & Mohanty, S. D. (2019). A data-driven approach to pre-
dicting diabetes and cardiovascular disease with machine learning. BMC Medical Informatics
and Decision Making, 19(211), 1–15.
Ed-daoudy, A., & Maalmi, K. (2019). A new internet of things architecture for real-time predic-
tion of various diseases using machine learning on big data environment. Journal of Big Data,
6, 104.
Emaminejad, N., Qian, W., Guan, Y., Tan, M., Qiu, Y., Liu, H., & Zheng, B. (2016). Fusion of
quantitative image and genomic biomarkers to improve prognosis assessment of early stage
lung cancer patients. IEEE Transactions on Biomedical Engineering, 63(5), 1034–1043.
Engchuan, W., & Chan, J. H. (2015). Pathway activity transformation for multi-class classifi-
cation of lung cancer datasets. Neurocomputing, 165, 81–89.
Fan, Y. J., Yin, Y. H., Xu, L., Zeng, Y., & Wu, F. (2014). Iot based smart rehabilitation system.
IEEE Transactions on Industrial Informatics, 10(2), 1568–1577.
Fitriyani, N. L., Syafrudin, M., Alfian, G., & Rhee, J. (2019). Development of disease prediction
model based on ensemble learning approach for diabetes and hypertension. IEEE Access, 7,
144777–144789.
Fradkin, D., Muchnik, I., & Schneider, D. (2005). Machine learning methods in the analysis of
lung cancer survival data. DIMACS Technical Report.
Guo, J., He, H., He, T., Lausen, L., Li, M., & Lin, H. (2020). GluonCV and GluonNLP: Deep
learning in computer vision and natural language processing. Journal of Machine Learning
Research, 21, 1–7.
Haq, A. U., Li, J. P., Memon, M. H., khan, J., Malik, A., Ahmad, T., … Shahid, M. (2019).
Feature selection based on L1-norm support vector machine and effective recognition
system for Parkinson’s disease using voice recordings. IEEE Access, 7, 37718–37734.
Hawkins, S. H., Korecki, J. N., Balagurunathan, Y., Gu, Y., Kumar, V., Basu, S., … Gillies, R. J.
(2014). Predicting outcomes of nonsmall cell lung cancer using CT image features. IEEE
Access, 2, 1418–1426.
Hoogendoorn, M., Moons, L. M. G., Numans, M. E., & Sips, R.-J. (2014). Utilizing data
mining for predictive modeling of colorectal cancer using electronic medical records.
International Conference on brain Informatics and Health BIH 2014: Brain Informatics and
Health (pp 132–141).
Huang, Z. W., Mcwilliams, A., Lui, H., Mclean, D., Lan, S., & Zeng, H. S. (2003). Near-infra-
red Raman spectroscopy for optical diagnosis of lung cancer. International Journal of Cancer,
107(6), 1047–1052.
Jakhar, K., & Hooda, N. (December). Big data deep learning framework using Keras: A case
study of Pneumonia prediction. 2018 4th International Conference on computing communi-
cation and automation (ICCCA) (pp. 1–5). IEEE.
Jemal, A., Bray, F., Center, M. M., Ferlay, J. J., Ward, E., & Forman, D. (2011). Global cancer
statistics. Cancer Journal for Clinicians, 61(2), 69–90.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., & Girshick, R. (2014). Caffe:
Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM inter-
national conference on Multimedia, (pp. 675–678).
Jordanski, M., Radovic, M., Milosevic, Z., Filipovic, N., & Obradovic, Z. (2018). Machine
learning approach for predicting wall shear distribution for abdominal aortic aneurysm
and carotid bifurcation models. IEEE Journal of Biomedical and Health Informatics, 22(2),
537–544.
Kaya, Y., & Uyar, M. (2013). A hybrid decision support system based on rough set and extreme
learning machine for diagnosis of hepatitis disease. Applied Soft Computing, 13(8), 3429–
3438.
Ketkar, N. (2017). Deep learning with python: A hands-on introduction. Berkeley, CA: Apress.
Khalid, Z., & Sezerman, O. U. (2018). Prediction of HIV drug resistance by combining sequence
and structural properties. IEEE/ACM Transactions on Computational Biology and
Bioinformatics, 15(3), 966–973.
Kim, T.-W., Koh, D.-H., & Park, C.-Y. (2010). Decision tree of occupational lung cancer using
classification and regression analysis. Safety and Health at Work, 1(2), 140–148.
Journal of Management Analytics 29
Kotsavasiloglou, C., Kostikis, N., Hristu-Varsakelis, D., & Arnaoutoglou, M. (2017). Machine
learning-based classification of simple drawing movements in Parkinson’s disease. Biomedical
Signal Processing and Control, 31, 174–180.
Kumar, D., Sankar, V., Clausi, D., Taylor, G. W., & Wong, A. (2019). SISC: End-to-end inter-
pretable discovery radiomics-driven lung cancer prediction via stacked interpretable sequen-
cing cells. IEEE Access, 7, 145444–145454.
Lai, D., Zhang, Y., Zhang, X., Su, Y., & Bin Heyat, M. B. (2019). An automated strategy for
early risk identification of sudden cardiac death by using machine learning approach on mea-
surable arrhythmic risk markers. IEEE Access, 7, 94701–94716.
Lee, B. J., & Kim, J. Y. (2016). Identification of type 2 diabetes risk factors using phenotypes
consisting of anthropometry and triglycerides based on machine learning. IEEE Journal of
Biomedical and Health Informatics, 20(1), 39–46.
Lee, B. J., Ku, B., Nam, J., Pham, D. D., & Kim, J. Y. (2014). Prediction of fasting plasma
glucose status using anthropometric measures for diagnosing type 2 diabetes. IEEE Journal
of Biomedical and Health Informatics, 18(2), 555–561.
Lee, J., Keam, B., Jang, E. J., Park, M. S., Lee, J. Y., Kim, D. B., … Kim, H.-L. (2011).
Development of a predictive model for type 2 diabetes mellitus using genetic and clinical
data. Osong Public Health and Research Perspectives, 2(2), 75–82.
Li, L., Liu, W., Zhang, H., Jiang, Y., Hu, X., & Liu, R. (2019). Down syndrome prediction using
a cascaded machine learning framework designed for imbalanced and feature-correlated
data. IEEE Access, 7, 97582–97593.
Li, M., Xiang, Z., Lian, Z., Xiao, L., Zhang, J., & Wei, Z. (2018). Prediction of lung motion
from four-dimensional computer tomography (4DCT) images using Bayesian registration
and trajectory modelling. IEEE Access, 6, 2803–2811.
Li, S., Xu, L., & Zhao, S. (2018). 5G internet of things: A survey. Journal of Industrial
Information Integration, 10, 1–9.
Lu, Y. (2019). Artificial intelligence A survey on evolution models applications and future
trends. Journal of Management Analytics, 6(4), 404–419.
Luo, J., Ding, P., Liang, C., Cao, B., & Chen, X. (2017). Collective prediction of disease-associ-
ated miRNAs based on transduction learning. IEEE/ACM Transactions on Computational
Biology and Bioinformatics, 14(6), 1468–1475.
Luo, Y., Shan, D. M., Ray, D., Matuszak, M., Jolly, S., Lawrence, T., … Naqa, I. E. (2019).
Development of a fully cross-validated Bayesian network approach for local control predic-
tion in lung cancer. IEEE Transactions on Radiation and Plasma Medical Sciences, 3(2), 232–
241.
Lynch, C. M., Abdollahi, B., Fuqua, J. D., de Carlo, A. R., Bartholomai, J. A., Balgemann, R.
N., … Frieboes, H. B. (2017). Prediction of lung cancer patient survival via supervised
machine learning classification techniques. International Journal of Medical Informatics,
108, 1–8.
Lynch, C. M., Berkel, V. H. V., & Frieboes, H. B. (2017). Application of unsupervised analysis
techniques to lung cancer patient data. PLoS One, 12(9), 1–18.
Ma, L., Wang, D. D., Zou, B., & Yan, H. (2017). An eigen-binding site based method for the
analysis of anti-EGFR drug resistance in lung cancer treatment. IEEE/ACM Transactions
on Computational Biology and Bioinformatics, 14(5), 1187–1194.
Majid, A., Ali, S., Iqbal, M., & Kausar, N. (2014). Prediction of human breast and colon cancers
from imbalanced data using nearest neighbor and support vector machines. Computer
Methods and Programs in Biomedicine, 113(3), 792–808.
Memarian, N., Kim, S., Dewar, S., EngelJr, J., & Staba, R. J. (2015). Multimodal data and
machine learning for surgery outcome prediction in complicated cases of mesial temporal
lobe epilepsy. Computers in Biology and Medicine, 64, 67–78.
Mohabatkar, H., Beigi, M. M., & Esmaeili, A. (2011). Prediction of GABAA receptor proteins
using the concept of chou’s pseudo-amino acid composition and support vector machine.
Journal of Theoretical Biology, 281(1), 18–23.
Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using
hybrid machine learning techniques. IEEE Access, 7, 81542–81554.
Mohebian, M. R., Marateb, H. R., Mansourian, M., AngelMañanas, M., & Mokarian, F.
(2017). A hybrid computer-aided-diagnosis system for prediction of breast cancer recurrence
30 K. Pradhan and P. Chawla
Tokui, S., Okuta, R., Akiba, T., Niitani, Y., Ogawa, T., Saito, S., … Vincent, H. Y. (2019).
“Chainer: A deep learning framework for accelerating the research cycle” KDD 19,
August 4–8, 2019, Anchorage, AK, USA.
Tong, D. L., & Schierz, A. C. (2011). Hybrid genetic algorithm-neural network: Feature extrac-
tion for unpreprocessed microarray data. Artificial Intelligence in Medicine, 53(1), 47–56.
Valdés-Mas, M. A., Martín-Guerrero, J. D., Rupérez, M. J., Pastor, F., Dualde, C., Monserrat,
C., & Peris-Martínez, C. (2014). A new approach based on machine learning for predicting
corneal curvature (K1) and astigmatism in patients with keratoconus after intracorneal
ring implantation. Computer Methods and Programs in Biomedicine, 116(1), 39–47.
Vásquez-Morales, G. R., Martínez-Monterrubio, S. M., Moreno-Ger, P., & Recio-García, J. A.
(2019). Explainable prediction of chronic renal disease in the Colombian population using
neural networks and case-based reasoning. IEEE Access, 7, 152900–152910.
Wang, H., Cui, Z., Chen, Y., Avidan, M., Abdallah, A. B., & Kronzer, A. (2018). Predicting
hospital readmission via cost-sensitive deep learning. IEEE/ACM Transactions on
Computational Biology and Bioinformatics, 15(6), 1968–1978.
Wang, Q., Cao, W., Guo, J., Ren, J., Cheng, Y., & Davis, D. N. (2019). DMP_MI: An effective
diabetes mellitus classification algorithm on imbalanced data With missing values. IEEE
Access, 7, 102232–102238.
Wu, J., Lian, C., Ruan, S., Mazur, T. R., Mutic, S., Anastasio, M. A., … Li, H. (2019).
Treatment outcome prediction for cancer patients based on radiomics and belief function
theory. IEEE Transactions on Radiation and Plasma Medical Sciences, 3(2), 216–224.
Xu, L., He, W., & Li, S. (2014). Internet of things in industries: A survey. IEEE Transactions on
Industrial Informatics, 10(4), 2233–2243.
Xu, B., Xu, L., Cai, H., Xie, C., Hu, J., & Bu, F. (2014). Ubiquitous data accessing method in
IoT-based information system for emergency medical services. IEEE Transactions on
Industrial Informatics, 10(2), 1578–1586.
Yang, P., & Xu, L. (2018). The Internet of Things (IoT): Informatics methods for IoT-enabled
health care. Journal of Biomedical Informatics, 87, 154–156.
Yin, Y., Zeng, Y., Chen, X., & Fan, Y. (2016). The internet of things in healthcare: An overview.
Journal of Industrial Information Integration, 1, 3–13.
Yoon, H., & Li, J. (2019). A novel positive transfer learning approach for telemonitoring of
Parkinson’s disease. IEEE Transactions on Automation Science and Engineering, 16(1),
180–191.
Yu, H., Ni, J., Dan, Y., & Xu, S. (2012). Mining and integrating reliable decision rules for imbal-
anced cancer gene expression data sets. Tsinghua Science and Technology, 17(6), 666–673.
Yuan, R., Li, Z., Guan, X., & Xu, L. (2010). An SVM-based machine learning method for accu-
rate internet traffic classification. Information Systems Frontiers, 12(2), 149–156.
Zamani, A., Rezaeieh, S. A., & Abbosh, A. M. (2015). Lung cancer detection using frequency-
domain microwave imaging. Electronics Letters, 51(10), 740–741.
Zhang, B., Qi, S., Monkam, P., Li, C., Yang, F., Yao, Y.-D., & Qian, W. (2019). Ensemble lear-
ners of multiple deep CNNs for pulmonary nodules classification using CT images. IEEE
Access, 7, 110358–110371.
Zhang, B., Ren, J., Cheng, Y., Wang, B., & Wei, Z. (2019). Health data driven on continuous
blood pressure prediction based on gradient boosting decision tree algorithm. IEEE
Access, 7, 32423–32433.
Zhang, J., Lafta, R. L., Tao, X., Li, Y., Chen, F., Luo, Y., & Zhu, X. (2017). Coupling a fast
Fourier transformation With a machine learning ensemble model to support recommen-
dations for heart disease patients in a telehealth environment. IEEE Access, 5, 10674–10685.
Zhong, H., & Song, M. (2019). A fast exact functional test for directional association and cancer
biology applications. IEEE/ACM Transactions on Computational Biology and Bioinformatics,
16(3), 818–826.
Zhong, W., Chow, R., & He, J. (2012). Clinical charge profiles prediction for patients diagnosed
with chronic diseases using multi-level support vector machine. Expert Systems with
Applications, 39(1), 1474–1483.
Zięba, M., Tomczak, J. M., Lubicz, M., & Świątek, J. (2014). Boosted SVM for extracting rules
from imbalanced data in application to prediction of the post-operative life expectancy in the
lung cancer patients. Applied Soft Computing, 14, 99–108.
32 K. Pradhan and P. Chawla
Appendix
Abbreviations Descriptions
SVM Support Vector Machine
LR Logistic Regression
NB Naïve Bayes
NN Neural Network
LKT-SVM SVM with Local Kernal Transform
L1-SVMR L1-SVM with Reject option
RF Random Forest
DT Decision Tree
KNN K-Nearest Neighbour
DBN Dynamic Bayesian Network
ELM Extreme Machine Learning
CNN Convolutional Neural Network
MLP Multi Layer Perceptron
CSDNN Cost-Sensitive Deep Neural Network
GANN Genetic Algorithm-Neural Network
CVIFLR Cascaded framework of Voting Isolation Forests
and Logistic Regression
GBDT Gradient Boosting Decision Tree
HRFLM Hybrid Random Forest with Linear Model
SVST Support Vector Sampling Technique
RS-SVM Rough Set SVM
LFDA-SVM Local Fisher Discriminant Analysis-SVM
MLSVM Multi-Level SVM
MTD-SVM Mega-Trend Diffusion
LS-SVM Least Squares-SVM
IoT Internet of Things
AUC Area Under Curve
ROC Receiver Operating Curve
NLST National Lung Screening Trial
IoT Internet of Things
AFS Analysis-of-Variance-based Feature Set
GEP Gene Expression Programming
GBM Gradient Boosting Machines
POMDP Partially-Observable Markov Decision Process
WONN-MLB Weight Optimized Neural Network with Maximum Likelihood Boosting
MLMR Maximum Likelihood and Minimum Redundancy
FPR False Positive Rate
BMI Body Mass Index
LM Levenberg–Marquardt
SCG Scaled Conjugate Gradient
BFGS Broyden-Fletcher-Goldfarb-Shanno
OVA-SVM One-Versus-All SVM
RS Rough Set
RS-SVM RS -based SVM
BPNN Back-Propagation Neural Networking
FTIR Fourier Transform Infrared
PSO Particle Swarm Optimization
Journal of Management Analytics 33