The Role of Big Data Analytics in Hospital Management System

International Journal of Pure and Applied Mathematics

Volume 115 No. 7 2017, 31-35

ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)
Special Issue


Dhivyalakshmi.S1, Umamakeswari.A2
M.Tech-Computer Science & Engineering, School of Computing, SASTRA University, Thanjavur-613401,
Associate Dean, School of Computing, SASTRA University, Thanjavur-613401,India
[email protected]

Abstract: Big data analytics has massive potential to acceptance of Electronic Health Records (EHR)
effect healthcare certainly by enlightening quality of unlocks extra opportunities for data analytics, as we are
care, saving lives and lowering costs. Basically, big able to contact structured and unstructured data which
data also helps organisations to become more creative is analytically gathered for each event in the healthcare
and be well-organized. Like many other industries, system [2]. Medical information as well as the
healthcare has adapted to data analytics not only for its experimental judgement plays a major role. This
economical returns but also for enlightening patients’ medical information can be examined with the Big
quality of life. Reduction in re-admission amounts, Data Analytics to envisage patient’s illnesses and to
predictive algorithms for diagnostics, real-time advise the appropriate desired medications. When
observing of ICU situations are some of the practical associating big data analysis of other business areas
applications of big data in hospitals. This paper with the healthcare, the health region is still in its initial
summarizes the existing growth of Big Data Analytics phases due to abundant reasons. Key challenges met
in medical institution. It also examines some of the include accepting the volume, velocity and variety of
emerging role of Predictive Data Analytics (PDA), a healthcare data [3].
few uses of Big Data Analytics in the medical field, the This paper highlights the different
proposed generic architecture, in addition to some implementations, methods and applications of the Big
security solutions. Data, which play a vibrant role in the field of medical
Keywords: Encryption, Medical Domain, PDA, Kafka, domain. It also explains the basic architecture which
Storm, NoSQL, Cloud Computing, Big Data Analytics, associates the batch-based and real-time based
De-Identification, Business Intelligence, Masking. computing towards improving the big data computing
in medical domain.
1. Introduction
2. Literature Survey
Big Data refers to enormous amounts of data that
cannot be processed by traditional techniques. The Ritu, Rajesh et al.,[4] proposed a robust model for big
processing of Big Data begins with the raw data that is healthcare data analytics. The purpose of this learning
not grouped and is most often difficult to store in the is to discourse the recent growths in big data analytics
memory of a single computer. Big data can be used to with medical application field. It describes the evolving
analyze the perceptions that can lead to better decision role of predictive data analytics with attentive learning
and strategic business moves. Big Data has been on patient’s quality care with several situations of
characterized by its three primary properties Volume, instances. Further, complete expressive novel
Velocity and Variety [1]. Another important property in framework is deliberated with the approach to offer
Big Data includes Veracity. Big Data analysis can be important aids to computing technology for effective
used for actual decision making in healthcare domain patient care diagnosis. As we know the healthcare data
by altering the existing machine learning algorithms. has reached unprecedented level of growth, data and
Big data can be examined with the software tools which repossessing effective patterns is the prime factor to be
are usually used as a part of predictive analytics in considered by healthcare practitioners. However, data
medicine, data discovery, text mining and statistical analytical techniques such as statistical modeling,
analysis. Business Intelligence software and data predictive analytics, artificial intelligence, data mining
visualisation tool can be a part of analysis process. and machine learning techniques are used in
In current years, the introduction of data analytics investigations to recover effective and well-organized
to large volumes of healthcare data collected on daily patterns from structured and unstructured big data.
basis have unlocked abundant new chances and The approach chosen for the detection of hidden
challenges in the field of medical informatics[2]. New patterns from big data is deliberated with the

International Journal of Pure and Applied Mathematics Special Issue

importance among the healthcare databases. The first fracture is admitted to the hospital; clinical operation
step contains in deliberating the concept of problem should takes place within 48 hours, the patient need to
domain with its importance with 4 V’s(volume, stay in the hospital for 7 days which is the least target
velocity, variety and veracity). The second step evolves time, finally therapy at home. The authors have defined
how these purposes can help healthcare organization a case study to improvise the hip breakage maintenance
for analytical approaches. The third step comprises in a provincial restoration project using the described
assigning the transmitted task to team for proper BI platform, including the determination, opportunity,
execution of objectives. The fourth step is to position procedure, and outcomes. The results produce tangible
big data platform (Hadoop, BigInsights, etc.,) for outcomes in better-quality time of surgery, reduces the
implementation and assessment of big data. The last length of hospital stay and access to rehabilitation.
step confers the saved results and its inference for Volker, Marc, Markus et al., [7] have addressed the
future healthcare medical diagnostic. Results are trend towards continuous healthcare where health is
conferred with healthcare practitioners and scientific continuously monitored by wearable and immobile
committee for validation. devices. They also discussed recent initiatives toward a
In this paper Reddy, Suresh et al., [5] had given an personalized medicine, based on advances in molecular
outline of loading and recovery procedures, Big Data medicine, data management and data analytics.
techniques used in medical clouds, importance and Authors Van, Lui et al., [8] have proposed a novel
need of Big Data Analytics in medical field and its architecture which advises in using the Hadoop,
merits, viewpoints in promising domain of predictive Apache Storm, Kafka and NoSQL Cassandra. The
analytics, difficulties faced and the cure methods in groupings of the higher throughput publish-subscribe
medical domain. They carried out the trials on clinical messaging, dispersed concurrent computing and
data by Open source web interface with Hortonworks information storing system will successfully evaluate
Data Platform. In these trials, they examined the the huge amount of medical data arriving at a higher
hospital’s over-all information such as common speed. The authors also provided the key technologies
obstacles, sicknesses and clinical knowledge. The for the Stream Computing. Apache Kafka is a publish-
authors tried to research on sicknesses, which affects subscribe messaging scheme that is intended to be
the patients and the type of hospital where the patient reckless, accessible and tough. Each Kafka broker is
should join. They have also investigated on the types of capable of handling the hundreds of the megabytes of
difficulties encountered by the hospital(s). read outs and writes for every second. Apache Storm is
Ali, Pete, Helen et al., [6] proposed a novel a free tool and the scattered real-time computation
learning to improvise the Hip Breakage Maintenance system. It is designed for its accessibility, dependable,
Processes in a Provincial Restoration System using a extensible, irrepressible, well-organized and easy to
Business Intelligence. This paper defines methodology arrange with and preserve. It is currently with twitter
considered and outcomes attained, utilizing data and in some real-time applications. “Not Only SQL”
management to simplify structure alteration from out- abbreviated as NoSQL denoting a varied and gradually
dated physical structure to programmed BI analytic an acquainted cluster of the non-relational information
answer. Offering recent, exact system enactment administration schemes where the databases were not
information via the norm of BI uses (such as an built mainly in the table format and usually does not
information depository modelling, assimilation use SQL for data handling. This will be convenient
services, investigation services, and commentary while working with an enormous amount of data where
services) permitted medical system associates to the nature of the data does not requires relational
concentrate on main structure zones which were model.
documented with the ultimate chances for Cassandra is the distributed storing system for
enhancement. Gartner defined Business Intelligence as handling an enormous amount of the structured data
a wide-ranging set of uses and technologies for which spreads out across for numerous commodity
collecting, loading, examining, distributing the data. It servers. It is also capable of providing highly offered
allows the authenticated users to use the data in the services without a single point of disaster. The authors
direction of supporting creativity users to make proposed a basic architecture by merging the merits of
healthier professional decisions. BI platform contains the batch-based and real-time based computing to
numerous interdependent components such as the improvise the big data computing in the medical
External Data Sources, Big Staging Area, Multi- domain. It also provides the better treatment conditions
Dimensional Data Warehouse,ETL, Online Logical and reduces the cost. In the proposed structure, as the
Processing cubes, Semantic layer for reporting, BI authors mentioned earlier, big data analytics in the
portal and Data mining. medical can be categorized into two equivalent layers
This paper converses the summary of Hip fracture called real-time based computing and batch-based
flow, which is a grave life alerting, event for older Computing. Stream Computing integrates Kafka to
adults. The General flow is given as the patient with hip enable real-time computing, on the other hand, in batch

International Journal of Pure and Applied Mathematics Special Issue

computing the output data are produced by the Hadoop Authors Rao, Suma, Sunitha et al., [11] proposed
cluster and used to store them in the HBase Database. security solution for big data analytics in medical
Rajesh, Ritu et al., [9] proposed a novel approach domain. There are numerous aspects of secrecy and
in Big Data with Integrated Cloud Computing with safety issues available in big data with respect to
Healthcare Analytics. In this article the authors have medical domain, which must be taken into account. The
specified to fuse the two technologies such as Cloud Challenges arise with the potential safety and secrecy
Computing and Big Data to get improvement in the ruptures. On the other hand, the values should be
area of healthcare organization. In the proposed detained for the data which will not be recognized
approach the authors have conferred different without proper access. The Challenges can be briefly
application domain of integrating these two categorized as the tasks evaluating from the acts,
technologies. Healthcare data is growing at incredible governing groups and due to some of the technical
rate, hence it is hard to accomplish and stock such big challenges. Security and Privacy for medical data
data. However, the difficulty can be fixed by keeping narrates not only about the privacy but also the norms
such big data into cloud which can be managed and involve repetition, license, independence,
accessed easily using internet. For this purpose, SaaS, authoritarianism, secure communication and
IaaS, PaaS, HaaS services of cloud computing can be confidentiality. Some of the basic safety tasks faced by
used. In the integrated approach proposed by authors, all the types of the databases are the properties of
the doctors and specialists are initiated to adopt, privacy, reliability, accessibility which are at the heart
implement and use these recent and favourable of the information safety, encryption and architectural
technologies without any type of proper and authorized differences. This paper discusses four main security
training. This combination is expected to increase the models which are De-Identification model, Data centric
proficiency and usefulness for patient outcome of approach, Walled Garden approach and Jujutsu model.
disease. There are several advantages for this approach. The first model deals with the unknown customer data
It is very much helpful in tracing and handling the which allows deeper analysis on the facts by the
population health more competently and successfully. It scientists without providing any disrespect to the
also improves the ability to deliver preventive care etc., industry and to the government data privacy policy. As
Chrimes, Moa, Zamani and Kuo et al., [10] the data are already secured during the production,
proposed a Platform for interactive healthcare analytics constant encryption/decryption is not required. The
under simulated enactment. The authors established a main approaches in this model are data encryption,
platform called Healthcare Big Data Analytics masking and tokenization. The second model protects
(HBDA). The proof-of-concept was done and the result the information at the information level itself. As a
of the test matched with the patient data representative result the information remains protected throughout its
of the hospital system. The authors did cross check life cycle as soon as it is caught by the application. In
with the available data, profiles and metadata with the this method, the information can be decrypted,
already available medical report. The performance of detokenized, or can be unmasked by some certified
the platform verified with different patient requests in users where the data will turn useless for the
simulation with the data downed into Hadoop file attackers/hackers. The third model keeps the whole
system over different applications. The outcomes cluster under its private linkage and strongly controls
proved that processing time for one billion records took the coherent entrance via the firewalls and the access
2 hours when Apache Spark is used. Apache Drill controls. The drawback of this approach is that, it will
outstripped Spark/Zeppelin and Spark/Jupyter. never prevent the credentialed users from misusing the
The challenges arose during the installation process system and also preventing them from observing and
of Spark and Drill. In Hadoop, one of the major issues altering the information stored in the cluster. The fourth
was the dependencies which are native to the system. and the final model is the traditional martial art
Another major issue is Apache Drill which has no technique, which has the capability to design and
choice to bind itself into a precise network interface. implement an engine called a vibrant reference engine
Apart from the disadvantages, there are some that endorses not only the subtle information in the
advantages that this platform possesses. HBDA organization but also the susceptibilities connected to
platform demonstrated the high performance verified to that information.
clinical applications. Another added advantage is the Shankar et al., [12] discusses various applications
execution of SQL Statement can be done with provided by big data analytics in medical domain. The
restrained resources. As time passes by, performance aim was to review few applications in medical domain
time of this platform proved to be improved. Use of and their related outcomes. The first application called
CSV file on Hadoop also has its own advantage but it is Diagnosis, discussed is integrated software which helps
not cost effective when compared to using of Spark. health resorts and the treatment center to use health
Drill provides low latency SQL Engine. care analytics. Currently, this integrated software is
used in the Seattle’s Children’s hospital which delivers

International Journal of Pure and Applied Mathematics Special Issue

References

