Application of Data Science and Bioinformatics in Healthcare Technologies
Application of Data Science and Bioinformatics in Healthcare Technologies
net/publication/362072647
CITATIONS READS
0 507
2 authors, including:
SEE PROFILE
All content following this page was uploaded by Sreenivasa Rao Veeranki on 05 September 2022.
Manish Varshney
Department of Computer Science and Engineering, School of Engg. & Tech.,
Maharishi university of Information Technology, Lucknow, India
Email: [email protected]
Introduction
The healthcare process has gone through an important change, involved by the
triple aim of enhancing planning, low costs including better effects. Healthcare
analytics can be executed at various levels, also observing as well as ignoring
healthcare mistakes, evidence combination, and predictive survey, along with
customized modelling. New main domains of investigation may be classified
according to the organization, administration, and evaluation of health data
processes, the depiction of healthcare facts, together with the inspection as well
as clarification of basic signs and features [2]. It's reasonable to expect a great
deal of change for medical informatics research in the future, given the fluidity of
many driving factors behind advancement in information processing technologies
and their technology as well as advances in medicine and health care.
Literature review
During the current decades, scientific statisticians have proposed innovative data
summary methods impressed by the speedy increase of computing capability
including the progression in storage capacities. Instances of these are machine
learning plus deep learning networks. Multiple computational techniques endure
at the nexus of mathematical, analytical as well as computational limitations.
Statistical processes often operate strategies that gather imminent capability from
distinct, together with huge databases of information [4]. Developing complicated
computational methods can give effective prognostication types. However, it is
unclear how broadly these methods are used in different medicinal domains.
which is equipped to work the Taylor-based bird swarm algorithm. The effect of
the report reveals that a method is a hopeful strategy.
Healthcare informatics centres on the learning technology that allows the efficient
gathering of data handling technology devices to increase healthcare education
and also to promote the distribution of patient preventive care. The goal of
healthcare informatics is to secure passage to severe inmate healthcare erudition
at the exact moment moreover place it is wanted to do remedial arrangements [7].
Healthcare informatics also centres on the administration of pharmaceutical data
for investigation and training. Three papers in this Special Issue have immediate
applicability for clinical judgment planning.
Research methodology
Secondary qualitative research has been used to discover the Application of Data
Science and Bioinformatics in Healthcare Technologies. Secondary research is
depending only on information that has previously been gathered before the
research process even begins. Data collection, compilation, and analysis are all
steps in this research project. It's called desk research because it requires
synthesizing pre-existing information available on the internet, in peer-reviewed
journals, textbooks and government archives and libraries. One of the duties of a
5399
secondary researcher is to seek patterns in previous studies and then apply what
they've learned to the present study's conditions. Some researchers combine the
main and secondary methods to their work since secondary research often relies
on primary research results. For example, the researcher analyses and discovers
current research gaps before doing primary research to gather new data for his or
her topic. The researcher has chosen three previous types of research on the
Application of Data Science and Bioinformatics in Healthcare Technologies to
gather the necessary data.
Result
Health informatics combines data and information from many medical fields,
such as pre-clinical, clinical, and post-clinical research, as well as health care
administration. According to Callahan et al. (2020), Technological progress has
given researchers and scientists more freedom to work on improving public
health, healthcare, and biomedicine [5]. Healthcare informatics' primary goal is to
offer healthcare providers comprehensive health information on their patients.
This information simplifies their job of making an appropriate treatment choice at
the appropriate moment. Furthermore, a patient in a remote location may obtain
an opinion from the finest available health care experts by utilizing healthcare
informatics.
Here we have divided the whole dataset after the pre-processing of the dataset.
The divided parts are the testing part and training part i.e. 20% and 80% of the
entire dataset respectively. This distribution of data parts are plotted to observe
the dataset clearly before doing classification work. It is possible to train a
machine learning model by using a set of samples (such as images or videos, texts
or audio files, etc.) that have been labelled with appropriate and thorough labels
(classes or tags). One of the advantages of machine learning is that it may
outperform a person in terms of accuracy. This may seem paradoxical at first, but
it is necessary due to the fundamental disparities in the ways in which humans
and robots perceive and interpret data.
When it comes to training ML models, there's another term you should be familiar
with: testing data sets. Machine learning requires the use of both training and
test data sets. Training data is essential for teaching an ML algorithm, while
testing data, as the name indicates, allows you to monitor and improve the
system's performance. Keep in mind that some of the data you gather for training
5401
your algorithm will be utilized to monitor how well the training is progressing. To
put it another way, your data will be divided into two sections: a training portion
and a testing portion
As a result, if the model runs over the same dataset numerous times, it will be
repeatedly exposed to the same patterns, which necessary for an algorithm to get
adequate is training. In order to prevent this, you'll need a new set of data to train
your algorithm to look for other types of relationships. Although your testing data
set is needed for other reasons, you don't want to use it during training.
Analysis, storage, and optimization of enormous biological data are all part of
bioinformatics and analytics. According to Attwood et al. (2019), Electronic health
records (EHRs) have emerged as a result of the rise in computing competence,
allowing for the creation of a comprehensive data warehouse. It's useful for
figuring out how genetics and phenotype are connected. Large biological datasets
are studied and analyzed using a variety of computational techniques to better
understand the illness, and these predictions may be used to link health care
data [6]. New methods and algorithms for analyzing genomic and proteomic data,
utilized in a variety of disciplines such as drug development and medicine, have
been devised by researchers in this field in the last year or so.
Artificial intelligence (AI) is a cutting-edge technology that aims to take the place
of human intellect on a partial or complete basis. Molecular computation and
bioinformatics have drawn attention to it in recent studies. When compared to
other computer methods, AI algorithms provide superior DNA sequencing
performance and accuracy. Using artificial intelligence to categorize microarray
cancer data is critical, according to Shi and colleagues. According to Alloghani et
al. (2020), a thorough study of machine learning methods used in bioinformatics
for dimensionality reduction of complicated data and feature selection for the
assignment of biomarkers in raw data was published [7]. Using machine learning
to analyze and classify single nucleotide polymorphisms. AI in bioinformatics has
a variety of uses, including drug repositioning and classifying patients based on
neuroimaging data.
Discussion
When it comes to the desire for health care, it differs from the desire for most
consumer goods and services. The desire for health care is not derived directly
from the use of medicinal methodology; rather, it stems from the immediate
estimation of improved health that is delivered by healthcare. People ask for
health insurance in advance of starting a new job and then pay up when they're
healthy [8]. Longevity and personal happiness are two measures of health that
may be compared. Directly and indirectly: directly because one's health
dimension affects the pleasure in goods and relaxation and implicitly because
one's health dimension upgrades efficiency, a person receives an incentive from
personal happiness. So every new idea targeted at making healthcare better has a
good chance of benefiting from excellent open methods.
tools. Biological and genetic data are evaluated and summarized using
bioinformatics, a subject that combines many disciplines including software
engineering, computer science, statistics, informatics, and engineering to do so.
Bioinformatics is the study of molecular data to quantify clinical, imaging, and
diagnostic information for the development of personalized medicine and health
care services. By using bioinformatics, researchers may discover candidate genes
that can help them understand disease genetics, unique adaptations, and
population variances. This also includes the study of quality articulation,
biological frameworks and the knowledge of life's transformational history [9].
DNA, RNA, 3D protein structures, and biomolecular interactions may all be
reenacted and shown using it in fundamental biology. Gene therapy, genetic
engineering, gene editing, and drug discovery are all advancing at a fast pace
thanks to these new technologies. Grouping studies may provide insight into a bio
molecule's structure, functions, and different properties. Sequence recovery of
related compounds from available databases will have served as a springboard for
this approach.
There are a variety of techniques available depending on the need, and the results
such as capacity, structure, or homologues are highlighted with amazing
accuracy. Proteins, nucleotides, DNA, and RNA sequences may be compared
using the Basic Local Alignment Search Tool (BLAST). Free software called
"HMMER" is also available for identifying and analyzing homologous protein
sequences in various databases. For multiple sequence alignment, "Omega" offers
a variety of operating systems accessible as one of the tools. When compared to
other algorithms like matrix-based and consistency-based, it will give more
accurate results. The tool "Sequerome" for profiling organizations was created by
the Bioinformatics and Computational Biosciences Unit (BCBU). Protein
physicochemical characteristics may be calculated using the computer program
"ProtParam." Sequence alignment is used to identify gene models utilizing gene
production with many sources of evidence (JIGSAW).
Protein molecules begin as unstructured amino acid strings in the early stages. In
the end, it develops biological dynamics and three-dimensional (3D) structures.
The biological functionality is predicated on protein collapse into a pre-imperative
correct topology. To summarize, protein 3D structure is critical for feature
extraction and may be seen via X-ray crystallography or nuclear magnetic
resonance (NMR) [10]. Forecasting structures and simulation algorithm validity
5403
Conclusion
References
[1] Banerjee, Amit, et al. "Emerging trends in IoT and big data analytics for
biomedical and health care technologies." Handbook of data science
approaches for biomedical engineering. Academic Press, 2020. 121-152.
[2] McPadden, Jacob, et al. "Health care and precision medicine research:
analysis of a scalable data science platform." Journal of medical Internet
research 21.4 (2019): e13043.
[3] Latif, Siddique, et al. "Leveraging data science to combat covid-19: A
comprehensive review." IEEE Transactions on Artificial Intelligence 1.1 (2020):
85-103.
[4] Ehwerhemuepha, Louis, et al. "Health DataLab–a cloud computing solution
for data science and advanced analytics in healthcare with applications to
predicting multi-centre pediatric readmissions." BMC medical informatics and
decision making 20.1 (2020): 1-12.
[5] Callahan, Tiffany J., et al. "Knowledge-based biomedical data science."
Annual review of biomedical data science 3 (2020): 23-41.
[6] Attwood, Teresa K., et al. "A global perspective on evolving bioinformatics and
data science training needs." Briefings in Bioinformatics 20.2 (2019): 398-404.
[7] Alloghani, Mohamed, et al. "A systematic review on supervised and
unsupervised machine learning algorithms for data science." Supervised and
unsupervised learning for data science (2020): 3-21.
[8] Baldi, P. (2018). Deep learning in biomedical data science. Annual review of
biomedical data science, 1, 181-205.
5404
[9] Kumar, Vivek, et al. "Prediction of malignant and benign breast cancer: A
data mining approach in healthcare applications." Advances in data science
and management. Springer, Singapore, 2020. 435-442.
[10] Arellano, April Moreno, et al. "Privacy policy and technology in biomedical
data science." Annual review of biomedical data science 1 (2018): 115-129.
[11] Rinartha, K., & Suryasa, W. (2017). Comparative study for better result on
query suggestion of article searching with MySQL pattern matching and
Jaccard similarity. In 2017 5th International Conference on Cyber and IT
Service Management (CITSM) (pp. 1-4). IEEE.
[12] Rinartha, K., Suryasa, W., & Kartika, L. G. S. (2018). Comparative Analysis of
String Similarity on Dynamic Query Suggestions. In 2018 Electrical Power,
Electronics, Communications, Controls and Informatics Seminar (EECCIS) (pp.
399-404). IEEE.
[13] Rusmini, R., & Hastuti, P. (2021). Local awareness based midwifery care in
basic level service in the digital era . International Journal of Health & Medical
Sciences, 4(1), 69-73. https://doi.org/10.31295/ijhms.v4n1.1150