Named Entity Recognition Using Ensemble
Named Entity Recognition Using Ensemble
Upgrading Industry 4.0 to 5.0 provides numerous research opportunities for the industrialists and
researchers. This industrial revolution cross the peak of automation in the life science domain. In this
digitalized world, big data plays a key role to provide the valuable insights by using various analytical
methods. In life science, available of huge textual data contains wide spread of valuable information. To
extract the hidden information from the big data, natural language processing plays a major and
significant role. In NLP, named entity recognition is one of the key factor and biggest challenge for the
research community. This paper presents the high level architecture of NER using ensemble learning
method. The EL model contains a dictionary based entity identifier and a self-learning classifier. Proposed
model outperformed well and produced high accuracy.
Keywords: Natural Language Processing, Named Entity Recognition, Conditional Random Field,
Lexicon Based Approach, Ensemble Learning
projected that in 2020 the healthcare industry has
1.Introduction 2134 Exabyte of data.The data deluge in
healthcare industry which is commonly generated
Life science is one of the prominent and growing by electronic healthcare record are stored
domain in the business industry. The revolution of inregional language. The stored data are in
Artificial Intelligence (AI) in the pharmacy organized structure which is making more
industry provided many research and job difficulties for the retrieving the hidden
opportunity. Plenty of applications that are being information from the huge amount of text data.
used in life science industry are migrated to The digitalized information of the clinical records
automation. Most of data are in textual format and are frequently store in the formal language. NLP is
creates the biggest challenge to the researchers and helpful for researchers and industrialists to
industrialist. To work with the textual data, NLP is communicate occasions and clinical ideas,
one of the key technique to extract the valuable astonishingly it makes the information hard for
insights. looking due to the lack of technologies and tools.
In various industries such as healthcare, education, To overcome these difficulties, the data must be
finance, social media, etc., contains abundant properly processed by the NLP techniques. Named
information which are difficult to handle. NLP is Entity Recognition (NER) is a key NLP errand to
significant to handle those sources. This paper extricate the elements of intrigue (e.g., ailment
distressthe role of NLP strategies in biomedical names, medicine names and lab tests) from clinical
field.According to statista , in 2019 they have
UMLS
Dictionary NLP Component
Lex-NER Lex-NER
Web Stemming Abbreviation
Resources NLP Tokenization Extractor
Component Lemmatization Entity Tagger
POS Tagging Generating
Training Dataset
Data
ML Model Preparation CRF Model