0% found this document useful (0 votes)

18 views5 pages

PDF Document 4

Uploaded by

210171601047

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views5 pages

PDF Document 4

Uploaded by

210171601047

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Natural Language Processing (NLP)

Natural language processing (NLP) is a branch of artificial intelligence (AI) that enables
computers to comprehend, generate, and manipulate human language. Natural language
processing has the ability to interrogate the data with natural language text or voice. This is
also called “language in.” Most consumers have probably interacted with NLP without
realizing it. For instance, NLP is the core technology behind virtual assistants, such as the
Oracle Digital Assistant (ODA), Siri, Cortana, or Alexa. When we ask questions of these
virtual assistants, NLP is what enables them to not only understand the user’s request, but to
also respond in natural language. NLP applies both to written text and speech, and can be
applied to all human languages. Other examples of tools powered by NLP include web
search, email spam filtering, automatic translation of text or speech, document
summarization, sentiment analysis, and grammar/spell checking. For example, some email
programs can automatically suggest an appropriate reply to a message based on its
content—these programs use NLP to read, analyze, and respond to your message.
There are several other terms that are roughly synonymous with NLP. Natural language
understanding (NLU) and natural language generation (NLG) refer to using computers to
understand and produce human language, respectively. NLG has the ability to provide a
verbal description of what has happened. This is also called "language out” by summarizing
by meaningful information into text using a concept known as "grammar of graphics."

In practice, NLU is used to mean NLP. The understanding by computers of the structure and
meaning of all human languages, allowing developers and users to interact with computers
using natural sentences and communication. Computational linguistics (CL) is the scientific
field that studies computational aspects of human language, while NLP is the engineering
discipline concerned with building computational artifacts that understand, generate, or
manipulate human language.

Research on NLP began shortly after the invention of digital computers in the 1950s, and
NLP draws on both linguistics and AI. However, the major breakthroughs of the past few
years have been powered by machine learning, which is a branch of AI that develops systems
that learn and generalize from data. Deep learning is a kind of machine learning that can learn
very complex patterns from large datasets, which means that it is ideally suited to learning the
complexities of natural language from datasets sourced from the web.

Applications of Natural Language Processing

Automate routine tasks: Chatbots powered by NLP can process a large number of routine
tasks that are handled by human agents today, freeing up employees to work on more
challenging and interesting tasks. For example, chatbots and Digital Assistants can recognize
a wide variety of user requests, match them to the appropriate entry in a corporate database,
and formulate an appropriate response to the user.
Improve search: NLP can improve on keyword matching search for document and FAQ
retrieval by disambiguating word senses based on context (for example, “carrier” means
something different in biomedical and industrial contexts), matching synonyms (for example,
retrieving documents mentioning “car” given a search for “automobile”), and taking
morphological variation into account (which is important for non-English queries). Effective
NLP-powered academic search systems can dramatically improve access to relevant cutting-
edge research for doctors, lawyers, and other specialists.
Search engine optimization: NLP is a great tool for getting your business ranked higher in
online search by analyzing searches to optimize your content. Search engines use NLP to
rank their results—and knowing how to effectively use these techniques makes it easier to be
ranked above your competitors. This will lead to greater visibility for your business.
Analyzing and organizing large document collections: NLP techniques such as document
clustering and topic modeling simplify the task of understanding the diversity of content in
large document collections, such as corporate reports, news articles, or scientific documents.
These techniques are often used in legal discovery purposes.
Social media analytics: NLP can analyze customer reviews and social media comments to
make better sense of huge volumes of information. Sentiment analysis identifies positive and
negative comments in a stream of social-media comments, providing a direct measure of
customer sentiment in real time. This can lead to huge payoffs down the line, such as
increased customer satisfaction and revenue.
Market insights: With NLP working to analyze the language of your business’ customers,
you’ll have a better handle on what they want, and also a better idea of how to communicate
with them. Aspect-oriented sentiment analysis detects the sentiment associated with specific
aspects or products in social media (for example, “the keyboard is great, but the screen is too
dim”), providing directly actionable information for product design and marketing.
Moderating content: If your business attracts large amounts of user or customer comments,
NLP enables you to moderate what’s being said in order to maintain quality and civility by
analyzing not only the words, but also the tone and intent of comments.

Industries Using Natural Language Processing

NLP simplifies and automates a wide range of business processes, especially ones that
involve large amounts of unstructured text like emails, surveys, social media conversations,
and more. With NLP, businesses are better able to analyze their data to help make the right
decisions. Here are just a few examples of practical applications of NLP:

 Healthcare: As healthcare systems all over the world move to electronic medical records, they are
encountering large amounts of unstructured data. NLP can be used to analyze and gain new
insights into health records.
 Legal: To prepare for a case, lawyers must often spend hours examining large collections of
documents and searching for material relevant to a specific case. NLP technology can automate
the process of legal discovery, cutting down on both time and human error by sifting through large
volumes of documents.
 Finance: The financial world moves extremely fast, and any competitive advantage is important.
In the financial field, traders use NLP technology to automatically mine information from
corporate documents and news releases to extract information relevant to their portfolios and
trading decisions.
 Customer service: Many large companies are using virtual assistants or chatbots to help answer
basic customer inquiries and information requests (such as FAQs), passing on complex questions
to humans when necessary.
 Insurance: Large insurance companies are using NLP to sift through documents and reports
related to claims, in an effort to streamline the way business gets done.

NLP Technology Overview

Machine learning models for NLP: We mentioned earlier that modern NLP relies heavily
on an approach to AI called machine learning. Machine learning make predictions by
generalizing over examples in a dataset. This dataset is called the training data, and machine
learning algorithms train on this training data to produce a machine learning model that
accomplishes a target task.
For example, sentiment analysis training data consists of sentences together with their
sentiment (for example, positive, negative, or neutral sentiment). A machine-learning
algorithm reads this dataset and produces a model which takes sentences as input and returns
their sentiments. This kind of model, which takes sentences or documents as inputs and
returns a label for that input, is called a document classification model. Document classifiers
can also be used to classify documents by the topics they mention (for example, as sports,
finance, politics, etc.).

Another kind of model is used to recognize and classify entities in documents. For each word
in a document, the model predicts whether that word is part of an entity mention, and if so,
what kind of entity is involved. For example, in “XYZ Corp shares traded for $28 yesterday”,
“XYZ Corp” is a company entity, “$28” is a currency amount, and “yesterday” is a date. The
training data for entity recognition is a collection of texts, where each word is labeled with
the kinds of entities the word refers to. This kind of model, which produces a label for each
word in the input, is called a sequence labeling model.

Sequence to sequence models are a very recent addition to the family of models used in
NLP. A sequence to sequence (or seq2seq) model takes an entire sentence or document as
input (as in a document classifier) but it produces a sentence or some other sequence (for
example, a computer program) as output. (A document classifier only produces a single
symbol as output). Example applications of seq2seq models include machine translation,
which for example, takes an English sentence as input and returns its French sentence as
output; document summarization (where the output is a summary of the input); and semantic
parsing (where the input is a query or request in English, and the output is a computer
program implementing that request).

Deep learning, pretrained models, and transfer learning: Deep learning is the most
widely-used kind of machine learning in NLP. In the 1980s, researchers developed neural
networks, in which a large number of primitive machine learning models are combined into a
single network: by analogy with brains, the simple machine learning models are sometimes
called “neurons.” These neurons are arranged in layers, and a deep neural network is one with
many layers. Deep learning is machine learning using deep neural network models.
Because of their complexity, generally it takes a lot of data to train a deep neural network,
and processing it takes a lot of compute power and time. Modern deep neural network NLP
models are trained from a diverse array of sources, such as all of Wikipedia and data scraped
from the web. The training data might be on the order of 10 GB or more in size, and it might
take a week or more on a high-performance cluster to train the deep neural network.
(Researchers find that training even deeper models from even larger datasets have even
higher performance, so currently there is a race to train bigger and bigger models from larger
and larger datasets).

The voracious data and compute requirements of Deep Neural Networks would seem to
severely limit their usefulness. However, transfer learning enables a trained deep neural
network to be further trained to achieve a new task with much less training data and compute
effort. The simplest kind of transfer learning is called fine tuning. It consists simply of first
training the model on a large generic dataset (for example, Wikipedia) and then further
training (“fine-tuning”) the model on a much smaller task-specific dataset that is labeled with
the actual target task. Perhaps surprisingly, the fine-tuning datasets can be extremely small,
maybe containing only hundreds or even tens of training examples, and fine-tuning training
only requires minutes on a single CPU. Transfer learning makes it easy to deploy deep
learning models throughout the enterprise.

There is now an entire ecosystem of providers delivering pretrained deep learning models
that are trained on different combinations of languages, datasets, and pretraining tasks. These
pretrained models can be downloaded and fine-tuned for a wide variety of different target
tasks.

Sample of NLP Preprocessing Techniques

Tokenization: Tokenization splits raw text (for example., a sentence or a document) into a
sequence of tokens, such as words or subword pieces. Tokenization is often the first step in
an NLP processing pipeline. Tokens are commonly recurring sequences of text that are
treated as atomic units in later processing. They may be words, subword units called
morphemes (for example, prefixes such as “un-“ or suffixes such as “-ing” in English), or
even individual characters.

Bag-of-words models: Bag-of-words models treat documents as unordered collections of

tokens or words (a bag is like a set, except that it tracks the number of times each element
appears). Because they completely ignore word order, bag-of-words models will confuse a
sentence such as “dog bites man” with “man bites dog.” However, bag-of-words models are
often used for efficiency reasons on large information retrieval tasks such as search engines.
They can produce close to state-of-the-art results with longer documents.

Stop word removal: A “stop word” is a token that is ignored in later processing. They are
typically short, frequent words such as “a,” “the,” or “an.” Bag-of-words models and search
engines often ignore stop words in order to reduce processing time and storage within the
database. Deep neural networks typically do take word-order into account (that is, they are
not bag-of-words models) and do not do stop word removal because stop words can convey
subtle distinctions in meaning (for example, “the package was lost” and “a package is lost”
don’t mean the same thing, even though they are the same after stop word removal).

Stemming and lemmatization: Morphemes are the smallest meaning-bearing elements of

language. Typically morphemes are smaller than words. For example, “revisited” consists of
the prefix “re-“, the stem “visit,” and the past-tense suffix “-ed.” Stemming and
lemmatization map words to their stem forms (for example, “revisit” + PAST). Stemming
and lemmatization are crucial steps in pre-deep learning models, but deep learning models
generally learn these regularities from their training data, and so do not require explicit
stemming or lemmatization steps.

Part-of-speech tagging and syntactic parsing: Part-of-speech (PoS) tagging is the process
of labeling each word with its part of speech (for example, noun, verb, adjective, etc.). A
Syntactic parser identifies how words combine to form phrases, clauses, and entire sentences.
PoS tagging is a sequence labeling task, syntactic parsing is an extended kind of sequence
labeling task, and deep neural Nntworks are the state-of-the-art technology for both PoS
tagging and syntactic parsing. Before deep learning, PoS tagging and syntactic parsing were
essential steps in sentence understanding. However, modern deep learning NLP models
generally only benefit marginally (if at all) from PoS or syntax information, so neither PoS
tagging nor syntactic parsing are widely used in deep learning NLP.
NLP Programming Languages
Python:
The NLP Libraries and toolkits are generally available in Python, and for this reason by far
the majority of NLP projects are developed in Python. Python’s interactive development
environment makes it easy to develop and test new code.
Java and C++:
For processing large amounts of data, C++ and Java are often preferred because they can
support more efficient code.

NLP Libraries and Development Environments

Here are examples of some popular NLP libraries.

TensorFlow and PyTorch: These are the two most popular deep learning toolkits. They are
freely available for research and commercial purposes. While they support multiple
languages, their primary language is Python. They come with large libraries of prebuilt
components, so even very sophisticated deep learning NLP models often only require
plugging these components together. They also support high-performance computing
infrastructure, such as clusters of machines with graphical processor unit (GPU) accelerators.
They have excellent documentation and tutorials.
AllenNLP: This is a library of high-level NLP components (for example, simple chatbots)
implemented in PyTorch and Python. The documentation is excellent.
HuggingFace: This company distributes hundreds of different pretrained Deep Learning
NLP models, as well as a plug-and-play software toolkit in TensorFlow and PyTorch that
enables developers to rapidly evaluate how well different pretrained models perform on their
specific tasks.
Spark NLP: Spark NLP is an open source text processing library for advanced NLP for the
Python, Java, and Scala programming languages. Its goal is to provide an application
programming interface (API) for natural language processing pipelines. It offers pretrained
neural network models, pipelines, and embeddings, as well as support for training custom
models.
SpaCy NLP: SpaCy is a free, open source library for advanced NLP in Python, and it is
specifically designed to help build applications that can process and understand large
volumes of text. SpaCy is known to be highly intuitive and can handle many of the tasks
needed in common NLP projects.
In summary, Natural language processing is an exciting area of artificial intelligence
development that fuels a wide range of new products such as search engines, chatbots,
recommendation systems, and speech-to-text systems. As human interfaces with computers
continue to move away from buttons, forms, and domain-specific languages, the demand for
growth in natural language processing will continue to increase. For this reason, Oracle
Cloud Infrastructure is committed to providing on-premises performance with our
performance-optimized compute shapes and tools for NLP. Oracle Cloud Infrastructure
offers an array of GPU shapes that you can deploy in minutes to begin experimenting with
NLP.

(2021 Minerals) Data Analytics Applied in Mining Industry
100% (1)
(2021 Minerals) Data Analytics Applied in Mining Industry
273 pages
Natural Language Processing
100% (1)
Natural Language Processing
12 pages
Python Data Science 2024 - Explo - Wilson, Stephen
No ratings yet
Python Data Science 2024 - Explo - Wilson, Stephen
170 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
A Beginner's Introduction To Natural Language Processing (NLP)
100% (1)
A Beginner's Introduction To Natural Language Processing (NLP)
15 pages
Machine Learning Algorithms For Wireless Sensor Networksa Survey
100% (1)
Machine Learning Algorithms For Wireless Sensor Networksa Survey
25 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
21 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
27 pages
Applications of NLP For Business
No ratings yet
Applications of NLP For Business
29 pages
Natural Language Processing With Python A Comprehensive Guide To NLP in The Age of AI For 2024 (Hayden Van Der Post) (Z-Library)
No ratings yet
Natural Language Processing With Python A Comprehensive Guide To NLP in The Age of AI For 2024 (Hayden Van Der Post) (Z-Library)
315 pages
Backgammon Strategy
No ratings yet
Backgammon Strategy
23 pages
Essential Guide To Python For All Levels (2024 Collection
No ratings yet
Essential Guide To Python For All Levels (2024 Collection
184 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
80 pages
NLP AI Detailed Presentation
No ratings yet
NLP AI Detailed Presentation
18 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
Introduction To Natural Language Processing
No ratings yet
Introduction To Natural Language Processing
211 pages
Medical Image Analysis - Unit 14 - Week 11
No ratings yet
Medical Image Analysis - Unit 14 - Week 11
4 pages
AI Unit-5
No ratings yet
AI Unit-5
10 pages
Evolution of Generative AI 1721160426
No ratings yet
Evolution of Generative AI 1721160426
10 pages
The Impact of Artificial Intelligence (AI) On Business Operations in Bangladesh
No ratings yet
The Impact of Artificial Intelligence (AI) On Business Operations in Bangladesh
6 pages
Applications of NLP
No ratings yet
Applications of NLP
4 pages
NLP Notes
No ratings yet
NLP Notes
90 pages
Telemarketing Dataset Analysis: Group 7 Abhishek Jagdale Nilay N Sonal Mittal Swapnil B Swapnil T Vishal Sinha
No ratings yet
Telemarketing Dataset Analysis: Group 7 Abhishek Jagdale Nilay N Sonal Mittal Swapnil B Swapnil T Vishal Sinha
21 pages
Natural Language Processing
No ratings yet
Natural Language Processing
73 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
Screenshot 2023-08-06 at 2.07.28 PM
No ratings yet
Screenshot 2023-08-06 at 2.07.28 PM
49 pages
Introduction To NLP: Prof: Vraj M Hingu Dept: Computer
No ratings yet
Introduction To NLP: Prof: Vraj M Hingu Dept: Computer
87 pages
NLP Presentation
No ratings yet
NLP Presentation
20 pages
Adversarial Attacks On Deep-Learning Models in Natural Language Processing: A Survey
No ratings yet
Adversarial Attacks On Deep-Learning Models in Natural Language Processing: A Survey
41 pages
SSRN Id3981160
No ratings yet
SSRN Id3981160
44 pages
Foundation For NLP
No ratings yet
Foundation For NLP
14 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
Chapter 1
No ratings yet
Chapter 1
31 pages
NLP Unit 1
No ratings yet
NLP Unit 1
48 pages
Unit 1 NLP
No ratings yet
Unit 1 NLP
76 pages
Natural Language Processing
No ratings yet
Natural Language Processing
43 pages
Unit 1
No ratings yet
Unit 1
26 pages
MGT 442 Project Phase 2
No ratings yet
MGT 442 Project Phase 2
20 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
Harambe University
No ratings yet
Harambe University
8 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
31 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
Amer 2
No ratings yet
Amer 2
18 pages
NLP Unit 1 Notes
No ratings yet
NLP Unit 1 Notes
15 pages
Predicting The Trends of Quality-Oriented Jobs
No ratings yet
Predicting The Trends of Quality-Oriented Jobs
3 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
11 pages
Natural Language Processing Unit1
No ratings yet
Natural Language Processing Unit1
23 pages
STRINGS in C
No ratings yet
STRINGS in C
10 pages
Fraud Detection in Mobile Payment Systems Using An Xgboost Based Framework
No ratings yet
Fraud Detection in Mobile Payment Systems Using An Xgboost Based Framework
19 pages
NLP Unit-1
No ratings yet
NLP Unit-1
20 pages
Natural Language Processing - 1
No ratings yet
Natural Language Processing - 1
44 pages
DS Exp2 Rugved
No ratings yet
DS Exp2 Rugved
5 pages
The Use of Natural Language Processing
No ratings yet
The Use of Natural Language Processing
15 pages
Unit 3&4
No ratings yet
Unit 3&4
10 pages
CH 5 NLP
No ratings yet
CH 5 NLP
12 pages
Unit-4 - IoT Platforms and Security
No ratings yet
Unit-4 - IoT Platforms and Security
23 pages
Exhibit 10 Updated
No ratings yet
Exhibit 10 Updated
7 pages
Omni Control
No ratings yet
Omni Control
18 pages
Model Evaluation and Improvement 1
No ratings yet
Model Evaluation and Improvement 1
8 pages
Sha 10
No ratings yet
Sha 10
6 pages
AI Init-5
No ratings yet
AI Init-5
6 pages
Learn: Leveraging Ebpf and Ai For Ransomware Nose Out
No ratings yet
Learn: Leveraging Ebpf and Ai For Ransomware Nose Out
6 pages
DS Exp2 20101A0021 Satyam Mishra
No ratings yet
DS Exp2 20101A0021 Satyam Mishra
5 pages
Unit1 A
No ratings yet
Unit1 A
8 pages
Leveraging Linguistic and Computer Science Notes
No ratings yet
Leveraging Linguistic and Computer Science Notes
4 pages
Paper of Rolling Net
No ratings yet
Paper of Rolling Net
9 pages
09 - Machine Learning
No ratings yet
09 - Machine Learning
7 pages
Feature Extraction, Feature Selection and Machine Learning For Image Classification: A Case Study
No ratings yet
Feature Extraction, Feature Selection and Machine Learning For Image Classification: A Case Study
6 pages
Untitled Document
No ratings yet
Untitled Document
5 pages
Natural Language Processing: Components of NLP
No ratings yet
Natural Language Processing: Components of NLP
8 pages
Natural Language Processing - Bridging The Gap Between Humans and Machines
No ratings yet
Natural Language Processing - Bridging The Gap Between Humans and Machines
6 pages
Research Paper Emaildetection
No ratings yet
Research Paper Emaildetection
6 pages
What Is NLP?
No ratings yet
What Is NLP?
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
5 pages
Ai Assistant: Easy Load
No ratings yet
Ai Assistant: Easy Load
6 pages
NLP 01
No ratings yet
NLP 01
7 pages
Residual Neural Network: Tea Leaf Desease Detection
No ratings yet
Residual Neural Network: Tea Leaf Desease Detection
6 pages
NLP Lecture 1
No ratings yet
NLP Lecture 1
3 pages
Reading 68 Fintech in Investment Management - Answers
No ratings yet
Reading 68 Fintech in Investment Management - Answers
5 pages
A Comprehensive Study Classification of Asian Ethnicities From Facial Images Using Deep Learning
No ratings yet
A Comprehensive Study Classification of Asian Ethnicities From Facial Images Using Deep Learning
5 pages
Tech Titans
No ratings yet
Tech Titans
12 pages
What Is NLP?
No ratings yet
What Is NLP?
3 pages
NLP Lab Introduction
No ratings yet
NLP Lab Introduction
4 pages
Lakshmi Priya Vellineni - Module 4 Assignment
No ratings yet
Lakshmi Priya Vellineni - Module 4 Assignment
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
2 pages
Expert Veri Ed, Online, Free.: Unlimited Access
No ratings yet
Expert Veri Ed, Online, Free.: Unlimited Access
2 pages
Piyush Soni: Bjective
No ratings yet
Piyush Soni: Bjective
2 pages
Resume 6
No ratings yet
Resume 6
1 page
AI For Your Business
From Everand
AI For Your Business
Book Summary Club
No ratings yet
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
From Everand
Exploring the Fascinating World of Natural Language Processing (NLP): Revolutionizing Communication and Empowering Machines through NLP Techniques and Applications
daniel Huston
No ratings yet

PDF Document 4

Uploaded by

PDF Document 4

Uploaded by

Natural Language Processing (NLP)

Applications of Natural Language Processing

Industries Using Natural Language Processing

NLP Technology Overview

Sample of NLP Preprocessing Techniques

Bag-of-words models: Bag-of-words models treat documents as unordered collections of

Stemming and lemmatization: Morphemes are the smallest meaning-bearing elements of

NLP Libraries and Development Environments

You might also like