0% found this document useful (0 votes)
76 views66 pages

AI4youngster - 6 - Topic NLP

The document provides an overview of natural language processing (NLP) including its achievements, methods, and trends. It discusses how NLP is used in industry applications like virtual assistants and machine translation. It describes the levels of linguistic knowledge involved in NLP like morphology, syntax, and semantics. Popular NLP methods are explained such as word embeddings, recurrent neural networks, encoder-decoder frameworks, attention mechanisms, and transformer models. Emerging trends in NLP involve pre-trained language models, transfer learning, and combining supervised and unsupervised learning techniques.

Uploaded by

Chi Le Minh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views66 pages

AI4youngster - 6 - Topic NLP

The document provides an overview of natural language processing (NLP) including its achievements, methods, and trends. It discusses how NLP is used in industry applications like virtual assistants and machine translation. It describes the levels of linguistic knowledge involved in NLP like morphology, syntax, and semantics. Popular NLP methods are explained such as word embeddings, recurrent neural networks, encoder-decoder frameworks, attention mechanisms, and transformer models. Emerging trends in NLP involve pre-trained language models, transfer learning, and combining supervised and unsupervised learning techniques.

Uploaded by

Chi Le Minh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

TABLE OF CONTENT

1. Some achievements of NLP

2. Overview of NLP

1. Linguistic levels of description

2. Why is NLP difficult?

3. Method in NLP

4. Trend in NLP

5. Conclusion
1. Some achievements of NLP
NLP in Industry
Communication With Machines
Virtual Assistant

● Conversational agents contain:


○ Speech recognition
○ Language analysis
○ Dialogue processing
○ Information retrieval
○ Text to speech
● Google now, Alexa, Siri, Cortana, VAV…
Google Translate & Vietgle Translate
Name entity Recognition

Example of NLP task: Named Entity Recognition (NER):


2. Overview of NLP
What is Natural Language Processing?

● Natural language processing (NLP) is a subfield of artificial intelligence and


computational linguistics. It studies the problems of automated generation and
understanding of natural human languages.
● Natural-language-generation systems convert information from computer databases
into normal-sounding human language. Natural-language-understanding systems
convert samples of human language into more formal representations that are easier
for computer programs to manipulate.
What is Natural Language Processing?

Computers using natural language as input and/or output

language Computer language

Understanding (NLU)

Generation
(NLG)
Natural language processing and computational linguistics

● Natural language processing (NLP) develops methods for solving practical problems
involving language:
○ Automatic speech recognition
○ Machine Translation
○ Sentiment Analysis
○ Information extraction from documents
● Computational linguistics (CL) focused on using technology to support/implement
linguistics:
○ How do we understand language?
○ How do we produce language?
○ How do we learn language?
Level Of Linguistic Knowledge
Morphology

● Morphology studies the structure of words


● Morphological derivation exhibits hierarchical structure
Example: re + vital + ize + ation
● The suffix usually determines the syntactic
category of the derived word
Syntax

● Syntax studies the ways words combine to form phrases and sentences

● Syntactic parsing helps identify who did what to whom, a key step in understanding a
sentence
Semantics and pragmatics

● Semantics studies the meaning of words, phrases and sentences


Ex: I have a dinner in/for an hour
● Pragmatics studies how we use language to do things in the world
Ex: Quy Nhơn thật là đẹp
Natural Language Processing

Applications Core Technologies (NLP sub-problems)


● Machine Translation ● Language modeling
● Information Retrieval ● Part-of-speech tagging
● Question Answering ● Syntactic parsing
● Dialogue Systems ● Named-entity recognition
● Information Extraction ● Word sense disambiguation
● Summarization ● Semantic role labeling
● Sentiment Analysis …

NLP lies at the intersection of computational linguistics and machine learning.


Why is NLP difficult?

● Ambiguity
● Sparsity
● Abstractly, most NLP applications can be viewed as prediction problems
 Should be able to solve them with Machine Learning
● The label set is often the set of all possible sentences
○ Infinite (or at least astronomically large)
● Training data for supervised learning is often not available
 Unsupervised/semi-supervised techniques for training from available data
● Algorithmic challenges
○ Vocabulary can be large (e.g., 50K words)
○ Data sets are often large (GB or TB)
Ambiguity

● “At last, a computer that understands you like your mother”


● It understands you as well as your mother understands you
● It understands (that) you like your mother
● It understands you as well as it understands your mother
Sparsity
Sparsity

Order words by frequency. What is the frequency of n-th ranked word?


Sparsity

● Regardless of how large our corpus is,


there will be a lot of infrequent words

● This means we need to find clever ways


to estimate probabilities for things we
have rarely or never seen
Feature Engineering and Deep Learning

● Up until 2014 most state-of-the-art NLP systems were based on feature engineering +
shallow machine learning models (e.g., SVMs, CRFs)
● Designing the features of a winning NLP system requires a lot of domain-specific
knowledge.
● Deep Learning systems on the other hand rely on neural networks to automatically
learn good representations.
Feature Engineering and Deep Learning

● Deep Learning yields state-of-the-art results in most NLP tasks.


● Large amounts of training data and faster multicore GPU machines are key in the
success of deep learning.
● Neural networks and word embedding play a key role in modern NLP models.
Deep Learning and Linguistic Concepts

● If deep learning models can learn representations automatically, are linguistic


concepts still useful (e.g., syntax, morphology)?
● Some proponents of deep-learning argue that such inferred, manually designed,
linguistic properties are not needed, and that the neural network will learn these
intermediate representations (or equivalent, or better ones) on its own [Goldberg, 2016]
● Goldberg believes many of these linguistic concepts can indeed be inferred by the
network on its own if given enough data.
● However, for many other cases we do not have enough training data available for the
task we care about, and in these cases providing the network with the more explicit
general concepts can be very valuable.
History
3. Method in NLP
NLP representation and architecture
Word2vec
Recurrent Neural Networks

● Neural networks that exploit the temporal nature of language!


● Also allow variable-length inputs
Sequence processing is particularly useful for some tasks!

● To capture this phenomenon


computationally, we can use recurrent
neural networks to perform sequence
processing.
● Syntactic parsing
● Part of speech tagging
● Language modeling
Issue with RNN

● In particular, RNNs may struggle with managing context.


● There’s also the issue of “vanishing gradients”
○ When small derivatives are repeatedly multiplied together, the products can become extremely small
Long Short-Term Memory Networks (LSTMs)

● Remove information no longer needed from the context, and add information likely to
be needed later
● Do this by:
○ Adding an explicit context layer to the architecture
○ This layer controls the flow of information into and out of network layers using specialized neural units called gates
Encoder-decoder Framework

K-dimensional vector of context

Condition of word generated in translation

“Sequence to Sequence Learning with Neural Networks”, 2014


Encoder-decoder Framework
The entire input is summed
in this one single vector

In the seq2seq model, the decoder's


state depends only on the previous
state and the previous output
Neural machine translation
Sequence-to-sequence: the bottleneck problem
Attention

● Attention provides a solution to the bottleneck problem.


● Main idea: on each step of the decoder, focus on a particular part of the source
sequence
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
Seq2Seq with attention
4. Trend in NLP
Neural Machine Translation by Jointly Learn to Align and Translate
Neural Machine Translation by Jointly Learn to Align and Translate

Source: Bahdanau et al., ICLR 2015, https://arxiv.org/abs/1409.0473


Attention (Bahdanau at al., 14164 citations)

● Attention significantly improves NMT performance


○ It’s very useful to allow decoder to focus on certain parts of the source
● Attention solves the bottleneck problem
○ Attention allows decoder to look directly at source; bypass bottleneck
● Attention provides some interpretability
○ By inspecting attention distribution, we can see what the decoder was focusing on
○ We get alignment for free!
○ This is cool because we never explicitly trained an alignment system
○ The network just learned alignment by itself
Seq2seq is very flexible!

● Sequence-to-sequence is useful for more than just MT

● Many NLP tasks can be phrased as sequence-to-sequence:


○ Summarization (long text → short text)
○ Dialogue (previous utterances → next utterance)
○ Parsing (input text → output parse as sequence)
○ Code generation (natural language → Python code)
Self Attention
Transformer model
Pre-trained language model (LM)
Trends in NLP for Technology

● Seq2Seq (Transformer)
● Transfer Learning. Fine-tuning a pre-trained language model (LM) has become the de
facto standard for doing transfer learning in natural language processing
● Combining Supervised & Unsupervised Methods
● Self supervised learning
● Reinforcement Learning
● NLP model with interpretability
● Combine Feature Engineering and Knowledge base
Trends in NLP for applications

● Sentiment Analysis On Social Media


● Multilingual NLP
● Automation In NLP
● Market Intelligence Monitoring
● NLP is already in use in financial marketing, for which it helps to determine the market
situation, employment changes, tender-related information, extracting information
from large repositories, etc.
Key Applications in 2021

● Computational linguistics (i.e., modeling the human capacity for language


computationally)
● Information extraction, especially “open” IE
● Question answering, chatbot (e.g., Watson, Siri, Google now)
● Machine translation
● Summarization
● Opinion and sentiment analysis
● Social media analysis
● Fake News Recognition
5. Conclusion
Challenges

● Localization in language (differ for images)


● Low-resource language
● Domain-specific language.
● The breakthroughs in NLP in the last 2 years are mainly driven by BERT-like
architectures, namely Transformers.
● However, these methods require significant computational resources (memory, time).
Sometimes, even a simple count vectorization does better than a complex BERT
approach.
● Transformers are only understood well enough by a handful of researchers. It is black
box.
Journal and Conference in NLP

http://anthology.aclweb.org/
Conclusion

● Computational linguistics and natural language processing:


○ were originally inspired by linguistics, but now they are almost applications of machine learning and statistics
● Methods in NLP based on typically deep learning
● Trends in NLP (technique and applications)
THANKS FOR LISTENING!

You might also like