0% found this document useful (0 votes)
64 views42 pages

Dialog System A Comprehensive Understanding

A dialog system uses natural language understanding to identify user intent and extract parameter values from user utterances. It tracks the dialogue state and applies dialogue policies to determine the optimal system response. The system may retrieve information from backend knowledge providers to generate natural language responses or perform system actions in response to user requests.

Uploaded by

Tran Trung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views42 pages

Dialog System A Comprehensive Understanding

A dialog system uses natural language understanding to identify user intent and extract parameter values from user utterances. It tracks the dialogue state and applies dialogue policies to determine the optimal system response. The system may retrieve information from backend knowledge providers to generate natural language responses or perform system actions in response to user requests.

Uploaded by

Tran Trung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Dialog System

A comprehensive understanding
Mr. T
Perception
Dialog System

Natural Semantic Frame


Language Ask_weather(date=weekends)
Speech
Utterance Understanding
Recognition • Domain identification
• User intent detection
Trigger • Slot Filling
Dialogue Backend
Text Input Management Knowledge
“I will go out at weekends, what is the weather?” • Dialogue state tracking
• Dialogue policy optimization Providers
Natural
Speech
Response Language
Synthesis
Generation System Action
Text Response Request_location
“Where will you go?/”Where you want to ask for the weather this weekends?”
Trigger Word
Hey Bot/Ok Bot! Patterns extraction Classifier

high level features

Low level features


Unknown

Trigger
Word
Wave Sound Frequency Domain
Convoluted Neural Network Recurrent Neural Network Output

A solution for a trigger word system

What’s else…
Speech Recognition
Speech Recognition
Speech wave
Pre-processing

Acoustic Features
v
WORD PRON (ipa)
vợ v ə ˨˩ˀ w
quê w e Decoder Acoustic Model

NGRAMsko SCORE
Acoustic Dictionary Vợ 2.5
(Pronunciation model) quê 0.7

“Vợ tôi ở quê rất đẹp” Language Model


Speech Synthesis
Ideas of TTS

PG & E will file schedules on April, 20

https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
Look up and Concatenate
No!
Pipeline for Text To Speech

Text Normalization Phonetic Analysis


• Sentence Tokenization • Dictionary Look Up
Text Input • Non-standard words • Names Spelling
• Disambiguation words • Grapheme 2 Phoneme
• Trained by Machine Learning (SVM, DT, LR) 𝑃෠ = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 (𝑃|𝑆)

Prosodic Analysis
• Prosody Structure
Voice Output • Prosody Prominence
• Tune

https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
The state-of-the-art

Where GAN comes into play.


Natural Language Understanding
Word Embedding
(1, 0, 0, 0, 0, 0, 0)
(0, 1, 0, 0, 0, 0, 0) One-hot Encoding Vector
(0, 0, 1, 0, 0, 0, 0)
) word Vector
(0, 0, 0, 0, 1, 0, 0)
(0, 0, 0, 0, 0, 1, 0)

King (0.12, 0.23, 0.43) 1


Documents
2 3 4

Queen (0.14, 0.57, 0.88) 1 10 0 1 0


Frequency Based Vector
Man (0.44, 0.90, 0.11) Terms 2 0 0 0 2

Woman (0.19, 0.23, 0.53) 3 4 0 7 0 word Vector

Boy (0.12, 0.65, 0.42) 4 0 5 0 12

Girl (0.34, 0.44, 0.68) Docs Vector

king
Predication Based Vector
Words Embedding man queen

woman
One-hot Encoding Vector
1 2 3 4 5 6 7 8
Co gai 1 0 0 0 0 0 0 0
Corpus: hot girl 0 1 0 0 0 0 0 0
Co gai, hot xinh dep 0 0 1 0 0 0 0 0 each word
girl, xinh đep, Gets a 1x 8
truoc day 0 0 0 1 0 0 0 0
truoc day, la, vector
mot, chang la 0 0 0 0 1 0 0 0 representation
trai, dam my mot 0 0 0 0 0 1 0 0
chang trai 0 0 0 0 0 0 1 0
dam my 0 0 0 0 0 0 0 1

What’s wrong…
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0
trai, dam my
mot 0 0 0 1 0
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0 And better
trai, dam my
mot 0 0 0 1 0 relationship
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Count Vector
Let us understand this using a simple example.
• D1: He is a lazy boy. She is also lazy.
• D2: Neeraj is a lazy person.
Dictionary = [‘He’, ‘She’, ‘lazy’, ‘boy’, ‘Neeraj’, ‘person’]
D=2 (# docs), N=6 (# words in the dictionary)

Count Vector matrix M = DXN, vector (“lazy”) = [2, 1]


He She lazy boy Neeraj person
D1 1 1 2 1 0 0
D2 0 0 1 0 1 1
TF-IDF vectorization
TF = (Number of times term t appears in a document)/(Number of
terms in the document)

So, TF(This,Document1) = 1/8

TF(This, Document2)=1/5

DF = log(N/n), where, N is the number of documents and n is the


number of documents a term t has appeared in.

where N is the number of documents and n is the number of


documents a term t has appeared in.
So, IDF(This) = log(2/2) = 0.

Let us compute IDF for the word ‘Messi’.


TF-IDF penalizes the word ‘This’ IDF(Messi) = log(2/1) = 0.301.

but assigns greater weight to Now, let us compare the TF-IDF for a common word ‘This’
and a word ‘Messi’ which seems to be of relevance to Document 1.
‘Messi’. TF-IDF(This,Document1) = (1/8) * (0) = 0

TF-IDF(This, Document2) = (1/5) * (0) = 0

TF-IDF(Messi, Document1) = (4/8)*0.301 = 0.15


Co-Occurrence Matrix with a fixed context window

The big idea – Similar words tend to occur together and will
have similar context for example –
“Apple is a fruit. Mango is a fruit.”
Apple and mango tend to have a similar context i.e fruit.

Not preferred in practical


Prediction based Vector
• Continuous Bag of words & Skip-Grams model

P(word|context) P(context|word)

Input Weight matrix = a word vector

https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
Intent and Entities

• Intent = topic/domain Intent = “Home_activity”

“Go home to have the dinner”


• Entities = keywords
Location Action Object
Dialogue Management

VS
Statefulness is the key
• Follow-up
• Pending action
Natural Language Generation
• Fix Response + slot filling + random from a pool

User: Do you know “I’m really quite something”?


Bot: “I’m really quite something” composed by Son Tung-MTP

• Using Neural Network and Language Model

Not recommended
Future of End2End
Data Driven
• Seq2Seq
• Reinforcement

https://aclweb.org/anthology/C18-3006
Tips
• Script Writer
• Personality
• Control the dialogue
• API saves time
• Label Intent and Entities
• Design the flow
• Expandable
• Lots of testing
Applications

Database
Data warehouse

Cloud Services Web service


Rasa: Open source
conversational AI
PRACTICAL TIME
Use case 1: Health-care

Data
Data
Logical
Functions

Speech Dialog
Text Management
Google (Rest API)
Virtual
Assistant Request Analyzed
Text Text

Emotion Health-Care Recommendation


Detection System System NLP
Intent
Entities
Use case 2: HR-chatbot
Chatbot

Supporting
Communicate
Monitoring

Training
HR staffs Employees

Company’s
Knowledge
Resource
THANK YOU!

You might also like