Building Chatbots in Python Chapter2 PDF
Building Chatbots in Python Chapter2 PDF
Alan Nichol
Co-founder and CTO, Rasa
DataCamp Building Chatbots in Python
An example
DataCamp Building Chatbots in Python
Intents
I'm hungry
Show me good pizza spots
I want to take my boyfriend out for sushi
Can also be request_booking
DataCamp Building Chatbots in Python
Entities
In [3]: pattern.findall(message)
Out[3]: ['Mary', 'Oxford', 'Google']
DataCamp Building Chatbots in Python
Let's practice!
DataCamp Building Chatbots in Python
Word vectors
Alan Nichol
Co-founder and CTO, Rasa
DataCamp Building Chatbots in Python
Machine learning
Programs which can get better at a task by being exposed to more data
Identifying which intent a user message belongs to
DataCamp Building Chatbots in Python
Vector representations
Word vectors
Context Candidates
let's meet at the ___ tomorrow office, gym, park, beach, party
I love going to the ___ to play with the dogs beach, park
In [3]: nlp.vocab.vectors_length
Out[3]: 300
Similarity
Direction of vectors matters
"Distance" between words = angle between the vectors
Cosine similarity
1: If vectors point in the same direction
0: If they are perpendicular
-1: If they point in opposite directions
DataCamp Building Chatbots in Python
.similarity()
"can" and "cat" are spelled similarly but have low similarity
but "cat" and "dog" have high similarity
In [1]: import spacy
In [4]: doc.similarity(nlp("can"))
Out[4]: 0.30165292161215396
In [5]: doc.similarity(nlp("dog"))
Out[5]: 0.80168555173294953
DataCamp Building Chatbots in Python
Let's practice!
DataCamp Building Chatbots in Python
Alan Nichol
Co-founder and CTO, Rasa
DataCamp Building Chatbots in Python
Supervised learning
A classifier predicts the intent label given a sentence
'Fit' classifier by tuning it on training data
Evaluate performance on test data
Accuracy: the fraction of labels we predict correctly
DataCamp Building Chatbots in Python
ATIS dataset
Thousands of sentences with labeled intents and entities
Collected from a real flight booking service
Intents like
atis_flight
atis_airfare
DataCamp Building Chatbots in Python
ATIS dataset II
In [1]: sentences_train[:2]
Out[1]: [
"i want to fly from boston at 838 am
and arrive in denver at 1110 in the morning",
"what flights are available from pittsburgh
to baltimore on thursday morning"
]
In [2]: labels_train[:2]
Out[2]: [
"atis_flight",
"atis_flight"
]
In [4]: scores = [
...: cosine_similarity(X[i,:], test_x)
...: for i in range(len(sentences_train)
...: ]
In [5]: labels_train[np.argmax(scores)]
Out[5]: 'atis_flight'
DataCamp Building Chatbots in Python
Let's practice!
DataCamp Building Chatbots in Python
Entity extraction
Alan Nichol
Co-founder and CTO, Rasa
DataCamp Building Chatbots in Python
In [3]: doc = nlp("my friend Mary has worked at Google since 2009")
Roles
Dependency parsing
In [3]: list(shanghai.ancestors)
Out[3]: [to, flight]
In [4]: list(singapore.ancestors)
Out[4]: [from, flight]
DataCamp Building Chatbots in Python
Shopping example
In [1]: doc = nlp("let's see that jacket in red and some blue jeans")
Let's practice!
DataCamp Building Chatbots in Python
Alan Nichol
Co-founder and CTO, Rasa
DataCamp Building Chatbots in Python
Rasa NLU
Library for intent recognition & entity extraction
Based on spaCy, scikit-learn, & other libraries
Built in support for chatbot specific tasks
DataCamp Building Chatbots in Python
Interpreters
In [1]: message = "I want to book a flight to London"
In [2]: interpreter.parse(message))
Out[2]: {
"intent": {
"name": "flight_search",
"confidence": 0.9
},
"entities": [
{
"entity": "location",
"value": "London",
"start": 27,
"end": 33
}
]
}
DataCamp Building Chatbots in Python
Rasa usage
# Creating a model
In [1]: from rasa_nlu.config import RasaNLUConfig
Rasa pipelines
In [1]: spacy_sklearn_pipeline = [
"nlp_spacy",
"ner_crf",
"ner_synonyms",
"intent_featurizer_spacy",
"intent_classifier_sklearn"
]
In [3]: RasaNLUConfig(
cmdline_args={"pipeline": "spacy_sklearn"}
)
Out[3]: <rasa_nlu.config.RasaNLUConfig at 0x10f60aa20>
DataCamp Building Chatbots in Python
Handling typos
In [1]: pipeline = [
...: "nlp_spacy",
...: "intent_featurizer_spacy",
...: "intent_featurizer_ngrams",
...: "intent_classifier_sklearn"
...: ]
DataCamp Building Chatbots in Python
Let's practice!