0% found this document useful (0 votes)
0 views10 pages

Generating N Grams

N-grams are continuous sequences of words or symbols in a document, defined as neighboring sequences of items, with applications in language models, text mining, and more. They can be categorized into unigrams, bigrams, and trigrams based on the number of words they combine. The document also provides examples of generating n-grams using CountVectorizer and TextBlob.

Uploaded by

Vidhya B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views10 pages

Generating N Grams

N-grams are continuous sequences of words or symbols in a document, defined as neighboring sequences of items, with applications in language models, text mining, and more. They can be categorized into unigrams, bigrams, and trigrams based on the number of words they combine. The document also provides examples of generating n-grams using CountVectorizer and TextBlob.

Uploaded by

Vidhya B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

GENERATING N-GRAMS

N-Grams
2

 N-grams are the fusion of multiple letters or


multiple words.
 ngram_range: tuple (min_n, max_n),
default=(1, 1)
 N-grams are continuous sequences of words or
symbols, or tokens in a document.
 In technical terms, they can be defined as the
neighboring sequences of items in a document.
 They have a wide range of applications, like
language models, semantic features, spelling
correction, machine translation, text mining, etc.
3

 Unigrams – The unique words present in


a sentence.
 Bigram – It is the combination of two
words.
 Trigram – It is the combination of three
words
Bigram Example 1
4

 text = ["I love my family and in my


family there are ten members."]
 vectorizer2 =
CountVectorizer(ngram_range=(2,
2))vector2 =
vectorizer2.fit_transform(text)
vector2
 vectorizer2.vocabulary_
Trigram Example 1
5

vectorizer4.vocabulary_
Generate n-grams using
6
TextBlob
 Text = “ I am learning NLP”
 Import textblob
 From textblob import TextBlob
 TextBlob(Text).ngrams(1)
 TextBlob(Text).ngrams(2)
 TextBlob(Text).ngrams(3)
Generate n-grams using Regular
Expression
7
8
9
10

You might also like