
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Frequency of Each Word in a String in Python
As a part of text analytics, we frequently need to count words and assign weightage to them for processing in various algorithms, so in this article we will see how we can find the frequency of each word in a given sentence. We can do it with three approaches as shown below.
Using Counter
We can use the Counter() from collections module to get the frequency of the words. Here we first apply the split() to generate the words from the line and then apply the most_common ().
Example
from collections import Counter line_text = "Learn and practice and learn to practice" freq = Counter(line_text.split()).most_common() print(freq)
Running the above code gives us the following result −
[('and', 2), ('practice', 2), ('Learn', 1), ('learn', 1), ('to', 1)]
Using FreqDist()
The natural language tool kit provides the FreqDist function which shows the number of words in the string as well as the number of distinct words. Applying the most_common() gives us the frequency of each word.
Example
from nltk import FreqDist text = "Learn and practice and learn to practice" words = text.split() fdist1 = FreqDist(words) print(fdist1) print(fdist1.most_common())
Running the above code gives us the following result −
<FreqDist with 5 samples and 7 outcomes> [('and', 2), ('practice', 2), ('Learn', 1), ('learn', 1), ('to', 1)]
Using Dictionary
In this approach we store the words of the line in a dictionary. Then we apply the count() to get the frequency of each word. Then zip the words with the word frequency values. The final result is shown as a dictionary.
Example
text = "Learn and practice and learn to practice" words = [] words = text.split() wfreq=[words.count(w) for w in words] print(dict(zip(words,wfreq)))
Running the above code gives us the following result:
{'Learn': 1, 'and': 2, 'practice': 2, 'learn': 1, 'to': 1}