
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Create Word Cloud using Python
A "word cloud" is a visual representation of text data, where the size of each word indicates its frequency or importance within the dataset. It helps us to identify the most common and important words in a text. It is typically used to describe/denote big data in a word.
In this article, we will create a word cloud on the Python programming language, and the data is accessed from Wikipedia.
Modules to create a Word Cloud in Python
Following are the modules required to create a word cloud in Python :
Install wordcloud
Before installing the word cloud module, you have to make sure that Python is installed and properly set up on your system. We can install WordCloud using the following code in the command prompt -
pip install wordcloud
Install NumPy
We can install numpy using the following code in the command prompt -
pip install numpy
Install Wikipedia
We can install Wikipedia using the following code in the command prompt -
pip install wikipedia
Creating a word cloud on Python programming
The word cloud is created on Python programming, and the data source is Wikipedia. Following are the steps:
Step 1: Printing data from wikipedia
Following is the code to print the data from Wikipedia:
import sys import numpy as np from PIL import Image import wikipedia from wordcloud import WordCloud, STOPWORDS my_str=input("Enter the title: ") title=wikipedia.search(my_str)[0] page=wikipedia.page(title) text=page.content print(text)
Following is the output of the above code:
Step 2: Cleaning unwanted data
The unwanted data, like "is," "the," "are," "with," etc., can be removed by STOPWORDS. Following is the code to remove unwanted data:
Following is the code to print the data from Wikipedia:
import numpy as np from PIL import Image import wikipedia from wordcloud import WordCloud, STOPWORDS my_str=input("Enter the title: ") title=wikipedia.search(my_str)[0] page=wikipedia.page(title) text=page.content print(text) background=np.array(Image.open("cloud.jpg")) stopwords=set(STOPWORDS) wc = WordCloud(background_color="white", max_words=400, mask=background, stopwords=stopwords,width=800, height=600)
Step 3: Generating word cloud
The generated word cloud will be saved in the same directory where your script is located. Following is the code to generate the word cloud using generate() method in Python:
import numpy as np from PIL import Image import wikipedia from wordcloud import WordCloud, STOPWORDS my_str=input("Enter the title: ") title=wikipedia.search(my_str)[0] page=wikipedia.page(title) text=page.content print(text) background=np.array(Image.open("cloud.jpg")) stopwords=set(STOPWORDS) wc = WordCloud(background_color="white", max_words=400, mask=background, stopwords=stopwords,width=800, height=600) #generating wordcloud wc.generate(text) wc.to_file("prg.jpg")
Following is the output of the above code: