0% found this document useful (0 votes)
11 views3 pages

Word Cloud

The document outlines the installation of necessary R packages for text analysis and visualization, including 'readxl', 'tm', and 'wordcloud'. It describes the process of importing text data, creating a text corpus, preprocessing the text, and generating a Document-Term Matrix. Finally, it details how to create a word cloud from the processed text data.

Uploaded by

Akshit Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

Word Cloud

The document outlines the installation of necessary R packages for text analysis and visualization, including 'readxl', 'tm', and 'wordcloud'. It describes the process of importing text data, creating a text corpus, preprocessing the text, and generating a Document-Term Matrix. Finally, it details how to create a word cloud from the processed text data.

Uploaded by

Akshit Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

#install necessary packages

Install.packages(“readxl”)

Install.packages(“tm”)

Install.packages(“wordcloud”)

Install.packages(“topicmodels”)

Install.packages(“ggplot2”)

Install.packages(“treemap”)

Install.packages(“syuzhet”)

# Load necessary libraries

Library(readxl) # To read Excel files

Library™ # Text mining

Library(wordcloud) # Word cloud generation

Library(topicmodels) # Topic Modelling

Library(ggplot2) # for plotting bar graphs

# Import data file

# Read the data (assuming the text is in the second column of the first sheet)

Text_data <- as.character(coding[[2]]) # Convert the second column to a character vector

# Create a Corpus

Corpus <- VCorpus(VectorSource(text_data))

# Preprocess the text

Corpus <- tm_map(corpus, content_transformer(tolower)) # Convert to lowercase

Corpus <- tm_map(corpus, removePunctuation) # Remove punctuation


Corpus <- tm_map(corpus, removeNumbers) # Remove numbers

Corpus <- tm_map(corpus, removeWords, stopwords(“english”)) # Remove common


stopwords

Corpus <- tm_map(corpus, removeWords, c(“word 1”, “ word 2”)) # Remove common
stopwords

Corpus <- tm_map(corpus, stripWhitespace) # Remove extra whitespace

Print(corpus[[1]]$content) # Reviewing first row data

# Create a Document-Term Matrix (DTM)

Dtm <- DocumentTermMatrix(corpus)

# Calculate word frequencies

Mat<-as.matrix(dtm) # creating

Mat

Freq<- colSums(mat) #named vector

Freq

# Create the word cloud

Set.seed(1)

Wordcloud(words= names(freq),

Freq = freq,

Min.freq =1,

Max.words = 10,

Random.order = FALSE,

Random.color = FALSE,

Rot.per = 0.2,
Colors = brewer.pal(4, “Dark2”))

You might also like