0% found this document useful (0 votes)
4 views

Text Mining and Sentiment Assignment

The document outlines a process for text mining and sentiment analysis using R programming. It includes instructions for installing necessary packages, loading data, and performing text preprocessing steps such as removing stopwords and punctuation. Additionally, it demonstrates how to visualize word frequencies and analyze sentiment using the 'syuzhet' method.

Uploaded by

Shubham Parida
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Text Mining and Sentiment Assignment

The document outlines a process for text mining and sentiment analysis using R programming. It includes instructions for installing necessary packages, loading data, and performing text preprocessing steps such as removing stopwords and punctuation. Additionally, it demonstrates how to visualize word frequencies and analyze sentiment using the 'syuzhet' method.

Uploaded by

Shubham Parida
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

TEXT MINING AND SENTIMENT ANALYSIS

SUBMITTED BY –

install.packages(c("tm","tidyverse", "lubridate", "tidytext", "dplyr",


"sentimentr","SnowballC","RColorBrewer","syuzhet"))
install.packages("ggplot2")

library("tm")
library("wordcloud")
library("tidytext")
library("dplyr")
library("sentimentr")
library("SnowballC")
library("syuzhet")

data<-read.csv(file.choose(),header= TRUE)

summary(data)
length(data)
colnames(data)
head(data)
sum(is.na(data))

TextDoc <- Corpus(VectorSource(data))


TextDoc
toSpace <- content_transformer(function (x , pattern ) gsub(pattern,
" ", x))

TextDoc <- tm_map(TextDoc, toSpace, "/")


TextDoc <- tm_map(TextDoc, toSpace, "@")
TextDoc <- tm_map(TextDoc, toSpace, "\\|")
TextDoc <- tm_map(TextDoc, content_transformer(tolower))
TextDoc <- tm_map(TextDoc, removeNumbers)
TextDoc <- tm_map(TextDoc, removeWords,
stopwords("english"))
TextDoc <- tm_map(TextDoc, removeWords, c("s", "company",
"team"))
TextDoc <- tm_map(TextDoc, removePunctuation)
TextDoc <- tm_map(TextDoc, stripWhitespace)
TextDoc <- tm_map(TextDoc, stemDocument)

TextDoc_dtm <- TermDocumentMatrix(TextDoc)


dtm_m <- as.matrix(TextDoc_dtm)

dtm_v <- sort(rowSums(dtm_m),decreasing=TRUE)


dtm_d <- data.frame(word = names(dtm_v),freq=dtm_v)

head(dtm_d, 5)

barplot(dtm_d[1:5,]$freq, las = 2, names.arg = dtm_d[1:5,]$word,


col ="lightgreen", main ="Top 5 most frequent words",
ylab = "Word frequencies")
x <- c(15696, 11329, 10921,6580,6019)
labels <- c("book", "stori", "read","like","one")
pie(x, labels, main = "Pie Chart of Top 5 Most word appeared", col
= rainbow(length(x)))
legend("topright", c("book","stori","read","like","one"), cex =
0.8,fill = rainbow(length(x)))

set.seed(1234)
wordcloud(words = dtm_d$word, freq = dtm_d$freq, min.freq = 5,
max.words=100, random.order=FALSE, rot.per=0.40,
colors=brewer.pal(8, "Dark2"))

syuzhet_vector <- get_sentiment(data$reviewText,


method="syuzhet")
head(syuzhet_vector)
summary(syuzhet_vector)

You might also like