0% found this document useful (0 votes)

10 views9 pages

Sentiment Analysis in Java - Analyzing Multisentence Text Blocks

The article discusses methods for performing sentiment analysis on multisentence text blocks in Java using Stanford CoreNLP. It highlights the importance of calculating a single sentiment score for entire text blocks, suggesting weighted averages to account for the significance of different sentences. Two approaches are presented: one focusing on the first and last sentences in product reviews, and another that increases the weight of sentences as the story progresses.

Uploaded by

tadala8333858591

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views9 pages

Sentiment Analysis in Java - Analyzing Multisentence Text Blocks

Uploaded by

tadala8333858591

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Java Magazine

Sentiment analysis in Java: Analyzing

multisentence text blocks
Yuli Vasiliev | February 18, 2022

   
One sentence is positive. One sentence is negative. What’s the
sentiment of the entire text block?
Sentiment analysis tells you if text conveys a positive, negative, or neutral message. When applied to a
stream of social media messages from an account or a hashtag, for example, you can determine whether
sentiment is overall favorable or unfavorable. If you examine sentiments over time, you can analyze them
for trends or attempt to correlate them against external data. Based on this analysis, you could then build
predictive models.

This is the second article in a series on performing sentiment analysis in Java by using the sentiment tool
integrated into Stanford CoreNLP, an open source library for natural language processing (NLP).

In the first article, “Perform textual sentiment analysis in Java using a deep learning model,” you learned
how to use this tool to determine the sentiment of a sentence from very negative to very positive. In
practice, however, you might often need to look at a single aggregate sentiment score for the entire text
block, rather than having a set of sentence-level sentiment scores.

Here, I will describe some approaches you can use to perform analysis on an arbitrarily sized text block,
building on the Java code presented in the first article.
Scoring a multisentence text block
When you need to deal with a long, multisentence text block (such as a tweet, an email, or a product
review), you might naturally want to have a single sentiment score for the entire text block rather than
merely receiving a list of sentiment scores for separate sentences.

One simple solution is to calculate the average sentiment score for the entire text block by adding the
sentiment scores of separate sentences and dividing by the number of sentences.

However, this approach is not perfect in most cases since different sentences within a text block can affect
the overall sentiment differently. In other words, different sentences within a block may have varying
degrees of importance when you calculate the overall sentiment.

There is no single algorithm for identifying the most-important sentences that would work equally well for
all types of texts; perhaps that is why Stanford CoreNLP does not provide a built-in option for identifying
the overall sentiment of a multisentence text block.

Fortunately, you can manually code such functionality to work best for the type of text you are dealing
with. For example, text samples of the same type usually have something in common when it comes to
identifying the most-important sentences.

Imagine you’re dealing with product reviews. The most-important statements—from the standpoint of
the overall review sentiment—typically can be found at the beginning or the end of the review. The first
statement usually expresses the main idea of the review, and the last one summarizes it. While this may
not be true for every review, a significant portion of them look exactly like that. Here is an example.

I would recommend this book for anyone who wants an introduction to natural language
processing. Just finished the book and followed the code all way. I tried the code from the
resource website. I like how it is organized. Well done.

The Stanford CoreNLP sentiment classifier would identify the above sentences as follows:

I would recommend this book for anyone who wants an introduction to

Sentence:
natural language processing.

Sentiment: Positive(3)

Sentence: Just finished the book and followed the code all way.

Sentiment: Neutral(2)

Sentence: I tried the code from the resource website.

Sentiment: Neutral(2)

Sentence: I like how it is organized.

Sentiment: Neutral(2)
Sentence: Well done.

Sentiment: Positive(3)

As you can see, the first and the last sentences suggest that the review is positive. Overall, however, the
number of neutral sentences in the review outnumber the positive statements, which means that an
arithmetic linear average, where you give the same weight to each sentence, does not seem to be a proper
way to calculate the overall sentiment of the review. Instead, you might want to calculate it with more
weight assigned to the first and the last sentences, as implemented in the example discussed below.

The weighted-average approach

Continuing with the sample Java program introduced in the first article, add the following
getReviewSentiment() method to the nlpPipeline class, as follows:

import [Link].*;
...

public static void getReviewSentiment(String review, float weight)

{
int sentenceSentiment;
int reviewSentimentAverageSum = 0;
int reviewSentimentWeightedSum = 0;
Annotation annotation = [Link](review);
List<CoreMap> sentences = [Link]([Link]
int numOfSentences = [Link]();
int factor = [Link](numOfSentences*weight);
if (factor == 0) {
factor = 1;
}
int divisorLinear = numOfSentences;
int divisorWeighted = 0;

for (int i = 0; i < numOfSentences; i++)

{
Tree tree = [Link](i).get([Link]);
sentenceSentiment = [Link](tree);
reviewSentimentAverageSum = reviewSentimentAverageSum + sentenceS
if(i == 0 || i == numOfSentences -1) {
reviewSentimentWeightedSum = reviewSentimentWeightedSum + sen
divisorWeighted += factor;
}
else
{
reviewSentimentWeightedSum = reviewSentimentWeightedSum + sen
divisorWeighted += 1;
}
}
}
[Link]("Number of sentences:\t\t" + numOfSentences);
[Link]("Adapted weighting factor:\t" + factor);
[Link]("Weighted average sentiment:\t" + [Link]((floa
[Link]("Linear average sentiment:\t" + [Link]((float)
}

Copy code snippet

The getReviewSentiment() method shown above illustrates how to calculate the overall sentiment of
a review using two approaches, calculating both a weighted average and the linear average for
comparison purposes.

The method takes the text of a review as the first parameter. As the second, you pass a weighting factor to
apply to the first and the last sentences when calculating the overall review sentiment. The weighting
factor is passed in as a real number in the range [0, 1]. To apply the scale to fit a particular review, you
recalculate the weighting factor by multiplying the passed value by the number of sentences in the review,
thus calculating the adapted weighting factor.

To test the getReviewSentiment() method, use the following code:

public class OverallReviewSentiment

{
public static void main(String[] args)
{
String text = "I would recommend this book for anyone who wants an i
[Link]();
[Link](text, 0.4f);
}
}

Copy code snippet

This example passes in 0.4 as the weighting factor, but you should experiment with the value passed in.
The higher this value, the more importance is given to the first and last sentences in the review.

To see this approach in action, recompile the nlpPipeline class and compile the newly created
OverallReviewSentiment class. Then, run OverallReviewSentiment, as follows:

$ javac [Link]
$ javac [Link]
$ java OverallReviewSentiment
This should produce the following results:

Number of sentences: 5

Adapted weighting factor: 2

Weighted average sentiment: 3

Linear average sentiment: 2

As you can see, the weighted average shows a more relevant estimate of the overall sentiment of the
review than the linear average does.

Sequential increases in weight ratios

When it comes to storylike texts that cover a sequence of events spread over a time span, the importance
of sentences—from the standpoint of the overall sentiment—often increases as the story goes. That is,
the most important sentences in the sense of having the most influence on the overall sentiment
conveyed by the story are typically found at the end, because they describe the most-recent episodes,
conclusions, or experiences.

Consider the following tweet:

The weather in the morning was terrible. We decided to go to the cinema. Had a great time.

The sentence-level sentiment analysis of this story gives the following results:

Sentence: The weather in the morning was terrible.

Sentiment: Negative(1)

Sentence: We decided to go to the cinema.

Sentiment: Neutral(2)

Sentence: Had a great time.

Sentiment: Positive(3)

Although the tweet begins with a negative remark, the overall sentiment here is clearly positive due to the
final note about time well spent at the movies. This pattern also works for reviews where customers
describe their experience with a product much like a story, as in the following example:

I love the stories from this publisher. They are always so enjoyable. But this one disappointed
me.

Here is the sentiment analysis for it:

Sentence: I love the stories from this publisher.

Sentiment: Positive(3)

Sentence: They are always so enjoyable.

Sentiment: Positive(3)

Sentence: But this one disappointed me.

Sentiment: Negative(1)

As you can see, more comments here are positive, but the entire block has an overall negative sentiment
due to the final, disapproving remark. As in the previous example, this suggests that in a text block like
this one, later sentences should be weighted more heavily than earlier ones.

For the ratio, you might use the index value of each sentence in the text, taking advantage of the fact that
a later sentence has a greater index value. In other words, the importance increases proportionally to the
index value of a sentence.

A matter of scale
Another important thing to decide is the scale you’re going to use for sentiment evaluation of each
sentence, as the best solution may vary depending on the type of text blocks you’re dealing with.

To evaluate tweets, for example, you might want to employ all five levels of sentiment available with
Stanford CoreNLP: very negative, negative, neutral, positive, and very positive.

When it comes to product review analysis, you might choose only two levels of sentiment—positive and
negative—rounding all other options to one of these two. Since both the negative and the positive classes
in Stanford CoreNLP are indexed with an odd number (1 and 3, respectively), you can tune the sentiment
evaluation method discussed earlier to round the weighted average being calculated to its nearest odd
integer.

To try this, you can add to the nlpPipeline class as follows:

public static void getStorySentiment(String story)

{
int sentenceSentiment;
int reviewSentimentWeightedSum = 0;
Annotation annotation = [Link](story);
List<CoreMap> sentences = [Link]([Link]
int divisorWeighted = 0;
for (int i = 1; i <= [Link](); i++)
{
{
Tree tree = [Link](i-1).get([Link]);
sentenceSentiment = [Link](tree);
reviewSentimentWeightedSum = reviewSentimentWeightedSum + sentenc
divisorWeighted += i;
}
[Link]("Weighted average sentiment:\t" + (double)(2*Math.
}

Copy code snippet

Test the above method with the following code:

public class OverallStorySentiment

{
public static void main(String[] args)
{
String text = "The weather in the morning was terrible. We de
[Link]();
[Link](text);
}
}

Copy code snippet

Recompile nlpPipeline and compile the newly created OverallStorySentiment class, and run
OverallStorySentiment as follows:

$ javac [Link]
$ javac [Link]
$ java OverallStorySentiment

The result should look as follows:

Weighted average sentiment: 3.0

This test uses a single sample text to test the sentiment-determining method discussed here. For an
example of how to perform such a test against a set of samples, refer back to the first article in this series.

Conclusion
This article looked at two methods of calculating the overall sentiment of a multisentence text block. Both
methods assume different sentences within a text block can affect the overall sentiment differently.
The first method determines the sentiment of customer reviews and is based on the observation that
the most-significant comments in a product review are at the beginning and end.

The second method calculates the overall sentiment by increasing the weight of each sentence as you
move from the beginning to the end of the text. This method may work fine for storylike texts where
the importance of sentences typically increases as the story progresses.

You can (and should) experiment with these and other methods to find the approach that best models the
type of text in your business case.

The final article of this series will show how to train the Stanford CoreNLP sentiment tool with your own
data to understand domain-specific phrases.

Dig deeper
Perform textual sentiment analysis in Java using a deep learning model

Natural language processing at your fingertips with OCI Language

How to program machine learning in Java with the Tribuo library

Performing sentiment analysis using Oracle Text

Yuli Vasiliev

Yuli Vasiliev is a programmer, freelance author, and consultant currently specializing in open source
development; Oracle database technologies; and, more recently, natural-language processing (NLP).

 Previous Post Next Post 

Resources Why Oracle Learn What's New Contact Us

for
Analyst What is Try Oracle US Sales
About Reports Customer Cloud Free Tier 1.800.633.0738
Best CRM Service? Oracle How can we help?
Careers
What is ERP? Sustainability
Developers Cloud What is Subscribe to
Economics Marketing Oracle Content
Investors
Corporate Automation? Oracle COVID- Try Oracle Cloud
Partners 19 Response
Responsibility What is Free Tier
Startups Oracle and
Diversity and Procurement? Events
Inclusion SailGP
What is Talent News
Security Management? Oracle and
Practices Premier
What is VM?
League
Oracle and Red
Bull Racing
Honda

Perform Textual Sentiment Analysis in Java Using A Deep Learning Model
No ratings yet
Perform Textual Sentiment Analysis in Java Using A Deep Learning Model
6 pages
Multiclass Sentiment Analysis Study
No ratings yet
Multiclass Sentiment Analysis Study
5 pages
NLPPR7
No ratings yet
NLPPR7
6 pages
App Review Sentiment Analysis Using Vectors
No ratings yet
App Review Sentiment Analysis Using Vectors
5 pages
Picet Presentation
No ratings yet
Picet Presentation
12 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
5 pages
Minor Fnal
No ratings yet
Minor Fnal
22 pages
Sentence Level Sentiment Analysis
No ratings yet
Sentence Level Sentiment Analysis
8 pages
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
No ratings yet
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
6 pages
Information Sciences: Li Kong, Chuanyi Li, Jidong Ge, Feifei Zhang, Yi Feng, Zhongjin Li, Bin Luo
No ratings yet
Information Sciences: Li Kong, Chuanyi Li, Jidong Ge, Feifei Zhang, Yi Feng, Zhongjin Li, Bin Luo
17 pages
Sentiment Analysis Using Vectotizer
No ratings yet
Sentiment Analysis Using Vectotizer
37 pages
applsci 13 04056 - 加水印
No ratings yet
applsci 13 04056 - 加水印
18 pages
Sentiment Analysis Using Machine Learning Classifiers
No ratings yet
Sentiment Analysis Using Machine Learning Classifiers
41 pages
Python Sentiment Analysis Guide
No ratings yet
Python Sentiment Analysis Guide
3 pages
1 s2.0 S187705091630463X Main
No ratings yet
1 s2.0 S187705091630463X Main
6 pages
Literature Review
No ratings yet
Literature Review
5 pages
Ambigu
No ratings yet
Ambigu
13 pages
Sentiment Analysis Over Social Networks: An
No ratings yet
Sentiment Analysis Over Social Networks: An
6 pages
Deep Learning for Movie Review Sentiment
No ratings yet
Deep Learning for Movie Review Sentiment
8 pages
Deep Learning for Movie Review Sentiment Analysis
No ratings yet
Deep Learning for Movie Review Sentiment Analysis
8 pages
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
No ratings yet
A Natural Language Processing For Sentiment Analysis From Text Using Deep Learning Algorithm
7 pages
Paper1 PDF
No ratings yet
Paper1 PDF
6 pages
Contextual Sentiment Polarity Analysis Using Conditional Random Fields
No ratings yet
Contextual Sentiment Polarity Analysis Using Conditional Random Fields
7 pages
Literature Review On Sentiment Analysis
100% (2)
Literature Review On Sentiment Analysis
6 pages
Large-Scale Sentiment Analysis System
No ratings yet
Large-Scale Sentiment Analysis System
4 pages
SSRN Id3349572
No ratings yet
SSRN Id3349572
4 pages
Guide To Sentiment Analysis Using Natural Language Processing
No ratings yet
Guide To Sentiment Analysis Using Natural Language Processing
15 pages
Sentiments of Public Opinion
No ratings yet
Sentiments of Public Opinion
3 pages
Analysis of Student Feedback Using Deep Learning
No ratings yet
Analysis of Student Feedback Using Deep Learning
4 pages
Stock Prediction With Sentiment
No ratings yet
Stock Prediction With Sentiment
7 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
64 pages
Machine Learning Sentiment Analysis
No ratings yet
Machine Learning Sentiment Analysis
20 pages
W04 3253 PDF
No ratings yet
W04 3253 PDF
7 pages
Sentiment Classification of Reviews Using Sentiwordnet: 9Th. It & T Conference
No ratings yet
Sentiment Classification of Reviews Using Sentiwordnet: 9Th. It & T Conference
10 pages
Urdu Sentiment Analysis Guide
No ratings yet
Urdu Sentiment Analysis Guide
18 pages
Weakly-Supervised Deep Embedding For Product Review Sentiment Analysis
No ratings yet
Weakly-Supervised Deep Embedding For Product Review Sentiment Analysis
12 pages
Document Analysis
No ratings yet
Document Analysis
6 pages
Sentiment Analysis Overview
No ratings yet
Sentiment Analysis Overview
3 pages
Sentiment Analysis: Srishti Chaubey
No ratings yet
Sentiment Analysis: Srishti Chaubey
40 pages
Sa Mincut Aditya
No ratings yet
Sa Mincut Aditya
36 pages
Sentiment Ci 2006
No ratings yet
Sentiment Ci 2006
16 pages
Sentiment Analysis Overview
No ratings yet
Sentiment Analysis Overview
6 pages
Azimi (2020)
No ratings yet
Azimi (2020)
44 pages
Sentiment Analysis and Opinion Mining
No ratings yet
Sentiment Analysis and Opinion Mining
49 pages
1 PB
No ratings yet
1 PB
5 pages
Recursive Deep Learning For Sentiment Analysis Over Social Data
No ratings yet
Recursive Deep Learning For Sentiment Analysis Over Social Data
6 pages
571 Document Mod
No ratings yet
571 Document Mod
30 pages
Sentiment Analysis Basics
No ratings yet
Sentiment Analysis Basics
32 pages
Machine Learning With Advance Model
No ratings yet
Machine Learning With Advance Model
19 pages
Viva Questions For Opinion Mining Project by NASIR ABBAS - VUBWN
No ratings yet
Viva Questions For Opinion Mining Project by NASIR ABBAS - VUBWN
8 pages
Naïve Bayes for Sentiment Analysis in Python
No ratings yet
Naïve Bayes for Sentiment Analysis in Python
23 pages
Sentiment Classification of Movie Reviews Using Contextual Valence Shifters
No ratings yet
Sentiment Classification of Movie Reviews Using Contextual Valence Shifters
23 pages
Report
No ratings yet
Report
30 pages
Hindi Sentiment Analysis Method
No ratings yet
Hindi Sentiment Analysis Method
8 pages
Two-Stage Sentiment Analysis Framework
No ratings yet
Two-Stage Sentiment Analysis Framework
30 pages
Sentiment Analysis With Contextual Embeddings and Self-Attention
No ratings yet
Sentiment Analysis With Contextual Embeddings and Self-Attention
10 pages
Sentiment Analysis On Data of Social Media: Aditya Zaware
No ratings yet
Sentiment Analysis On Data of Social Media: Aditya Zaware
5 pages
A Novel Machine Learning Approach For Sentiment Analysis Based On Adverb-Adjective-Noun-Verb (AANV) Combinations
No ratings yet
A Novel Machine Learning Approach For Sentiment Analysis Based On Adverb-Adjective-Noun-Verb (AANV) Combinations
5 pages
Cin2015 715730
No ratings yet
Cin2015 715730
9 pages
Grade 8 Excel Formulas Worksheet
No ratings yet
Grade 8 Excel Formulas Worksheet
2 pages
Embedded Systems Design With The Atmel AVR Microcontroller Synthesis Lectures On Digital Circuits and Systems 24 Barrett Full Digital Chapters
No ratings yet
Embedded Systems Design With The Atmel AVR Microcontroller Synthesis Lectures On Digital Circuits and Systems 24 Barrett Full Digital Chapters
111 pages
9th Class Chapter 5 Chemistry Notes Sindh Board
No ratings yet
9th Class Chapter 5 Chemistry Notes Sindh Board
10 pages
Understanding SWR
No ratings yet
Understanding SWR
51 pages
Fane Colossus Prime 18XS DS141117
No ratings yet
Fane Colossus Prime 18XS DS141117
1 page
Error Log for F43.2 Diagnosis
No ratings yet
Error Log for F43.2 Diagnosis
26 pages
Chemistry Practicals First Years
100% (1)
Chemistry Practicals First Years
65 pages
Design of A Laboratory Workplace: M405A For LBYMREI
No ratings yet
Design of A Laboratory Workplace: M405A For LBYMREI
12 pages
Project and Drawing Properties Guide
No ratings yet
Project and Drawing Properties Guide
82 pages
OE5200 References
No ratings yet
OE5200 References
2 pages
Final Exam Review Sheet MAT 142 ONLINE
No ratings yet
Final Exam Review Sheet MAT 142 ONLINE
113 pages
Papermaking Raw Materials and Their Characteristics: Prof. Chhaya Sharma
No ratings yet
Papermaking Raw Materials and Their Characteristics: Prof. Chhaya Sharma
16 pages
Ansi Ieee C37 010 1979 PDF
100% (2)
Ansi Ieee C37 010 1979 PDF
54 pages
BIT Information Technology (02130082) : University of Pretoria Yearbook 2017
No ratings yet
BIT Information Technology (02130082) : University of Pretoria Yearbook 2017
38 pages
Understanding Stress and Strain in Materials
No ratings yet
Understanding Stress and Strain in Materials
15 pages
SANYO UR18650A 2.2ah Specifications
100% (1)
SANYO UR18650A 2.2ah Specifications
18 pages
Final Elevated Water Tank Qalax
100% (1)
Final Elevated Water Tank Qalax
4 pages
Engineering - Circuit - Analysis - 9th Solutions - CH - 17
No ratings yet
Engineering - Circuit - Analysis - 9th Solutions - CH - 17
80 pages
H35.1 (M) - 2004 Alloy and Temper Designation Systems For Aluminum
0% (1)
H35.1 (M) - 2004 Alloy and Temper Designation Systems For Aluminum
16 pages
(JP Morgan) Just What You Need To Know About Variance Swaps
100% (1)
(JP Morgan) Just What You Need To Know About Variance Swaps
30 pages
Unit 4 KTT 2 Organic Pathways - Question Book
No ratings yet
Unit 4 KTT 2 Organic Pathways - Question Book
10 pages
Alcatel-Lucent Advanced Troubleshooting v2.0 Lab Guide
No ratings yet
Alcatel-Lucent Advanced Troubleshooting v2.0 Lab Guide
73 pages
Grade 11 Life Sciences Remote Learning Booklet - Term 3 & 4
100% (7)
Grade 11 Life Sciences Remote Learning Booklet - Term 3 & 4
91 pages
Symbolic Tensor Calculus Using Index Notation
No ratings yet
Symbolic Tensor Calculus Using Index Notation
8 pages
2.3.5 Practice - Equilibrium and Kinetics (Practice) - 2
No ratings yet
2.3.5 Practice - Equilibrium and Kinetics (Practice) - 2
7 pages
Demonte Adjetivos
No ratings yet
Demonte Adjetivos
38 pages
Photoshop CMYK for Screen-Printing
100% (1)
Photoshop CMYK for Screen-Printing
3 pages
2023 Msce Mock 2 Chemistry P1
100% (3)
2023 Msce Mock 2 Chemistry P1
12 pages
Chapter # 01: (1-1) Wapda Safety Policy
No ratings yet
Chapter # 01: (1-1) Wapda Safety Policy
36 pages
Common-Ion Effect and Buffers: V. Valdez and K. Zamora
No ratings yet
Common-Ion Effect and Buffers: V. Valdez and K. Zamora
2 pages

Sentiment Analysis in Java - Analyzing Multisentence Text Blocks

Uploaded by

Sentiment Analysis in Java - Analyzing Multisentence Text Blocks

Uploaded by

Java Magazine

Sentiment analysis in Java: Analyzing

I would recommend this book for anyone who wants an introduction to

Sentence: I tried the code from the resource website.

Sentence: I like how it is organized.

The weighted-average approach

public static void getReviewSentiment(String review, float weight)

for (int i = 0; i < numOfSentences; i++)

Copy code snippet

To test the getReviewSentiment() method, use the following code:

public class OverallReviewSentiment

Copy code snippet

Adapted weighting factor: 2

Weighted average sentiment: 3

Linear average sentiment: 2

Sequential increases in weight ratios

Consider the following tweet:

Sentence: The weather in the morning was terrible.

Sentence: We decided to go to the cinema.

Sentence: Had a great time.

Here is the sentiment analysis for it:

Sentence: They are always so enjoyable.

Sentence: But this one disappointed me.

To try this, you can add to the nlpPipeline class as follows:

public static void getStorySentiment(String story)

Copy code snippet

Test the above method with the following code:

public class OverallStorySentiment

Copy code snippet

The result should look as follows:

Weighted average sentiment: 3.0

Natural language processing at your fingertips with OCI Language

Performing sentiment analysis using Oracle Text

 Previous Post Next Post 

Resources Why Oracle Learn What's New Contact Us

You might also like