0% found this document useful (0 votes)
152 views6 pages

Sentiment Analysis or Opinion Mining: Project Synopsis

This document provides an overview of a project on sentiment analysis and opinion mining. The project aims to develop a system that can automatically analyze opinions expressed in texts about various entities such as products, services, topics, individuals, organizations and events. It outlines an approach involving identifying object features that opinions are expressed about, determining the semantic orientation (positive, negative or neutral) of opinions on each feature, and grouping synonyms of features. The document defines key concepts and presents a model for feature-based opinion mining involving extracting opinion tuples of the form (opinion holder, object, feature, semantic orientation) from texts. It identifies the main technical problems as feature extraction, opinion orientation classification and synonym grouping.

Uploaded by

Harvinder Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views6 pages

Sentiment Analysis or Opinion Mining: Project Synopsis

This document provides an overview of a project on sentiment analysis and opinion mining. The project aims to develop a system that can automatically analyze opinions expressed in texts about various entities such as products, services, topics, individuals, organizations and events. It outlines an approach involving identifying object features that opinions are expressed about, determining the semantic orientation (positive, negative or neutral) of opinions on each feature, and grouping synonyms of features. The document defines key concepts and presents a model for feature-based opinion mining involving extracting opinion tuples of the form (opinion holder, object, feature, semantic orientation) from texts. It identifies the main technical problems as feature extraction, opinion orientation classification and synonym grouping.

Uploaded by

Harvinder Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Project Synopsis

Sentiment Analysis or Opinion Mining


Team:
Project Leader:
Mr. Ravikant Verma

Team Members:
Prayas Suneja (081235) Harvinder Singh Bhatia (081236) Dhawal Keswani (081241) Gagan Dwivedi (081242)

Overview:
Textual information in the world can be broadly classified into two main categories, facts and opinions. Facts are objective statements about entities and events in the world. Opinions are subjective statements that reflect peoples sentiments or perceptions about the entities and events. Much of the existing research on text information processing has been (almost exclusively) focused on mining and retrieval of factual information, e.g. information retrieval, web search and many other text mining and natural language processing tasks. Little work has been done on the processing of opinions until only recently. Yet, opinions are so important that whenever one needs to make a decision one wants to hear others opinions. This is not only true for individuals but also true for organizations. Finding opinion sources and monitoring them on the Web, however, can be a formidable task because a large number of diverse sources exist on the Web and each source also contains a huge volume of information. In many cases, opinions are hidden in long forum posts and blogs. It is very difficult for a human reader to find relevant sources, extract pertinent sentences, read them, summarize them and organize them into usable forms. An automated opinion mining and summarization system is thus needed. Opinion mining, also known as sentiment analysis, grows out of this need.

Approach:
Model of Opinion Mining
In general, opinions can be expressed on anything, e.g., a product, a service, a topic, an individual, an organization, or an event. The general term object is used to denote the entity that has been commented on. An object has a set of components (or parts) and a set of attributes. Each component may also have its sub-components and its set of attributes, and so on. Thus, the object can be hierarchically decomposed based on the part-of relationship. Definition (object): An object O is an entity which can be a product, topic, person, event, or organization. It is associated with a pair, O: (T, A), where T is a hierarchy or taxonomy of components (or parts) and sub-components of O, and A is a set of attributes of O. Each component has its own set of sub-components and attributes. In this hierarchy or tree, the root is the object itself. Each non-root node is a component or subcomponent of the object. Each link is a part-of relationship. Each node is associated with a set of attributes. An opinion can be expressed on any node and any attribute of the node. However, for an ordinary user, it is probably too complex to use a hierarchical representation. To simplify it, the tree is flattened. The word features is used to represent

both components and attributes. Using features for objects (especially products) is quite common in practice. Note that in this definition the object itself is also a feature, which is the root of the tree. Let an evaluative document be d, which can be a product review, a forum post or a blog that evaluates a particular object O. In the most general case, d consists of a sequence of sentences d= s1, s2, , sm. Definition (opinion passage on a feature): The opinion passage on a feature f of the object O evaluated in d is a group of consecutive sentences in d that expresses a positive or negative opinion on f. This means that it is possible that a sequence of sentences (at least one) together expresses an opinion on an object or a feature of the object. It is also possible that a single sentence expresses opinions on more than one feature, e.g., The picture quality of this camera is good, but the battery life is short. Definition (opinion holder): The holder of a particular opinion is a person or an organization that holds the opinion. In the case of product reviews, forum postings and blogs, opinion holders are usually the authors of the posts. Opinion holders are important in news articles because they often explicitly state the person or organization that holds a particular opinion. For example, the opinion holder in the sentence John expressed his disagreement on the treaty is John. Definition (semantic orientation of an opinion): The semantic orientation of an opinion on a feature f states whether the opinion is positive, negative or neutral. Putting things together, a model for an object and a set of opinions on the features of the object can be defined, which is called the feature-based opinion mining model.

Methodology
Model of Feature-Based Opinion Mining:
An object O is represented with a finite set of features, F = {f1, f2, , fn}, which includes the object itself. Each feature fi F can be expressed with a finite set of words or phrases Wi, which are synonyms. That is, there is a set of corresponding synonym sets W = {W1, W2, , Wn} for the n features. In an evaluative document d which evaluates object O, an opinion holder j comments on a subset of the features Sj F. For each feature fk Sj that opinion holder j comments on, he/she chooses a word or phrase from Wk to describe the feature, and then expresses a positive, negative or neutral opinion on fk. The opinion mining task is to discover all these hidden pieces of information from a given evaluative document d.

Mining output: Given an evaluative document d, the mining result is a set of quadruples. Each quadruple is denoted by (H, O, f, SO), where H is the opinion holder, O is the object, f is a feature of the object and SO is the semantic orientation of the opinion expressed on feature f in a sentence of d. Neutral opinions are ignored in the output as they are not usually useful. Given a collection of evaluative documents D containing opinions on an object, three main technical problems can be identified (clearly there are more): Problem 1: Extracting object features that have been commented on in each document d D. Problem 2: Determining whether the opinions on the features are positive, negative or neutral. Problem 3: Grouping synonyms of features (as different opinion holders may use different words or phrase to express the same feature).

Roadmap
Classifying evaluative texts at the document level or the sentence level does not tell what the opinion holder likes and dislikes. A positive document on an object does not mean that the opinion holder has positive opinions on all aspects or features of the object. Likewise, a negative document does not mean that the opinion holder dislikes everything about the object. In an evaluative document (e.g., a product review), the opinion holder typically writes both positive and negative aspects of the object, although the general sentiment on the object may be positive or negative. To obtain such detailed aspects, going to the feature level is needed. Based on the model presented earlier, three key mining tasks are: 1. Identifying object features: For instance, in the sentence The picture quality of this camera is amazing, the object feature is picture quality. A supervised pattern mining method is proposed. An unsupervised method is used. The technique basically finds frequent nouns and noun phrases as features, which are usually genuine features. Clearly, many information extraction techniques are also applicable, e.g., conditional random fields (CRF), hidden Markov models (HMM), and many others. 2. Determining opinion orientations: This task determines whether the opinions on the features are positive, negative or neutral. In the above sentence, the opinion on the feature picture quality is positive. Again, many approaches are possible. A lexicon-based approach has been shown to perform quite well .The lexicon-based approach basically uses opinion words and phrases in a sentence to determine the orientation of an opinion on a feature. A relaxation labeling based approach is given in . Clearly, various types of supervised learning are possible approaches as well.

3. Grouping synonyms: As the same object features can be expressed with different words or phrases, this task groups those synonyms together. Not much research has been done on this topic.

You might also like