0% found this document useful (0 votes)
11 views27 pages

Module 6 - Social Media Analytics and Text Mining.

Uploaded by

sonia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views27 pages

Module 6 - Social Media Analytics and Text Mining.

Uploaded by

sonia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Social Media Analytics

and Text Mining.


Introducing Social Media
• Social media provides a collaborative environment which can be
employed for:
•  Building relationships
•  Distributing content
•  Rating products and services
•  Engaging target
Social Media Categories
• Social networking websites―These provide a Web-based platform to
users where they can create a personalized profile summarizing and
showcasing their interests; define other members as connections or
contacts; and communicate and share content with their contacts.
Examples of social networking websites include Facebook, LinkedIn,
MySpace, Hi5, and Bebo.
• Blogs―Short for ‘Web logs’, blogs represent online journals to
showcase the content organized in the reverse chronological order.
Examples of blogging sites include Blogger, WordPress, and Tumblr.
• Microblogs―These allow people to share and showcase small posts
and are suitable for quick sharing of content in a few lines of text or an
individual photo or video. Twitter is a well-known microblogging
website.
Social Media Categories
• Content communities and media sharing sites―These allow users to
organize and share different types of media content such as videos and
images. The members can also comment on the shared content.
Examples include YouTube, Pinterest, Flikr, and Instagram.
• Wiki―It represents a collective website in which the members can
create and modify content in a community-based database. In other
words, the users can modify the content of any hosted page and can
also create new pages in the website based on the wiki technology. One
of the most popular examples of Wiki websites is Wikipedia, which is
an online encyclopedia.
• Social bookmarking websites―These websites allow users to organize
and manage tags and links to other websites. Well-known examples
include Reddit, StumbleUpon, and Digg.
Introducing Key Elements of Social Media
• Collect―In order to effectively incorporate social media, the business
organizations first need to understand how to collect and leverage useful
information and market artifacts. This involves critical and careful analysis of the
information coming from various sources such as customers, competitors,
journalists, and other market influencers. Various tools such as feed readers, blog
subscriptions, and email newsletters can be employed to collect information from
various sources.
• Curate―Once the information is collected from various sources, the next step is to
effectively curate the important information to be sent to the clients and internal
stakeholders. This involves intelligent filtering and aggregation of the information
collected from various resources. This not only provides an effective insight to the
customers, but also helps in having a clear vision about the current industry
standards and market trends. Various curation tools such as Newsle, LinkedIn, and
RSS readers can be employed for this task.
Introducing Key Elements of Social Media
• Create―After collection and curation of the collected information, the
organizations need to create valuable content objects that can provide
a focus and industry buzz to them. This is an effective marketing
strategy to create a leader position in the industry. This can be
accomplished by employing various publishing programs and sharing
routines.
• Share―A key element for implementing effective social media is
sharing of information. This employs sharing your content, information,
and ideas with others. This helps in expanding the social media
network. Various tools such as Feedly and Hootsuite help in sharing
information and content over the social media.
Introducing Key Elements of Social Media
• Engage―The basic idea behind social media is to engage the existing
and prospective customers. The tools and routines of social media and
the regular practice of listening, curation, and sharing help executives
and sales personnel of an organization in engaging more and more
customers, stakeholders, prospective customers, journalists, and
industry influencers. Tools such as Salesforce help to connect people
from different categories. Apart from that, various mobile apps also
help in expanding the realm of reaching more and more people.
Introducing Text Mining
• Text mining or text analytics comes as a handy tool to quantitatively examine the
text generated by social media and filtered in the form of different clusters,
patterns, and trends.
• In other words, text mining represents the set of tools, techniques, and methods
applied for automatically processing natural language textual data provided in
huge amounts in the form of computer files.
• The extracted and structured content and themes are used for rapid analysis,
identification of hidden data and information, and for automatic decision-making.
Text mining tools are often based on the principles of information retrieval and
natural learning processes.
• Complex linkage structure makes text mining in social networks a challenging job,
requiring the help of automated tools and sorting techniques.
• A number of text mining tools and algorithms have been developed to enable easy
extraction of information from different textual resources.
Introducing Text Mining
• The recent developments in statistical and data processing tools have added to the
evolution in the domain of text mining.
• Text mining employs the concepts obtained from various fields ranging from
linguistics and statistics to Information and Communication Technologies (ICT).
• Statistical pattern learning is applied to create patterns from the extracted text,
which are further examined to obtain valuable information.
• The overall process of text mining comprises retrieval of information, lexical
analysis, creation and recognition of patterns, tagging, extraction of information,
application of data mining techniques, and predictive analytics. This can be
summarized as follows: Text mining = Lexicometry + Data mining
Application Areas
• Text mining can be applied in the following areas:
• Competitive intelligence―In order to succeed, business
organizations not only need to know about the key players in the
industry, but also about the strengths and weaknesses of their
competitors. Text mining provides factual data to organizations that
can be applied for strategic decision making.
• Community leveraging―Text mining facilitates the identification
and extraction of the information embedded in community
interaction. This information can be applied for amending marketing
strategies.
• Law enforcement―Text mining can be applied in the domain of
government intelligence for countering anti-terrorist activities
Application Areas
• Text mining can be applied in the following areas:
• Life sciences―Text mining can also be effectively applied in the
area of research and development of drugs. Bioinformatics
companies, such as PubGen, are applying biomedical text mining
combined with network visualization as an Internet service.
Key Steps Text Mining Process
• 1. Extracting the keyword―Any text analysis process begins by
identification of relevant and precise keyword(s) that can be applied for
specific queries. Next the content and the linkage patterns are
considered for applying keyword searches since the content related to
similar keywords is often linked. The selected keywords act as social
network nodes and play an important role while clustering the text.
• 2. Classifying and clustering the text―Various algorithms are applied
for classifying text from the source content. For this process, the nodes
are associated with labels prior to classification. After that, the
classified text is clustered on the basis of similarity. The classification
and clustering of the text are greatly influenced by the linkage
structure of data. Accurate results can be obtained by applying node
labeling and content-based classification techniques.
Key Steps Text Mining Process
• 3. Identifying patterns―Trend analysis applies the principle that even
for the same content, the clusters collected at different nodes can have
different concept distributions. For this reason, the concepts at various
nodes are compared and classified accordingly in the same or different
subcollections. Obtaining desired results for a specific query involves
careful processing of the relevant document. For effective text mining,
several stages of processing need to be applied on a document, such as:


Stages Text Mining Process
• Text preprocessing―This involves the identification of all the unique
words in a document. Non-informative words, such as the, and, or,
and when, are filtered out from the document text before applying
word stemming. Word stemming refers to the process of reducing
the inflected or derived words to their stem base. For example,
words such as cat, cats, catlike, and catty will all be mapped to the
same stem base ‘cat’. Terms such as stemmers or stemming
algorithms are also used interchangeably in stemming programs.
Affix stemmers trim down both suffix and prefix, such as ed, ly, and
ing from a given word. Popular stemmers include Brute Force
algorithm and Suffix Tripping algorithm.

Stages Text Mining Process
• Document representation―A document is basically represented in words and
terms.
• Document retrieval―This involves the retrieval of a document based on some
query. Accurate results are ensured using text indexing and accuracy measures.
Text indexing and searching capabilities can be incorporated in an application
using Lucene which is a Java library.
• Document clustering―This involves the grouping of conceptually related
documents to ensure fast retrieval. A term for a given query can be searched
faster from the well-clustered documents. Document clustering can be
implemented using the following techniques:
• Hierarchical clustering
• One-pass clustering
• Buckshot clustering
Text Mining Process
• Both structured and unstructured data are involved in text mining.
Unstructured data comes from reviews and summaries while the
structured data is obtained from organized spreadsheets.
• Text mining tools identify themes, patterns, and insights hidden in the
structured as well as unstructured data. Various text mining software
are employed by organizations for different data mining applications.
Text Mining Software
• The following are some commonly used text mining software
• R―Used for statistical data analysis, text processing, and sentiment
analysis
• ActivePoint―Applied for natural language processing and online
catalog-based contextual search
• Attensity―Used for extraction of facts including who, what, where,
and why and then identifying people, places, and events and how
they are related
• Crossminder―Applied for cross-lingual text analytics
• Compare Suite―Used for comparing texts by keywords and
highlighting common and unique
Text Mining Software
• The following are some commonly used text mining software
• keywords IBM SPSS Predictive Analytics Suite―Applied for data
and text mining
• Monarch―Applied for analysis and transformation of reports into
live data
• SAS Text Miner―Provides a rich suite of text processing and
analysis tools
• Textalyzer―Used for online text analysis Apart from these, some
other text mining
Sentiment Analysis
• Sentiment analysis is one of the most important components of text mining. Also
termed as opinion mining, it involves careful analysis of people’s opinions,
sentiments, attitudes, appraisals, and evaluations.
• This is accomplished by examining large amounts of unstructured data obtained
from the Internet on the basis of positive, negative, or neutral view of the end user.
• Sentiment analysis involves the analysis of following sentences:
• Facts―Product A is better than product B.
• Opinions―I don’t like A. I think B is better in terms of durability. Similar to
Web analysis, specific queries are applied in sentiment analysis to retrieve and
rank relevant content.
• However, sentiment analysis also differs from Web analysis in certain factors. It is
possible to determine from a sentiment analysis that whether the content
expresses an opinion on the topic, and also whether the opinion is positive or
negative.
Sentiment Analysis
• Ranking in Web analysis is done on the basis of the frequency of keywords.
• On the other hand, ranking in sentiment analysis is done on the basis of polarity of
the attitude.
• With the widespread use of Web 2.0 technologies, a huge volume of opinionated
data is available on the social media.
• People using social media put their reviews and comments about products used
and also share their feedback, opinions and experiences with others in their
network.
• These reviews and feedback are utilized by organizations to improve and upgrade
their products and services, and enhance their brand equity.
• Sentiment analysis applies other domains such as linguistics, digital technologies,
text analysis tools, artificial intelligence, and Natural Language Processing (NLP) for
identification and extraction of useful information.
• This greatly influences various domains ranging from politics and science to social
science.
Sentiment Analysis
• The process of sentiment analysis begins by tagging words using Parts of Speech
(POS) such as subject, verb phrase, verb, noun phrase, determiner, and
prepositions.
• Defined patterns are filtered to identify their sentiment orientation. For example,
‘beautiful room’ has an adjective followed by noun.
• The adjective ‘beautiful’ indicates a positive perspective about the noun ‘room’.
• At this stage, the emotional factor in the phrase is also examined and analyzed.
After that, an average sentiment orientation of all the phrases is computed and
analyzed to conclude if a product is recommended by a user.
Sentiment Analysis
• Following parameters may be applied to classify the given text in the process of
sentiment analysis:
• Polarity, which can be positive, negative, or neutral
• Emotional states, which can be sad, angry, or happy
• Scaling system or numeric values
• Subjectivity or objectivity
• Features based on key entities such as durability of furniture, screen size of cell
phone, and lens quality of camera Automated sentiment
Online Tools for Sentiment Analysis
• Topsy―It is used to measure success of a Website on Twitter. It tracks the
occurrence of given and related keywords, website name, and website URL in
tweets.
• BackTweets―This toll is applied to improve search engine ranking of a website. It
tracks tweets that link back to a website.
• Twitterfall―It locates tweets that are important for a website. It can be used to
stay in touch with the customers and consumers, and respond to their queries and
suggestions in real time.
• TweetBeep― This is used to send timely updates or alerts for the topics of
interest.
• Reachli―Designed especially for Pinterest, it is a content sharing website. This tool
helps in tracking data and scheduling and organizing pins (denote the updates in
Pinterest) in advance.
THANK YOU

15

You might also like