0% found this document useful (0 votes)

15 views5 pages

Chapter 5 Predictive Analytics II Text J Web J and Social Media Analytics

Chapter 5 discusses the significance of text analytics and text mining in extracting valuable insights from unstructured text data, which can lead to better decision-making and competitive advantages for businesses. It differentiates between text analytics, which focuses on quantitative results and pattern recognition, and text mining, which aims to discover qualitative knowledge from textual sources. The chapter also covers the process of sentiment analysis, its applications in various fields, and the steps involved in analyzing sentiments expressed in text.

Uploaded by

stanspatch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views5 pages

Chapter 5 Predictive Analytics II Text J Web J and Social Media Analytics

Uploaded by

stanspatch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

CHAPTER 5

PREDICTIVE ANALYTICS II: TEXT, WEB, AND SOCIAL

MEDIA ANALYTICS
LO 1 Describe text analytics and understand the need for text mining.
Due to the fact that knowledge is power in today’s business world, and knowledge is
derived from data and information, businesses that effectively and efficiently tap into
their text data sources will have the necessary knowledge to make better decisions,
leading to a competitive advantage over those businesses that lag behind.
TEXT ANALYTICS is the automated process of translating large volumes of unstructured text
into quantitative data to uncover insights, trends, and patterns. Combined with data
visualization tools, this technique enables companies to understand the story behind the
numbers and make better decisions.
Text analysis uses many linguistic, statistical, and machine learning techniques. Text
analytics involves information retrieval from unstructured data and the process of
structuring the input text to derive patters and trends and evaluating and interpreting the
output data. It also involves lexical analysis, categorisation, clustering, pattern recognition,
tagging, annotation, information extraction, link and association analysis and visualisation.

@study_ingmadesimple Luca du Toit

Application areas of text mining:
►information extraction – text mining can identify key phrases and relationships within text
by looking for predefined objects and sequences in text by way of pattern matching
►topic tracking – based on a user profile and documents that a user views, text mining
can predict other documents of interest to the user
►summarisation – text mining can summarise a document which saves time on the part
of the reader
►categorisation – text mining can identify the main themes of a document and then
place the document into a predefined set of categories based on those themes
►clustering – text mining can group similar documents without having a predefined set of
categories
►concept linking – text mining can create concepts-related documents by identifying
their shared concepts and thereby helping users find information that they perhaps would
not have found using traditional search methods
►question answering – text mining can find the best answer to a given question through
knowledge-drive pattern matching

Mang organisations are realising the importance of extracting knowledge from their
document-based data repositories through the use of text mining tools. Text mining
benefits are obvious in the areas where very large amounts of textual data are being
generated, such as law (court orders), academic research (research articled), finance
(quarterly reports), medicine (discharge summaries), biology (molecular interactions),
technology (patent files), and marketing (customer comments).
LO 2 Differentiate among text analytics, text mining, and data mining.

Both text analytics and text mining intend to solve the same problem (automatically
analysing raw text data) by using different techniques. Text mining identifies relevant
information within a text and therefore, provides qualitative results. Text analytics,
however, focuses on finding patterns and trends across large sets of data, resulting in
more quantitative results. Text analytics is usually used to create graphs, tables and other
sorts of visual reports

TEXT ANALYTICS is a broad concept that includes information retrieval, as well as

information extraction, data mining, and web mining. It is the process of converting
unstructured text data into meaningful data for analysis, in order to measure customer
opinions, product reviews, feedback, to provide search facility, sentimental analysis and
entity modelling to support fact based decision making.
TEXT MINING is primarily focused on discovering new and useful knowledge from the
textual data sources. It is the semiautomated process of identifying and extracting facts,
relationships, and patterns (useful information and knowledge) from large amounts of
unstructured data sources, that would otherwise remain buried in the mass of textual big
data.
Natural language processing (NLP) is an important component of text mining and is a
subfield of artificial intelligence and computational linguistics. It studies the problem of
understanding the natural human language with the view of converting depictions of
human language into more formal representations that are easier for computer programs
to manipulate.

@study_ingmadesimple Luca du Toit

Challenges associated with the implementation of NLP:
►part-of-speech tagging – It is difficult to markup terms in a text as corresponding to a
particular part of speech (such as nouns, verbs, adjectives, or adverbs) because the part
of speech depends not only on the definition of the term but also on the context within
which it is used.
►text segmentation – Some written languages, such as Chinese, Japanese, and Thai, do
not have single-word boundaries and so the text-parsing task requires the identification of
word boundaries which is often difficult. Similar challenges in speech segmentation
emerge when analysing spoken language because sounds representing successive letters
and words blend into each other.
►word sense disambiguation – Many words have more than one meaning. Selecting the
meaning that makes the most sense can only be accomplished by taking into account
the context within which the word is used.
►syntactic ambiguity – The grammar for natural languages is ambiguous because
multiple possible sentence structures often need to be considered. Choosing the most
appropriate structure usually requires a fusion of semantic and contextual information.
►imperfect or irregular input – Foreign or regional accents and vocal impediments in
speech and typographical or grammatical errors in texts make the processing of the
language an even more difficult task.
►speech acts - A sentence can often be considered an action by the speaker. The
sentence structure alone may not contain enough information to define this action. For
example, “Can you pass the class?” requests a simple yes/no answer, whereas “Can you
pass the salt?” is a request for a physical action to be performed.

TEXT MINING PROCESS:

Task 1: establish the corpus – The main purpose of the first task activity is to collect all the
documents related to the context being studied. Once collected, the text documents, e-
mails, web pages, short notes, recordings, XML files, etc are transformed and organised in
a manner such that they are all in the same representational form for computer
processing.
Task 2: create the term-document matrix – In this task, the digitised and organised
documents (the corpus) are used to create the TDM where row represent the documents
and columns represent the terms. The goal is to convert the list of organised documents
into a TDM where the cells are filled with the most appropriate indices.
Task 3: extract the knowledge – Novel patterns are extracted in the context of the
specific problem being addressed.

@study_ingmadesimple Luca du Toit

‘Text mining’ is the same as ‘data mining’ in that it has the same purpose and uses the
same processes, but with ‘text mining’ the input to the process is a collection of
unstructured data files.

DATA MINING is the process of identifying valid, novel, potentially useful, and ultimately
understandable patterns in data stored in structured databases, where the data are
organised in records structured by categorical, ordinal, or continuous variables.

Structured data is data that is standardized into a tabular format with numerous rows and
columns, making it easier to store and process for analysis and machine learning
algorithms. Structured data can include inputs such as names, addresses, and phone
numbers.

Unstructured data is data that does not have a predefined data format. It can include
text from sources, like social media or product reviews, or rich media formats like, video
and audio files.
LO 3 Describe sentiment analysis.

SENTIMENT refers to a settled opinion reflective of one’s feelings. It is a view or opinion that
is held or expressed.
SENTIMENT ANALYSIS deals with the automatic extraction of opinions, feelings, and
subjectivity in text. Sentiment analysis is the process of computationally identifying and
categorising opinions expressed in a piece of text, especially in order to determine
whether the writer's attitude towards a particular topic, product, etc. is positive, negative,
or neutral.

Sentiment analysis is often used by businesses to detect sentiment in social data, gauge
brand reputation, and understand customers.
Sentiment analysis applications:
►voice of the customer (VOC) – Sentiment analysis can access a company’s product
and service reviews to better understand and better manager customer complaints and
praises.
►voice of the market (VOM) – Sentiment analysis can help understand aggregate
opinions and trends of stakeholders to help companies with competitive intelligence and
product development and positioning.
►voice of the employee (VOE) – Sentiment analysis uses rich, opinionated textual data in
an effective and efficient way to listen to what employees are saying. Happy employees
empower customer experience efforts and improve customer satisfaction.
►brand management – Sentiment analysis helps shape perceptions rather than just
managing experiences.
►financial markets – Sentiment analysis can be used as a proper way to compute the
market movements, with the use of social media, news, blogs, and discussion groups.
►politics – Sentiment analysis can help understand what voters are thinking and can
clarify a candidate’s position on issues. It can also help political organisations, campaigns,
and new analysts to better understand which issues and positions matter the most to
voters.
►government intelligence – Sentiment analysis can allow the automatic analysis of the
opinions that people submit about pending policy or government-regulation proposals.

@study_ingmadesimple Luca du Toit

Sentiment analysis process:
(1) sentiment detection/ O-S
Polarity calculation – This step aims
to differentiate between a fact
ana an opinion, which may be
viewed as classification of text as
objective or subjective. If the
objectivity value is close to 1, then
there is no opinion and the process
goes back and grabs the next text
data to analyse.
(2) N-P polarity classification – This
step will take the opinionated
piece of text and will classify the
opinion as falling under one of two
opposing sentiment polarities
(positive or negative), or locate its
position on the continuum
between these two polarities. This
step will also involve identifying the
strength of the sentiment (mildly,
moderately, strongly, or very
strongly). This classification many
need to be done at several levels:
term, phrase, sentence, and
document level.

(3) target identification – This step aims to accurately identify the target of the expressed
sentiment, such as a person, product, or event.

(4) collection and aggregation – Once the sentiments of all text data points in the
document are identified and calculated, they are aggregated and converted to a single
sentiment measure for the whole document.
REFERENCES – the above summary is made using the following textbook:
R. Sharda, et el. 2018. Business Intelligence, Analytics, and Data Science: A Managerial
Perspective. Fourth Edition. Pearson.
PLEASE NOTE: I am selling the service provided in summarising this chapter and not the
intellectual property provided.

@study_ingmadesimple Luca du Toit

Concept Mining: Fundamentals and Applications
From Everand
Concept Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Summer Vacation Assignment (2025-26)
No ratings yet
Summer Vacation Assignment (2025-26)
3 pages
Text Mining: Fundamentals and Applications
From Everand
Text Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Critical Book Review: English Education E 19
No ratings yet
Critical Book Review: English Education E 19
7 pages
Conditionals, Factives and The Left Periphery: Liliane Haegeman
No ratings yet
Conditionals, Factives and The Left Periphery: Liliane Haegeman
19 pages
Cuadernillo 3 Roy B
No ratings yet
Cuadernillo 3 Roy B
59 pages
2023 Chief Examiners Report - English Language
No ratings yet
2023 Chief Examiners Report - English Language
3 pages
DLP English Day 1
No ratings yet
DLP English Day 1
3 pages
Mid Basico 6
No ratings yet
Mid Basico 6
6 pages
Text Analytics with Python: A Brief Introduction to Text Analytics with Python
From Everand
Text Analytics with Python: A Brief Introduction to Text Analytics with Python
Anthony S. Williams
No ratings yet
Adobe Scan 04-Jan-2025
No ratings yet
Adobe Scan 04-Jan-2025
6 pages
Modul Descriptive
No ratings yet
Modul Descriptive
9 pages
Fenger-Weisse 2022 Matching Domains. The Syntac, Morphology, and Phonology of The Verb in Sinhala
No ratings yet
Fenger-Weisse 2022 Matching Domains. The Syntac, Morphology, and Phonology of The Verb in Sinhala
41 pages
Text Mining
No ratings yet
Text Mining
18 pages
Masinde Muliro University of Science and Technology
No ratings yet
Masinde Muliro University of Science and Technology
99 pages
Hawk Compare 2 Birds QP
No ratings yet
Hawk Compare 2 Birds QP
28 pages
Module 1 Part1
No ratings yet
Module 1 Part1
54 pages
Oral Communication 2nd - 3rd Level (Autoguardado)
No ratings yet
Oral Communication 2nd - 3rd Level (Autoguardado)
34 pages
DMTerm Paper
No ratings yet
DMTerm Paper
4 pages
Business Intelligence and Anlytics UNIT 2
No ratings yet
Business Intelligence and Anlytics UNIT 2
35 pages
Solved MCQs - 1
No ratings yet
Solved MCQs - 1
2 pages
Bcse206l FDS Module-4 Smsatapathy
No ratings yet
Bcse206l FDS Module-4 Smsatapathy
50 pages
Mother Tongue: A Complex Idea, or A Simple Truth. Language - Is The Tool of My
No ratings yet
Mother Tongue: A Complex Idea, or A Simple Truth. Language - Is The Tool of My
5 pages
2022 CAPE Communication Studies Examination Focus Workshop
No ratings yet
2022 CAPE Communication Studies Examination Focus Workshop
84 pages
Screenshot 2024-06-04 at 12.02.17 AM
No ratings yet
Screenshot 2024-06-04 at 12.02.17 AM
23 pages
Chapter 7 - Text Mining, Sentiment Analysis, and Social Analytics
No ratings yet
Chapter 7 - Text Mining, Sentiment Analysis, and Social Analytics
91 pages
Adjective
No ratings yet
Adjective
11 pages
ETB Text Analytics Using Machine Learning - 20-12-24
No ratings yet
ETB Text Analytics Using Machine Learning - 20-12-24
38 pages
Spanish Level Self-Assessment
No ratings yet
Spanish Level Self-Assessment
4 pages
AEC 111 - English Comm.
No ratings yet
AEC 111 - English Comm.
265 pages
Base Form Past Simple Past Participle Present Perfect Past Perfect
No ratings yet
Base Form Past Simple Past Participle Present Perfect Past Perfect
4 pages
Afif Fauzi-Fitk
No ratings yet
Afif Fauzi-Fitk
54 pages
The Example Aspect of Reading According To The Text (Narrative Text)
No ratings yet
The Example Aspect of Reading According To The Text (Narrative Text)
6 pages
2816 - Angliski jazik-KLUC - 2018-Juni PDF
No ratings yet
2816 - Angliski jazik-KLUC - 2018-Juni PDF
3 pages
Chapter 07 - in Class
No ratings yet
Chapter 07 - in Class
49 pages
Listening Lesson Plan 4
No ratings yet
Listening Lesson Plan 4
13 pages
Chapter 03 - Sharda 11e Full Accessible PPT 07
No ratings yet
Chapter 03 - Sharda 11e Full Accessible PPT 07
29 pages
Unit I - Text Mining
No ratings yet
Unit I - Text Mining
48 pages
AFM - Module 4
No ratings yet
AFM - Module 4
48 pages
Literary Elements Project
No ratings yet
Literary Elements Project
4 pages
Linking Words
No ratings yet
Linking Words
1 page
TextAnalyticsApplicationofTextMining2021 31122023 071845am 1 10122024 061001pm
No ratings yet
TextAnalyticsApplicationofTextMining2021 31122023 071845am 1 10122024 061001pm
7 pages
Lecture 5 - Text Mining Sentiment and Social Media Analytics
No ratings yet
Lecture 5 - Text Mining Sentiment and Social Media Analytics
52 pages
21st Century Teachers' Tales PHONICS For BEGINNERS, Short Vowels Sound - Volume 1
60% (10)
21st Century Teachers' Tales PHONICS For BEGINNERS, Short Vowels Sound - Volume 1
48 pages
An Overview On Extractive Text Summariza
No ratings yet
An Overview On Extractive Text Summariza
13 pages
What Is Text Mining
No ratings yet
What Is Text Mining
9 pages
Succeed in CAE 2015 00 SSGuide
81% (16)
Succeed in CAE 2015 00 SSGuide
52 pages
IMTC634 - Data Science - Chapter 7
No ratings yet
IMTC634 - Data Science - Chapter 7
24 pages
10 - Session 10 - Text Analytics, Text Mining and Sentiment Analysis
No ratings yet
10 - Session 10 - Text Analytics, Text Mining and Sentiment Analysis
36 pages
TextMining PAKDD1999
No ratings yet
TextMining PAKDD1999
7 pages
Text Analysis Monkeylearncom
No ratings yet
Text Analysis Monkeylearncom
46 pages
Seven Text Mining Techniques
No ratings yet
Seven Text Mining Techniques
21 pages
Text Mining
No ratings yet
Text Mining
12 pages
Pronouns: By:Emma and Taylor
No ratings yet
Pronouns: By:Emma and Taylor
13 pages
Text Mining Introduction
No ratings yet
Text Mining Introduction
6 pages
Text Mining
No ratings yet
Text Mining
16 pages
WINSEM2023-24 BCSE206L TH VL2023240501787 2024-02-19 Reference-Material-I
No ratings yet
WINSEM2023-24 BCSE206L TH VL2023240501787 2024-02-19 Reference-Material-I
42 pages
Text Mining
No ratings yet
Text Mining
25 pages
UNIT - 1 Text Mining
No ratings yet
UNIT - 1 Text Mining
18 pages
CH 06 PPTaccessible
No ratings yet
CH 06 PPTaccessible
71 pages
Text Mining in Big Data Analytics
No ratings yet
Text Mining in Big Data Analytics
34 pages
Applied Text Analysis
No ratings yet
Applied Text Analysis
13 pages
Great Big Natural Language Processing Primer KDnuggets
No ratings yet
Great Big Natural Language Processing Primer KDnuggets
25 pages
Comparative Analysis of Text Mining Techniques For
No ratings yet
Comparative Analysis of Text Mining Techniques For
12 pages
Text Mining
No ratings yet
Text Mining
13 pages
Guided Reading Lesson Plan - 1
No ratings yet
Guided Reading Lesson Plan - 1
7 pages
Simad University: Chapter 7: Text and Web Mining
No ratings yet
Simad University: Chapter 7: Text and Web Mining
6 pages
Lecture 6-Text Mining and Sentiment Analysis
No ratings yet
Lecture 6-Text Mining and Sentiment Analysis
57 pages
Text Mining & Applications in Social Media: by Anthony Yang
No ratings yet
Text Mining & Applications in Social Media: by Anthony Yang
30 pages
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
No ratings yet
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
11 pages
Section 2 Text Analytics and Text Mining Overview
No ratings yet
Section 2 Text Analytics and Text Mining Overview
47 pages
Module 4
No ratings yet
Module 4
63 pages
1-What Is Text Mining - IBM
No ratings yet
1-What Is Text Mining - IBM
5 pages
Text Mining: Concepts, Process and Applications: January 2013
No ratings yet
Text Mining: Concepts, Process and Applications: January 2013
5 pages
05b.BDA (18CS72) Module-5 Text Mining
No ratings yet
05b.BDA (18CS72) Module-5 Text Mining
23 pages
Text Mining-: Document and Interesting Text Phrases - in A Customer Experience Context, Text
No ratings yet
Text Mining-: Document and Interesting Text Phrases - in A Customer Experience Context, Text
2 pages
Introduction To Text Mining
No ratings yet
Introduction To Text Mining
6 pages
Submitted To: Submitted By:: Text Mining
No ratings yet
Submitted To: Submitted By:: Text Mining
15 pages
Case Study On Text Mining
No ratings yet
Case Study On Text Mining
8 pages
Different Text Mining Techniques
No ratings yet
Different Text Mining Techniques
4 pages
Text Mining: Tools, Techniques, and Applications
No ratings yet
Text Mining: Tools, Techniques, and Applications
19 pages
Text Mining
No ratings yet
Text Mining
10 pages
Text Analytics and Text Mining Overview
No ratings yet
Text Analytics and Text Mining Overview
16 pages
Text and Sentiment Analysis
No ratings yet
Text and Sentiment Analysis
41 pages
Text and Web Mining
No ratings yet
Text and Web Mining
44 pages
A Detailed Study On Text Mining Techniques
No ratings yet
A Detailed Study On Text Mining Techniques
4 pages
Text Mining: 2 History
No ratings yet
Text Mining: 2 History
8 pages
Text Mining: A Burgeoning Technology For Knowledge Extraction
100% (1)
Text Mining: A Burgeoning Technology For Knowledge Extraction
5 pages
Survey Data Analysis
No ratings yet
Survey Data Analysis
17 pages
Effective Classification of Text
No ratings yet
Effective Classification of Text
6 pages

Chapter 5 Predictive Analytics II Text J Web J and Social Media Analytics

Uploaded by

Chapter 5 Predictive Analytics II Text J Web J and Social Media Analytics

Uploaded by

CHAPTER 5

PREDICTIVE ANALYTICS II: TEXT, WEB, AND SOCIAL

@study_ingmadesimple Luca du Toit

TEXT ANALYTICS is a broad concept that includes information retrieval, as well as

@study_ingmadesimple Luca du Toit

TEXT MINING PROCESS:

@study_ingmadesimple Luca du Toit

@study_ingmadesimple Luca du Toit

@study_ingmadesimple Luca du Toit

You might also like