0% found this document useful (0 votes)
42 views22 pages

Text Analytics Summit 2010: The Big Questions Facing The Text Analytics Industry Analytics Industry

"Text analytics" has a name and a fair amount of hype but as yet almost no practitioners" "text mining" [?] "text data mining" - a lot has happened since 1999. "Don't look back. Something might be gaining you," says Satchel Paige.

Uploaded by

Sandeep Raut
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views22 pages

Text Analytics Summit 2010: The Big Questions Facing The Text Analytics Industry Analytics Industry

"Text analytics" has a name and a fair amount of hype but as yet almost no practitioners" "text mining" [?] "text data mining" - a lot has happened since 1999. "Don't look back. Something might be gaining you," says Satchel Paige.

Uploaded by

Sandeep Raut
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Text Analytics Summit 2010

#TAS10

The Big Questions Facing the Text


Analytics Industry

Seth Grimes
@sethgrimes
>> Past, Present & Future

He who controls the present, controls the past.


He who controls the past, controls the future.
-- derived from George Orwell’s 1984
>> The (Near) Past: Lacking Use Cases

In 1999 –
“The nascent field of text data mining (TDM) has
the peculiar distinction of having a name and a
fair amount of hype but as yet almost no
practitioners.”
-- Prof. Marti A. Hearst,
“Untangling Text Data Mining”
>> So “Big Questions”…

Whatever you call it – “text analytics” ≈ “text mining”


≈ “text data mining” – a lot has happened since.

How is the industry developing?


• Solution providers.
• Customers & prospects.
• Technology & solutions.
>> What’s Past is Prologue

“Don't look back. Something might be


gaining on you.”
-- Satchel Paige
>> The Present: Today’s Market

I estimate a $425 million global market in 2009.


• Up about 25% from $350 million in 2008, up in turn
40% from $250 million in 2007.
• Covers software licenses, vendor provided support and
professional services.
$(hundreds) million more value created by:
• Universities and research centers, especially in the life
sciences.
• Government, particularly for intelligence & counter-
terrorism.
• OEM licensees, for listening platforms, e-discovery, etc.
• Systems integrators and consultants.
>> Applications Today

Broadly grouped --
• Intelligence and counter-terrorism.
• Life sciences.

• Content management, publishing & search.


• Customer & market intelligence.
• E-discovery.
• Enterprise feedback.
• Law enforcement.
• Risk, fraud, compliance, and investigation.
>> Today’s Text Analytics Players

BI, data mining, and analytics.


Enterprise- and specialized-application focus.
Search tools and services.
Software-tool, OEM suppliers.
Text analytics pure-plays, diverse applications.
Web services (APIs).
>> Market Trends
“The Diverse and Exploding Digital Universe,” (IDC, 2008)
Stronger than ever:
• Life sciences.
• Intelligence & counter-terrorism.
Continued steep growth:
• Media & publishing.
 Seek to mine and to classify/process.
 For users, semantic annotations ease navigation and boost findability.
• Customer experience.
 Key to quality, satisfaction.
• Market intelligence including competitive intelligence.
 Aggregates and details are both important.
New on the scene – or at least newly visible:
• Social-media monitoring, measurement, analysis.
>> Technology Initiatives

Now and near future.


• Semantic search.
Guha (IBM), McCool (Stanford), Miller (W3C): “The addition of
explicit semantics can improve [navigational and research]
search” (2003).
• Question answering.
Matthew Glotzbach, Google: “Question answering is the future of
enterprise search” (2006).
• Sentiment analysis & social-media analytics.
Bing Liu, Univ of Illinois: “The Web has dramatically changed the
way that people express their views and opinions.”
>> Technology Initiatives 2

Now and near future.


• Customer experience.
Bruce Temkin, ex-Forrester Research: “The future is clearly about
analyzing feedback in any form that your customers give it.
That’s a trend that won’t go away.”
• Text visualization.
We’re still coming to terms with the idea of actually extracting and
exploiting the information content of rich media.
• Web 3.0 & the Semantic Web.
Ronen Feldman, Bar-Ilan University and Hebrew University: “Text
analytics [is] driving the Semantic Web” (2006).
>> Search, from Keywords to Intelligence

Text analytics enables smarter search that better


responds to user goals.
>> Question Answering

Text analytics
(information
extraction)
feeds curated
knowledge
bases. Search
is transformed
from
information
retrieval to
information
access.
>> Sentiment Analysis

Two assertions:
• Human
communications
are inherently
subjective.
• Opinion often
masquerades as
Fact.
>> Sentiment Analysis… & Social Media

“Sentiment analysis is the task of identifying positive


and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in
Phrase-Level Sentiment Analysis”
>> Finding Business Value

In customer-experience initiatives, “more unsolicited,


unstructured data [implies] increasing use of text
analytics.”
-- Bruce Temkin, ex-Forrester Research
>> Text Visualization
>> Looking Ahead
The Semantic Web Vision

“The Semantic Web is a web of data,


in some ways like a global
database.” -- Tim Berners-Lee, 1998

"

An open-architure, coordinated by
the W3C standards (World Wide
Web Consortium)
Linked Data: “exposing, sharing, and
connecting pieces of data, information,
and knowledge on the Semantic Web.”
>> Web 3.0

Web 3.0 = Web 2.0 + the Semantic Web + semantic


tools. Recurring themes:
• Semantically enriched -- context sensitive -- localized.
Text analytics enables Web 3.0 and the Semantic
Web.
• Automated content categorization and classification.
• Text augmentation: metadata generation, content
tagging.
• Information extraction to databases.
• Exploratory analysis and visualization.
>> In Sum

Robust growth.
Consolidation and emergence.
Technical challenges.
New frontiers.

… and two days to learn more.


Text Analytics Summit 2010
#TAS10

The Big Questions Facing the Text


Analytics Industry

Seth Grimes
@sethgrimes

You might also like