welcome - text analytics summit 2010
DESCRIPTION
Welcome address slides for the Text Analytics Summit 2010, presented by conference chair Seth Grimes, Alta Plana Corporation.TRANSCRIPT
Seth Grimes@sethgrimes
Text Analytics Summit 2010#TAS10
The Big Questions Facing the Text Analytics Industry
>> Past, Present & Future
He who controls the present, controls the past. He who controls the past, controls the future.
-- derived from George Orwell’s 1984
>> The (Near) Past: Lacking Use Cases
In 1999 –
“The nascent field of text data mining (TDM) has the peculiar distinction of having a name and a fair amount of hype but as yet almost no practitioners.”
-- Prof. Marti A. Hearst,“Untangling Text Data Mining”
>> So “Big Questions”…
Whatever you call it – “text analytics” ≈ “text mining” ≈ “text data mining” – a lot has happened since.
How is the industry developing?• Solution providers.
• Customers & prospects.
• Technology & solutions.
>> What’s Past is Prologue
“Don't look back. Something might be gaining on you.”
-- Satchel Paige
>> The Present: Today’s Market
I estimate a $425 million global market in 2009.• Up about 25% from $350 million in 2008, up
in turn 40% from $250 million in 2007.• Covers software licenses, vendor provided
support and professional services.
$(hundreds) million more value created by:• Universities and research centers, especially
in the life sciences.• Government, particularly for intelligence &
counter-terrorism.• OEM licensees, for listening platforms, e-
discovery, etc.• Systems integrators and consultants.
>> Applications Today
Broadly grouped --• Intelligence and counter-terrorism.• Life sciences.
• Content management, publishing & search.• Customer & market intelligence.• E-discovery.• Enterprise feedback.• Law enforcement.• Risk, fraud, compliance, and investigation.
>> Today’s Text Analytics Players
BI, data mining, and analytics.
Enterprise- and specialized-application focus.
Search tools and services.
Software-tool, OEM suppliers.
Text analytics pure-plays, diverse applications.
Web services (APIs).
>> Market Trends
Stronger than ever:• Life sciences.• Intelligence & counter-terrorism.
Continued steep growth:• Media & publishing.
Seek to mine and to classify/process. For users, semantic annotations ease navigation and boost findability.
• Customer experience. Key to quality, satisfaction.
• Market intelligence including competitive intelligence.
Aggregates and details are both important.
New on the scene – or at least newly visible:• Social-media monitoring, measurement,
analysis.
“The Diverse and Exploding Digital Universe,” (IDC, 2008)
>> Technology Initiatives
Now and near future.• Semantic search.
Guha (IBM), McCool (Stanford), Miller (W3C): “The addition of explicit semantics can improve [navigational and research] search” (2003).
• Question answering.Matthew Glotzbach, Google: “Question answering is
the future of enterprise search” (2006).• Sentiment analysis & social-media analytics.
Bing Liu, Univ of Illinois: “The Web has dramatically changed the way that people express their views and opinions.”
>> Technology Initiatives 2
Now and near future.• Customer experience.
Bruce Temkin, ex-Forrester Research: “The future is clearly about analyzing feedback in any form that your customers give it. That’s a trend that won’t go away.”
• Text visualization.We’re still coming to terms with the idea of actually
extracting and exploiting the information content of rich media.
• Web 3.0 & the Semantic Web.Ronen Feldman, Bar-Ilan University and Hebrew
University: “Text analytics [is] driving the Semantic Web” (2006).
>> Search, from Keywords to Intelligence
Text analytics enables smarter search that better responds to user goals.
>> Question Answering
Text analytics (information extraction) feeds curated knowledge bases. Search is transformed from information retrieval to information access.
>> Sentiment Analysis
Two assertions:• Human
communications are inherently subjective.
• Opinion often masquerades as Fact.
>> Sentiment Analysis… & Social Media
“Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis”
>> Finding Business Value
In customer-experience initiatives, “more unsolicited, unstructured data [implies] increasing use of text analytics.”
-- Bruce Temkin, ex-Forrester Research
>> Text Visualization
>> Looking Ahead
The Semantic Web Vision
"
Linked Data: “exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web.”
An open-architure, coordinated by the W3C standards (World Wide Web Consortium)
“The Semantic Web is a web of data, in some ways like a global database.” -- Tim Berners-Lee, 1998
>> Web 3.0
Web 3.0 = Web 2.0 + the Semantic Web + semantic tools. Recurring themes:• Semantically enriched -- context sensitive --
localized.
Text analytics enables Web 3.0 and the Semantic Web.• Automated content categorization and
classification.• Text augmentation: metadata generation,
content tagging.• Information extraction to databases.• Exploratory analysis and visualization.
>> In Sum
Robust growth.
Consolidation and emergence.
Technical challenges.
New frontiers.
… and two days to learn more.
Seth Grimes@sethgrimes
Text Analytics Summit 2010#TAS10
The Big Questions Facing the Text Analytics Industry