exploring the hidden unknown using self-learning text ... · how about if we have a self-learning...

25
Exploring the Hidden Unknown Using Self-Learning Text Analytics

Upload: others

Post on 19-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

Exploring the Hidden Unknown Using Self-Learning Text Analytics

Page 2: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Current Approach in Data Discovery

• Activities are a form of customer engagement• User training – limited to a few presumably relevant keywords for

searching• Most dashboards provide:

• content reach, fans, retweet, followers, …• Sentiment analysis:

• These metrics disregard context

Page 3: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Current Approach

“Without an appropriate query, the analyzed data might be irrelevant and the end results inaccurate” infegy Blog

Search/FilteringNoise/Irrelevancy

Missing Information Data is not relevant/

result is not accurate

additionally, there is about 327,000,000 documents on search tips on Google

Page 4: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

An Example of Noise[monitoring tweets on first days of TIFF]

Page 5: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

What’s happening when you define rules & keywords?

What do you see here?

Source: http://nexalogy.com/

Cozy restaurant

Quiet neighborhood

Beach side

Beautiful view

Page 6: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Full StoryThe smaller version of the photo was exactly 10% of the overall photograph

Source: http://nexalogy.com/

“What you do not know is far more relevant than what you know.” Nassim Nicholas Taleb

Page 7: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Pain Points

• Too much information• Labour Intensive• Time consuming• Noisy data • Missing Information

“The costs, the shortcomings with accuracy, and the time needed to build and refine data dictionaries are frustrating at times.”

Text Analytics 2014, Alta Plana

Page 8: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Pain Points

• Too much information• Labour Intensive• Time consuming• Noisy data • Missing Information

“The costs, the shortcomings with accuracy, and the time needed to build and refine data dictionaries are frustrating at times.”

Text Analytics 2014, Alta Plana

We are turning to an unstructured world

Page 9: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Unstructured World

• Uber• Airbnb• Facebook• etc

Tom goodwin "Something interesting is happening"

Page 10: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Profound Changes in Approaching Data

In the past

Big data era

• People knew what to collect and how to use data before collecting data

• Data was collected in an controlled environment.

• Looking for causes in data analysis approach

• Data is generated and collected without knowing what its purpose will be

• Data is noisy and irrelevant in large scale

• Looking for correlations in data

Page 11: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

How about if we have a self-learning Text analytics solution

‘Traditional’ Content Analytics

difficult & expensive process for cleansing, training ,curation, classification, categorization

human bias, noise, missing data, incorrect and incomplete results

Traditional Content Analytics

requires no trainingintuitive user interfacesystem is self-learningfind hidden unknownsmore meaningful results

Let’s Change the Paradigm

Page 12: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Self-Learning Text Analytics

• Ability to learn based on prior knowledge obtained from text itself• Understand meaning in the context• Automatic discovery of relevancy, correlations and concept• Cold start KB• Root cause analysis

Deep Learning Text

analytics

• No training• No taxonomy, no dictionary is required• Find hidden unknown• Qualitative analysis• No human labeling, Boolean queries

Automatic Discovery

• No preprocessing, cleaning• Robust to noise• Real-time analytics

Self Analytics Service

Page 13: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Applied in various segment

Insurance

Eliminate preprocessing, training data

Dramatically reduces

operational expense and time

of delivery

Self service analytics

Customer Experience

Increasing positive customer experience (CX)

Improving the client retention

Recommendation

Qualitative Analysis

Open-ended question and

qualitative surveys

Automatic categorization-

eliminating manual labeling

Enabling in-hand insights

Risk Management/ Public Safety

Early detection of evidence of fraud

Pilot Project

Public Safety

Risk Management

Retail

Category Management

Classification

Customer Intent

Increase sale

Page 14: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Pain Points

• Too much information• Labour Intensive• Time consuming• Noisy data • Missing Information

“The costs, the shortcomings with accuracy, and the time needed to build and refine data dictionaries are frustrating at times.”

Text Analytics 2014, Alta Plana

Case Studies

Page 15: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #1[Analysing Social media for the purpose of public safety]

Data source: tweets on April 15, 2013

Page 16: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #1[Analysing Social media for the purpose of public safety]

Page 17: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #2[Analyzing 1M tweets on Chrysler brand]

• Data: 1 Million tweets• Challenges:

– Noise, irrelevancy & volume• Ram (Chrysler brand) vs. RAM (memory) vs. Ram

(name)• Compass (Chrysler brand) vs. Compass(device) vs.

Compass (name of watch/group…)• Core (Chrysler brand) vs. Core(processor) • ….

Page 18: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #2[Analyzing 1M tweets on Chrysler brand]

Page 19: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case study #3[Analyzing Blogs & Forums]

• Source: watchuseek community discussions

Page 20: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

- discover hidden unknowns- Beyond keyword search- Finding correlations hips between topics- Discovery of the knowledge in the data without using external resources

Finding correlationships between water- resistant, watch and stowa marine

Beyond keyword indexingDiscovery the relationship between Bottadesign , Germany and Swiss heart

Understanding 600t divingstar is a doxawatch even without mentioning the keyword “doxa”

Case study #3[Analyzing Blogs & Forums]

Page 21: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #4[Root cause analysis by self-learning text analytics]

Page 22: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case study #5[2012-2015 US customer Complaints in the Financial Industry]

Finding unknown issues Financial Institutions

Page 23: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study #6[Qualitative Survey]

• City Council's Executive Committee requested the City Manager to seek the public's input on the establishment of a casino in Toronto.

Survey Response Form

Page 24: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

‹#›

Case Study[Qualitative Survey]

Page 25: Exploring the Hidden Unknown Using Self-Learning Text ... · How about if we have a self-learning Text analytics solution ‘Traditional’ Content Analytics. difficult & expensive

• www.Kaypok.com• Twitter: @KaypokINC

Confidential Information – Not for Distribution

Contact Information

Razieh Niazi, Founder & [email protected]://ca.linkedin.com/pub/razieh-niazi/8/9b8/ba8416-731-3624 (Mobile)