render telefonica
DESCRIPTION
Tutorial at the RENDER Kick-off Meeting, TelefonicaTRANSCRIPT
Kick-Off RENDER Project
Kalsruhe, October27st 2010
Telefónica I+D
Telefónica I+D User Modeling Analytical Models
Index
Telefónica Case StudyOverviewData SourcesResultsData Key PointsData Considerations
01
02
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
1
Annex A: Twitter Analysis Examples02
Case Study
01
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
2
Telefónica I+D User Modeling Analytical Models
Overview
�RENDER will provide means to enable Telefónica to assess theincoming requests, complaints and concerns, identify opinions,viewpoints, trends and tendencies, and take feasible actions basedthereupon.
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
3
Data Sources
Call Centers
Contacts
Web Customer
Portal Messages
Surveys (Shops &
Market Research)
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
4
Public Forums
Comments
Corporate Forums
Comments
Twitter Entries
Data Sources
• Twitter data collection›
�Amounts of Data• Data in corporate channels
› Movistar España
› O2 UK and O2 Ireland
• Data in public channels› Open forums
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
5
› 600.000 tweets per day (1% total)
› By geolocation
› 23.000 tweets/day in UK
› 5.000 tweets/day in Spain
› 900 tweets/day in Ireland
› By topic
› 3.300 tweets/day speaking about O2
› 3.200 tweets/day speaking about Movistar
› 800 tweets/day speaking about Telefónica
Results�What do we want to achieve in this project?
• To apply of NLP, data mining, web mining, and machine learningtechniques in order to discover and analyze in‐depth large streams ofdata from various sources, across multiple (natural) languages, and acomprehensive opinion model covering intensity, biases and factcoverage.
�Key aspects• Management of data source
›
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
6
› Internal Data Vs. External Data
• Processing of the data bias› Customer Vs. Potential customer
› Non-experimented Vs. Advanced users
• Vision of segmented opinion› Individual Opinion Vs. Global Opinion
• Identification of the subjectivity in the opinions› Positive, Negative and Neutral Opinions
• Knowledge of opinion geolocalization (Twitter entries)
Internal dataInternal dataInternal data
Data Key Points
Web Customer Portal
Corporate Forums
Call Center
Customers Customers Customers
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
7
Customers Customers Customers
Objective / Subjective
Objective / Subjective
Objective / Subjective
No possible segmentation
Possible segmentation
Possible segmentation
Possible localization Possible localization (with user account)
Language identifiedLanguage not
identifiedLanguage not
identified
Possible localization (with user account)
Offline users Online users Online users
Surveys (shops & market research)
Public Forum Twitter Entries
Data Key Points
Internal data External dataExternal data
Customers or Potential Customers or Potential Customers or Potential
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
8
Customers or Potential Customers
Objective / Subjective
Objective / Subjective
Objective / Subjective
Possible segmentation
No possible segmentation
No possible segmentation
Not identified language Identified language
Not identified language
Possible localizationNot always possible
localizationNot possible localization
Offline usersOnline users Advanced online users
Customers or Potential Customers
Customers or Potential Customers
Data ConsiderationsCall Center
Only interaction customer with the CRM.
Technical Limitations due to working with recordings:- Speech recognition - User/Operator in the same channel (User diarization)
Formal language.
The transcriptions have not mistakes as unknown words and symbols (only recognition errors).
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
9
channel (User diarization)
High difficulty data acquisition.
Customers don’t speak freely, it’s a formal dialogue.
The topics list is limited, the issues are defined.
The most of calls don’t express opinion, are only questions and complaints.
Data ConsiderationsWeb Customer Portal
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
10
Data Considerations Web Customer Portal
Text sentences can have errors (grammar, vocabulary…)
Customers don’t write freely, it’s a formal message.
Formal language.
The technical limitations will only be the challenge of the Opinion Mining.
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
11
Only interaction customer with the CRM.
Medium difficulty data acquisition.
The list of topics is limited, the issues are defined.
The most of comments don’t express opinion, only questions and complaints.
Data Considerations Forums Comments
�Corporate forum
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
12
Data Considerations Forums Comments
�Public forum
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
13
Data Considerations Forums Comments
Informal language.
Transcriptions can have errors (grammar, vocabulary…)
Only Interaction between
Customers write in complete freedom.
The comments can express opinion.
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
14
Only Interaction between customers (Public Forums)
Medium difficulty data acquisition.
opinion.
The list of topics is unlimited, customers can open any new issue.
Interaction customer-enterprise and between customers (Corporate Forums)
The technical limitations will only be the challenge of the Opinion Mining.
Data Considerations Surveys (shops & market research)
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
15
Data Considerations Surveys (shops & market research)
The list of topics is limited.
Only Interaction customer-enterprise
Formal language.
Customers write in complete freedom.
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
16
Medium difficulty data acquisition.
The comments can express opinion.
Transcriptions without errors and natural language.
The technical limitations will only be the challenge of the Opinion Mining.
Data Considerations Twitter Entries
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
17
Data Considerations Twitter Entries
Informal language.
Transcriptions can have errors (grammar, vocabulary…)
Low difficulty data acquisition.
The comments can express opinion.
Customers write in complete
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
18
Customers write in complete freedom.
The list of topics is unlimited, customers can open any new issue.
Interaction customer-enterprise and between customers.
The technical limitations will only be the challenge of the Opinion Mining.
Annex A: Twitter AnalysisExamples
02
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
19
Examples
Telefónica I+D User Modeling Analytical Models
Twitter Analysis Examples�Current opinion mining projects in Twitter with no interesting results
• TwitrratrO2 can’t be
searched because it has only two characters. ����
There’s only 4 results for ‘O2
Ireland’
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
20
Ireland’
The only 4 results are classified as
neutral
This comment is really negative!
Twitter Analysis Examples�Current opinion mining projects in Twitter with no interesting results
• Tweetfeel
It’s possible to search O2, but…
…the results are
bad!
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
21
Sometimes it’s well
classified
Sometimes the word
doesn’t exist
And the rest it’s bad
classified or identified!
Twitter Analysis Examples�Current projects with no interesting results
• Tweetfeel
In this case it’s possible to search
O2 Ireland...
…but it’s not
Área: Lorem ipsumRazón Social: TelefónicaTelefónica I+D User Modeling Analytical Models
22
�There is still much work to do…
possible as following words
There are only 4 results, and 3 are RT (retweeting)