big data in telco & banking analytics - ibm · bring tremendous value in various scenarios: 1....
TRANSCRIPT
Big Data in Telco & Banking
Analytics
Benjamin Sznajder
IBM Research – Haifa
Agenda
• What is Big Data, Why Now
• IBM’s approach
• Big Data in Banking industry
• A Telco scenario
Bytes and bytes
� Megabyte: 1 minute of MP3 music, 6 seconds of CD quality music
� Gigabyte: 7 minutes of HDTV video, 1 DVD = 4.7 Gigabyte
� Terabyte: The US library of Congress = 160 Terabytes, Wikipedia = 6 Terabytes
� Petabyte: Google processes 24 petabytes per day, Avatar used 1 Petabyte of storage
� Exabyte: All words ever spoken = 5 exabytes, monthly internet traffic = 21 exabytes
� Zetabytes: in 2008 the americans consumed 4 Zetabytes of data
� Yotabytes
We are in an Era of New Data Sources and New Volumes of Data -90% of the data in the world today has been created in the last two years
1.3 Billion RFID tags in 200530 Billion RFID tags in 2010
Google processes > 24 Petabytes of data in a single day
Facebook processes 10 Terabytes of data every day
Hadron Collider at CERN generates 40 Terabytes of data / sec
For every session, NY Stock Exchange captures 1 Terabyteof trade information
Twitter processes 7 Terabytes of data every day250,000,000 tweets
4.6 Billion mobile phones worldwide
2 Billion Internet users in 2011By 2013, annual internet traffic will reach 667 Exabytes
© 2013 IBM Corporation
2009800,000 petabytes
202035 zettabytes
as much Data and ContentOver Coming Decade
44x Business leaders frequently make decisions based on information they don’t trust, or don’t have1 in3
83%of CIOs cited “Business intelligence and analytics” as part of their visionary plansto enhance competitiveness
Business leaders say they don’t have access to the information they need to do their jobs
1 in2
of CEOs need to do a better job capturing and understanding information rapidly in order to make swift business decisions
60%
… And Organizations Need Deeper Insights
Of world’s datais unstructured
80%
Information is at the Center of a New Wave of Opportunity…
© 2013 IBM Corporation
Example: The Perception Gap Surrounding Social Media . . . .
� IBM 2010 CEO Study: 88 percent of CEOs said “getting closer to customers” was top priority over next 5 years and viewed social media as a core part of that strategy
� However, a March 2011 IBM study identified that companies fail to understand what customers want from social advertising and outreach
70%
7%
23%
Agree
Neutral
Disagree
“What Customers Want”First in a two-part series
IBM Institute for Business ValuePublished March 2011
Social media and social networking will increase customer advocacy?
Source: “Capitalizing on complexity, Insights from the Global Chief Executive Office Study,” IBM Institute for Business Value, 2010
7
The “BIG Data” Challenge / Opportunity
Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible
This data cannot be handled easily by traditional Warehouses and Databases.
Scalable, cost-effective, reliable, fault tolerant systems along with experience in Analytics make this possible
Manage the complexity of multiple relational and non-relational data types and schemas
Variety
Streaming data and large volume data movementVelocity
Scale from terabytes to zettabytesVolume
8
Traditional and Big Data Approaches
IT
Structures the data to answer that question
Business Users
Determine what question to ask
Monthly sales reports
Profitability analysis
Customer surveys
Traditional Approach
Structured & Repeatable Analysis
IT
Delivers a platform to enable creative discovery
Business
Explores what questions could be asked
Brand sentiment
Product strategy
Maximum asset utilization
Big Data Approach
Iterative & Exploratory Analysis
9
Big Data in Action – Some Examples
Utilities� Weather impact analysis on
power generation� Smart meter data analysis
E Commerce� Analyze internet behavior and
buying patterns
� Digital asset piracy
Transportation� Weather and traffic
impact on logistics and
fuel consumption
Call Centers� Voice-to-text mining for
customer behavior
understanding
Financial Services� Improved risk decisions
� Customer sentiment analysis
� AML
Telecommunications� Operations and failure analysis
from device, sensor, and GPS
inputs
Stock Market� Impact of weather on securities prices� Analyze market data at ultra-low latencies
Fraud Prevention� Detecting multi-party fraud
� Real time fraud prevention
© 2013 IBM Corporation
Agenda
� What is Big Data, Why Now
� IBM’s approach
� Big Data in Banking industry
� A Telco scenario
© 2013 IBM Corporation
IBM Big Data Platform Strategy
BI / Reporting
BI / Reporting
Exploration / Visualization
IndustryApp
Predictive Analytics
Content Analytics
Analytic Applications
IBM Big Data Platform
Systems Management
Application Development
Visualization & Discovery
Accelerators
Information Integration & Governance
StorageSystem
Stream Computing
Data Warehouse
• Integrate and manage the full variety, velocity and volume of “Big Data”
• Apply advanced analytics to information in its native form
• Visualize all available data for ad-hoc analysis
• Development environment for building new analytic applications
• Support workload optimization and scheduling
• Provide for security and governance
• Integrate with enterprise software
. . . .
© 2013 IBM Corporation
BigInsights Brings Hadoop to the Enterprise � BigInsights = analytical platform for persistent Big Data
– Based on open source & IBM technologies
– Managed like a start-up . . . . Emphasis on deep
customer engagements, product plan flexibility
� Distinguishing characteristics
– Built-in analytics . . . . Enhances business knowledge
– Enterprise software integration . . . . Complements and
extends existing capabilities
– Production-ready platform with tooling for analysts,
developers, and administrators. . . . Speeds time-to-
value; simplifies development and maintenance
� IBM advantage
– Combination of software, hardware, services and
advanced research
StorageSystem
© 2013 IBM Corporation
Visualize results through dashboards
• Built-in dashboards for monitoring system health, application status, distributed file system, etc.
• Easy to customize . . . . Add, group, or remove widgets for:• BigSheets collections and charts• Cluster/system Monitoring• HDFS monitoring• MapReduce metrics• Third party Widgets or Open
Social Gadgets can be added to a dashboard
• Create new, custom dashboards to suit your needs!
© 2013 IBM Corporation
Big Data Platform - Stream Computing
� Built to analyze data in motion
– Multiple concurrent input streams
– Massive scalability
� Process and analyze a variety of data
– Structured, unstructured content,
video, audio
– Advanced analytic operators
© 2013 IBM Corporation
� continuous ingestion� Continuous ingestion� Continuous analysis
How Streams Works
© 2013 IBM Corporation
Achieve scale:By partitioning applications into software componentsBy distributing across stream-connected hardware hosts
Infrastructure provides services forScheduling analytics across hardware hosts, Establishing streaming connectivity
TransformFilter / Sample
ClassifyCorrelate
Annotate
Where appropriate: Elements can be fused togetherfor lower communication latency
� Continuous ingestion� Continuous analysis
How Streams Works
© 2013 IBM Corporation
Agenda
� What is Big Data, Why Now
� IBM’s approach
� Big Data in Banking industry
� A Telco scenario
© 2013 IBM Corporation
Top priority – give customers what they want…
�89% if Banking CEOs say that their top
priority is to better :–Understand
–Predict
–Give customers what they want…
�Banking analytics can help improve how
banks segment, target, acquire or retain
customers.
© 2013 IBM Corporation
Importance of analytics within the banking industry
�As per Deloitte research, three business
drivers increase the Importance of
analytics within the banking industry:
– Regulatory reform
– Customer profitability
– Operational efficiency
© 2013 IBM Corporation
© 2013 IBM Corporation
Fraud Analysis
� The Association of Certified Fraud Examiners’ 2010 Global Fraud Study found that the banking and financial services industry had the most cases across all industries –accounting for more than 16% of frauds.
� How Big Data can help here?– Calculation of statistical parameters (e.g., averages, standard
deviations, high/low values)
– Classification – to find patterns amongst data elements.
– Joining different diverse sources – to identify matching values (such as names, addresses, and account numbers) where they shouldn’t exist.
– Duplicate testing – to identify duplicate transactions such as payments, claims, or expense
– Etc…
© 2013 IBM Corporation
Customer Analytics in Bank retailing
�Banks and credit unions are constantly at risk of losing customers
or members…
� In order to stem the flow, they may offer their best customers
–better rates
–waive annual fees
–prioritize treatments…
� It has cost …You cannot afford to make such offers to every single
customer.
�The success and feasibility of such strategies is dependent on
identifying the right customer for the right action…
© 2013 IBM Corporation
Banks realize the importance of Analytics…
Agenda
• What is Big Data, Why Now
• IBM’s approach
• Big Data in Banking industry
• A Telco scenario
SummarySetup
• Communication Service Providers (CSP) encompass users browsing activity (on mobile phones and tablets) and mobile apps
• This Usage Data can be leveraged to bring tremendous value in various scenarios:
1. New customer micro-segmentations and targeted proposition development
2. Creating new tiered data pricing plans based on data usage analysis
3. Creating new propensity models for churn reduction and services cross selling
4. Developing new models of targeted advertisement
Method
� Usage data is monitored through the analysis of mobile gateway logs
� Opaque network data is analyzed and mapped into clear and well defined taxonomy of domains
� Example domains of interest include:– Arts/Entertainment/News_and_Media– Reference/Maps/Google_Maps– Society/Relationships/Dating/Speed_Dating/– and much more
� For every domain, we monitor:– number of time it is visited– time spent– application used– amount of data transmitted etc…
Customer Micro-segmentation
• The goal: understanding trends and interests of specific user segments and developing targeted websites, content and apps e.g., sport, tourism…
• Communication Service Providers (CSP) use static, multi purpose, marketing segmentation of customers, which is not effective
• Segments are defined only once or twice and therefore cannot reflect a propensity change, or commercial intent
• Moreover, current segments are too broad which lead to blanket actions which will not suit all customers
• By understanding how customers use their phones, we allow highlypersonalized marketing interactions
• We use Web browsing and application data to learn ad-hoc data-driven micro segments aimed specifically to perform for a given action/offer.
– Web data is representative of customer tastes and interests and it is current, and up-to-date
URL Analysis- Extract Implicit User Profile
analysis
URL Analysis: for each user, report the most meaningful interests to describe her profile.
Large scale analysis
Update users profiles
Consume
Adaptive user segmentations:create new users segmentation by clustering similar interests
Data Cleansing
How URLs are transformed in Concepts
{docid: d1, wwpokec.azet.sk}
{docid:d2, http://news.yahoo.com/recall-news-215006441.htm}
Concepts (categories) Selection
{docid: d3, www.youtube.com}
ODP-
Business/Marketing_and_Advertising/News_and_Media
Concepts Aggregation(Top-k concepts per user)
WIKIPEDIA
Product recalls
URL Parsing (Types)
ODP- Open Directory Project
• One of the largest collaborative efforts to manually annotate web pages
• More than 4 million web pages, into more than 590,000 categories (Tree-based taxonomy)
• RDF dump file is available to download
• Examples:
– Society/Relationships/Dating/
• Society/Relationships/Dating/Speed_Dating/
• Society/Relationships/Dating/Chats_and_Forums/
• ….
– Computers/Internet/On_the_Web/
• Computers/Internet/On_the_Web/Podcasts/
• Computers/Internet/On_the_Web/Web_Portals/
• Computers/Internet/On_the_Web/Message_Boards/
• ….
Wikipedia Dump
• The largest, dynamic collaborative free Encyclopedia
• More than 4 millions articles, and more than 900,000 Categories (DAG-based taxonomy)
• dump file is available to download
• Examples:
– http://en.wikipedia.org/wiki/Online_dating
Category:
• Online dating services ->Online dating for specific interests
• Intimate relationships-> Breastfeeding , Casual sex, Celibacy , Relationship counseling, Dating, Kissing, Marriage ….
• Social software ->Mobile social software , Blog hosting services , Blog software , Bulletin board system software , Social networking services ,….
Example of User Profile
Userid Category Agent type Date Count
012013a474b Arts/Entertainment/News_and_Media AndroidBrowser 2011-09-26 22
012013a474b Arts/Radio/Internet/Directories AndroidBrowser 2011-09-27 15
012013a474b Reference/Maps/Google_Maps BlackberryBrowser 2011-09-27 14
012013a474b Arts/Entertainment/News_and_Media AndroidBrowser 2011-09-27 13
� Top-4 categories for userid “012013a474b”, aggregated by Category, Agent type and Date, ranked by Count.
Category browsing behaviour appears not to vary significantly with age
• Top Level browsing behaviour does not appear to vary widely by age group,
though 25-34 year olds seem to concentrate a higher proportion of their
browsing in the “top categories”
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%
All
18-24
25-34
35-44
45-54
55
% of Total URLs Browsed
Google Facebook Apple and Itunes YouTube Vodafone Twitter
BBC SocialNetworking VodafoneWap Dating GoogleMaps Shopping
SecureBrowsing News Ebay VideoStreaming Wikipedia Yahoo
Amazon YahooMessenger HTCWeather News MobileWAP
Gender Differences in Browsing Behaviour
• Analysing only the top 100 browsing categories it is possible to identify
clear preferences by Male and Female customers
• Top ten categories remain the same for Men and Women, though the
ordering varies slightly
• Those categories for which there are significant differences between men
and women:Male Female
News & Media Online Shopping
Sports Health & Medicine
Football Cinemas
Autotrader Personal Finance
Adult Content
Mobile Gaming
© 2013 IBM Corporation
THINK
35