hadoop, big data and big analytics 2014 - sas...hadoop, big data and big analytics 2014 3 waves of...
TRANSCRIPT
Hadoop, Big Data and Big Analytics 2014 3 waves of Big Analytics The Business Improvement Frameworks Big Analytics Use Cases - The data, the analytics actions and benefits
Vic Winch, Director Big Data COE, Teradata International
> Level-set
> Business Improvement Frameworks
> Use case examples
> Take home lessons
Agenda
Level-set
Internetworking Computers: year zero
In 1982 the 1st internetworking project emerged that joined networks together. It could connect …. computers
Map of the TCP/IP test network in February 1982
The public internet of computers: year zero
In 1987 the 1st public Internet Service Provider launched. It could connect …. computers …. businesses …. people …. everything …
35 Years of the Internet, 1969-2004. Stamp of Azerbaijan, 2004
Now, the internet is changing EVERYTHING
Big Data: hype - or reality?
“Unprecedented data growth… that continues, regardless of budget constraints”
David Cappucio, Research VP (Gartner)
“Big Data is bull***… it’s really just data.”
Harper Reed, CTO Obama For America
Big Data is All Data or Total Data Traditional Data + New Sources of Data
“Don’t think about Big Data as a stand-alone, think about your core business problems and how to solve them by analyzing Big Data” Former Head of Big Data, Facebook
New Data Available
Traditional Data Available
Good Practice 1 – Keep It
Keep all this new digital data
Challenges 1 – Keep It
The data are volatile – so keep it as is
There is a lot of data
Hadoop provides a cost effective solution
Big Data is also about Big Analytics: Traditional and New Analytics against all the data
Big Data isn’t about technology. Big data is about business needs. The bottom line is use the right technology for whatever it is you need Ken Rudin, Facebook Analytics Chief
New Types of Analytics
Traditional Analytics
Good Practice 2 – Agility and Ease of Use
Establish an Exploration Approach
Win Fast, Fail Fast
Challenge 2 – Agility and Ease of Use
Wanted
– tools that work within the expertise of the business user
PROCESS FLOW
Marketing
Executives
Operational
Systems
Frontline
Workers
Customers
Partners
Engineers
Data
Scientists
Business
Analysts
Math
and Stats
Data
Mining
Business
Intelligence
Applications
Languages
Marketing
ANALYTIC TOOLS
USERS
ACTION
ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
DATA
Fast Loading
Filtering and
Processing
Online Archival
Reports
Dashboards
Real-time
Recommendations
Operational
Insights
Rules Engines
EXPLORATION
Data
Discovery
Pattern
Detection:
Path, Graph,
Time-series
analysis
New Models
And
Model Factors
Teradata View of Big Analytics
UNIFIED DATA ARCHITECTURE
Marketing
Executives
Operational
Systems
Frontline
Workers
Customers
Partners
Engineers
Data
Scientists
Business
Analysts
Math
and Stats
Data
Mining
Business
Intelligence
Applications
Languages
Marketing
ANALYTIC TOOLS
USERS
DISCOVERY PLATFORM
INTEGRATED DATA WAREHOUSE
ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
DATA PLATFORM
ACCESS MOVE MANAGE
From transactions - to interactions: the three waves of Big Data
Analysis of clickstream data enables Amazon and eBay to achieve “mass customisation” of their web-sites.
Analysis of social / interaction data enables Amazon, Apple and LinkedIn to go social (“people who like what you like also like…”)
Increasing instrumentation is now leading to the emergence and optimisation of “the Internet of Things”.
People interacting with
things
People interacting with
people
Things interacting with
things
(1)
(2)
(3)
These trends are real and accelerating – but are they about “more”, or “different”?
BUSINESS IMPROVEMENT OPPORTUNITIES FRAMEWORK
Examples of use cases
“Blah, blah, blah – show me the money!”
19 8/15/2014 Teradata Confidential
Big Data Business Value Framework - Retail
l Fraud Prevention Detect fraud patterns in sales channels; online, store
Service Efficiency Analyze how failures in service impact loyalty of customers
Sales Impact
Competitor Impact Analysis of competitor to understand impact of store opening
Customer Migration Analyzing declines in customer segments over large timeframes
Customer Interactions
Path Analysis Web Click path analysis optimizing the web purchase
How long do customers spend in front of the shelf/display? Is the Item available on the shelf, are there any ‘out of stock’ items preventing a sale?
Behaviour Targeting Identify customer behavior interactions with items
Customer Movements Patterns in customer store movement, dwell time; linked to sales
Search Term Analysis
Analyzing which search engines deliver most traffic
Marketing Effectiveness
Understand motivations for buying, starting with what item was placed first into the basket. Which items ‘ignited’ a transaction?
First in Basket Analyzing impact of first in basket to the overall basket spend
Marketing Attribution Analyzing the true effectiveness of marketing spend
Event Triggered Activity Detect behavioral triggers and send timely interventions
Personalization Dynamic online targeting of messages and content
Discovery Platform
Low Cost to Store To cater for the volume, provides low cost storage
New Data Types Can store and manage the new data types EDW can not
Time to Analytics Rapid analysis of the data once its loaded, with fast processing
Time to Data Quick & cheap to load new data, structured & unstructured
Market Basket
Product Affinity Likelihood of certain products will be purchased together in the same “basket”?
Analyzing Item Price movement and its impact on basket size and affinity of items over a long duration (6 yrs). Data Set (6 years): Transaction Data, Price data
Pricing Affinity Analyzing item price movement and its impact on basket size and affinity of items
Promotion Affinity Analysis Did affinity between products increase during a promotion?
MultiChannel Customer
Identify customers shopping in both online and offline channels and the path they take
Customer Segmentation Identify new multi-facet customer behavior
Sentiment Analysis Social analysis linked to complaints; influencers and sales analysis
Social/ Customer
Interactions
Social Media Signal Identify how “social signals can determine brand sentiment
Social Segmentation Identify purchased and non-purchased categories from social data
20 8/15/2014 Teradata Confidential
• Situation and Challenge
> Driving own-label participation
> Huge challenge in analysing range effectiveness
> 3,800 new own label SKUs.
• Insight Gained, Understand
> Visualisation allows for discovery of new trends and
associations across huge data sets (years of annual sales by category)
> Understanding common affinities in more detail – marketing activity can be implemented to
cross-sell products/categories with a higher degree of success
> Easily identifiable trigger (gateway) products
• Actions Taken
> Review ranging strategy
• Business Benefits > Detailed, fast insight into customers
shopping reaction to range/pricing/promotional strategy changes
> Gaining further insight into how best to drive promotional strategy for increased sales & profit
> Products identified that can drive cross-sales and higher value sales, with fast visualisation techniques (above)
• Typically the leaders are achieving +2% in X-Sells
Product Affinity
NOTE – All data visualisations are for
illustrative purposes only.
21 8/15/2014 Teradata Confidential
• Situation and Challenge
> Optimising the path to purchase
> Millions of site visitors, volatile multi-structured data
> Seeing the signal in the noise
> Simplifying the web-site for everyone
• Insight Gained, Understand
> “Path” visualisation allows for analyst to zoom
and filter every session path
> Arrival > Checkout
> Events prior to item removal
> Adwords search words > Checkout
> Cross-marketing activities > Checkout
> Understanding common paths to purchase and to non-purchase in more detail – marketing activity can be implemented to up-sell products/categories with a higher degree of success
• Actions Taken:
> Review website design
> Up-sell opportunities
> Personalisation of landing pages
> Interactive call outs
> Business process optimisation
• Business Benefits > Detailed, fast insight into customer
clickstream behavior.
> Gaining further insight into how best to drive promotional strategy for increased sales & profit
> Products identified that can drive up-sales and higher value sales
• Typically the leaders are achieving +2% up-sells
Web Clickstream Path Analysis
22 8/15/2014 Teradata Confidential
Marketing
Customer Experience
Fraud
Credit Risk
Online fraud Unusual usage of authenticated website based on context
Path to Fraud ID the detailed multichannel steps that precede fraud
Fraud Networks Find connections between related parties
Claims Fraud Identification of valid v fraudulent customer claims
Abandon online purchase Insight and action to drive follow up
Mktg Attribution ID the contribution of each contact to a sale
Sales Process Improvement ID and Improve sales process effectiveness
Path to Churn ID the path leading to attrition
Identify broken processes based on multi channel engagement
Customer Sat/NPS Understand the cause of dissatisfaction and loyalty
Predict Complaint ID root cause and identify opportunity to intervene and fix
People Like Me Affinity groupings refine people like me recommendations
Pre default risk Path to default via golden path analysis
Connection risk High risk associates via social or txn networks
Collections analytics Identify path to repay via collections
Operational (Banking)
Reduction in manual Claims review Increased productivity
Automate Claims notification Optimise handling and client satisfaction
Advanced Risk & Pricing insights Minimise adverse selection with Geospatial
Behavioral-based Pricing with Telematics data
Operational (Insurance)
Real Estate Pricing Using new data and techniques to enhance risk-based price
Call Centre Analytics Adherence to core processes and service standards at busy times
Sales Compliance & Mis-Selling Detect key words that mislead client / Identify unusual sales behaviour
Online T&C’s Email follow up from opt-out or rapid T&C completion
Service Efficiency ID the paths leading to high cost service calls and rectify cause
Big Data Business Value Framework - Finance
23 8/15/2014 Teradata Confidential
Company Goals and Challenges
• Customers receive lots of different marketing content as think about a purchase • Online such as search results, Display ads etc • Offline such as eMails, TV ads, calls
• When a customer responds, how do you attribute the value of the sale to all the different media?
• Most companies attribute the value to the last touch, but this the other media in the process
• We need to understand the contribution of all media
Outcome
• Better estimate the value of all media involved in the path to purchase
• Optimise media purchases (invest
more in better performing media, invest less is lower performing media) to drive more sales at a lower cost per sale
• Leaders Deliver a 4x improvement in contribution per media $
Marketing Business Improvement Opportunity Marketing Attribution
Data & Analytics
Data: • Online Interactions from all digital media (eg display ads) • Offline Interactions such as eMail, dmail, calls etc • Sales achieved by customer plus the value of each sale
Analytics • Sessionise data identify the unique sales events • Path to identify all media seen before the purchase • Attribution of the value of the sale across all seen media • Associative Analysis to identify common ‘baskets of
media’ viewed by customers before purchase • Regression Analysis to identify the most effective bundles
of media, and their value
24 8/15/2014 Teradata Confidential
Fraud Business Improvement Opportunity Online Fraud Example
• Situation and Challenge • Fraud difficult to identify using
traditional analytical techniques
• Fraudsters develop new techniques as old techniques are prevented
• Fraud is very costly, directly impacting the bottom line
• Insight Gained • Able to bring together multiple data
from all touchpoints and bank processes
• Identify complex patterns of behaviour leading to a Fraud event
• Actions Taken • New fraud patterns identified…
• Safeguards to reactively reduce attrition risk for individual customers
• Business Benefit • Previously unidentified fraud
patterns identified within the proof of concept.
• $7m dollars worth of Fraud closed down before the money was lost
• Path continues to identify new types of Fraud patterns quickly, when only a few real cases have been uncovered.
25 8/15/2014 Teradata Confidential
Customer Attrition Use Case Customer Example
• Situation and Challenge • Bank with 40m active customers
• Attrition running at 10% per annum
• Replacing a lost customer costs $100 (conservative estimate)
• No systematic way of identifying why attrition occurs, or how to prevent it
• Insight Gained, Understand: • All journeys that lead to attrition
• Problem interactions commonly occurring in these journeys
• The most common journeys where the attrition risk is highest
• Actions Taken • Strategies implemented to
proactively reduce attrition (e.g. fee-reversals for valuable customers)
• Safeguards to reactively reduce attrition risk for individual customers
• Business Benefits
• Attrition reduced by c.10%
• Mitigates c.$40m in recruitment costs
• Improved customer satisfaction through proactive management of customer experience
26 8/15/2014 Teradata Confidential
Big Data Business Value Framework - CME
27 8/15/2014 Teradata Confidential
Emerging Big Data Business Value Framework Social Network Analysis (not Social Analytics)
Marketing
SNA Analyze transactions (Voice, SMS and data) to detect networks, strength of relationships and influences to enrich propensity models and trace the way customers influence each other´s opinions and behaviors
CEM
Churn Management
Value
Outputs
Looks like
• Optimized upsell
• Optimized targeting
• Optimized retention
• New segments
• Segment list(s) • Scored list(s) • ID the central
party in network for upsell
Enablers
Interest to • CMO • Retention function • Upsell function
Inputs • CDR’s / XDR’s • Subscriber data /
profile ⏏
28 8/15/2014 Teradata Confidential
Emerging Big Data Business Value Framework Sentiment Analysis
CEM
Sentiment analysis Determine positive/negative sentiments mining text in social media, call center and CRM data
Churn Management
Value
Outputs Looks like
• Tune offerings • Manage issues
quickly • Test offerings
• KPI’s • Scores • Feedback scores
+/-
Enablers
Interest to • CMO • Retention function • Product planning • C-Level
Inputs • Call center logs • IVR logs • Twitter API • Facebook API
(Operators page) • Subscriber data /
profile
⏏
29 8/15/2014 Teradata Confidential
Design & Develop
Use & Service
Make & Deliver
Source & procure
Big Data
•R&D data from Web
•Product Testing data from Test Equipments
•Customer expectations & preferences data from Social Media
• Inbound parts tracking data from RFID
•Supply Collaboration data from SCM apps
•Supplier base capabilities data from Web
•Usage, conditon, environment data from the Product Sensors
•«Call Logs» data from CRM apps
•Part fit/removal data from RFID
•Machinery operating data from MES and SCADA
•Product Tracking Data from RFID
•Parts documentation from CAD & PDM apps
•Supplier machinery data from MES/SCADA
•Service Work Orders reports from MRO apps
•Diagnosis from Test Equipment
•Demand & Supply Data from ERP
Big Data Business Value Framework - Manufacturing
30 8/15/2014 Teradata Confidential
Train Engine Failure Sensor Data Analysis
Outlook: UDA leverages full potential for closed loop lifecycle management
Data Load/Refine
Discover
Operationalize
Take Action
Leverage Hadoop as low cost solution to store all (potentially relevant) data - Operational data - PLM data - Etc.
• Leverage EDW as productive environment
• Applying the identified prognosis algorithms and push into operative systems (e.g. Maximo to automatically schedule workjobs)
Use Teradata Aster (e.g. Decision tree algorhithms) to identify failure patterns for prognosis, using - Sensor data (what) - Location data (where) - Usage data (when, with which
passenger load) - Weather data (under which
conditions) etc.
• Identify trains with need for service before incident
• No defective trains on track • Zero unplanned downtime • Optimized field dispatch, spare
part allocation etc
31 8/15/2014 Teradata Confidential
Business Problem/Business Objective
Business Outcomes
• Identify early warning signal of quality problem based on internal and external text data
• Identify key customer voice channel of a specific quality issue
Analytics Situation
• Difficult to analyze external web & social data, NHTSA data along with internal call center data
• Most of customer complaints are emotional and unstructured text data
• Hope to understand a journey of customer voice
Identify key channel for a specific problem to catch early warning signs
Identify key words for affinity analysis concerning negative and positive response
Social Media Analytics for Automotive Quality
• Sentiment Analysis: Classify
responses into positive/
negative/neutral feelings; and
score the emotion on a scale of 0
to100 depending on its intensity
• Car Type Classification: Classify
into 77 types of cars
• Subject Classification: Analysis
on NHTSA, Web data to
automatically classify into 14
different codes of call center
service
• Subjects Analysis: Analysis on
words written on documents to
identify correlation
• Early Warning Analysis: Score
emotional signal based on control
chart by car type
• Prove the viability of taking advantage of external and internal unstructured text data to create business value
• Faster response to risks or quality issues in a customized way for each car type
• Better understand customer voice channel
Take home lessons
BIG DATA IS NOT A TECHNOLOGY
34 8/15/2014 Teradata Confidential
BIG DATA IS A MOVEMENT DEMANDING MORE ANALYTICS ON ALL DATA
By themselves,
a big bit-bucket and some fancy Analytic technology add no value;
start with a business problem, not with a technology (ours or anybody else’s).
Old business process
+
New technology
=
More Expensive old business process
The objective is not merely to gain insight –
The objective is take action on insights
So that we change the way you do business.
38 8/15/2014 Teradata Confidential
Thank you for being here