learnings do the math - cio summits · big data’s recent enterprise popularity can be attributed...
Post on 12-Jun-2018
220 Views
Preview:
TRANSCRIPT
0Mu Sigma Confidential
Chicago, IL
Bangalore, India
www.mu-sigma.com
Proprietary Information
"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"
Chicago, IL
Bangalore, India
www.mu-sigma.com
Proprietary Information
"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"
Do The Math
Embarking on Decision Sciences & Analytics
2013
Learnings
zubin.dowlaty@mu-sigma.com
Twitter: @crunchdata
11Mu Sigma Confidential
Analytics Trends & Big Data
The Decision Sciences Journey: Key Learnings & Scale
Agenda
2Mu Sigma Confidential
The information explosion is leading to an unprecedented ability to store, manage and analyze data; we need to run ahead..
Information Ecosystem
► Text
Expanding ApplicationsData Source Explosion
Technology EvolutionAdvanced Analytical Techniques
Digital Data in 2011:
~ 1750 Exabytes
Facebook: 100+
million users per day
Mobile Phone
Subscription: 4.6
billion
15 Million Bing
recommendation records
RFID Tags: 40 billion
I
N
F
O
R
M
A
T
I
O
N
A
G
E
Click-stream
dataPurchase intent
data
Social Media,
Blogs
Archives
RFID & Geospatial Information
Events
Opinion Mining
Social Graph
Targeting
Influencer
Marketing
Offer
Optimization
Clinical Data
Analysis
Fraud
Detection
SKU
Rationalization
Massively large
databases
Search
Technologies
Complex Event
Processing
Cloud
Computing
Software as a
service
Interactive
VisualizationIn Memory
Analytics
CART
Adaptive
Regression
Splines
Machine
Learning
Radial
Basis Functions
Support Vector
Machines
Neural
Networks Geospatial Predictive
Modeling
3Mu Sigma Confidential
Three Key Macro Analytics Trends Targeting Consumption
Macro Trends: Agenda in All Three
Analytics TrendsComputational Enablement
– Scalable Analytics, Big Data & NoSQL technologies, GP-GPU, In –Memory & High Performance Computing
– Master Data Management, Cloud, and Federated Data – the end of the central EDW?
– Analytics as a Service and the Cloud
– Open Source Maturity
– Rise of Agile Self Service
– Data Model Disruption: ETL vs. ELT
Usability and Visualization
– Dashboards will Evolve into Cockpits
– Improved Interactive Data Visualization & Aesthetics
– Augmented / Virtual Reality / Natural User Interfaces
– Graph Analytics sparked by Social Data
– Discovery with Computational Geometry
– Mobile BI and the Integration of Consumer & Enterprise Devices
– Gamification: Do you want to play a game?
Intelligent Systems (“Anticipation Denotes Intelligence”)
– Operational Smart Systems are Back propagating – Pervasive Analytics
– Real Time Analytics and Event Streams
– Artificial Intelligence (AI) is not what we expected
– Don’t forget the 4th tier it’s Juicy
– Metaheuristics and the Ants
– Operational Analytics, You are the Exception
– Experiments in the Enterprise and the Quasi
– Just Say No to OLS
– Energy Informatics if Radiating
– Intelligent Search
4Mu Sigma Confidential
“Big Data”, the term, is about applying new technologies to meet unfulfilled needs: ANALYTICS AT SCALE…
What is Big Data?
Message Passing Interface Map Reduce
BIG Computation & Big Data Scaling Economically
Intensive Computational Analysis not Supported by the Traditional Data Warehouse Stack
and Architecture
5Mu Sigma Confidential
Big Data’s recent enterprise popularity can be attributed to two key factors
Big Data Technology Evolution
2003 2004 2005 2006 2007 2008 2009 2010 2011
Apache: Hadoop project
Yahoo: 10K core cluster
IBM
Acquires
Early Research Open source dev momentum
Initial success stories
Commercialization &
Adoption
Cloudera named most promising startup funded in 2009
elastic map-reduce
Teradata buys Aster (map-reduce DB)
Google Trigger: Releases “Map Reduce” & “GFS” paper
Google: Bigtablepaper
Lucene subproject
HP Acquires Vertica
EMC Acquires
Greenplum
2012
Oracle
MSOFT
SAS
Technology Trigger
– Robust Open Source Technology for Parallel Computation on Commodity Hardware (Hadoop / NoSQL)
Frustration Trigger..
– Tyranny of the Data Model, Cost & Complexity for data integration, Agility, Impact, Bureaucracy, Difficulty & Costly to scale of existing EDW technology
Faced with storing massive data sets from indexing the
web, Google searches for & finds a way to store & retrieve
the data in commodity hardware
6Mu Sigma Confidential
Why should you care, is this phenomenon all marketing hype or business critical?
Data is an asset and growing by leaps and bounds
– Structured and unstructured (DARK MATTER)
– Within and Beyond the firewall
A competitive differentiator is the use of data to improve decision making thru analytics
– “Executives are Professional Gamblers”
– Need an Edge..
» We are in the information age…
Consumption of analytics differentiates
» Embedded analytics vs. ppt culture (deckware?)
The idea is linked to the concept of a digital age or digital revolution, and carries the ramifications of a shift from traditional industry that the industrial revolution brought through industrialization, to an economy based on the manipulation of information, i.e., an information society (computation)
It’s all about analytics done at “SCALE”, enabling scale cost effectively.
Who Cares?
7Mu Sigma Confidential
A Mindset for Enabling Decision Sciences At ScaleDataset -> Toolset -> Skillset -> Mindset
Big Data – What is it?
Skillset
– Computer Scientist
» Map-Reduce, HDFS, HIVE, R, PIG, NoSQL, Unstructured Data
» Parallel computation & Stream Processing
– Data & Visual Explorer
» Data “Artisans” / Design Thinking
– Math / Stats / Data Mining / Econometrics / Machine Learning
Mindset
– Horizontal thinking
– Automation and Scale
» Higher utilization of machines for decisions
– Experimentation & Agility
– Learning vs Knowing
– Consumption Focused
– Algorithmic and Heuristic (Man & Machine)
– Intrapreneur, Curious, Persuasive, Soft Skills
8Mu Sigma Confidential
the Big Data eco-system is a challenge to keep up with..Mindset: Knowing vs. Learning..
The Big Data Ecosystem
”Formulating a right question is always hard, but with big data, it is an order of
magnitude harder, because you are blazing the trail (not grazing on the green field).” source: Gartner
Source: The Big Data Group llc
10Mu Sigma Confidential
Other Big Data Examples of Monetization
Large insurance companies spend hundreds of millions a year on “trip charges” for new
homeowner quotes
– Significantly Reduced spend with image analysis of roof types
Recommender algorithms implemented on a large e-commerce retail site
– Support cross-sell, new revenue stream and improved customer delight
Reduction in annual license and software fees, storage, and server expenses
Operational analytics improvements
– Improved customer experience thru scoring & tagging more rapidly
– Improved inventory management and customer satisfaction thru SKU level prediction and control
– Implemented at scale test and learn methods for store and SKU optimization
– Ad monetization, improved conversion due to accelerated elasticity estimation and segment tuning
Real time social media “intervention” generating sales leads
New revenue streams created by monetization of client’s transaction data
Data Hub / Lake / Reservoir / 360 / Fabric / Integration: Finally the EDW Lives?
11Mu Sigma Confidential
Quantified Self – Get ready for a Wave of Sensor Data & Analytics..Tricorder? THINK BIGGER…..
http://en.wikipedia.org/wiki/Quantified_Self
– “a collaboration of users and tool makers who share an interest in self knowledge through self-tracking”
– The primary methodology of self-quantification is data collection, followed by visualization, cross-referencing and the discovery of correlations.
Wearable Computing
Star Trek: Computer, Communicator, Replicator…
12Mu Sigma Confidential
Trend: Dashboards will Evolve to be More Actionable and Forward Looking
Dashboredom?
http://www.information-
management.com/issues/20_7/dashboards_data_management_quality_b
usiness_intelligence-10019101-1.html
“As we move from the information age into the intelligence age”..
Usability and Visualization: Evolution
13Mu Sigma Confidential
Trend: Interactive Visualization Improvements and the Importance of Aesthetics & Design
Advanced Visualization will continue to increase in depth and
relevance to broader audiences. Visionary vendors
like Tableau, QlikTech, SAS JMP/Visual Analytics, Panopticon,
and Spotfire (now Tibco) made their mark by providing
significantly differentiated visualization capabilities compared
with the trite bar and pie charts of most BI players' reporting
tools.
The latest advances in state-of-the-art UI technologies such as
Microsoft’s SilverLight, Adobe Flex, R, and Processing,
Javascript/ HTML5 via frameworks like Ext JS, D3
http://d3js.org/ augur the era of a revolution in state-of-the art
visualization capabilities, and Multi Touch Devices …
http://bardoli.blogspot.com/2009/12/top-10-trends-for-2010-in-analytics.html movie: Avengers & Iron Man
http://selection.datavisualization.ch/Usability and Visualization: Evolution
1414Mu Sigma Confidential
Analytics Trends & Big Data
The Decision Sciences Journey: Key Learnings & Scale
Agenda
Analytics Trends
15Mu Sigma Confidential
Reference DesignDecision Support Stack for Operationalizing Analytics at Scale
Math + Business
+ Technology
Math + Technology
Technology
Dataset
++ Design
Weak Point for Enterprises Today
In Scaling Decision Sciences
16Mu Sigma Confidential
Innovation thrives at the intersections of disciplines
IT
Math
BusinessTechnology
Consulting
++ Applied Math
- - High Cost
IT
++ Business
++ Applied Math
One has to play in the continuum…
Culture Traits
Learning over
Knowing
Agility
Scale &
Convergence
Multi-disciplinary
skills
Innovativeness
Cost Effectiveness
Experimentation
17Mu Sigma Confidential
Run the
business
The dynamic nature of the business creates the need for organizations to solve ill-defined shifting problems
Muddy Problem
Fuzzy Problem
Clear Problem
Solution
Agility experiments
Discovery based
Experience with
business
Change / Learn
the business
Consultative approach
Semi-automated
Socializing with smaller
sub groups
Solid understanding of
what all happened to
reach this stage
Need to design for
scale
Tech based
Heuristic
Right brained
Creative
Judgment based
Experimental
More complex
Multi-input,
multi-output
Algorithmic
Left brained
Routine
Process
oriented
More defined
More mature
Amenable to
automation
Mystery Heuristic Algorithmic Codification Toolification Systematization
Segments of Problems
18Mu Sigma Confidential
Two different paradigms are emerging in analytics and decision sciences
Business
Problem
Data
Available?
Problem-Driven Approach
HypothesisData
Sufficient?
N
Acquire Data
Y
AnalysisInsights &
Recommendations
Discovery-Driven Approach
REQUIRE HARMONY : NOT JUST BALANCE AND NOT ONLY CONFLICT
Discovery Driven
Approach
Agenda-less Observation
Data Exploration
Pattern Anticipation
Business Opportunity
Empirical Validation
Continuous Improvement
19Mu Sigma Confidential
The information age requires one to think in terms of the decision supply chain: Rinse and Repeat
Source Produce Distribute Consume
Manufacturing Supply Chain
Decision Sciences Supply Chain
Define Problem
Gather Data
Perform Analysis
Get Results
Generate Insights
Give Recommendations
Communicate Action Perform Action
Measure Success
Example: Pipeline Analytics
20Mu Sigma Confidential
…and the ability to quickly translate and consume decision sciences to drive strategic imperatives
Align
Incentives
Creation
Creation of insights using data and
application of Business, Math and
Technology to solve business problems
Translation of business problems to
analytics problems and translation back
from analytics solutions to business
solutions
Evangelization and consumption of
insights and recommendations to make,
monitor and measure decisions
Translation & consumption of decision sciences will be core to providing
competitive advantage
Institutionalization of Decision Sciences
21Mu Sigma Confidential
Decision Sciences is NOT an Evolution from BI & Analytics
High
Low
Need
Fo
r C
reati
vit
y
LowNeed for Rigor & Process Discipline
Data
Management
Type I
Reporting &
MIS
Type II
Descriptive
Data Analysis
Type III
Inquisitive
Modeling &
Forecasting
Type IV
Predictive
Insight
Generation
Type V
Prescriptive
High
22Mu Sigma Confidential
It is a SYMPHONY of multiple notes working together
Descriptive
Analytics
Prescriptive
Analytics
Predictive
Analytics
Inquisitive
Analytics
D
IP
P D I P P
23Mu Sigma Confidential
We are in the ‘Information Age’ and its time that data and human analytics capabilities are augmented with machine intelligence: Super Man vs. Iron Man vs. Robocop..
Intelligent Systems
Intelligent Systems: Next Wave of Innovation and Change (Iron Man Suit)
Data Mining Finally Meets the Operation.. Powered by Big Data Scale & Value
Convergence of the global industrial
system with the power of advanced
computing, analytics, low-cost sensing
and new levels of connectivity
permitted by the Internet
* Source: GE (Industrial Internet – Pushing the Boundaries of Minds and Machines)
Intelligent
Machines
Advanced
Analytics
People
At Work
24Mu Sigma Confidential
Yippee ! 5 % discount ..
Yeah.. I think I need these headphones too
The current business scenario demands automating and injecting intelligence into various enterprise touch points
Business Situation
Webpage
Add to
Cart
Checkout
Yearly Bonus
Let me buy a new laptop
Taxonomy, Associations,
Nearest NeighborPopup
Optimization,
Need State Personalization
Collaborative Filtering
Recommender Systems
Offer Optimization
Behind the Scenes
Marketing
Mix
CRM
Revenue
Management
Hey ! I have some good deals being provided here
Wow ! Great offers .. Since I have some more
money, let me have a look
Customer Segmentation
Propensity Score
Attribution Methods to Optimize Media Spend
Pricing
Elasticity
Add Intelligent Analytics Everywhere You Can to Unlock The Value In Information
Search Buy laptop
25Mu Sigma Confidential
Current News on Intelligent Systems
CEP Vendors Acquired June 2013: Streambase (Tibco) and Progress
Apama (Software AG)
USA Today, September 16, 2012
– http://www.usatoday.com/money/business/story/2012/09/16/review-automate-this-probes-math-in-global-trading/57782600/1
Two Second Advantage: Published 2011
– Wayne Gretzky: Edge: He predicted a little better than anyone else, where the hockey puck would be.
– Enterprise 2.0: Every transaction became a bit of digital data
– Enterprise 3.0: Every EVENT can become a bit of digital data
– There will always be value in making projections, days, months, years, predictive analytics, data mining, and analytics systems on historical data – batch systems
– But in today’s world, enterprises need something more, instantaneous, proactive, predictive capability – real time systems
GE Mind and Machines: November 2012
» Industrial Internet, Intelligent Systems, and Intelligent Machines
» http://www.ge.com/docs/chapters/Industrial_Internet.pdf
» http://www.youtube.com/watch?v=SvI3Pmv-DhE
26Mu Sigma Confidential
Reference DesignDecision Support Stack for Operationalizing Analytics at Scale
Math + Business
+ Technology
Math + Technology
Technology
Dataset
++ Design
Weak Point for Enterprises Today
In Scaling Decision Sciences
27Mu Sigma Confidential
Typical Analytics Workflow
Teradata
15%
of
the p
rocess
70%
of
the p
rocess
15
%
Time Period Selection
Sister Store Selection
Processing of codes
Basic data pull
EmailRequest Client – Analytics
Report
Run the optimization
process
Set inputs in the code
SAS
Processing of codes
Optmodel optimization
VBA Excel
Output as excels
Client Report Generation
Excel
CPLEX
optimization
SAS
1
2
4
5
6
7
8
9
10
11
3
Example of Analytics Batch Process Mgmt
28Mu Sigma Confidential
Success of an Analytics Shop leads to the Golden “Handcuffs” that must be broken to scale..
Business Situation
Utilizing Analytics Process Orchestration Can Help Organizations Tackle These Specific
Problem Areas at Scale
Creation and
consumption challenges
due to a lack of seamless
technology integration
Scaling issues due to lack of a process metaphor
Insufficient documentation by developers
leading to knowledge management challenges
Lack of automation for
hidden decisions
Analysts unable to spend time with
high value analytics due to
operational issues – data pull,
technology integration
KPI’s to be monitored over
time – difficult to detect
exceptions automatically
29Mu Sigma Confidential
Utilize “Analytics BPM”: Workflow and Business Process Management Platform to Manage Man & Machine
Batch Analytics Automation
Performance Gains Across Client Engagements
Industry Engagement Business Impact
Retail
Technology
Healthcare
Automates reporting framework
Optimizes search engine marketing spends
Helps create marketing mix solution
Improvement in analyst productivity
Increased collaboration and utilization of better process management strategies
Improved model granularity
Flexible bidding process
Increased click through rates by 8%
Easier tool deployment
Reduced development time by 15%
Support creation of additional reports rapidly
Analytics BPM allows one to build human and machineworkflows and embed them within your enterprise.
•Nexus BPM v1.5 (open source LGPL; based on Jboss JBPM)
•muFLOW v2.1 (based on Activiti.org) Mu Sigma Product
Key Features
A modeling environment that “glues”
together reusable components from
multiple technologies and enables
automation
Extensible and Self Documenting
Ability to create rich analytics vehicles
that can be customized by use case
Provision to leverage past knowledge
Facilitates efficient process
management such as modeling,
automation, monitoring, and analysis
Analytics
BPM
30Mu Sigma Confidential
When should an organization think about Analytics BPM?
muFlow™ - Objective
Visual Metaphor + Component Model = Extensible & Agile
Analytics BPM
We were having trouble integrating different technology solutions within our organization. We had automated
individual technologies to a significant extent, but we were facing roadblocks connecting certain disparate
application systems with each other. After some research, we moved our control layer to a BPM and
everything has been considerably smooth ever since.
– a banking client
31Mu Sigma Confidential
Create intelligent business workflows; linking People, Systems, Information and Analytics on one platform
Example: Propensity to Click
32Mu Sigma Confidential
Trend: Operational Analytics, You are the Exception
Intelligent Systems
Source: Tibco
Signal Generator
Filter / Rules
Root Cause Learner
Tip: Utilize Three
Interacting Modules to
Create an Intelligent
Exception System
33Mu Sigma Confidential
Future is Now: “Top Shops” & the Realization of Intelligent Systems: The Algorithm Becomes a First Class Citizen..
Source: SAS Model Manager
Intelligent Systems
Analytic Lifecycle Management
• “Software” Parallels
• Dev, Test, Stage,
Production, & Versioning
• Algorithms are first class
• Re-Use
• API Economy
Continuous Analytics
performance management
• Champion/Challenger
• Live Update and
Optimization & Simulation
Model Management
Algo Wars
34Mu Sigma Confidential
Questions?
Chicago, IL
Bangalore, India
2013
www.mu-sigma.com
Proprietary Information
"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly prohibited"
zubin.dowlaty@mu-sigma.com
Twitter: @crunchdata
top related