big data, physics, and the industrial internet: how modeling & analytics are making the world...
TRANSCRIPT
Imagination at work.
Matt Denesuk Chief Data Science Officer GE Software October 2014
Big Data, Physics, and the Industrial Internet���How Modeling & Analytics are Making the World Work Better.
© General Electric Company, 2014. All Rights Reserved.
Contact: [email protected]
What’s this all about? Industries that are all about data & IT see outsized productivity & performance gains
• Telecom, financial srvcs,…
2
Making industrials all about data & IT will transform how the world works
• Power, water, aviation, rail, mining, oil & gas, manufacturing, …
And Big Data + Physics is the enabler
Entertainmentdigitized
© General Electric Company, 2014. All Rights Reserved.
What happened when 1B people became connected?
Social marketing emerged
Communications mobilized
IT architecture virtualized
Retail & ad transformed
Consumer Internet
] [
] [
] [
] [
] [
Industrial Internet
Brilliant Power
Brilliant Factory
Logistics Optimization
Factory Optimization
Smart Grid
Hospital Optimization
Real-time Network Planning
Intelligent Medical Devices
Connected Machines
Brilliant Hospital
Brilliant Rail Yard
Now what happens when 50B Machines become connected?
Employees increase productivity OT is virtualized Analytics become predictive
Machines are self healing & automated Monitoring and maintenance is mobilized [ [
© General Electric Company, 2014. All Rights Reserved.
Shipment Visibility
Cornerstone of IoT Transformation is Software-Defined Machines (SDM’s)���
• Easily connect machines to Internet • Embed apps and analytics into machines and cloud, making them intelligent and self-
aware
• Change and update capabilities of machines and devices without changing hardware • Deliver intelligence to users providing continuously better outcomes • Extend Industrial Internet platform via API and ecosystem
CONSUMER COMMERCIAL & INDUSTRIAL
7 GESoftware.com | @GESoftware | #IndustrialInternet
The Value to Customers is Huge Efficiency and cost savings, new customer services, risk avoidance – 1% improvements cuts $276B in waste across industries
Aviation
Power
Healthcare
Rail
Oil and Gas
Industry Segment Type of savings Estimated value
over 15 years
$66B
$30B
$63B
$27B
$90B
Commercial
Gas-fired generation
System-wide
Freight
1% fuel savings
Exploration and development
1% fuel savings
1% reduction in system inefficiency
1% reduction in system inefficiency
1% reduction in capital expenditures
Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors. Source: GE estimates
8 GESoftware.com | @GESoftware | #IndustrialInternet
Internet ���of things 1 Intelligent,
SW-defined machines
2 Big Data & Analytics 3 Physics +
Big Data 4 A living network ���of machines, data, ���and people Increasing system
intelligence through embedded software
Employing deep physics & engineering models to leap-frog what’s possible with data-driven techniques
Transforming massive amounts of data into intelligence, generating data-driven insights, and enhancing asset performance
Forces shaping���the Industrial Internet ���
Reference Architecture ���Platform for the Industrial Internet must bridge OT & IT
PaaS SaaS
Industrial Data Lake
Industrial Big Data Management Event Processing
Business Process Management
Single Record of Asset
Analytics & Modeling
Inte
grat
ion
with
ER
P /
CR
M
Device mgmt. M2M, M2H, M2C
Insight to Action • Maintenance • SW Upgrades • Machine Control
Mobility and Collaboration
Cyber-Security & Operational Reliability
Any Machine
Any
Device
11
Two ways of seeing a data set* (and the world)
The data set is record of everything that happened, e.g., • All customer transactions last month • All friendship links between members of social networking site
Goal is to find interesting patterns, rules, and/or associations.
Physical Scientist – “get the knowledge”
(*See D. Lambert, or R. Mahoney, e.g.)
• The data set is an partial, and often very noisy reflection of some underlying phenomenon, e.g.,
– Emission spectra from stars – Battery voltage varying with current, time, and temperature
• Goal is better understanding or ability to predict aspects of that phenomenon, often through a mathematical model
For certain kinds of problems, immense power in the combination
Computer Scientist: “get the knowledge locked in the data”
Example: Statistical Translation
• Employ language experts to codify rules, exceptions, vocabulary mappings, etc.
• Apply transformation to user’s query.
• Gather and classify lots of translated docs (websites, UN, books, …)
• Identify & match patterns • Map to user’s translation query.
Regular Science approach
Statistical (data-driven) approach
Use of language is infinitely complex, but you can teach a
computer all the rules and content.
People say the same kind of things over and over. And
somebody has already translated it.
• Costly, hard to scale • Can translate nearly any statement
(but accuracy variable) • In theory, could be better than
human.
• Incrementally low cost, highly scalable.
• Limited in scope to digitized docs that have been translated before
• Limited by skill of human translators
Will flop with innovative use of language (new poetry, …)
Too expensive and difficult to deploy comprehensively
13
Three basic components of Industrial Data Science Physics/engineering-based models
• Need much less data • Powerful, but difficult to maintain and scale
Empirical, heuristic rules & insights
• Straightforward to understand • Captures accumulated knowledge of your experts
Data-driven techniques – machine learning, statistics, optimization, advanced visualization, … • Often not enough data in the industrial domain • Bias: limited to regions of parameter space traversed
in normal operation • But easiest to maintain and scale
15
Industrial Example: improving rule based systems Many equipment operators have a system something like this, with rules derived based on experience and intuition.
Rule sets implemented in
Analytics Engine Produce alerts
Low-latency operational data
Alerts
16
Industrial Example: improving rule based systems
Rule sets implemented in
Analytics Engine Produce alerts
Low-latency operational data
Pattern, sequence, association mining, etc.
Outcome data
Combine ML plus rule-based alerts with outcome data to produce better alerts
More actionable
alerts
17
Industrial Example: improving rule based systems
Rule sets implemented in
Analytics Engine
Low-latency operational data
Outcome data
Use ML and outcome data to refine and extend rule base, providing yet further actionability, resulting in substantial improvements in operational outcomes.
Recommendation engine
Tune parameters of existing rules, and create new rules.
Actionable Recommendations
18
Sensor Data
Another Industrial Example: use advanced physical models to create new features for ML approaches
Predicted Values and Δs
Variety of Machine Learning
Techniques
Outcome data
Using as ML features the: 1. Deviations from
expected physics, &
2. Inferred or hidden parameter estimates
provides much richer and effectively less noisy data, resulting in much stronger predictions and models.
19
Climbing up the value chain toward Condition-based Performance Management and Business Optimization.
19
Fix it when it breaks
Predictive Maintenance (“future”)
Prescriptive recommendations (multi-channel)
Fleet/operation-wide optimization levels. Trade-offs to optimize business performance
Condition-based Maintenance (“now”) Model-driven Work-driven Time-driven
Need: • Earlier detection • Root cause • Scaling to more
equipment Types & instances
New levers for optimization across the
operation or business
“Equipment heath is not a given, but
a variable”
20
Capability / Impact Ramp
Data completeness, breadth, quality
Dat
a S
cien
ce C
ompl
exity
Basic Reporting
Advanced Reporting
Anomaly Detection
Rules augmentation
Predictive analytics
Prescriptive analytics
Operational optimization
Alerts
Highly-
actionable
management
info
High-value
guidance
Sophisticated, optimized
management of business
operations
Optimizes the design & operations of complex business and physical systems, extracting more value at lower risk
Broad range of deep Data Science capabilities needed
Innovates new ways of performing reliability analysis, statistical modeling of large data, biomarker discovery and financial risk management
Focuses on developing algorithms and systems for real time video analysis
Research in algorithms and software systems that analyze & understand images to produce actionable insights
Develop scalable and cross-disciplinary machine learning & predictive capabilities to derive actionable insights from big data
Modeling complex system and noise processes to detect subtle deviations and estimate critical system parameters
Employing deep physical and engineering understanding of equipment and processes to generate normative models.
Sensor & Signal
Analytics
Delivering data and knowledge-driven decision support via semantic technologies and big data systems research
Knowledge Discovery
Applied Statistics
Physics & expert-based
Modeling
Machine Learning
Computer Vision
Image Analytics
Optimization & Management
Science
21
Industrial Data
Science
22
“Industrial Data Science”
① Outcome-oriented application of mathematical & physics-based analysis & models to real-world problems in industrial operations.
② Tools & processes needed to do that continually & at scale.
Improve the performance of industrial operations, e.g., • Higher equipment uptime, utilization, • Lower maintenance/shop costs, longer component life • Fleet level optimization & trade-offs • Business optimization (linking to financial & customer data) • Service / contract management
Combination of : • Physical & expert modeling experience & depth • Installed base of industrial equipment and data. • Big Data, Machine Learning, and statistical capabilities
What is it?
Why do we do it
What’s needed
Industrial Data
Science