big data, physics, and the industrial internet: how modeling & analytics are making the world...

22
Imagination at work. Matt Denesuk Chief Data Science Officer GE Software October 2014 Big Data, Physics, and the Industrial Internet How Modeling & Analytics are Making the World Work Better. © General Electric Company, 2014. All Rights Reserved. Contact : [email protected]

Upload: mattdenesuk

Post on 20-Aug-2015

236 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Imagination at work.

Matt Denesuk Chief Data Science Officer GE Software October 2014

Big Data, Physics, and the Industrial Internet���How Modeling & Analytics are Making the World Work Better.

© General Electric Company, 2014. All Rights Reserved.

Contact: [email protected]

What’s this all about? Industries that are all about data & IT see outsized productivity & performance gains

•  Telecom, financial srvcs,…

2

Making industrials all about data & IT will transform how the world works

•  Power, water, aviation, rail, mining, oil & gas, manufacturing, …

And Big Data + Physics is the enabler

Entertainmentdigitized

© General Electric Company, 2014. All Rights Reserved.

What happened when 1B people became connected?

Social marketing emerged

Communications mobilized

IT architecture virtualized

Retail & ad transformed

Consumer Internet

] [

] [

] [

] [

] [

Industrial Internet

Brilliant Power

Brilliant Factory

Logistics Optimization

Factory Optimization

Smart Grid

Hospital Optimization

Real-time Network Planning

Intelligent Medical Devices

Connected Machines

Brilliant Hospital

Brilliant Rail Yard

Now what happens when 50B Machines become connected?

Employees increase productivity OT is virtualized Analytics become predictive

Machines are self healing & automated Monitoring and maintenance is mobilized [ [

© General Electric Company, 2014. All Rights Reserved.

Shipment Visibility

Cornerstone of IoT Transformation is Software-Defined Machines (SDM’s)���

•  Easily connect machines to Internet •  Embed apps and analytics into machines and cloud, making them intelligent and self-

aware

•  Change and update capabilities of machines and devices without changing hardware •  Deliver intelligence to users providing continuously better outcomes •  Extend Industrial Internet platform via API and ecosystem

CONSUMER COMMERCIAL & INDUSTRIAL

Example: Wind Farm in Analytics Age

(40 TB/yr/500 wm farm)

7 GESoftware.com | @GESoftware | #IndustrialInternet

The Value to Customers is Huge Efficiency and cost savings, new customer services, risk avoidance – 1% improvements cuts $276B in waste across industries

Aviation

Power

Healthcare

Rail

Oil and Gas

Industry Segment Type of savings Estimated value

over 15 years

$66B

$30B

$63B

$27B

$90B

Commercial

Gas-fired generation

System-wide

Freight

1% fuel savings

Exploration and development

1% fuel savings

1% reduction in system inefficiency

1% reduction in system inefficiency

1% reduction in capital expenditures

Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors. Source: GE estimates

8 GESoftware.com | @GESoftware | #IndustrialInternet

Internet ���of things 1 Intelligent,

SW-defined machines

2 Big Data & Analytics 3 Physics +

Big Data 4 A living network ���of machines, data, ���and people Increasing system

intelligence through embedded software

Employing deep physics & engineering models to leap-frog what’s possible with data-driven techniques

Transforming massive amounts of data into intelligence, generating data-driven insights, and enhancing asset performance

Forces shaping���the Industrial Internet ���

Reference Architecture ���Platform for the Industrial Internet must bridge OT & IT

PaaS SaaS

Industrial Data Lake

Industrial Big Data Management Event Processing

Business Process Management

Single Record of Asset

Analytics & Modeling

Inte

grat

ion

with

ER

P /

CR

M

Device mgmt. M2M, M2H, M2C

Insight to Action •  Maintenance •  SW Upgrades •  Machine Control

Mobility and Collaboration

Cyber-Security & Operational Reliability

Any Machine

Any

Device

What do we need from Data Science?

10

11

Two ways of seeing a data set* (and the world)

The data set is record of everything that happened, e.g., •  All customer transactions last month •  All friendship links between members of social networking site

Goal is to find interesting patterns, rules, and/or associations.

Physical Scientist – “get the knowledge”

(*See D. Lambert, or R. Mahoney, e.g.)

•  The data set is an partial, and often very noisy reflection of some underlying phenomenon, e.g.,

–  Emission spectra from stars –  Battery voltage varying with current, time, and temperature

•  Goal is better understanding or ability to predict aspects of that phenomenon, often through a mathematical model

For certain kinds of problems, immense power in the combination

Computer Scientist: “get the knowledge locked in the data”

Example: Statistical Translation

•  Employ language experts to codify rules, exceptions, vocabulary mappings, etc.

•  Apply transformation to user’s query.

•  Gather and classify lots of translated docs (websites, UN, books, …)

•  Identify & match patterns •  Map to user’s translation query.

Regular Science approach

Statistical (data-driven) approach

Use of language is infinitely complex, but you can teach a

computer all the rules and content.

People say the same kind of things over and over. And

somebody has already translated it.

•  Costly, hard to scale •  Can translate nearly any statement

(but accuracy variable) •  In theory, could be better than

human.

•  Incrementally low cost, highly scalable.

•  Limited in scope to digitized docs that have been translated before

•  Limited by skill of human translators

Will flop with innovative use of language (new poetry, …)

Too expensive and difficult to deploy comprehensively

13

Three basic components of Industrial Data Science Physics/engineering-based models

•  Need much less data •  Powerful, but difficult to maintain and scale

Empirical, heuristic rules & insights

•  Straightforward to understand •  Captures accumulated knowledge of your experts

Data-driven techniques – machine learning, statistics, optimization, advanced visualization, … •  Often not enough data in the industrial domain •  Bias: limited to regions of parameter space traversed

in normal operation •  But easiest to maintain and scale

14

© 2014 General Electric Company - All rights reserved

Some Patterns

15

Industrial Example: improving rule based systems Many equipment operators have a system something like this, with rules derived based on experience and intuition.

Rule sets implemented in

Analytics Engine Produce alerts

Low-latency operational data

Alerts

16

Industrial Example: improving rule based systems

Rule sets implemented in

Analytics Engine Produce alerts

Low-latency operational data

Pattern, sequence, association mining, etc.

Outcome data

Combine ML plus rule-based alerts with outcome data to produce better alerts

More actionable

alerts

17

Industrial Example: improving rule based systems

Rule sets implemented in

Analytics Engine

Low-latency operational data

Outcome data

Use ML and outcome data to refine and extend rule base, providing yet further actionability, resulting in substantial improvements in operational outcomes.

Recommendation engine

Tune parameters of existing rules, and create new rules.

Actionable Recommendations

18

Sensor Data

Another Industrial Example: use advanced physical models to create new features for ML approaches

Predicted Values and Δs

Variety of Machine Learning

Techniques

Outcome data

Using as ML features the: 1. Deviations from

expected physics, &

2. Inferred or hidden parameter estimates

provides much richer and effectively less noisy data, resulting in much stronger predictions and models.

19

Climbing up the value chain toward Condition-based Performance Management and Business Optimization.

19

Fix it when it breaks

Predictive Maintenance (“future”)

Prescriptive recommendations (multi-channel)

Fleet/operation-wide optimization levels. Trade-offs to optimize business performance

Condition-based Maintenance (“now”) Model-driven Work-driven Time-driven

Need: •  Earlier detection •  Root cause •  Scaling to more

equipment Types & instances

New levers for optimization across the

operation or business

“Equipment heath is not a given, but

a variable”

20

Capability / Impact Ramp

Data completeness, breadth, quality

Dat

a S

cien

ce C

ompl

exity

Basic Reporting

Advanced Reporting

Anomaly Detection

Rules augmentation

Predictive analytics

Prescriptive analytics

Operational optimization

Alerts

Highly-

actionable

management

info

High-value

guidance

Sophisticated, optimized

management of business

operations

Optimizes the design & operations of complex business and physical systems, extracting more value at lower risk

Broad range of deep Data Science capabilities needed

Innovates new ways of performing reliability analysis, statistical modeling of large data, biomarker discovery and financial risk management

Focuses on developing algorithms and systems for real time video analysis

Research in algorithms and software systems that analyze & understand images to produce actionable insights

Develop scalable and cross-disciplinary machine learning & predictive capabilities to derive actionable insights from big data

Modeling complex system and noise processes to detect subtle deviations and estimate critical system parameters

Employing deep physical and engineering understanding of equipment and processes to generate normative models.

Sensor & Signal

Analytics

Delivering data and knowledge-driven decision support via semantic technologies and big data systems research

Knowledge Discovery

Applied Statistics

Physics & expert-based

Modeling

Machine Learning

Computer Vision

Image Analytics

Optimization & Management

Science

21

Industrial Data

Science

22

“Industrial Data Science”

① Outcome-oriented application of mathematical & physics-based analysis & models to real-world problems in industrial operations.

②  Tools & processes needed to do that continually & at scale.

Improve the performance of industrial operations, e.g., •  Higher equipment uptime, utilization, •  Lower maintenance/shop costs, longer component life •  Fleet level optimization & trade-offs •  Business optimization (linking to financial & customer data) •  Service / contract management

Combination of : •  Physical & expert modeling experience & depth •  Installed base of industrial equipment and data. •  Big Data, Machine Learning, and statistical capabilities

What is it?

Why do we do it

What’s needed

Industrial Data

Science