cambriano's data governor reduces the big data footprint and saves the planet

35
Click to edit Master title style Big Data Governor Martyn Jones Cambriano Energy www.cambriano.es © 2014 Martyn Richard Jones All rights reserved.

Upload: martyn-jones

Post on 16-Jul-2015

317 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Click to edit Master title style

Big Data Governor

Martyn Jones

Cambriano Energy

www.cambriano.es

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• Simply stated, the best application of Big Data is in systems and methods that will significantly reduce the data footprint.

To begin at the beginning

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

I. Do not generate data that is not needed.

II. Do not store data that doesn't need to be stored.

III. Do not index data that doesn't need to be indexed.

IV. Do not replicate data that doesn't need to be replicated.

V. Do not transmit or move data that doesn't need to be transmitted or moved.

VI. Do not integrate data that doesn't need to be integrated.

VII. Do not enrich data that doesn't need to be enriched.

VIII. Do not process data that doesn't need to be processed.

IX. Do not provide access to data that does not need to be accessed.

X. Do not archive or backup data that doesn't need to be archived or backed up.

10 Big Data Commandments

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• Years of knowledge and experience in information management strongly suggests that more data does not necessarily lead to better data.

• The more data there is to generate, move and manage, the greater the development and administrative overheads.

• The more data we generate, store, replicate, move and transform, the bigger the data, energy and carbon footprints will become.

Why would we want to reduce the data footprint?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• We can use it in profiling, in order to identify the data that could be useful.

• We can use it to identify immaterial, surplus and redundant data.

• By using it to catalogue, categorise and classify certain high-volume data sources.

How can Big Data reduce Big Data?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• We can use it to audit, analyse and review the generation, storage and transmission of data.

• We can use the data to parameterise data generators and filters, and

• To be used to generate 'Big-Data-by-exception' discrimination rules and as the basis for data discrimination based on directed machine-learning approaches.

What can we do with the Big Data profile data?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• We hear that Big Data represents a significant challenge.

• The best way of dealing with significant challenges is to manufacture an appropriate, coherent and realisable response - a strategy.

• By addressing the data problems up-stream we can then attempt to turn the Big Data problem into a more manageable data problem, or alternatively, we can choose to remove the problem.

So why would we do all of this?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• We can reduce the amount of data that we actually generate by removing unnecessary generation, storage and transmission of superfluous data. We can change logging, monitoring and signal data generators (applications and devices) so that they produce only concise and usable data. This requires modifications to parts of existing applications and application servers.

• We can introduce data governors as intelligent data filters and actively exclude or include data in data flows. This is particularly relevant where we are dealing with really high-volume data throughput and bandwidth where release of data into the data streams is subject to rules of exception. For example, we may decide to exclude any market signal data that simply repeats the same price stated in previous data.

• We can also filter data dimensionally; by association and abstraction of discrete phases, events, facets and values; and, by time, affinity and proximity.

How does this work in practice?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• Making data smaller reduces the data footprint – lower cost, less operational complexity and greater focus.

• The earlier you filter data the smaller the data footprint is – lower costs, less operational complexity and greater focus.

• A smaller data footprint accelerates the processing of the data that does have potential business value – lower cost, higher value, less complexity and best focus.

What are the benefits?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• We should only generate data that is required, that has value, and that has a business purpose – whether management oriented, business oriented or technical in nature.

• We should filter Big Data, early and often.

• We should store, transmit and analyse Big Data only when there is a real business imperative that prompts us to do so.

In order to tame Big Data?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• Taming Big Data is a business, management and technical imperative.

• The best approach to taming the data avalanche is to ensure there is no data avalanche – this is referred to as moving the problem upstream.

• The use of smart 'data governors' will provide a practical way to control the flow of high volumes of data.

Conclusions?

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

The Big Data GovernorA brief architectural and functional overview

Martyn Jones, Creative Director, Cambriano Energy

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• The Big Data Governor is an architectural concept, set of methods and a technology which has been developed in Spain (EU) by Martyn Jones and associates at Cambriano Energy.

• The Big Data Governor’s role is to help in the purposeful and meaningful reduction of the ever expanding data footprint, especially as it relates to data volumes and velocity (see Gartner 3Vs).

• The reduction techniques are based on exclusion, inclusion and exception.

• It’s implementation is made through a development environment that can target hardware, firmware, middleware and software forms of hosting and continuously monitored execution.

The Big Data Governor

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleReduce the data footprint and maintain fidelity

Data ApplicationData forward, store

and analyse

All data All data All data

Business As Usual: All generated data is stored and forwarded

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleReduce the data footprint and maintain fidelity

Inline Data ApplicationData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Business To Be: Model 1: Only significant data is stored and forwarded

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit I – IC Fab Testing

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Integrated Circuit Wafer Production Testing / Probing Storage of Test / Probe Results Analysis of Test / Probe Results

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit I – IC Fab Testing

This exhibit shows where the Data Governor isplaced in the Integration Circuit fabrication and testing/probing chain.

In large plants, the IC probing process generatesvery large volumes of data at high velocity rates.

Based on exception rules the Data Governorreduces the flow of data to the centralised data store.

It also speeds up velocity and time to analysis.

Greater speed and less volumes mean thatproduction show-stoppers are spotted earlier, thereby potentially leading to significantproduction and recuperation cost savings.

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit II – Internet of Things

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Internet of Things Internet ‘Thing’ Storage of IoT Data Analysis of IoT Data

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit II – Internet of Things

This exhibit shows where the Data Governoris placed in the Internet of Things data flow.

The Data Governor is embedded into an IoTdevice, and functions as a data exceptionengine.

Based on exception rules and triggers theData Governor reduces the flow of data to thecentralised / regionalised data store.

It also speeds up velocity and time to analysis.

Greater speed and less volumes mean thatimportant signals are spotted earlier, therebypossibly leading to more effective analysis and quicker time to action.

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit III – Net Activity

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Online internet interaction Activity and event logging Happy Data Analysis of Net Activity

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit III – Net Activity

This exhibit shows where the Data Governor isplaced in the capture and logging of interactiveinternet activity.

The Data Governor acts as a virtual device writtento by standard and customised log writers, and functions as a data exception engine.

Based on exception rules and triggers the Data Governor reduces the flow of data generated byinternet-browser-activity logging.

It also speeds up velocity and time to analysis.

Greater speed and significantly reduced data volumes may lead to more effective and focusedanalysis and quicker time to action.

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit IV – Signal Data

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Signal generation and transmission Near Zero Latency transmissionImmediate Analysis of

Data

Smaller Big Data

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit IV – Signal Data

This exhibit shows where the Data Governor isplaced in the stream of continuous signal data.

The Data Governor acts as an inline data exceptionengine.

Based on exception rules and triggers the Data Governor reduces the flow of signal data.

It also speeds up velocity and time to analysis.

Greater speed and significantly reduced data volumes may lead to more effective and focusedanalysis and quicker time to action.

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit V – Machine Data

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Filter, concentrate / summarise

Sensor data Sensor data (internal storage)’ Sensor data Analysis of IoT Data

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit V – Machine Data

This exhibit shows where the Data Governor isplaced in the stream of continuous machine generated data.

The Data Governor acts as an inline data analysisand exception engine.

Exception data is stored locally and periodicallytransferred to an analysis centre.

Analysis of the totality of the same class and origins of data can be used to drive ANN* and statistical analysis which can be used to support(for example) the automatic and semi-automaticgeneration of preventive maintenance rules.

Greater speed and significantly reduced data volumes may lead to more effective and focusedanalysis and quicker time to proactivity.

*Adaptive Neural Network

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleCE Data Governor – Exhibit VI – Other Applications

Application Inline DataData forward, store

and analyse

All data All data Significant data

CE Data Governor

Temporal data store

Trading Plant monitoring Sport Climate Change

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

• Taking our example of the IC Fab test/probe chain, a Data Governor should be able to handle a hierarchy or matrix of designation and exception.

• For example, a top level Data Governor actor could be the Production Run actor.

• The Production Run actor could designate and assign exception rules to a Batch Analysis actor.

• In turn, the Batch Analysis actor could designate and assign exception rules to a Wafer Instance Analysis actor.

Designation and Exception Rules – IC Fab

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleDesignation, Exception Rules, Feedback - IC Fab

Production Run Actor

Batch AnalysisActor

Wafer InstanceAnalysis Actor

Batch AnalysisActor

Wafer InstanceAnalysis Actor

Wafer InstanceAnalysis Actor

Wafer InstanceAnalysis Actor

Wafer InstanceAnalysis Actor

Designation and Exception

Exception Feedback

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleException Rules – IC Fab

If the status of production.run is red

and the signal.aggregate from batch.actor is green

and the signal from wafer.actor is green

and the sensibility.status of production.run is red

Then the action of data.governor is forward.data

and the action of data.governor is immediate

© 2014 Martyn Richard Jones

Click to edit Master title styleTesting and triggering of exception rules – IC Fab

Production ID in focus: X0635387N

Test ID 1 2 3 4 5 6 7 8 9 10 11 12

A OK

B FAIL

C OK

D OK

E OK

F OK

G

H OK

I OK

J FAIL

K FAIL

Timeline (artificial)

If status.text of wafer.test(“B”) is FAIL

and the status.text of wafer.test(“J”) is FAIL

and the status.text of wafer.test(“K”) is FAIL

Then the action of data.governor is forward.data

and the action of data.governor is immediate© 2014 Martyn Richard Jones

Click to edit Master title styleCambriano Data Governor -

Data Governor

Temporal Data Store

Big Data

Target Data Store

Blackboard Paradigm

Classes, objects and instances

Exclusion, inclusion, aggregation and exception rules – modus ponens

Non-brittle and brittle constraints and triggers

Scripting, pluggable components and user exits

Quantitative analysis

Qualitative analysis

Data persistence, aggregation and generalisation

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title styleDW 3.0 Information Supply Framework with Data Governors

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

OLTP

Applications

‘What if ’

analysis

MIS /

Reporting

Visualisation

Publication

ºAll digital

data

Data Governor

Data Governor Collector

Data

Governor

Manager

Click to edit Master title styleSummary

Application / Intelligent device

Inline DataData forward, store

and analyse

All data generators All data generated Significant data

CE Data Governor

Temporal data store

Rules and constraints

1. Data is generated, captured, created or invented.

2. It is stored to a real device or virtual device.

3. The Data Governor (in all its configurations) acts as a data discrimination and

data exception manager and ensures that significant data is passed on.

4. Significant data is used for ‘business purposes’ and to potentially refine the

rules of the CE Data Governor.

© 2014 Martyn Richard Jones All rights reserved.

Click to edit Master title style

The CE Big Data GovernorIf you want to know more about the CE Big Data Governor architecture

or wish to discuss your particular needs then please contact Martyn Jones at [email protected]

Direct line: +34 618 471 465

Click to edit Master title style

Big Data Governor

[email protected] and http://www.cambriano.es

Professional web site: http://www.martynjones.eu

Strategy blog: http://www.goodstrat.com

Direct line: +34 618 471 465

© 2014 Martyn Richard Jones All rights reserved.