peter elleby - big data, big noise, big hope - no miracles
DESCRIPTION
Peter Elleby from Greenlight's presentation from our Big Data breakfast conferenceTRANSCRIPT
![Page 1: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/1.jpg)
Peter Elleby
Greenlight
‘Big Data, Big Noise, Big Hope – No Miracles
![Page 2: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/2.jpg)
Big Data, Big Noise, Big Hope – No Miracles
27/06/2013
![Page 3: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/3.jpg)
Big Data - Volume, Velocity, Variety
As American created about 4lb of rubbish every day.
If the rest of the world produced as much, this would be 10M tons daily, or 4T tons annually.
![Page 4: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/4.jpg)
How do you define “Big Data”?
Applications involving collections of data of a size, that makes them impossible to process in a cost effective manner using traditional database management tools
and data processing applications.
![Page 5: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/5.jpg)
Traditional Data Management and Data Processing
OLTP OLAP
Application Operational Decision Support
Horizon Days & Weeks Months & Years
Refresh Immediate Periodic
Data Model Entity-Relationship Multi-Dimensional
Schema Normalized Star (de-normalized)
Emphasis Update Retrieval
Space Small Large (History)
![Page 6: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/6.jpg)
Core Big Data Strategies
• Distribution of Data• Network of Lower Cost Devices
• Compression of Data• Using Processing Power to Reduce Bandwidth Requirements
• Representation of Data• Focus on Algorithm rather than Data Model
• Change of Emphasis• From Completeness to Relevancy
![Page 7: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/7.jpg)
Big Data Application - Hydra
![Page 8: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/8.jpg)
Big Data Application Characteristics - Hydra
• Time Series Data• Storage of State versus Events
• Data Aggregation• Statistical Significance
• Dynamic Clustering• Ontologies of Keywords and Phrases
• Data Refinement• Statistical Process Control and Regression Modelling
![Page 9: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/9.jpg)
Brewers Theorem (the CAP Theorem)
The CAP theorem states that any networked shared-data system can have at most two of three desirable properties:• consistency (C) equivalent to having a single up-to-date
copy of the data• high availability (A) of that data (for updates)• tolerance to network partitions (P)
“sacrifice consistency to gain faster responses in a more scalable manner”
![Page 10: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/10.jpg)
A Practical Everyday Example
S1 S2 SN...
![Page 11: Peter Elleby - Big Data, Big Noise, Big Hope - No Miracles](https://reader036.vdocuments.site/reader036/viewer/2022082916/54c0f2584a79591a4c8b4575/html5/thumbnails/11.jpg)
The Takeaways
• The Aims of your Application determines whether you are dealing with Big Data
• The frameworks or technologies best suited to achieve your goals are determined your application