© 2011 ibm corporation bmw11: dealing with the massive data generated by many-core systems dr don...
TRANSCRIPT
© 2011 IBM Corporation
BMW11:
Dealing with theMassive DataGenerated by
Many-Core Systems
Dr Don Grice
© 2011 IBM Corporation
IBM Systems and Technology Group
Title: Dealing with the Massive Data Generated by Many Core Systems.
Abstract: Multi-core and Many-core architectures are enabling computing systems that are more powerful than ever. The amount of data being generated by these systems is becoming an issue in several areas, including storage of results, movement of intermediate and final results, and the ability to consume the data and transform it into 'information'. As we move forward we need to be developing HW and SW methods to deal with this massive data explosion. Data reduction/simplification and real time analytics will involve more computation but may be one of the most promising methods for dealing with this flood of newly generated data.
Title: Dealing with the Massive Data Generated by Many Core Systems.
Abstract: Multi-core and Many-core architectures are enabling computing systems that are more powerful than ever. The amount of data being generated by these systems is becoming an issue in several areas, including storage of results, movement of intermediate and final results, and the ability to consume the data and transform it into 'information'. As we move forward we need to be developing HW and SW methods to deal with this massive data explosion. Data reduction/simplification and real time analytics will involve more computation but may be one of the most promising methods for dealing with this flood of newly generated data.
© 2011 IBM Corporation3
TYPES OF DATA MANIPULATION
COMPUTE INTENSIVEDATA INTENSIVE
NETWORK INTENSIVE
TYPES OF DATA MANIPULATION
COMPUTE INTENSIVEDATA INTENSIVE
NETWORK INTENSIVE
© 2011 IBM Corporation4
Styles of Massively Parallel Workloads
Data in Motion:
High Velocity
Mixed Variety
High Volume*
(*over time)
SPL, C, Java
Compute Intensive(Data Generators)
Generative Modeling Extreme Physics
C/C++, Fortran, MPI, OpenMP
Reactive Analytics Extreme Ingestion
Data Intensive : Data in Motion (Streaming)
Long Running
Small Input
Massive Output
Data at Rest*:
High Volume
Mixed Variety
Low Velocity
(*pre-partitioned)
= compute node
Hadoop/MapReduce (BigInsights)
Reducers
Mappers
Input Data
Output Data
Global Analytics:
View of All Data Required
Data ‘Must be Moved’Higher VelocityNetwork is Critical
Data Intensive (Data At Rest)Data Intensive (Data At Rest) Data Intensive (Data Needs to Move)Data Intensive (Data Needs to Move)
© 2011 IBM Corporation5
Styles of Massively Parallel Workloads
Data in Motion:
High Velocity
Mixed Variety
High Volume*
(*over time)
SPL, C, Java
Compute Intensive(Data Generators)
Generative Modeling Extreme Physics
C/C++, Fortran, MPI, OpenMP
Reactive Analytics Extreme Ingestion
Data Intensive : Data in Motion (Streaming)
Long Running
Small Input
Massive Output
Data at Rest*:
High Volume
Mixed Variety
Low Velocity
(*pre-partitioned)
= compute node
Hadoop/MapReduce (BigInsights)
Reducers
Mappers
Input Data
Output Data
Global Analytics:
View of All Data Required
Data ‘Must be Moved’Higher VelocityNetwork is Critical
Data Intensive (Data At Rest)Data Intensive (Data At Rest) Data Intensive (Data Needs to Move)Data Intensive (Data Needs to Move)
Embarassingly Parallel Network Dependent
© 2011 IBM Corporation
Data Intensive Applications(Large Data)
Up to 10,000 Times larger
Up to 10,000 times faster
Traditional Data Warehouse and Business Intelligence
Dat
a S
cale
Dat
a S
cale
yr mo wk day hr min sec … ms s
Exa
Peta
Tera
Giga
Mega
Kilo
Decision FrequencyOccasional Frequent Real-time
Data in Motion
Da
ta a
t R
es
t
New “Big Data” Brings New Opportunities, Requires New Analytics
Telco Promotions100,000 records/sec, 6B/day
10 ms/decision
270TB for Deep Analytics
DeepQA
100s GB for Deep Analytics
3 sec/decision
Smart Traffic250K GPS probes/sec
630K segments/sec
2 ms/decision, 4K vehicles
Petascale Analytics, Appliances and Ecosystem
Deeper InsightsFaster Decisions
Smarter Planet
Big Data is the new resource. The new opportunity is Big Analytics. Every Smarter Planet solution will depend on it.
Market leadership in the Era of Analytics will be taken by the first player to deliver high volumes of easy-to-use Smarter Planet solutions.
Ultimate success will require a Petascale Analytics Appliance and a rich ecosystem of data, algorithms and skills.
Directly integrating Reactive and Deep Analytics enables feedback-driven insight optimization
Dat
a S
cale
Dat
a S
cale
Decision FrequencyOccasional Frequent Real-time
Government and Telco industries are leading this trend
Traditional Data Warehouse and Business Intelligence
Integration
Inte
grat
ion
yr mo wk day hr min sec … ms s
Exa
Peta
Tera
Giga
Mega
Kilo
Feedback
Reactive Analytics
Reality
FastObservations Actions
History
Deep Analytics
Deep PredictionsHypotheses
Integration
Maximum Insight Requires Combining Deep and Reactive Analytics
Watson
• IBM Research built a computer system that is able to compete with humans at the game of Jeopardy: Human vs. Machine contest.
• Named “Watson,” the computer is designed to rival the human mind• Answering questions in natural language poses a grand challenge in computer science,
and the Jeopardy! clue format is a great way to showcase: Broad range of topics, such as history, literature, politics, popular culture and science Nature of the clues, requires analyzing subtle meaning, irony, riddles and other complexities
• Based on the science of Question Answering (QA); differs from conventional search• Natural Language / Human Interactions• Critical for implementing useful business applications such as:
Medical diagnosis Customer relationship management Regulatory compliance Help desk support
Feb. 14 / 15 / 16
© 2011 IBM Corporation11
Compute Intensive Workloads(Traditional ‘HPC’)
Compute Intensive Workloads(Traditional ‘HPC’)
© 2011 IBM Corporation
IBM Systems and Technology Group
Fundamental Issues with Large Scale HPCCompute Intensive Workloads
• Power Efficiency• TCO
• Programmability and Scaleout• Frequency is Plateaued • More Parallelism is needed • Balanced BWs are required for ‘sustained’ Perf• Shared Memory Model vs I/O ‘Accelerator’ Model
• Availability and Reliability• More Circuitry is required• Technology Scale makes it worse• Design for Availability is required
• Data Management and Cost of Storing/Moving Data• Time Steps & Checkpoints• Storage Cost, Energy Cost, BW, Latency• Life Cycle Management
© 2011 IBM Corporation13
Amount of Data Generated Growing Much FasterThan BW to Store or Retrieve it
Amount of Data Generated Growing Much FasterThan BW to Store or Retrieve it
• Example: 100x improvement in Machine Performance
•Core Frequency has Plateaued• 100x Performance -> >100x more cores• Memory per core ~ constant? -> >100x more memory
• Checkpoint Data Increase >100x Plus frequency may increase due to reliability changes• Time Step Data Increase at least 100x? (with Performance)
• Disk and Tape BWs are basically Plateaued (~100MB/s)• Compression Methods are not improving much Only provides ~2x BW boost at most in any event• Capacity Growing at 20-30% CGR but not BW
• Amount of Disk/Tape needs to grow >100x to match BW• Some relief possible with Write Duty Cycle Utilization
• Cache locally and take full interval to write it out• Pre-stage Reads
• Example: 100x improvement in Machine Performance
•Core Frequency has Plateaued• 100x Performance -> >100x more cores• Memory per core ~ constant? -> >100x more memory
• Checkpoint Data Increase >100x Plus frequency may increase due to reliability changes• Time Step Data Increase at least 100x? (with Performance)
• Disk and Tape BWs are basically Plateaued (~100MB/s)• Compression Methods are not improving much Only provides ~2x BW boost at most in any event• Capacity Growing at 20-30% CGR but not BW
• Amount of Disk/Tape needs to grow >100x to match BW• Some relief possible with Write Duty Cycle Utilization
• Cache locally and take full interval to write it out• Pre-stage Reads
© 2011 IBM Corporation14
Example of Data Volume GapExample of Data Volume Gap
• Example of Data Volume Gap Growing for Commercial Users• BW Gap is even larger!• Example of Data Volume Gap Growing for Commercial Users• BW Gap is even larger!
4%CAGR
© 2011 IBM Corporation15
Data Centric Computing
‘Network’ ‘Domain’
Register Stack Functional Units
High Speed Cluster Network
LAN/SANWAN?
Cluster/’System’
Multi-ClusterGrid?
SMP Bus OS/SMP
FLASHFLASH
Server(s)Server(s)
CPU’sCPU’s
MemoryMemory
I/OI/O
High Speed Cluster NetworkHigh Speed Cluster Network
FLASHFLASH
Server(s)Server(s)
CPU’sCPU’s
MemoryMemory
I/OI/O
‘Local’ Storage Node‘Local’ Storage Node ‘Local’ Storage Node‘Local’ Storage Node
Data Set Size Increases Downward
‘Remote’ Storage Node‘Remote’ Storage Node
Disk or TapeDisk or Tape
LAN/SAN.. WAN?LAN/SAN.. WAN?
‘Remote’ Storage Node‘Remote’ Storage Node
Disk or TapeDisk or Tape
Disk, Tape?Flash?Disk, Tape?Flash?
© 2011 IBM Corporation16
SUMMARYSUMMARY
• Data Volume and BW is Exploding in many Areas• Multicore/Many Core Compute Intensive Systems
• Are generating more data and faster than ever before• Also using more Memory due to Frequency Stabilization
• Data Storage BWs are not improving much• Balance of Compute to I/O and Storage will need to shift• Compute Intensive Workloads will also interact with Data Intensive Workloads in Workflow environments• Data Life Cycle Management, Prestaging and Intelligent Writing will become increasingly more important as machines grow in capability
• Data Volume and BW is Exploding in many Areas• Multicore/Many Core Compute Intensive Systems
• Are generating more data and faster than ever before• Also using more Memory due to Frequency Stabilization
• Data Storage BWs are not improving much• Balance of Compute to I/O and Storage will need to shift• Compute Intensive Workloads will also interact with Data Intensive Workloads in Workflow environments• Data Life Cycle Management, Prestaging and Intelligent Writing will become increasingly more important as machines grow in capability
© 2011 IBM Corporation
IBM Systems and Technology Group
...any Questions?
Thank you...