data-intensive computing symposium data-rich computing: where it’s at phillip b. gibbons intel...
TRANSCRIPT
Data-Intensive Computing Symposium
Data-Rich Computing:Where It’s At
Phillip B. GibbonsIntel Research Pittsburgh
Data-Intensive Computing SymposiumMarch 26, 2008
Some slides are borrowed from Jason Campbell, Shimin Chen, Suman Nath,and Steve Schlosser. Remaining slides are © Phillip B. Gibbons
Phillip B. Gibbons, Data-Intensive Computing Symposium3
Particle PhysicsLarge Hadron
Collider(15PB)
Human Genomics(7000PB)1GB / person
200PB+ captured200% CAGR
http://www.intttp://www.intetp://www.intelp://www.intel.://www.intel.c//www.intel.co
World Wide Web(~1PB)
wiki wikiiki wiki wki wiki wii wiki wik
Wikipedia(10GB)
100% CAGR
Internet Archive(1PB+)
Typical Oil Company
(350TB+)
Estimated On-line RAM in Google
(8PB)
Personal Digital Photos
(1000PB+)100% CAGR
Total digital data to be created this year 270,000PB (IDC)
200 of London’s Traffic Cams
(8TB/day)
2004 WalmartTransaction DB(500TB)
Annual Email Traffic, no spam(300PB+)
Merck BioResearch DB
(1.5TB/qtr)
One Day of Instant Messaging in 2002
(750GB)
Terashake Earthquake Model
of LA Basin(1PB)
MIT BabytalkSpeech
Experiment(1.4PB)
UPMC HospitalsImaging Data
(500TB/yr)
Phillip B. Gibbons, Data-Intensive Computing Symposium4
Everyday Sensing & Perception (ESP)
15MB today, 100s of GB soon
Cardiac CT4GB per 3D scan,
1000s of scans/year
Terashake Sims~1 PB for LA basin
Object Recognition GB today TB needed
Data-Rich Computing Thriving in a World Awash with Data
@ IntelResearch
Sampling ofthe projects
Phillip B. Gibbons, Data-Intensive Computing Symposium5
Goal: Sample entire region at 10m resolution
6x104 x 3x104 x 1x104 = 18x1012 sample points!
~1 PB of data uncompressed
Image credit: Amit Chourasia, Visualization Services, SDSC
600 km
300
km100 kmdeep
SCECgroundmodel
Building groundmodels ofSouthernCalifornia
Steve Schlosser, Michael Ryan, Dave O’Hallaron (IRP)
Phillip B. Gibbons, Data-Intensive Computing Symposium6
Harvard ground model
Time to Build:SCEC model – ~1 day
Harvard model – ~6 hours
50 8core blades8GB memory300GB disk
Phillip B. Gibbons, Data-Intensive Computing Symposium8
Data-Rich Computing: Where It’s At
Important, interesting, exciting research area
Cluster approach:computing is co-located where the storage is at
Memory hierarchy issues:where the (intermediate) data are at, over the course of the computation
Pervasive multimedia sensing: processing & querying must be pushed out of the data center to where the sensors are at
I know where it’s at, man!
Focus of this talk:
Phillip B. Gibbons, Data-Intensive Computing Symposium9
Memory Hierarchy (I):CMP Architecture
Shared H/W Resources– On-chip cache
– Off-chip PIN bandwidth
Main Memory
P
L1
P
L1
P
L1
(Distributed) Shared L2 Cache
Interconnect
Processor Chip
Longer latency Lower bandwidth
Memory
Phillip B. Gibbons, Data-Intensive Computing Symposium10
Memory Hierarchy (II):CMPs, Memories & Disks on a LAN
SSD (Flash)
MagneticDisk
and/or
Memory
SSD (Flash)
MagneticDisk
and/or
Memory
Cluster– Orders of magnitude
differences in latency & bandwidth among the levels
– Differing access characteristics:
– Quirks of disk
– Quirks of flash
– Quirks of cache coherence
Moreover, can have WAN of such Clusters
Phillip B. Gibbons, Data-Intensive Computing Symposium11
Hierarchy-Savvy Parallel Algorithm Design (HI-SPADE) project
Hierarchy-savvy:– Hide what can be hid– Expose what must be exposed
– Sweet-spot between ignorant and fully aware
Support:– Develop the compilers, runtime systems,
architectural features, etc. to realize the model– Important component: fine-grain threading
Goal: Support a hierarchy-savvy model ofcomputation for parallel algorithm design
Phillip B. Gibbons, Data-Intensive Computing Symposium12
HI-SPADE project: Initial Progress
Effectively Sharing a Cache among Threads [Blelloch & Gibbons, SPAA’04]
– First thread scheduling policy (PDF) with provably-good shared cache performance for any parallel computation
– W.r.t. sequential cache performance
– Hierarchy-savvy: automatically get good shared-cache performance from good sequential cache performance
P
L2 Cache
Main Memory
Shared L2 Cache
P P P P
Main Memory
With
Phillip B. Gibbons, Data-Intensive Computing Symposium13
Example: Parallel Merging in Merge Sort
Parallel Depth First (PDF):
Work Stealing (WS):
Shared cache = 0.5 *(src array size + dest array size).
Cache miss Cache hit Mixed
P=8
Phillip B. Gibbons, Data-Intensive Computing Symposium14
HI-SPADE: Initial Progress (II)
Scheduling Threads for Constructive Cache Sharing on CMPs [Chen et al, SPAA’07]
– Exposes differences between theory result & practice
– Provides an automatic tool to select task granularity
LU Merge Sort Hash Join
Work Stealing (ws) vs. Parallel Depth First (pdf); simulated CMPs
Phillip B. Gibbons, Data-Intensive Computing Symposium15
HI-SPADE: Initial Progress (III)
Provably Good Multicore Cache Performance for Divide-and-Conquer Algorithms [Blelloch et al, SODA’08]
– First model considering both shared & private caches
– Competing demands: share vs. don’t share
– Hierarchy-savvy: Thread scheduling policy achieves provably-good private-cache & shared-cache performance, for divide-and-conquer algorithms
P
L2 Cache
Main Memory
P P P P
Shared L2 Cache
Main Memory
L1 L1 L1 L1 L1
Phillip B. Gibbons, Data-Intensive Computing Symposium16
HI-SPADE: Initial Progress (IV)
Online Maintenance of Very Large Random Samples on Flash Storage [Nath & Gibbons, submitted]
– Flash-savvy algorithm (B-File) is 3 orders of magnitude faster & more energy-efficient than previous approaches
– Well-known that random writes are slow on flash; we show a subclass of “semi-random” writes are fast
Progress thus far is only the tip of the iceberg:Still far from our HI-SPADE goal!
Springboard for a more general study of flash-savvy
algorithms based on semi-random writes
(in progress)Lexa
r C
F c
ard
Phillip B. Gibbons, Data-Intensive Computing Symposium17
Data-Rich Computing: Where It’s At
Important, interesting, exciting research area
Cluster approach:computing is co-located where the storage is at
Memory hierarchy issues:where the (intermediate) data are at, over the course of the computation
Pervasive multimedia sensing: processing & querying must be pushed out of the data center to where the sensors are at
I know where it’s at, man!
Phillip B. Gibbons, Data-Intensive Computing Symposium18
Pervasive Multimedia Sensing
Rich collection of (cheap) sensors– Cameras, Microphones, RFID readers, vibration sensors, etc
Internet-connected. Potentially Internet-scale– Tens to millions of sensor feeds over wide-area
– Pervasive broadband (wired & wireless)
Goal: Unified system for accessing, filtering, processing, querying, & reacting to sensed data– Programmed to provide useful sensing services
Phillip B. Gibbons, Data-Intensive Computing Symposium19
Example Multimedia Sensing Services
Consumer services:
Parking Space Finder
Lost & Found / Lost pet
Watch-my-child / Watch-my-parent
Congestion avoidance
Phillip B. Gibbons, Data-Intensive Computing Symposium20
Example Multimedia Sensing Services
Health, Security, Commerce, and Science services:
• Internet-scale Sensor Observatories
• Homeland Security
• Asset/Supply Chain Tracking
Our prototype
• Low Atmosphere Climate Monitoring
• Epidemic Early Warning System
Phillip B. Gibbons, Data-Intensive Computing Symposium21
Data & Query Scaling Challenges
Data scaling
– Millions of sensors
– Globally-dispersed
– High volume feeds
– Historical data
Query scaling
– May want sophisticated data processing on all sensor feeds
– May aggregate over large quantities of data, use historical data, run continuously
– Want latest data, NOW
NetRad: 100Mb/s
Phillip B. Gibbons, Data-Intensive Computing Symposium22
IrisNet: Internet-scale Resource-intensive Sensor Network services
General-purpose architecture for wide-area sensor systems– A worldwide sensor web
Key Goal: Ease of service authorship– Provides important functionality for all services
Intel Research Pittsburgh + many CMU collaborators– First prototype in late 2002
– In ACM Multimedia, BaseNets, CVPR, DCOSS, Distributed Computing, DSC, FAST, NSDI(2), Pervasive Computing, PODC, SenSys, Sigmod(2), ToSN
Phillip B. Gibbons, Data-Intensive Computing Symposium23
Data & Query Scaling in IrisNet
Store sensor feeds locally– Too much data to collect centrally
Push data processing & filtering to sensor nodes– Reduce the raw data to derived info, in parallel near source
Push (distributed) queries to sensor nodes– Data sampled » Data queried– Tied to particular place: Queries often local
Exploit logical hierarchy of sensor data– Compute answers in-network
Processing & querying must be pushed out of the data center to where the sensors are at
Phillip B. Gibbons, Data-Intensive Computing Symposium24
IrisNet’s Two-Tier Architecture
User
. . .SA
senseletsenselet
Sensor
SA
senseletsenselet
Sensor Sensor
SA
senseletsenselet
Web Serverfor the url
. . .
Query
OAXML database
. . .OA
XML databaseOA
XML database
Two components:SAs: sensor feed processingOAs: distributed database
Sensornet
Phillip B. Gibbons, Data-Intensive Computing Symposium25
Creating a New IrisNet Service
Senselet(program to
filter sensor data)
Extended code(application-specific
aggregation) Hierarchy
(XML schema) Front-end
SA SAOAs
Query with standardDB language
Image processing steps
FFFFEFF Send to OAUpdates DBOnly 500 lines of new code
for Parking Space Finder
vs. 30K lines of IrisNet code
Research focus: Fault Tolerance
Phillip B. Gibbons, Data-Intensive Computing Symposium26
Data-Rich Computing: Where It’s At
Important, interesting, exciting research area
Cluster approach:computing is co-located where the storage is at
Memory hierarchy issues: [HI-SPADE]where the (intermediate) data are at, over the course of the computation
Pervasive multimedia sensing: [IrisNet] processing & querying must be pushed out of the data center to where the sensors are at
I know where it’s at, man!
Phillip B. Gibbons, Data-Intensive Computing Symposium27
Backup Slides
Phillip B. Gibbons, Data-Intensive Computing Symposium28
Techniques for Privacy Protection
Cameras raise huge privacy concerns– Use to it in London. Chicago protest
Viewed by law enforcement vs. viewed by public
• IrisNet Goal: Exploit processing at the sensor node to implement privacy policies
• Privileged senselet that detects & masks faces
• All other senselets only see masked version
Only tip of the iceberg
Phillip B. Gibbons, Data-Intensive Computing Symposium29
Data Organized as Logical Hierarchy
<State id=“Pennysylvinia”> <County id=“Allegheny”> <City id=“Pittsburgh”> <Neighborhood id=“Oakland”>
<total-spaces>200</total-spaces> <Block id=“1”>
<GPS>…</GPS> <pSpace id=“1”> <in-use>no</in-use>
<metered>yes</metered> </pSpace>
<pSpace id=“2”> …
</pSpace> </Block> </Neighborhood>
<Neighborhood id=“Shadyside”> …
……
…
Example XML Hierarchy
IrisNet automaticallypartitions the hierarchy
among the OAs
Phillip B. Gibbons, Data-Intensive Computing Symposium30
In-Network Query Processing:Query Evaluate Gather (QEG)
/NE/PA/Allegheny/Pittsburgh/(Oakland | Shadyside) / rest of query
Pittsburgh OAQ
1. Queries its XML DB
Discovers Shadyside datais cached, but not Oakland
Does DNS lookup to find IP addr for Oakland
2. Evaluate the result
3. Gathers the missing data by sending Q’ to Oakland OA Oakland OA
QEG
Q’
Q’: /NE/PA/Allegheny/Pittsburgh/Oakland/ rest of query
Combines results & returns
IrisNet’sapproach