data-intensive computing symposium data-rich computing: where it’s at phillip b. gibbons intel...

Data-Intensive Computing Symposium

Data-Rich Computing:Where It’s At

Phillip B. GibbonsIntel Research Pittsburgh

Data-Intensive Computing SymposiumMarch 26, 2008

Some slides are borrowed from Jason Campbell, Shimin Chen, Suman Nath,and Steve Schlosser. Remaining slides are © Phillip B. Gibbons

Phillip B. Gibbons, Data-Intensive Computing Symposium3

Particle PhysicsLarge Hadron

Collider(15PB)

Human Genomics(7000PB)1GB / person

200PB+ captured200% CAGR

http://www.intttp://www.intetp://www.intelp://www.intel.://www.intel.c//www.intel.co

World Wide Web(~1PB)

wiki wikiiki wiki wki wiki wii wiki wik

Wikipedia(10GB)

100% CAGR

Internet Archive(1PB+)

Typical Oil Company

(350TB+)

Estimated On-line RAM in Google

(8PB)

Personal Digital Photos

(1000PB+)100% CAGR

Total digital data to be created this year 270,000PB (IDC)

200 of London’s Traffic Cams

(8TB/day)

2004 WalmartTransaction DB(500TB)

Annual Email Traffic, no spam(300PB+)

Merck BioResearch DB

(1.5TB/qtr)

One Day of Instant Messaging in 2002

(750GB)

Terashake Earthquake Model

of LA Basin(1PB)

MIT BabytalkSpeech

Experiment(1.4PB)

UPMC HospitalsImaging Data

(500TB/yr)


Everyday Sensing & Perception (ESP)

15MB today, 100s of GB soon

Cardiac CT4GB per 3D scan,

1000s of scans/year

Terashake Sims~1 PB for LA basin

Object Recognition GB today TB needed

Data-Rich Computing Thriving in a World Awash with Data

@ IntelResearch

Sampling ofthe projects


Goal: Sample entire region at 10m resolution

6x104 x 3x104 x 1x104 = 18x1012 sample points!

~1 PB of data uncompressed

Image credit: Amit Chourasia, Visualization Services, SDSC

600 km

300

km100 kmdeep

SCECgroundmodel

Building groundmodels ofSouthernCalifornia

Steve Schlosser, Michael Ryan, Dave O’Hallaron (IRP)


Harvard ground model

Time to Build:SCEC model – ~1 day

Harvard model – ~6 hours

50 8core blades8GB memory300GB disk


Data-Rich Computing: Where It’s At

Important, interesting, exciting research area

Cluster approach:computing is co-located where the storage is at

Memory hierarchy issues:where the (intermediate) data are at, over the course of the computation

Pervasive multimedia sensing: processing & querying must be pushed out of the data center to where the sensors are at

I know where it’s at, man!

Focus of this talk:


Memory Hierarchy (I):CMP Architecture

Shared H/W Resources– On-chip cache

– Off-chip PIN bandwidth

Main Memory

P

L1

P

L1

P

L1

(Distributed) Shared L2 Cache

Interconnect

Processor Chip

Longer latency Lower bandwidth

Memory


Memory Hierarchy (II):CMPs, Memories & Disks on a LAN

SSD (Flash)

MagneticDisk

and/or

Memory

SSD (Flash)

MagneticDisk

and/or

Memory

Cluster– Orders of magnitude

differences in latency & bandwidth among the levels

– Differing access characteristics:

– Quirks of disk

– Quirks of flash

– Quirks of cache coherence

Moreover, can have WAN of such Clusters


Hierarchy-Savvy Parallel Algorithm Design (HI-SPADE) project

Hierarchy-savvy:– Hide what can be hid– Expose what must be exposed

– Sweet-spot between ignorant and fully aware

Support:– Develop the compilers, runtime systems,

architectural features, etc. to realize the model– Important component: fine-grain threading

Goal: Support a hierarchy-savvy model ofcomputation for parallel algorithm design


HI-SPADE project: Initial Progress

Effectively Sharing a Cache among Threads [Blelloch & Gibbons, SPAA’04]

– First thread scheduling policy (PDF) with provably-good shared cache performance for any parallel computation

– W.r.t. sequential cache performance

– Hierarchy-savvy: automatically get good shared-cache performance from good sequential cache performance

P

L2 Cache

Main Memory

Shared L2 Cache

P P P P

Main Memory

With

PDF


Example: Parallel Merging in Merge Sort

Parallel Depth First (PDF):

Work Stealing (WS):

Shared cache = 0.5 *(src array size + dest array size).

Cache miss Cache hit Mixed

P=8


HI-SPADE: Initial Progress (II)

Scheduling Threads for Constructive Cache Sharing on CMPs [Chen et al, SPAA’07]

– Exposes differences between theory result & practice

– Provides an automatic tool to select task granularity

LU Merge Sort Hash Join

Work Stealing (ws) vs. Parallel Depth First (pdf); simulated CMPs


HI-SPADE: Initial Progress (III)

Provably Good Multicore Cache Performance for Divide-and-Conquer Algorithms [Blelloch et al, SODA’08]

– First model considering both shared & private caches

– Competing demands: share vs. don’t share

– Hierarchy-savvy: Thread scheduling policy achieves provably-good private-cache & shared-cache performance, for divide-and-conquer algorithms

P

L2 Cache

Main Memory

P P P P

Shared L2 Cache

Main Memory

L1 L1 L1 L1 L1


HI-SPADE: Initial Progress (IV)

Online Maintenance of Very Large Random Samples on Flash Storage [Nath & Gibbons, submitted]

– Flash-savvy algorithm (B-File) is 3 orders of magnitude faster & more energy-efficient than previous approaches

– Well-known that random writes are slow on flash; we show a subclass of “semi-random” writes are fast

Progress thus far is only the tip of the iceberg:Still far from our HI-SPADE goal!

Springboard for a more general study of flash-savvy

algorithms based on semi-random writes

(in progress)Lexa

r C

F c

ard





Memory hierarchy issues:where the (intermediate) data are at, over the course of the computation

Pervasive multimedia sensing: processing & querying must be pushed out of the data center to where the sensors are at



Pervasive Multimedia Sensing

Rich collection of (cheap) sensors– Cameras, Microphones, RFID readers, vibration sensors, etc

Internet-connected. Potentially Internet-scale– Tens to millions of sensor feeds over wide-area

– Pervasive broadband (wired & wireless)

Goal: Unified system for accessing, filtering, processing, querying, & reacting to sensed data– Programmed to provide useful sensing services


Example Multimedia Sensing Services

Consumer services:

Parking Space Finder

Lost & Found / Lost pet

Watch-my-child / Watch-my-parent

Congestion avoidance


Example Multimedia Sensing Services

Health, Security, Commerce, and Science services:

• Internet-scale Sensor Observatories

• Homeland Security

• Asset/Supply Chain Tracking

Our prototype

• Low Atmosphere Climate Monitoring

• Epidemic Early Warning System


Data & Query Scaling Challenges

Data scaling

– Millions of sensors

– Globally-dispersed

– High volume feeds

– Historical data

Query scaling

– May want sophisticated data processing on all sensor feeds

– May aggregate over large quantities of data, use historical data, run continuously

– Want latest data, NOW

NetRad: 100Mb/s


IrisNet: Internet-scale Resource-intensive Sensor Network services

General-purpose architecture for wide-area sensor systems– A worldwide sensor web

Key Goal: Ease of service authorship– Provides important functionality for all services

Intel Research Pittsburgh + many CMU collaborators– First prototype in late 2002

– In ACM Multimedia, BaseNets, CVPR, DCOSS, Distributed Computing, DSC, FAST, NSDI(2), Pervasive Computing, PODC, SenSys, Sigmod(2), ToSN


Data & Query Scaling in IrisNet

Store sensor feeds locally– Too much data to collect centrally

Push data processing & filtering to sensor nodes– Reduce the raw data to derived info, in parallel near source

Push (distributed) queries to sensor nodes– Data sampled » Data queried– Tied to particular place: Queries often local

Exploit logical hierarchy of sensor data– Compute answers in-network

Processing & querying must be pushed out of the data center to where the sensors are at


IrisNet’s Two-Tier Architecture

User

. . .SA

senseletsenselet

Sensor

SA

senseletsenselet

Sensor Sensor

SA

senseletsenselet

Web Serverfor the url

. . .

Query

OAXML database

. . .OA

XML databaseOA

XML database

Two components:SAs: sensor feed processingOAs: distributed database

Sensornet


Creating a New IrisNet Service

Senselet(program to

filter sensor data)

Extended code(application-specific

aggregation) Hierarchy

(XML schema) Front-end

SA SAOAs

Query with standardDB language

Image processing steps

FFFFEFF Send to OAUpdates DBOnly 500 lines of new code

for Parking Space Finder

vs. 30K lines of IrisNet code

Research focus: Fault Tolerance





Memory hierarchy issues: [HI-SPADE]where the (intermediate) data are at, over the course of the computation

Pervasive multimedia sensing: [IrisNet] processing & querying must be pushed out of the data center to where the sensors are at



Backup Slides


Techniques for Privacy Protection

Cameras raise huge privacy concerns– Use to it in London. Chicago protest

Viewed by law enforcement vs. viewed by public

• IrisNet Goal: Exploit processing at the sensor node to implement privacy policies

• Privileged senselet that detects & masks faces

• All other senselets only see masked version

Only tip of the iceberg


Data Organized as Logical Hierarchy

<State id=“Pennysylvinia”> <County id=“Allegheny”> <City id=“Pittsburgh”> <Neighborhood id=“Oakland”>

<total-spaces>200</total-spaces> <Block id=“1”>

<GPS>…</GPS> <pSpace id=“1”> <in-use>no</in-use>

<metered>yes</metered> </pSpace>

<pSpace id=“2”> …

</pSpace> </Block> </Neighborhood>

<Neighborhood id=“Shadyside”> …

……

…

Example XML Hierarchy

IrisNet automaticallypartitions the hierarchy

among the OAs


In-Network Query Processing:Query Evaluate Gather (QEG)

/NE/PA/Allegheny/Pittsburgh/(Oakland | Shadyside) / rest of query

Pittsburgh OAQ

1. Queries its XML DB

Discovers Shadyside datais cached, but not Oakland

Does DNS lookup to find IP addr for Oakland

2. Evaluate the result

3. Gathers the missing data by sending Q’ to Oakland OA Oakland OA

QEG

Q’

Q’: /NE/PA/Allegheny/Pittsburgh/Oakland/ rest of query

Combines results & returns

IrisNet’sapproach

data-intensive computing symposium data-rich computing: where it’s at phillip b. gibbons intel...

Documents