static memory management for efficient mobile sensing applications

22
University of Iowa | Mobile Sensing Laboratory Static Memory Management for Efficient Mobile Sensing Applications EMSOFT 2015 Farley Lai, Daniel Schmidt, Octav Chipara Department of Computer Science

Upload: farley-lai

Post on 22-Jan-2018

500 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

Static Memory Management for Efficient Mobile Sensing

Applications

EMSOFT 2015

Farley Lai, Daniel Schmidt, Octav ChiparaDepartment of Computer Science

Page 2: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• A class of applications that process continuous input data streams and may produce continuous output streams

– real-time processing

– efficient resource management

Emerging Mobile Sensing Applications

2

Speaker Models

Speech Recording

VADFeature

Extraction

HTTP Upload

Speaker Identifier

Introduction

Sensing Stream Processing

Page 3: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Workload: stream operations on frames of samples

– e.g., windowing, splitting, or appending

– stream operation tend to be memory intensive

• Goal: implement stream operations efficiently

– reduce memory footprint

– reduce number of memory accesses

• Challenges:

– handle complex interaction between components

– avoid unnecessary memory copies

– enable data sharing between components

The Memory Management Challenge

3

Introduction

Page 4: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Dynamic memory management

– specialized data structures to implement memory management

• e.g., SigSeg [Girod, et al. 2008] – linked list of buffered samples

– a level of indirection in accessing streaming data

• Static memory management

– no runtime overhead

– requires precise knowledge of the variable live ranges

• difficult to achieve in complex applications

• must be time-efficient to be included in compilers

Approaches to Memory Management

4

Introduction

[Girod2008] L. Girod, Y. Mei, R. Newton, S. Rost, A. Thiagarajan, H. Balakrishnan, and S. Madden, “XStream: a Signal-Oriented Data Stream Management System,” in ICDE, 2008.

Page 5: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Application model

• Static analysis

• Memory layout

• Evaluation

• Conclusions

Outline

5

Page 6: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• StreamIt – synchronous data flow (SDF) language

– application = graph of filters connected with FIFO channels

• limited memory operations: pop(), peek(), and push()

• known consumption and production rates

A Model for Stream Applications

6

pop

peek

push

Filter::work()

INPUT: OUTPUT:

Page 7: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• StreamIt – synchronous data flow language

– applications are constructed hierarchically

• pipeline of streams

• split and joins (splitter and joiner)

– pass-by-value semantics

• naïve implementation would incur significant number of copies

A Model for Stream Applications

7

LPF2

Source

Du

plic

ate LPF1

Subtract SinkR

ou

nd

-Ro

bin

Page 8: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• SDFs may be executed in a cyclo-static schedule– the complete memory behavior of the program may be

observed within one execution of the schedule

• Our solution: static analysis + memory layout

Insight

8

LPF2

Source

Du

plic

ate LPF1

Subtract Sink

Ro

un

dR

ob

in

Source,3 DUP, 3 LPF1,1 LPF2,1

Source,1 DUP, 1 LPF1,1 LPF2,1 RR,1 Sub,1 Sink

INIT PHASE:

STEADY

PHASE:

RR,1 Sub,1 Sink

Page 9: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Location Sharing

– an output element is pushed from an unmodified input element

– each I/O element is associated with a pop/push index

• Temporal Sharing

– an output element reuses the input element storage

– each I/O element is associated with a live range [i, j]

• Builds on abstract interpretation

– build a Control-Flow Graph (CFG) for each filter

– abstract interpretation of memory operations

Component Analysis

9

Page 10: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Abstract interpretation of memory operations

– memory counter (MC) – relative order of operation

– indexes of current push (out) and pop (in)

– live range for each input (LIN) and output (LOUT) element

• Indexes and live ranges represented as intervals

• Subset of rules for determining live ranges:

Component Analysis

10

MC, out, LOUT

LOUT [out]⊔ MC, out++, MC++push

MC, in, LIN

LIN[in]⊔MC, in++, MC++pop

(MC1, in1, out1) (MC2, in2, out2)

(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)join

Page 11: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 11

Example of Component Analysis

[0,0] ∅ ∅ExampleLIN LOUT

0 0 1

MC, LIN, in

LIN[in]⊔MC, in++, MC++pop

RULE:

STATE:

MC 0

in 0 0

out 0 0

MC 1

in 1 1

out 0 0

CFG:

LIN[0] =LIN[0]⊔[0,0]

Page 12: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 12

Example of Component Analysis

[0,0] [1,1] ∅ExampleLIN LOUT

0 0 1

RULE:

STATE:

MC 1

in 1 1

out 0 0

MC 2

in 1 1

out 1 1

CFG:

LOUT[0] =LOUT[0]⊔[1,1]

MC, LOUT, out

LOUT [out]⊔ MC, out++, MC++push

Page 13: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 13

Example of Component Analysis

[0,0] [1,1] ∅ExampleLIN LOUT

0 0 1

RULE:

STATE:

MC 1

in 1 1

out 0 0

MC 2

in 1 1

out 0 1

CFG:

MC 2

in 1 1

out 1 1

(MC1, in1, out1) (MC2, in2, out2)

(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)

join

Page 14: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 14

Example of Component Analysis

[0,0] [1,1] [2,2]ExampleLIN LOUT

0 0 [0,1]

RULE:

STATE:

MC 2

in 1 1

out 0 1

MC 3

in 1 1

out 1 2

CFG:

LOUT[0,1] =LOUT[0,1]⊔[2,2]

MC, LOUT, out

LOUT [out]⊔ MC, out++, MC++push

Page 15: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Component analysis constructs a memory fragment

– captures live ranges for temporal reuse

– captures location sharing edges

• Whole program analysis constructs a memory graph

– stitches together memory fragments

– simulates the schedule to

• connect location sharing edges into paths and

• extend live ranges with the phase number and invocation index

• Our approach:

– analysis is precise when there is no input dependency

– otherwise, it is a sound approximation

Whole Program Analysis

15

Page 16: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

B

• Empirical insights– split-joins can be eliminated for manipulating location shared

elements

– a filter usually can reuse its input memory

• Heuristic approaches to resolving temporal reuse conflicts

Memory Layout

16

A

B

A0

0

0

A B other comps A memory B memory

0

0 0

No conflict Append on Conflict (AoC) Insert-in-Place (IP)

B

A

A

Page 17: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Intel x86_64 on Mac OS X 10.10.3– 3GHz Intel Xeon CPU E5-1680 v2.

– 32KB L1 instruction + 32KB L1 data caches

– 256KB L2 + 25MB L3 caches

• StreamIt Compiler– baseline default settings without optimizations

– enabled cache optimizations with –cacheopt

– gcc –O3 to compile generated C/C++ code

• 11 micro benchmarks from StreamIt

• 3 macro benchmarks from real MSAs– BeepBeep [Peng, C., et al. 2007],

– MFCC and Crowd [Xu, C., et al. 2013]

Experimental Setup

17

Evaluation

Page 18: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

– ESMS reduces both channel buffer sizes and the number memory operations from splitters, joiners and reordering filters

Memory Usage on Intel x86_64

18

45% to 96% reductions73% reductions on average

Evaluation

Page 19: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

– Compared with baseline StreamIt– The average speedup of AA, AoC, and IP are 3, 3.1, and 3 while the average

speedup of CacheOpt is merely 1.07. – ESMS improves the performance by eliminating unnecessary memory

operations and reducing cache/memory references.

Speedup on Intel x86_64

19

Evaluation

Page 20: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Static memory management is effective for stream languages

– whole program memory behaviors may be characterized

– both location and temporal sharing opportunities are exploited

– performance improvement due to fewer memory operations and references

• ESMS provides significant performance improvements

– 45% to 96% data size reduction

– 73% code size reduction

– 3X speedup

Conclusions

20

Page 21: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• National Science Foundation (NeTs grant #1144664 )

• Carver Foundation (grant #14-43555 )

Acknowledgements

21

CSense Toolkit

Page 22: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

Questions?

Thank You

22