efficient diagnostic tracing for wireless sensor networks vinaitheerthan sundaram patrick eugster...
TRANSCRIPT
Efficient Diagnostic Tracing For Wireless Sensor Networks
Vinaitheerthan SundaramPatrick EugsterXiangyu Zhang
ACM SenSys 2010
2
Motivation Wireless Sensor Network (WSN) deployments
Great Duck Island, MacroScope, Volcano, SensorScope, LOFAR, VigilNet, ExScal, PermaSense
Deployment lessons Murphy loves WSN deployments! Low data yield reported (2% to 70%) [Beutel et al.]* Several deployment failures went unexplained
Post-deployment diagnosis is important but challenging
* “Deployment Techniques for Sensor Networks", J Beutel, K Römer, M Ringwald, M woehrle
3
Our Approach: Efficient Diagnostic Tracing We record control-flow path
Advantages Many faults manifest as abnormal control-flow
split-phase faults, initialization faults, finite-state machine faults
Basic block level accuracy of preemption information data races
Effective compression of repetitive computation
Roadmap Motivation Efficient diagnostic tracing
Trace concurrency Trace control-flow Run-time compression
Implementation Evaluation Related work Conclusions
4
5
Tracing Concurrency in TinyOS Events (“async”) are interrupts that drive the
execution Tasks run when there are no events Tasks cannot preempt other tasks or events
Time
E1E2
E1 T1 T1E1
Interrupt level
E1 E2 E1 T1 E1 T1
Trace EUnit*EUnit ID Event* $ | εID Tid | EidEvent Eid Event* $ | ε
Nested!
E1 E2 $ $ T1 E1 $ $
Roadmap Motivation Efficient diagnostic tracing
Trace concurrency Trace control-flow
intra-procedural inter-procedural
Run-time compression Implementation Evaluation Related work Conclusions
6
7
Tracing Control-Flow: Intra-Procedural Encode n acyclic control-flow paths with an integer in [0, n-1]
Entry
A: if (p1)
B: s1
D: if(p2)
Exit
C: s2
E: s3id +=
1
id = 0
Output(id)
id += 2
Foo() {A. if (p1)B. s1; elseC. s2;D. if(p2)E. s3; }
Path ABDEABD ACDEACD
id0 123
Foo()
8
Code Instrumentation Ball-Larus (BL) algorithm [Micro ’96]
Optimal encoding and uses minimal instrumentation
Entry
A: if (p1)
B: s1
D: if(p2)
Exit
C: s2
E: s3
[0,3]
[0,3]
[2,3][0,1]
[0,1]
[1] Each node, v, is annotated with the number of paths to the exit
• Num(v) = ∑child(i) Num(i)
4
4
2 2
2
1
1
[0,0]
Divide encoding space at each node
Foo()
9
+1
+2
Code Instrumentation BL algorithm
[2] At forks, annotate each edge with the sum of paths contributed by preceding edges
Entry
A: if (p1)
B: s1
D: if(p2)
Exit
C: s2
E: s3
[0,3]
[0,3]
[2,3][0,1]
[0,1]
4
4
2 2
2
1
1
[0,0]+1
+2
+0
+0
+1
+2
+6+2+0
v
w x y2 4 2
8 [0,7]
[0,1] [2,5] [6,7]
[1] Each node, v, is annotated with the number of paths to the exit
• Num(v) = ∑child(i) Num(i)Entry
A: if (p1)
B: s1
D: if(p2)
Exit
C: s2
E: s3
Foo()
10
E1 E2 2 $ 1 $ T1 E1 0 $ 0 $
Tracing Control-Flow: Intra-Procedural
Intra-procedural trace
Trace EUnit*EUnit ID (Event|Path)* $ | εID Tid | EidPath PidEvent Eid Event* $ | ε
Time
E1E2
E1 T1 T1E1
Interrupt levelEntry
A: if (p1)
B: s1
D: if(p2)
Exit
C: s2
E: s3id +=
1
Output(label)id = 0
Output(id)
Output(‘$’)
id += 2
Trace Foo 2 $
Foo()
Roadmap Motivation Efficient diagnostic tracing
Trace concurrency Trace control-flow
intra-procedural inter-procedural
Run-time compression Implementation Evaluation Related work Conclusions
11
12
Tracing Control-Flow: Inter-Procedural Allow function calls inside events and tasks
D
E1()A
B: F()
C
E
G
2
2
11
2
1
+ 1
F()I
J
K L
M
2
2
1 1
1
+ 1
Functions are nested Treat functions as
events or tasks
E1 F 0 $ 1 $
E1 F Inter-Procedural Path
0 0 0
1 0 1
0 1 2
1 1 3
13
Tracing Control-Flow: Inter-Procedural
D
C
E
G1
E1()A
B: F()4
4
2
1 1
F()
I
J
K L
M
+2
2
4
2 2
4
Encode callee’s control-flow within the caller’s control-flow path
+1
Inline F() inside E1()
E1 2 $
Inlining is expensive
E1 F 0 $ 1 $
14
Inter-Procedural Path: Context-Sensitivity
F()I
J
K L
M
+1+1
D
E1()A
B: F()
C
E
G 2
4
1
4
2
1 1
4
2 2
4
+2
Edge increment inside function F depends on the call site
1
2
2
1
1
2
2
1
T1()
A
B: F()
C
Call Edge
15
Inter-Procedural Summary Analysis
n - number of paths in F() x - number of paths to exit after the call site of F()p - path taken in F() at run-time. Note, p ε [0,n-1]
… …
x
Exit
S1… …
n
Exit
Entry
F()n*x
+p*x
F()
Entryn*x
[1] Compute BL individually
[2] Adjust at call site Annotate node with n*x(each of the n paths in the callee can be trailed by one of the x paths in the caller)
Annotate edge with +p*x(skip (p*x -1)inter-procedural paths preceding p)
[3] Recompute BL from call site to entry
E1()
[0, n*x-1]x
x
16
Inter-Procedural Summary Analysis
F()
I
J
K L
M
1
n = 2
2
1
1
+1
E1()A
B: F()n*x=4
4
+1D
C
E
G1
x=2
1 1
+(p*2)
n - number of paths in F() x - number of paths to exit after the call site of F()p - path taken in F() at run-time
inter-procedural path AB 0 CDGAB 0 CEG AB 1 CDGAB 1 CEG
id
0 123
F() path IJKMIJLM
id0 1
17
Tracing Control-Flow: Inter-Procedural
Inter-procedural trace
Time
E1E2
E1 T1 T1E1
Interrupt level
E1 E2 4 $ 3 $ T1 E1 4 $ 3 $
Roadmap Motivation Efficient diagnostic tracing
Trace concurrency Trace control-flow Run-time compression
Implementation Evaluation Related work Conclusions
18
19
Run-time Trace Compression WSNs repeat the sequence of tasks or events at
run-time
Pattern replacement Find two most frequent patterns by profiling
offline During online, replace patterns with symbols
recursively
Run-length encoding Loop compression Run of symbols
Roadmap Motivation Efficient diagnostic tracing Implementation Evaluation Related work Conclusions
20
21
Implementation TinyTracer - Automatically instruments the
program and generates traces at run-time into flash (radio)
Multiple granularity Component – SurgeM Function – SurgeM__Timer__Fired
Tools CIL, TinyOS 1.x, nesC1.3
Roadmap Motivation Efficient diagnostic tracing Implementation Evaluation
Effectiveness case studies of common faults
Efficiency overhead measurements
Related work Conclusions
22
23
Case Studies of Common Faults Initialization faults
EEPROM component in TinyOS 1.x (previously unknown)
Split-phase faults High-level data race in LEACH implementation Failure handling in DeferredPowerManagerM
Task queue overrun PermaSense [Keller et al., SenSys’09] CC1000 radio deadlock
State machine implementation faults VoltageM, EEPROM
24
Notation true path = 1, false path = 0
Normal input (N) E1 0 $ E2 0 $ T1 1 $
Abnormal input (A) E1 0 $ E2 1 $ T1 0 $
Trace NNNNNNNNNNANNN
From trace, localize failure to the path taken inside T1
Hypothetical Bug Case Study
State = 0; /*{0, 1} */
Event E1() { state = 0; post T1;}Event E2(input) { if(input > threshold) state = 1; … }Task T1(){ if(state == 0) sendToBase(); else /* bug */}
Failure
Comparison with function call tracing Normal operation – E1 E2 T1
Abnormal operation – E1 E2 T1
25
Overhead Measurements Metrics
Overheads - energy, RAM, and code size Trace size
Benchmarks Standard TinyOS programs
Blink – blinks red LED every second Oscilloscope – samples light every 1/8th second Surge – samples light every 2 seconds CountToLedsAndRfm – displays and broadcasts counter every
1/4th second LRX (Golden Gate bridge monitoring)
A component to send large packets reliably Test program sends large messages every 2 seconds About 1000 lines of nesC code
Ran benchmarks for 30 minutes
26
Energy Overhead for Surge
• Overhead mainly caused by flash• Inter-procedural tracing overhead is about 1.3 % for 4
components• Inter-procedural tracing consumes less energy than function-
call tracing
base
inte
r
func
inte
r
func
inte
r
func
inte
r
func
inte
r
func
1. App 2. (1)+LED 3. (2) + Sensor
4. (3) + Radio
5. (4) + Timer
0
50
100
150
200
250
300
flash cpu other
None
Joules
46%
72%
6.4%2.4% 1.3%0.9%
27
Memory Overheads
Data Structure
RAM (in bytes)
Flash Pages (16)
256
Circular Buffer (2)
384
Miscellaneous 300
Blink Oscil Surge Count LRX0
10,000
20,000
30,000
40,000Unin-stru-mentedKB
Program memory overhead
RAM overhead
28
Trace Size Ran benchmarks for 30 minutes Shows trace can be compressed well
Benchmarks Compressed trace size (bytes/second)
Compression ratio (uncompressed/compressed)
Blink 0.2 21.71
Oscilloscope 2.7 11.99
Surge 0.3 14.11
CountToLedsRfm 2.0 4.625
LRX 51.7 1.74
Roadmap Motivation Efficient diagnostic tracing Implementation Evaluation Related work Conclusions
29
30
Related Work Network faults
Sympathy [Ramanathan et al., SenSys’05], PAD [Liu et al., SenSys ‘08], SNMS [Tolle and Culler, EWSN’05]
Logging NodeMD [Krunic et al., MobiSys ‘07], LIS [Shea et al.,
DATE’10], Dustminer [Khan et al., SenSys’08]
Visibility Marionette [Whitehouse et al., SenSys’06],
Clairvoyant [ Yang et al., SenSys’07], Hermes [Kothari et al., IPSN’08]
31
Conclusions We showed the feasibility of program tracing
in WSNs Our contributions
Novel context-free grammar execution encoding Efficient inter-procedural control-flow path
recording Effective trace compression scheme
Looking ahead Better compression techniques Distributed tracing schemes Variable value tracing
32
Thank you
Q & A