os ii: dependability & trust swifi-based os evaluations
DESCRIPTION
OS II: Dependability & Trust SWIFI-based OS Evaluations. Prof. Neeraj Suri Stefan Winter Dept. of Computer Science TU Darmstadt, Germany. Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de. So far: Verification & Validation Testing Techniques Static vs. Dynamic - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/1.jpg)
1
OS II: Dependability & TrustSWIFI-based OS Evaluations
Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
Prof. Neeraj Suri
Stefan Winter
Dept. of Computer ScienceTU Darmstadt, Germany
![Page 2: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/2.jpg)
2
Fault Detection: Software Testing So far: Verification & Validation
Testing Techniques Static vs. Dynamic Black-box vs. White-box
Last time: Fault Injection (FI) Applications Techniques Some FI tools
Today: Testing (SWIFI) of operating systems WHERE: Error propagation in OSs [Johansson’05] WHAT: Error selection for testing [Johansson’07] WHEN: Injection trigger selection [Johansson’07]
Next lecture: Profiling the OS extensions (state change @ runtime)
![Page 3: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/3.jpg)
3
FI Recap
Fault Injection (FI) is the process of either inserting bugs into your system or exposing your system to operational perturbations
FI applications for dependable system development Defect Count Estimation (Fault Seeding) Test Suite Evaluation (Mutation Testing) Security Testing Experimental Dependability Evaluations
FI techniques Physical FI HW FI Simulated FI SWIFI
![Page 4: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/4.jpg)
4
FI Recap (cont.)
Where to apply change (location, abstraction/system level)
What to inject (what should be injected/corrupted?) Which trigger to use (event, instruction, timeout,
exception, … ?) When to inject (on first/second/… trigger event) How often to inject (Heisen-/Bohrbugs) … What to record & interpret? For what purpose? How is the system loaded at the time of the
injection Applications running and their load (workload) System resources Real realistic synthetic workload
![Page 5: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/5.jpg)
5
Outline for today‘s lecture
Drivers - a major dependability issue in commodity OSs An error propagation view
FI-based robustness evaluations of the kernel Black box assumption Fault representativeness vs. failure relevance
Design and implementation issues of a suitable FI framework Fault modeling Failure modeling Workloads
![Page 6: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/6.jpg)
6
The problem: Drivers!
Device drivers Numerous: 250 installed (100
active) drivers in XP/Vista Large & complex: 70% of Linux
code base Immature: every day 25 new / 100
revised versions Vista drivers Access Rights: kernel mode
operation in monolithic OSs
Device drivers are the dominant cause of OS failures despite sustained testing efforts
Causes of WinXP outages
Causes of Win2k outages
![Page 7: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/7.jpg)
7
The problem (cont.)
Problem statement:Driver failures lead to OS API failures
Mitigation approaches1. Harden OS robustness2. Improve driver reliability
![Page 8: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/8.jpg)
8
The problem (cont.)
The problem in terms of error propagation
The effect of testing in terms of error propagation
The effect of robustness hardening in terms of error propagation
![Page 9: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/9.jpg)
9
Issues with the driver testing approach
What if the driver is not the root cause?
What if we cannot remove defects (e.g. commercial OSs)?
![Page 10: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/10.jpg)
10
Issues with the hardening approach
What if we cannot remove robustness vulnerabilities?
More issues with the hardening approach in next week‘s lecture...
![Page 11: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/11.jpg)
11
FI-based robustness evaluations
Fault containment wrappers are expensive Additional code is an additional source of bugs Runtime overhead for error checks
Where should we add fault containment wrappers? Where errors with critical effects are likely to occur Where propagation is likely Where critical errors propagate
How do we know where which errors propagate? Propagation analysis (cf. PROPANE)
![Page 12: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/12.jpg)
12
A
B D
C
E
F
IncreasinglIncreasinglyy
badbad
C
E
A
F
DB
!!
Robustness Evaluations
![Page 13: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/13.jpg)
13
Robustness Evaluations
Experimental technique to ascertain “vulnerabilities” Identify (potential) sources, error propagation & hot spots,
etc. Estimate their “effects” on applications Component enhancement with “wrappers”
• if (X > 100 && Y < 30) then Exception();• Location of wrappers
Aspects Metrics for error propagation profiles Experimental analysis
![Page 14: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/14.jpg)
14
System Model
Applications
Operating System
Drivers
?
![Page 15: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/15.jpg)
15
Device Driver
Model the interfaces (defined in C) Export (functions provided by the driver) Import (functions used by the driver)
Driver X
dsx.1 … dsx.m osx.1 … osx.n
Hardware
Exported Imported
![Page 16: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/16.jpg)
16
Metrics
Three metrics for profiling1. Propagation - how errors flow through the OS2. Exposure - which OS services are affected3. Diffusion - which drivers are the sources
Impact analysis
– Metrics– Case study (WinCE)– Results
![Page 17: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/17.jpg)
17
Service Error Permeability
1. Service Error Permeability: Measure one driver’s influence
on one OS service Used to study service-driver
relations
)osin error in Pr(error POS
)dsin error in Pr(error PDS
..
..
zxiizx
yxiiyx
s
s
xD
is
![Page 18: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/18.jpg)
18
OS Service Error Exposure
2. OS Service Error Exposure: An application uses certain services How are these services influenced
by driver errors? Used to compare services
x jxx jx ds
ijx
os
ijx PDSPOS
D.
D.
i
..
E
xD
is
![Page 19: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/19.jpg)
19
Driver Error Diffusion
3. Driver Error Diffusion: Which driver affects the
system the most? Used to compare drivers
xD
i .i . s
.s
. Djxjx ds
ijx
os
ijx
x PDSPOS
is
![Page 20: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/20.jpg)
20
Case Study: Windows CE
Targeted drivers Serial Ethernet
FI at interface Data level errors
Effects on OS services 4 Test applications
Test App
OS
DriversTargetDriver
Manager
Interceptor
DriversDrivers
Host
![Page 21: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/21.jpg)
21
Error Model
Data level errors in OS-Driver interface Wrong values Based on the C-type
• Boundary• Special values• Offsets
Transient First occurrence
![Page 22: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/22.jpg)
22
Impact Analysis
Impact ascertained via failure mode analysis
Failure classes: Class NF: No visible effect Class 1: Error, no violation Class 2: Error, violation Class 3: OS Crash/Hang
?
![Page 23: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/23.jpg)
23
Error Model
Error C-Type #cases
Integers
int 7
unsigned int 5
long 7
unsigned long 5
short 7
unsigned short 5
LARGE_INTEGER 7
Void * void 3
Char’s
char 7
unsigned char 5
wchar_t 5
Boolean bool 1
Enums multiple #ident’s
Structs multiple 1
Case # New value
1 previous – 1
2 previous +1
3 1
4 0
5 -1
6 INT_MIN
7 INT_MAX
LONG RegQueryValueEx([in] HKEY hKey,
[in] LPCWSTR lpValueName,
[in] LPDWORD lpReserved,
[out] LPDWORD lpType,
[out] LPBYTE lpData,
[in/out] LPDWORD lpcbData);
![Page 24: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/24.jpg)
24
Service Error Permeability
Ethernet driver 42 imported svcs 12 exported svcs
Most Class 1 3 Crashes (Class 3)
![Page 25: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/25.jpg)
25
OS Service Error Exposure
Serial driver 50 imported svcs 10 exported svcs
Clustering of failures
![Page 26: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/26.jpg)
26
Driver Error Diffusion Higher diffusion for Ethernet Most Class NF Failures at boot-up
Ethernet Serial
#Experiments 414 411
#Injections 228 187
#Class NF 330(80%)
377(92%)
#Class 1 80 (19%) 25 (7%)
#Class 2 1 9
#Class 3 3 0
0.616 0.460
0.002 0.022
0.007 0
k1DC
k3DC
k2DC
![Page 27: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/27.jpg)
27
Error Models: “What to Inject?”
FI’s effectiveness arises based on the chosen error model being (a) representative of actual errors, and (b) effectively triggering “vulnerabilities”.
Comparative evaluation of “effectiveness” of different error models: Fewest injections? Most failures? Best “coverage”?
Propose a composite error model for enhancing FI effectiveness
![Page 28: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/28.jpg)
28
Chosen Drivers & Error Models
Error Models: Data-type (DT) Bit-flips (BF) Fuzzing (FZ)
Driver Description#Injection cases
DT BF FZ
cerfio_serial Serial port 397 2362 1410
91C111 Ethernet 255 1722 1050
atadisk CompactFlash 294 1658 1035
![Page 29: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/29.jpg)
29
Error Models – Data-Type (DT) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
![Page 30: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/30.jpg)
30
Error Models – Data-Type (DT) Errors
Case New Value
1 Previous – 1
2 Previous +1
3 1
4 0
5 -1
6 INT_MIN
7 INT_MAX
0x80000000
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
![Page 31: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/31.jpg)
31
Error Models – Data-Type (DT) Errors
Varied #cases depending on the data type Requires tracking of the types for correct injection Complex implementation but scales well
int foo(int a, int b) {…}
int ret = foo(0x80000000, 0x00000000);
![Page 32: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/32.jpg)
32
Error Models – Data-Type (DT) Errors
Data type C-Type #Cases
Integers
int 7
unsigned int 5
long 7
unsigned long 5
short 7
unsigned short 5
LARGE_INTEGER 7
Misc.
* void 3
HKEY 6
struct {…} multiple
Strings 4
Characters
char 7
unsigned char 5
wchar_t 5
Boolean bool 1
Enums multiple casesmultiple cases
![Page 33: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/33.jpg)
33
Error Models – Bit-Flip (BF) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
![Page 34: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/34.jpg)
34
Error Models – Bit-Flip (BF) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
1000101101000100000100111110001
![Page 35: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/35.jpg)
35
Error Models – Bit-Flip (BF) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
1000101101000101000100111110001
1000101101000100000100111110001
![Page 36: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/36.jpg)
36
Error Models – Bit-Flip (BF) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a289f1, 0x00000000);
Typically 32 cases per parameter Easy to implement
1000101101000101000100111110001
![Page 37: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/37.jpg)
37
Error Models – Fuzzing (FZ) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
![Page 38: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/38.jpg)
38
Error Models – Fuzzing (FZ) Errors
int foo(int a, int b) {…}
int ret = foo(0x45a209f1, 0x00000000);
0x17af34c2
![Page 39: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/39.jpg)
39
Error Models – Fuzzing (FZ) Errors
int foo(int a, int b) {…}
int ret = foo(0x17af34c2, 0x00000000);
Selective #cases Simple implementation
![Page 40: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/40.jpg)
40
Comparison
Compare Error Models on:
Number of failures Effectiveness Experimentation Time Identifying services
Error propagation
![Page 41: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/41.jpg)
41
Failure Classes & Driver Diffusion
Failure Class Description
No Failure No observable effect
Class 1Error propagated, but still satisfied the OS service specification
Class 2Error propagated and violated the service specification
Class 3 The OS hung or crashed
![Page 42: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/42.jpg)
42
Failure Classes & Driver Diffusion
Failure Class Description
No Failure No observable effect
Class 1Error propagated, but still satisfied the OS service specification
Class 2Error propagated and violated the service specification
Class 3 The OS hung or crashed
Driver Diffusion: a measure of a driver’s abilityto spread errors:
i .s
. Dyxds
iyx
x PDSxD
is
![Page 43: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/43.jpg)
43
Number of Failures (Class 3)
0
10
20
30
40
50
60
70
80
FZBFDTFZBFDTFZBFDT
#C3
Failu
res
91C111cerfio_serial atadisk
![Page 44: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/44.jpg)
44
Failure Classes & Driver Diffusion
Drivers DT BF FZ
cerfio_serial 1.50 1.05 1.56
91C111 0.73 0.98 0.69
atadisk 0.63 1.86 0.29
Driver Diffusion (Class 3)
Class 3
Class 2
Class 1
No failure
0%
20%
40%
60%
80%
100%
BFDT FZ
atadisk
BFDT FZ
91C111
BFDT FZ
cerfio_serial
![Page 45: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/45.jpg)
45
Experimentation Time
Driver Error ModelExec. time
h min
cerfio_serial
DT 5 15
BF 38 14
FZ 20 44
91C111
DT 1 56
BF 17 20
FZ 7 48
atadisk
DT 2 56
BF 20 51
FZ 11 55
![Page 46: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/46.jpg)
46
Identifying Services (Class 3)
Which OS services can cause Class 3 failures?
Which error model identifies most services (coverage)?
Is some model consistently better/worse?
Can we combine models?
Service DT BF FZ
1 X
2 X X
3 X
4 X X
5 X
6 X X
7 X X
8 X X
9 X X X
10 X X X
11 X X X
12 X
13 X
14 X X X
15 X
16 X X X
17 X
18 X
![Page 47: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/47.jpg)
47
Identifying Services (Class 3 + 2)
Which OS services can cause Class 3 failures?
Which error model identifies most services (coverage)?
Is some model consistently better/worse?
Can we combine models?
Service DT BF FZ
1 O X O
2 X X O
3 X O
4 X X
5 X
6 X X
7 X X O
8 X X
9 X X X
10 X X X
11 X X X
12 O X
13 X
14 X X X
15 X
16 X X X
17 X
18 X
![Page 48: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/48.jpg)
48
Bit-Flips: Sensitivity to Bit Position?
0
2
4
6
8
10
024681012141618202224262830Bit position
#Ser
vice
s
[LSB][MSB]
![Page 49: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/49.jpg)
49
024681012141618
024681012141618202224262830
#Ser
vice
s
Bit position
Bit-Flips: Bit Position Profile
Cumulative #services identified
![Page 50: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/50.jpg)
50
Fuzzing – Number of injections?
91111C
cerfio_serial
atadisk
0.2
0.4
0.6
0.8
1.2
1.0
1.4
1.6
1.8
2.0
Dif
fusi
on
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15#Injections
![Page 51: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/51.jpg)
51
Composite Error Model
Let’s take the best of bit-flips and fuzzing Bit-flips: bit 0-9 and 31 Fuzzing: 10 cases
~50% fewer injections Identifies the same service set
500
1500
2500
3500
cerfio_serial
91C111atadisk
#Inj
ecti
ons All BF & FZ
Composite
![Page 52: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/52.jpg)
52
Composite Error Model – Results
BFDT FZCM
atadisk
BFDT FZCM
91C111BFDT FZ
CM
cerfio_serial
Class 3
Class 2
Class 1
No failure
0%
20%
40%
60%
80%
100%
![Page 53: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/53.jpg)
53
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Model Implementation Coverage Execution
DT
BF
FZ
CM
![Page 54: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/54.jpg)
54
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Model Implementation Coverage Execution
DT
BF
FZ
CM
Requires tracking
data types
Requires few experiments
![Page 55: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/55.jpg)
55
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Model Implementation Coverage Execution
DT
BF
FZ
CM
Found the most Class 3 failures
Requires many experiments
![Page 56: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/56.jpg)
56
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Model Implementation Coverage Execution
DT
BF
FZ
CM
Finds additional services
![Page 57: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/57.jpg)
57
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Model Implementation Coverage Execution
DT
BF
FZ
CM
Profiling gives combined BF & FZ with high coverage
![Page 58: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/58.jpg)
58
Summary Comparison across three well established error models + CM
Data-type Bit-flips Fuzzing
Outlook: Outlook: When to do the injection? More drivers, OS’s, models?
Model Implementation Coverage Execution
DT
BF
FZ
CM
![Page 59: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/59.jpg)
59
On the Impact of Injection TriggersOn the Impact of Injection Triggersfor OS Robustness Evaluationfor OS Robustness Evaluation
Andréas JohanssonAndréas Johansson, Neeraj Suri, Neeraj Suri
Department of Computer ScienceDepartment of Computer ScienceTechnische Universtät DarmstadtTechnische Universtät Darmstadt
GermanyGermany
DEEDS: Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
Brendan MurphyBrendan Murphy
Microsoft ResearchMicrosoft ResearchCambridgeCambridge
UKUK
Presented at ISSRE 2007Presented at ISSRE 2007
![Page 60: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/60.jpg)
60
Operating System RobustnessOperating System Robustness
Operating SystemOperating System Key operational element Used in virtually all environments robustness! Drivers are a major source of failures [1] [2]
[1] Ganapathi et. al., LISA’06[2] Chou et. al., SOSP’01
![Page 61: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/61.jpg)
61
Operating System RobustnessOperating System Robustness
External faults Robustness Drivers Interfaces
Experimental Fault injection Run-time
Interface OS-Driver No source code
Goal Identify services with robustness
issues Identify drivers spreading errors
Applications
Drivers
OS
![Page 62: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/62.jpg)
62
Operating System RobustnessOperating System Robustness
The issues behind FI based OS robustness The issues behind FI based OS robustness Where to inject? [3] What to inject? [4] When to inject? [today]
OutlineOutline Problem definition Call strings and call blocks System and error model Experimental setup and method Results
[3] Johansson et. al., DSN’05[4] Johansson et. al., DSN’07
![Page 63: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/63.jpg)
63
Fault InjectionFault Injection
Target: interface OS-DriverTarget: interface OS-Driver Each call potential injectionEach call potential injection Problem: too many callsProblem: too many calls
First-occurrence Sample (uniform?)
Service invocations
![Page 64: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/64.jpg)
64
Fault InjectionFault Injection
Observation: calls are not made randomlyObservation: calls are not made randomly Repeating sequences of calls
Idea: select calls based on “operations”Idea: select calls based on “operations” Identify subsequences, select services
![Page 65: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/65.jpg)
65
Call Strings & Call BlocksCall Strings & Call Blocks
Call stringCall string List of tokens (invocations) to a specific driver
Call blockCall block Subsequence of a call string May be repeating Corresponds to a higher level “operation” Used as trigger for injection
![Page 66: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/66.jpg)
66
System and Error ModelSystem and Error Model
Error model: bit-flipsError model: bit-flips Shown to be effective Simple to implement
Injection Function parameter values
![Page 67: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/67.jpg)
67
Experimental ProcessExperimental Process
Execute workloadExecute workload Record call string
Extract call blocksExtract call blocks Select service targets (1 per call block)
Define triggersDefine triggers Based on tracking call blocks
Perform injectionsPerform injections
![Page 68: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/68.jpg)
68
Injection SetupInjection Setup
Target OS: Windows CE .NetTarget OS: Windows CE .Net Target HW: XScale 255Target HW: XScale 255
![Page 69: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/69.jpg)
69
Failure ClassesFailure Classes
Failure Class Description
No Failure No observable effect
Class 1Error propagated, but still satisfied the OS service specification
Class 2Error propagated and violated the service specification
Class 3 The OS hung or crashed
![Page 70: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/70.jpg)
70
Selected DriversSelected Drivers
Serial port driverSerial port driver Ethernet card driverEthernet card driver
Workload/driver phases:Workload/driver phases:
![Page 71: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/71.jpg)
71
Serial Driver Call String and Call BlocksSerial Driver Call String and Call Blocks
Call string:Call string:
D02775(747){23}732775(747){23}23D02775(747){23}732775(747){23}23
Init Working Clean up
![Page 72: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/72.jpg)
72
Ethernet Driver Call String and Call BlocksEthernet Driver Call String and Call Blocks
![Page 73: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/73.jpg)
73
Driver ProfilesDriver Profiles
Driver invocation patterns differDriver invocation patterns differ Impact of call block injection efficiencyImpact of call block injection efficiency
Serial Ethernet
![Page 74: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/74.jpg)
74
Serial Driver ResultsSerial Driver Results
![Page 75: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/75.jpg)
75
Serial Driver Service IdentificationSerial Driver Service Identification
FO δ α β1 γ1 ω1 β2 γ2 ω2
CreateThread x x x
DisableThreadLibraryCalls
x x
EventModify x x
FreeLibrary x x
HalTranslateBusAddress x
InitializeCriticalSection x
InterlockedDecrement x
LoadLibrary x x
LocalAlloc x x
memcpy x x x
memset x x x
SetProcPermissions x x x
TransBusAddrToStatic x
![Page 76: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/76.jpg)
76
Ethernet Driver ResultsEthernet Driver Results
TriggerSerial Ethernet
#Injections #C3 #Injections #C3
First Occ. 2436 8 1820 12
Call Blocks
8408 13 2356 12
![Page 77: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/77.jpg)
77
SummarySummary
Where, What & When?Where, What & When? New timing model for interface fault injectionNew timing model for interface fault injection
Faults in device driversFaults in device drivers Based on call strings & call blocksBased on call strings & call blocks
ResultsResults Significant differenceSignificant difference More servicesMore services Driver dependentDriver dependent Driver profilingDriver profiling More injections (2436 vs. 8408)More injections (2436 vs. 8408) Focus on init/clean up?Focus on init/clean up?
![Page 78: OS II: Dependability & Trust SWIFI-based OS Evaluations](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681432e550346895daf9e12/html5/thumbnails/78.jpg)
78
Discussion & OutlookDiscussion & Outlook
Call block identificationCall block identification Scalability? New data structures (suffix trees)
Call block selectionCall block selection Working phase vs. initial/clean up
Determinism & concurrencyDeterminism & concurrency Workload selectionWorkload selection
Error modelsError models