approximating attack surfaces with stack traces [icse 15]

23
Christopher Theisen , Kim Herzig , Patrick Morrison , Brendan Murphy , Laurie Williams North Carolina State University Microsoft Research, Cambridge UK Approximating Attack Surfaces with Stack Traces

Upload: chris-theisen

Post on 28-Jul-2015

67 views

Category:

Science


4 download

TRANSCRIPT

Page 1: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Christopher Theisen†, Kim Herzig‡, Patrick Morrison†, Brendan Murphy‡, Laurie Williams†

†North Carolina State University‡Microsoft Research, Cambridge UK

Approximating Attack Surfaces with Stack Traces

Page 2: Approximating Attack Surfaces with Stack Traces [ICSE 15]

1/17Introduction | Methodology | Results and Discussion | Future Work | Conclusion

Page 3: Approximating Attack Surfaces with Stack Traces [ICSE 15]

1/17Introduction | Methodology | Results and Discussion | Future Work | Conclusion

Page 4: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Before we start…

What is the “Attack Surface” of a system?

Ex. early approximation of attack surface – Manadhata [2]:

Only covers API entry points

…easy to say, hard to define (practically).

The (OWASP) Attack Surface of an application is: [1]

1. …paths into and out of the application2. the code that protects these paths3. all valuable data used in the application4. the code that protects data

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 2/17

[1] https://www.owasp.org/index.php?title=Attack_Surface_Analysis_Cheat_Sheet&oldid=156006[2] Manadhata, P., Wing, J., Flynn, M., & McQueen, M. (2006, October). Measuring the attack surfaces of two FTP daemons. In Proceedings of the 2nd ACM workshop on Quality of protection (pp. 3-10). ACM

Page 5: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Our goal is to aid software engineers in

prioritizing security efforts by

approximating the attack surface of a

system via stack trace analysis.

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 3/17

Page 6: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Proposed Solution

Stack traces represent user activity that puts the system under stress

There’s a defect of some sort; does it have security implications?

Stack traces may localize security flaws

Crashes caused by user activityBad input that was handled improperly, et cetera

Crashes are a DoS attack by definition; you brought the service or system down!

Hardware crashes are excluded

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 4/17

Page 7: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Research Questions

RQ1: How effectively can stack traces to be used to approximate the attack surface of a system?

RQ2: Can the performance of vulnerability prediction be improved by limiting the prediction space to the approximated attack surface?

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 5/17

Page 8: Approximating Attack Surfaces with Stack Traces [ICSE 15]

OverviewCatalog all code that appears on stack traces

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 6/17

Page 9: Approximating Attack Surfaces with Stack Traces [ICSE 15]

OverviewCatalog all code that appears on stack traces

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 6/17

Page 10: Approximating Attack Surfaces with Stack Traces [ICSE 15]

OverviewCatalog all code that appears on stack traces

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 6/17

Page 11: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Data Sources

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

[4] "Description of the Dr. Watson for Windows," Microsoft Corporation, [Online]. Available: http://support.microsoft.com/kb/308538/en-us.

7/17

Page 12: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Attack Surface Construction (RQ1)

Data source, Crash ID, binary [4000+], filename [100,000+], function [10,000,000+]

Crashes Provide:Binary

Function

foo!foobarDeviceQueueRequest+0x68foo!fooDeviceSetup+0x72foo!fooAllDone+0xA8bar!barDeviceQueueRequest+0xB6bar!barDeviceSetup+0x08bar!barAllDone+0xFFcenter!processAction+0x1034center!dontDoAnything+0x1030

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 8/17

Page 13: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Results (RQ1)

Fuzzing User Induced

Crashes

%binaries 0.9% 48.4%

%vulnerabilities 14.9% 94.6%

Microsoft targets fuzzing towards high-risk modulesWe are covering the majority of vulnerabilities seen!

Targeting different crashes gets different results

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 9/17

Page 14: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Prediction Models (RQ2)

We believe that the key for [improving prediction] is by:

(1) developing new prediction techniques that deal with the“needle in the haystack” problem

(2) finding new metrics that deal with the unique characteristics of vulnerabilities and attacks.

Zimmermann et al. study [3]:

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

[3] T. Zimmermann, N. Nagappan and L. Williams, "Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista," in Software Testing, Verification and Validation (ICST), 2010 Third International Conference on, 2010

10/17

Page 15: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Prediction Models (RQ2)

We believe that the key for [improving prediction] is by:

(1) developing new prediction techniques that deal with the“needle in the haystack” problem

(2) finding new metrics that deal with the unique characteristics of vulnerabilities and attacks.

Zimmermann et al. study [3]:

Stack traces point to where flawed code lives!

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

[3] T. Zimmermann, N. Nagappan and L. Williams, "Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista," in Software Testing, Verification and Validation (ICST), 2010 Third International Conference on, 2010

10/17

Page 16: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Prediction Model Construction (RQ2)

Replicated the VPM from Windows Vista study

Run the VPM with all files considered as possibly vulnerable

Repeat, but remove code not found on stack traces

Vulnerability Prediction Model (VPM)

29 metrics in 6 categories:

ChurnDependencyLegacy

CODEMINE data [5]

SizeDefectsPre-release vulnerabilities

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

[5] J. Czerwonka, N. Nagappan, W. Schulte and B. Murphy, "CODEMINE: Building a Software Development Data Analytics Platform at Microsoft," Software, IEEE, vol. 30, no. 4, pp. 64--71, 2013.

11/17

Page 17: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Results (RQ2)

Comparing the VPM run on all files vs. just attack surface files…

Precision improved from 0.5 to 0.69

Recall improved from 0.02 to 0.05

Statistical improvement

Practical? No.

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 12/17

Page 18: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Problems with Precision [6]

No. Low precision is fine in several situations.

When the cost of missing the target is prohibitively expensive.When only a small fraction [of] the data is returned.When there is little or no cost in checking false alarms.

Are low precision predictors unsatisfactory?

…especially on highly imbalanced datasets.

Recall and precision like to compete

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 13/17

[6] Tim Menzies, Alex Dekhtyar, Justin Distefano, and Jeremy Greenwald. 2007. Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'". IEEE Trans. Softw. Eng. 33, 9 (September 2007)

Page 19: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Problems with Precision [6]

No. Low precision is fine in several situations.

When the cost of missing the target is prohibitively expensive.When only a small fraction [of] the data is returned.When there is little or no cost in checking false alarms.This seems appropriate for security flaws!

Are low precision predictors unsatisfactory?

…especially on highly imbalanced datasets.

Recall and precision like to compete

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

[6] Tim Menzies, Alex Dekhtyar, Justin Distefano, and Jeremy Greenwald. 2007. Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'". IEEE Trans. Softw. Eng. 33, 9 (September 2007)

13/17

Page 20: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Lessons Learned - Visualizations

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

Destination files

“Sourc

e”

file

s

14/17

Page 21: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Limitations

Stack traces are a good metric for Windows 8…

Different levels of granularity? (File/Function)Smaller projects? Open source?Not operating systems?

Results don’t necessarily generalize

Other learners?

Oversampling and Undersampling?

What else can we do with VPM’s?

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 15/17

Page 22: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Future Work

What else can we do with stack traces?

Frequency of appearanceDependencies, not the entities themselvesHow many stack traces are required?Sliding window; how does the approximation change over time?

Additional Metrics

Visualization Plugin for IDEs

…does it actually help?

Tool Development

Introduction | Methodology | Results and Discussion | Future Work | Conclusion 16/17

Page 23: Approximating Attack Surfaces with Stack Traces [ICSE 15]

Introduction | Methodology | Results and Discussion | Future Work | Conclusion

foo!foobarDeviceQueueRequest+0x68foo!fooDeviceSetup+0x72foo!fooAllDone+0xA8bar!barDeviceQueueRequest+0xB6bar!barDeviceSetup+0x08bar!barAllDone+0xFFcenter!processAction+0x1034center!dontDoAnything+0x1030

Conclusion

17/17