qrs16: notice: a framework for non-functional testing of compilers
TRANSCRIPT
![Page 1: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/1.jpg)
NOTICE: A Framework for Non-functional Testing of Compilers
Mohamed BOUSSAA
OlivierBARAIS
GersonSUNYE
BenoitBAUDRY
2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016)
August 1-3, 2016 - Vienna, Austria
INRIA Rennes, France
2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016)
August 1-3, 2016 - Vienna, Austria1
![Page 2: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/2.jpg)
a1. Context
a2. Motivating Example
a3. NOTICE: A Framework for Non-functional Testing of Compilers
a4. Performance Evaluation
a5. Conclusion
Outline
2
![Page 3: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/3.jpg)
Motivation
C Compilers
Source code Machine code
Machine code
Machine code
Optimizations
Current innovations in science and industry demand ever-increasing computingresources while placing strict requirements on system performance, power
consumption, size, response, reliability, portability and design time
Satisfy the non-functional requirements
for a broad range of programs and
architectures
Resource constraints
3
![Page 4: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/4.jpg)
Compiler fine/auto-tuning is complex
4
Huge design space for optimization options (more than 150 optimizations)
• compiling a program means trading off between various objectives • compilation time, code quality, code size, ...
Constructing a good set of optimization levels (-Ox) is hard
• conflicting objectives, complex interactions, unknown effect of some optimizations, ...
4
![Page 5: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/5.jpg)
Trying to please everyone
5
Program-independent universal sequences (e.g., -O1, -O2, -O3, etc.)
Based on heuristics and experience
Each optimization level allows trading off various objectives
• O1: "take your time, give it your best shot"• O2: "optimize, and be quick about it"• O3: "I’m feeling lucky, and have lots of time"
How efficient are predefined/universal compiler levels?
5
![Page 6: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/6.jpg)
Motivating Example
GCC 4.8.4:- 78 optimizations - 278 combinations
6
Speedup,Memory,
etc.
ResourceConstraints
WHY ALWAYS
ME !!????
- Testing each optimization configuration is impossible
-BOSS: Clients complain about the high memory consumption
-BOSS: Is it possible to consume less CPU? we don’t have enough resources/money
-BOSS: Please, can we optimize even more ?
Good luck Son !!
- Heuristics are needed
6
![Page 7: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/7.jpg)
NOTICE: A Framework for Non-functional
Testing of Compilershttps://noticegcc.wordpress.com
7
![Page 8: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/8.jpg)
Contributions
1- Diversity-based exploration
Novel formulation of the compiler optimization problem using Novelty Search Diverse optimization sequences Explore the large search space by considering Novelty as the main objective
2- Microservice-based infrastructure
Execute and monitor of the different variants of optimized code using system containers
Resource isolation and management Provide a fine-grained understanding and analysis of compilers behavior regarding
optimizations Automatic extraction of non-functional properties relative to resource usage Finely auto-tuning compilers according to user (non-functional) requirements
We propose:
8
![Page 9: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/9.jpg)
Diversity-based exploration
gcc –c test.c –fno-dce –fno-dse –fdce -fno-align-loops …
Mutation:
Crossover:
Best solution Solution with best non-functional improvement
0 0 1 0 …
Step 2: Evaluation
… Archive:
Novelty metric:
Step 3: Selection
Step 4: Evolutionary
operators 0 1 1 1 0 …
0 1 1 1 0 …
1 0 0 1 1 …
Go To Step 2
Solution representation:
Saves solutions that get a novelty metric value higher than a specific novelty threshold value.
Calculate the distance of one solution from its K Nearest neighbors in current population and in the Archive.
Step 1: Random
generation
9
Select solutions to evolve based on novelty scores.
Tournament selection:
![Page 10: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/10.jpg)
Contributions
1- Diversity-based exploration
Novel formulation of the compiler optimization problem using Novelty Search Diverse optimization sequences Explore the large search space by considering Novelty as the main objective
2- Microservice-based infrastructure
Execute and monitor of the different variants of optimized code using lightweight system containers
Provide a fine-grained understanding and analysis of compilers behavior regarding optimizations
Resource isolation and management Automatic extraction of non-functional properties relative to resource usage Finely auto-tuning compilers according to user (non-functional) requirements
We propose:
10
![Page 11: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/11.jpg)
NOTICE Infrastructure
000
000
NOTICE
Compile and execute optimized code within a new
container instance
Gather at runtime non-functional properties of running programs under test
Save information relative to resource consumptions within a times series database
Analysis of the performance and non-functional properties
of programs under test
1
2
3
4
Code Execution
RuntimeMonitoring
Time seriesDatabase
PerformanceAnalysis
11
![Page 12: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/12.jpg)
NOTICE Infrastructure
OptimizationsComponentUnder Test
MonitoringComponent
Back-endDatabase
Component
Cgroup file systemsRunning…
Monitoring records
Front-endVisualizationComponent
Time-series database
HTTP Requests
CPU
Memory
…
12
![Page 13: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/13.jpg)
Evaluationhttps://noticegcc.wordpress.com/experimental-results/
13
![Page 14: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/14.jpg)
Experimental Setup
v4.8.4
Random C code generator
For monitoring
For storage
Optimizations
Mono ObjectiveNovelty Search (NS)Genetic Algo (GA)
Random Search (RS)
Multi ObjectiveNovelty Search (NS-II)
NSGA-II
Speedup (S)
Meta-heuristics
Program under
test
Compiler
Algorithm parameters
Evaluation metrics
Memory consumption reduction (MR)
CPU consumption reduction (CR)
Over -O0
Trade-off <execution time - memory usage>
14
![Page 15: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/15.jpg)
Research QuestionsRQ1: Mono-objective SBSE Validation.
Optimizations
Non-functionalmetric
Training set programs
Best sequence
RQ2: Sensitivity of input programs to optimization sequences
Unseen programs
Non-functionalimprovementBest sequence
in RQ1
RQ3: Impact of speedup on resource consumption.
RQ4: Trade-offs between non-functional properties.
Best SpeedupSequence
In RQ1
Impact on resource
consumption OptimizationsPareto front
solutions
15
Training set programs
Multi-objective search
Mono-objective search
Non-functionalTrade-off
<time-memory>
Input program
![Page 16: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/16.jpg)
RQ1- ResultsRQ1: Mono-objective SBSE Validation.- Training set: 10 Csmith programs- Average S, MR, and CR- Comparison: Ox, RS, GA and NS
Key findings for RQ1:– Best discovered optimization sequences using mono-objective search techniques always provide better results than standard GCC optimization levels.– Novelty Search is a good candidate to improve code in terms of non-functional properties since it is able to discover optimization combinations that outperform RS and GA.
Search for best optimization sequence
Best sequence
Optimizations
Non-functionalMetric
Training set programs
16
![Page 17: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/17.jpg)
RQ2- Results
Key findings for RQ2:– It is possible to build general optimization sequences that perform better than standard optimization levels– Best discovered sequences in RQ1 can be mostly used to improve the memory and CPU consumption of Csmith programs. To answer RQ2, Csmith programs are sensitive to compiler optimizations.
RQ2: Sensitivity.- 100 unseen Csmith programs- O2 vs O3 vs NS
Unseen programs
Non-functionalimprovement
Best SequenceIn RQ1
17
![Page 18: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/18.jpg)
RQ3- ResultsRQ3: Impact of optimizations on resource consumption.- Ox vs RS vs GA vs NS
Key findings for RQ3: – Optimizing software performance can induce undesirable effects on system resources.– A trade-off is needed to find a correlation between software performance and resource usage.
Best SpeedupSequence
In RQ1 Training set programs
Impact on Resource CPU & memory
18
Memory reduction
Increase of resource usage
CPU reduction
![Page 19: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/19.jpg)
RQ4- ResultsRQ4: Trade-offs between non-functional properties. - 1 Csmith program- Trade-off <execution time-memory usage>
Key findings for RQ4: – NOTICE is able to construct optimization levels that represent optimal trade-offs between non-functional properties.– NS is more effective when it is applied for mono-objective search.– NSGA-II performs better than our NS adaptation for multi-objective optimization. However, NS-II performs clearly better than standard GCC optimizations and previously discovered sequences in RQ1.
19
Optimizations Pareto frontsolutions
Multi-objective searchTrade-off time/memory
Input program
Pareto front NS-II(multi-objective)
Ofast O3 O2
O1
Best CPU reduction (mono-
objective)
Best memory reduction(mono-objective)
Pareto front NSGA-II(multi-objective)
![Page 20: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/20.jpg)
Conclusion
20
![Page 21: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/21.jpg)
Conclusion
21
Novel formulation of the compiler optimization problem based on Novelty Search
Novelty Search is able to generate effective optimizations
Automated tool for automatic extraction of non-functional properties of optimized code
Automatically extract information about memory and CPU consumption
Summary
Explore more trade-offs amongresource usage metrics
Evaluate NOTICE:• on real world benchmarks• other case studies (i.e.,
compilers, programs, etc)
Future directions
21
![Page 22: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/22.jpg)
https://noticegcc.wordpress.com/ 22
Questions?
![Page 23: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/23.jpg)
Additional slides
23
![Page 24: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/24.jpg)
Tool Support
24
![Page 25: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/25.jpg)
Functional Testing of Compilers
PLDI’11
PLDI’14
Literature Overview
ICSE’16
25
![Page 26: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/26.jpg)
Non-Functional Testing of Compilers
Literature Overview
CGO’08
ACSAC’08
PLDI’04
26
![Page 27: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/27.jpg)
Prior work is insufficient
Testing the non-functional properities pose several new challenges:
- Different cost-benefit trade-offs (e.g., Speedup/memory or CPU usage)
- Finely auto-tuning compilers according to user (non-functional) requirements
- Performance is the major concern (e.g., speedup)
- Ignore other important non-functional properties (e.g., resource consumption properties)
- Evaluation is based on a small set of input programs (e.g., Spec CPU benchmarks)
27
![Page 28: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/28.jpg)
Given a set of compiler optimization options {F1, F2, ..., Fn}, How can we find
the combination that maximize program performance better than standard
optimization levels ?
Do this efficiently, without the use of a priori knowledge of the optimizations and their interactions
From
to
From
to
Problem Statement
28
![Page 29: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/29.jpg)
NSGA-II overview
• NSGA-II: Non-dominated Sorting Genetic Algorithm (K. Deb et al., ’02)
Parent Population
Offspring Population
Non-dominated sorting
F1
F2
F3
F4Crowding distance sorting
Population in next generation
MOEA Framework http://moeaframework.org/
29
![Page 30: QRS16: NOTICE: A Framework for Non-functional Testing of Compilers](https://reader035.vdocuments.site/reader035/viewer/2022062902/58ee74a81a28abf0678b4619/html5/thumbnails/30.jpg)
NSGA-II overview
30