Context-Specific Middleware Context-Specific Middleware Specialization Techniques for Specialization Techniques for
Optimizing Software Product-line Optimizing Software Product-line ArchitecturesArchitectures
Arvind S. Krishna, Aniruddha S. Gokhale, Douglas C. SchmidtInstitute for Software Integrated Systems, Dept of EECS
Vanderbilt University Nashville, TN, USA
Venkatesh P. Ranganath, John HatcliffDept of Computer and Information Sciences
Kansas State UnivManhattan, KS, USA
Eurosys’06, Leuven, BelgiumApril 18-21, 2006
2
F-15productvariant
A/V 8-Bproductvariant
F/A 18productvariant UCAV
productvariant
Product-line architecture
Hardware (CPU, Memory, I/O)Hardware (CPU, Memory, I/O)
OS & Network ProtocolsOS & Network Protocols
Host Infrastructure MiddlewareHost Infrastructure Middleware
Distribution MiddlewareDistribution Middleware
Common Middleware ServicesCommon Middleware Services
Middleware for Product Line Architectures
•Middleware factors out many reusable general-purpose & domain-specific services from traditional DRE application responsibility
•Essential for product-line architectures (PLAs)
• e.g., Boeing Boldstroke Avionics mission computing PLA for Boeing fighter aircrafts (F-15, F/A-18, AV-8B, UCAV/JUCAS)
• DRE system with 100+ developers, 3,000+ software components, 3-5 million lines of C++
• Used as open experimentation platform
AirFrame
AP
NavHUD GPS
IFF
FLIR
Domain-specific ServicesDomain-specific Services
3
F-15productvariant
A/V 8-Bproductvariant
F/A 18productvariant UCAV
productvariant
Product-line architecture
Hardware (CPU, Memory, I/O)Hardware (CPU, Memory, I/O)
OS & Network ProtocolsOS & Network Protocols
Host Infrastructure MiddlewareHost Infrastructure Middleware
Distribution MiddlewareDistribution Middleware
Common Middleware ServicesCommon Middleware Services
•Middleware factors out many reusable general-purpose & domain-specific services from traditional DRE application responsibility
•Essential for product-line architectures (PLAs)
•However, standards-based, general-purpose, layered middleware is not yet adequate for the most demanding & mission-critical PLA-based DRE systems
AirFrame
AP
NavHUD GPS
IFF
FLIR
Domain-specific ServicesDomain-specific Services
Middleware for Product Line Architectures
4
F-15productvariant
A/V 8-Bproductvariant
F/A 18productvariant UCAV
productvariant
Product-line architecture
Hardware (CPU, Memory, I/O)Hardware (CPU, Memory, I/O)
OS & Network ProtocolsOS & Network Protocols
Specialized MiddlewareSpecialized Middleware
•Middleware factors out many reusable general-purpose & domain-specific services from traditional DRE application responsibility
•Essential for product-line architectures (PLAs)
•However, standards-based, general-purpose, layers middleware is not yet adequate for the most demanding & mission-critical PLA based DRE systems
AirFrame
AP
NavHUD GPS
IFF
FLIR
Soln: Middleware Specialization for PLA-based DRE systems
Middleware for Product Line Architectures
5
Middleware Specialization Evaluation Criteria
Premise
• Application of specialization techniques should result in considerable improvements in QoS over & above horizontal general-purpose middleware optimizations
• Handcrafting specializations infeasible for large-scale DRE systems => need for tools and processes
• Specializations should have minimal impact on standards compliance (APIs)
Evaluation Criteria
• Use TAO (www.dre.vanderbilt.edu/TAO) as gold standard with several general-purpose optimizations
• Set performance improvements ~30 to 40% improvement from application of specializations cumulatively
• Turning on just one/two optimizations might improve performance by ~10 to 15%
6
Opportunities for Middleware Specialization
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
1
2
3
• Certain functionality can be excessive for PLAs• e.g., layered demultiplexing, leading to
unnecessary performance overhead
• Challenge: automatically remove specification-imposed generality when it’s not needed
• Goal is to devise techniques that apply to any standards compliant middleware, not just an implementation
• Dimension #1: Specification-imposed generality
• Standards-based general purpose middleware functionality defined by specifications such as CORBA, J2EE etc
4
7
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
• Dimension #2: Middleware framework generality
• General-purpose middleware implementations need to work across applications that have varying functional & QoS requirements
• Accommodate variability by providing hooks
• e.g., for different protocol, concurrency & demultiplexing strategies• Hooks add overhead indirections & dynamic dispatching
• PLAs however require one alternative; one protocol
TCP/IP, VME, SCTP, SHMIOP
Thread-pool, Single-threaded, Thread-
per connection
• Challenge: Automatically specialize middleware frameworks to eliminate unnecessary hooks
• Goal is devise techniques applicable to distributed systems that apply common patterns
Opportunities for Middleware Specialization
8
Opportunities for Middleware Specialization
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
• Dimension #3: Platform generality
• Middleware implementations run on different hardware/OS/compiler platforms
• Platforms provide certain optimizations that can be leveraged to enhance QoS
gcc 3.2 (no exceptions),
timesys kernel
Green-hills compiler, vxWorks
platform
• Challenge: Automatically discover PLA deployment platform characteristics to improve QoS
• Goal is to devise techniques that apply to any host infrastructure middleware (e.g., ACE or JVMs) targeting heterogeneous OS, compiler, & hardware platforms
9
Bold Stroke PLA ScenarioExample PLA configuration: Basic Single Processor (BasicSP) – DRE system scenario based on Boeing Bold Stroke challenge problems from DARPA PCES & MoBIES
ACE_wrappers/TAO/CIAO/DAnCE/examples/BasicSPCoSMIC/examples/BasicSP
Goal: Select representative DRE system, e.g., “rate based” events for control information & operations that transmit common data
• Timer Component – Triggers periodic refresh rates
• GPS Component – Generates periodic position updates
• Airframe Component – Processes input from the GPS component & feeds to Navigation display
• Navigation Display – Displays GPS position updates
TIMER20H
z
GPS NAV DISPAIRFRAME
TIMER
20Hz
GPS NAV DISPAIRFRAME
timeout data_avail
get_data ()
data_avail
get_data ()
10
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Services
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
Identifying “Ahead of Time” System Invariants
A specific Reactor used
Protocol: A specific protocol
used
Specification Invariance
Framework Invariance Deployment Invariance
Does not support native exceptions
TIMER20H
z
GPS NAV DISPAIRFRAME
TIMER
20Hz
GPS NAV DISPAIRFRAME
timeout data_avail
get_data ()
data_avail
get_data ()Single method
interfaces:Sends same operation on
wire
14
Feature Oriented CUStomizer (FOCUS)
Middleware Instrumentation Phase
Middleware Specialization Phase
FOCUS addresses specialization challenges by building specialization language, tool, & process to capture & automate middleware specializations
Application Devloper
Ruleselection
OS & Network ProtocolsOS & Network ProtocolsOS & Network ProtocolsOS & Network Protocols
Customized MiddlewareCustomized MiddlewareCustomized MiddlewareCustomized Middleware
AP
NavHUD GPS
IFF
FLIR
• ~1,000 Perl SLOC Parser + weaver
• ~2,500 XML SLOC specialization files
• ~50 (files) annotations
SplRules
Foo (){ ….. ……. //hook …}
Middleware Developer
•Capture specialization transformations via FOCUS specialization language
•Annotate middleware source code with specialization directives
•Create a domain-specific language (DSL) to capture middleware variability
•Analyses & determines the type of specializations applicable
•FOCUS transformation engine selects the appropriate transformations & uses the annotations to automate specializations
16
Specialization Experimental Setup
Goals
• Application of specialization techniques should result in considerable improvements in QoS over & above horizontal general-purpose middleware optimizations
TAO baseline• Active demultiplexing & perfect hashing
for O(1) request demultiplexing• Buffer caching & direct collocation
optimization• Optimized configuration for different
ORB components
Experiment Setup• Pentium III 850 Mhz processor,
running Linux Timesys 2.4.7 kernel, 512 MB of main memory, TAO version 1.4.7 compiled with gcc 3.2.3
• Timers at the client & within ORB used to collect data
• Used Emulab testbed
Specialized TAO Middleware
17
Results for Layer-folding Specialization
Average end-to-end measures improved by ~16%
Average path measures improved by ~40%
Worst case path
measure improved by ~20%
Worst case end-to-end
latency improved by
~14%• Path specialized latency measures
• Path defined as code-path when a request is received until the upcall is dispatched on the skeleton
Experiment
• End-to-end latency measures for:
• General-purpose optimized TAO with active demultiplexing & perfect hashing
• Specialized TAO with layer folding specialization enabled
Specialization applied at the server side (can also be applied at the client side)
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
Dispersion improves by a factor of ~1.5 for both cases
20
Cumulative Specialization Results
Worst-case measures
improved by ~45%
• End-to-end client side throughput improved by ~65%. • Results exceeded the hypothesis & evaluation criteria
• Specification related• Layer folding• Memoization• Constant propagation
(ignoring endianess)• Framework• Aspect weaving
(Reactor + protocol)• Deployment • Loop unrolling +
emulated exceptions
Average end-to-end measures
improved by ~43%
Jitter results twice as good as general-purpose optimized TAO
Layer folding, deployment platform,
memoization, constant propagation
21
Evaluating FOCUS Pros & ConsStrengths• Provides a lightweight, zero (run-time)
overhead middleware specialization• Designed to work across different
languages (e.g., Java & C++)• KSU applying FOCUS &
specializations to Java ORBs• XML-based rule capture
• Easy language extension, ability to add new features easily
• If/when C++ aspect technologies mature, can transform them into aspect rules via XSLT transforms
• Execute transformations via scripting• Integration with QA tools; code
generation from models
Drawbacks
• Doesn’t provide full-fledged language parser, i.e., join points identified via annotations or via regular expressions
• Need to synchronize annotations with specialization files, so modifying source code requires change to specialization files
• Ameliorated via distributed continuous QA; Limitation exists even with aspects
• Correctness of transformations have to be validated externally; unlike AspectJ
• Need higher level tools to validate combinations of specializations
FOCUS available in latest ACE+TAO distribution in ACE_wrappers/bin/FOCUS
24
Future Work System Optimizations
• FOCUS approach applied to middleware optimizations
Model-Driven Technologies
Domain-Specific Modeling Languages
• Future work will focus on identifying system level (middleware, platform, application) specializations
• Goal is to drive the specialization process to optimize systems layer-to-layer
• Capturing invariants in models and using generative technologies to drive specializations
• Other QoS parameters
25
F-15productvariant
A/V 8-Bproductvariant
F/A 18productvariant UCAV
productvariant
Product-line architecture
Concluding RemarksResolving the tension between•Generality Middleware is designed to be independent of particular application requirements
•Specificity PLAs are driven by the functional & QoS requirements for each product variant (using SCV analysis)
Specialized Middleware Stack
Hardware (CPU, Memory, I/O)Hardware (CPU, Memory, I/O)
OS & Network ProtocolsOS & Network Protocols
Host Infrastructure MiddlewareHost Infrastructure Middleware
Distribution MiddlewareDistribution Middleware
Common Middleware ServicesCommon Middleware Services
Domain-specific ServicesDomain-specific Services
•Domain-specific language (DSL) tools & process for automating the specializations
•Development of reusable specialization patterns
•Identifying specialization points in middleware where patterns are applicable
•Latency improvements of 45%
•www.dre.vanderbilt.edu
27
FOCUS Research Goals
SplRules
Foo (){ ….. ……. //hook …}
Middleware Developer
• Contributions: The goal of this work isn’t on specialization languages per se, but instead to quantify the improvements that come from applying various types of specializations to middleware
• Transformational approaches: Stratego-XT, DMS support a broader range of specializations
Application Devloper
Ruleselection
OS & Network ProtocolsOS & Network ProtocolsOS & Network ProtocolsOS & Network Protocols
Customized MiddlewareCustomized MiddlewareCustomized MiddlewareCustomized Middleware
AP
NavHUD GPS
IFF
FLIR
Domain specific language for capturing
specializations; minimize accidental complexity
Process to support specialization with evolution
Language agnostic; leverage
annotations
Ensuring transformations (1) do not incur additional
overhead; (2) do not compromise portability
Develop a process that enables role
separation
Applying specializations improve latency
~30 – 40%
28
FOCUS Specialization Phase
//File: Reactor.h
//@@ REACTOR_SPL_INCLUDE_HOOK// Code woven by FOCUS:#include "ace/Select_Reactor.h"// END Code woven by FOCUSclass Reactor{public: int run_reactor_event_loop (); // Other public methods ....private: // Code woven by FOCUS: ACE_Select_Reactor *reactor_impl_; // End Code woven by FOCUS // ......};
//File: Reactor.h//@@ REACTOR_SPL_INCLUDE_HOOKclass Reactor{public: virtual int run_reactor_event_loop (); // Other public methods ....private: ACE_Reactor_Impl *reactor_impl_; // ......};
Annotated middleware source Transformed middleware source
• These specializations don’t affect framework “business logic,” just its structure• Existing framework is still available when developers need OO capabilities
Removed virtual keyword
Replaced ACE_Reactor_Impl with ACE_Select_Reactor
#include concrete header file
Annotations help identify join points
29
Automating Layer Folding Specialization
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
<add> <hook>TAO_DISPATCH_OPT_ADD_HOOK</hook> <data> 1. if (__dispatch_resolved__) 2.{ 3: //First invocation normal path 4: __dispatch_resolved_ = 1; 5: } 6: else 7:{ // All invocations go through --- Optimized path 8: this->request_disaptcher__.dispatch (…) 9:} <data></add>
Normal path; the dispatch is cached
in the process
Directly use cached dispatcher
Component Source Files {h,cpp} (loc) FOCUS Specialization file
Layer folding
Demultiplexing (tao, PortableServer)
• tao/GIOP_Base 1,200• PortableServer/Servant_Base 2,000
• <add>, <substitute>, <comment>• Specialization file ~250 loc• Tag related changes (5 locations) in
TAO & Portable Server
Automating Specialization via FOCUS•On a per connection basis determine the target dispatcher; •For subsequent requests, directly use the dispatcher to process requests
FOCUS enables specification of specializations in corresponding
language syntax
30
Automating Reactor Specialization
Component Source Files {h,cpp} (loc) FOCUS Specialization file
Aspect Weaving
Reactor Framework (ACE, TAO modules)
• ace/Reactor 1,500• ace/Reactor_Base 1,800 • ace/Select_Reactor 2,500• tao/advanced_resouces 2,500
• <add>, <substitute>, <copy-from-source> <comment>
• Specialization file ~700 loc• Tag related changes (10 locations) <
0.1% of the code within the Reactor framework
Automating Specialization via FOCUS
• <substitute>,<add>: weave code that creates specialized Reactor
• <copy-from-source> to move implementation from Select_Reactor to base Reactor component
• <comment>: comment unspecialized code
<add> <hook>TAO_REACTOR_SPL_COMMENT_HOOK_END</hook> <data> ACE_NEW_RETURN (impl, ACE_Select_REACTOR (….), 0); </data>
ACE_Reactor_Impl *TAO_Advanced_Resource_Factory::allocate_reactor_impl (void) const{ ACE_Reactor_Impl *impl = 0;//@@ TAO_REACTOR_SPL_COMMENT_HOOK_START switch (this->reactor_type_) { ………… }//@@ TAO_REACTOR_SPL_COMMENT_HOOK_END}
Transformations preserve hooks;
Minimizes clutter in middleware source
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
31
Automating Platform Specializations
Deployment platform specialization• Exception support:
• Autoconf’s AC_COMPILE_IF_ELSE macro to check if compiler supports native exception
• Loop unrolling optimization• Autoconf’s AC_RUN_IF_ELSE macros to compile & run benchmark to quantify memcpy() performance improvements
ACE_CACHE_CHECK([if ACE memcpy needs loop unrolling], [ace_cv_memcpy_loop_unroll],[AC_RUN_IFELSE([AC_LANG_SOURCE([[ /// …. Program that will run the benchmark to //// determine if an optimization is better
Variable that will be checked to set a compilation flag
Directive to run the benchmark
Provide actual source
language that will be compiled &
run
This approach can be applied automatically to discover OS-specific system calls & compiler settings that maximize QoS
Component Source Files {h,cpp} (loc) FOCUS Specialization file
Constant Propagation
(autoconf)
• ace/CDR_Stream 4,000• ace/Message_Block 2,000
• Code changes ~650• Benchmark ~250
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
32
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Services
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
Concluding Remarks
•Domain-specific language (DSL) tools & process for automating the specializations
Specification-imposed specializations• Layer-folding• Constant propagation• Memoization
Framework specializations• Aspect Weaving
techniques• Bridge Reactor• Template method
Protocol• Strategy Messaging,
Wait, Demultiplexing
Deployment platform specializations• Unroll copy loops • Use emulated exceptions• Leverage zero-copy
data transfer buffers
•Development of reusable specialization patterns
•Identifying specialization points in middleware where patterns are applicable
35
Foo (){ ….. ……. //hook …}
ACE_Reactor_Impl *TAO_Advanced_Resource_Factory::allocate_reactor_impl (void) const{ ACE_Reactor_Impl *impl = 0;//@@ TAO_REACTOR_SPL_COMMENT_HOOK_START switch (this->reactor_type_) { ………… }//@@ TAO_REACTOR_SPL_COMMENT_HOOK_END}
Middleware Annotations• Middleware developer
annotates middleware with hooks• Hooks represented as
“meta-comments,” i.e., opaque to compilers
FOCUS Instrumentation Phase
<?xml version='1.0'?> <transform>11: <module="TAO/tao">12: <file name="advanced_resource.cpp">13: <comment>14: <start-hook>TAO_REACTOR_SPL_COMMENT_HOOK_START </start-hook>15: <end-hook>TAO_REACTOR_SPL_COMMENT_HOOK_END </end-hook>16: </comment>17: </file>18: </module>23: .... </transform>
Annotations help identify join points & relieve
FOCUS from Implementing a full
fledged language parser
36
Lessons Learned (2/2)Specializations have potential to improve task schedulability
•Specializations help high priority tasks finish ahead of their time to complete• Tasks with priority 50 finish early,
increasing time available for scheduling priorities with 35 & 10
•Specializations can benefit both hard real-time & soft & softer real-time tasks
Adaptation with specializations can adversely affect QoS•Specializations do not consider any form of recovery if invariance assumptions fail
•Adaptation requires loading general-purpose code, add checks along request processing path; increases jitter for DRE systems
QoS improvements are scenario specific•All our specializations improve path latencies considerably than end-to-end latency; More the specialized code path is traversed, greater is QoS improvement
Thread Pool with Lanes
PRIORITY
35PRIORITY
50PRIORITY
10
POA
S1 S2
POA
S320
S435
POA
S540
S550
Specializations increase slack in
the system
37
Applicability of Specialization Approaches
Resolving Specification Imposed Generality•Applicable to standards compliant CORBA middleware
•Layer folding specialization CORBA demultiplexing; Other layered demultiplexing approaches
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
•Avoiding (de)marshaling checks middleware standards such as J2EE, .NET that target heterogeneous deployment
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
1
2
3
4
Big endian
Little endian
38
Applicability of Specialization Approaches
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
TCP/IP, VME, SCTP, SHMIOP
Thread-pool, Single-threaded, Thread-
per connection
Resolving Framework generality Specialize design pattern
Reactor Framework specialization Bridge patternProtocol framework specialization Template method pattern
•Techniques to distributed systems that apply these patterns (www.dre.vanderbilt.edu/POSA)
39
Applicability of Specialization Approaches
Dim #1: Specification Imposed generality
Dim #2: Framework generality
Dim #3: Deployment generality
Container
ClientOBJREF
in argsoperation()out args +
return
IDLSTUBS
ORBINTERFACE
IDLSKEL
Object Adapter
ORB CORE GIOP/IIOP/ESIOPS
Component(Servant)
Se
rvic
es
ProtocolInterface
ComponentInterface
ServicesInterface
DII
DSI
Green-hills compiler, vxWorks
platform
gcc 3.2 (no exceptions),
timesys kernel
Dimension #3: Resolving Deployment generality•Host Infrastructure middleware (ACE, JVMs) targeting heterogeneous OS, compiler & hardware characteristics •Systems concerned with QoS