designer’s guide - high-performance model compilation for complex behavioral … · 2009. 12....
TRANSCRIPT
High-Performance Model Compilationfor Complex Behavioral ModelsSeptember 20th, 2007
Daniel Platte
Qimonda · Daniel Platte · September 16, 2007 · Page 2 Copyright © Qimonda AG 2007 · All rights reserved.
Outline
Bottom-Up Modeling Through Symbolic AnalysisBottom-Up Modeling Through Symbolic Analysis
Simulation ResultsSimulation Results
SummarySummary
Model CompilationModel Compilation
Qimonda · Daniel Platte · September 16, 2007 · Page 3 Copyright © Qimonda AG 2007 · All rights reserved.
Import
Import of several netlist formats (Spice, Spectre, etc), Cadence Design Framework integration
Bottom-Up Modeling FlowCircuit Netlist
Network DAE
Device Models Equation SetupAutomated setup of symbolic network equationsthrough an extended MNA and symbolic devicemodels
Optimizations
Algebraic reformulation and restructuring of the DAEs to optimize simulationperformance (no impact on the accuracy)
Model Reduction
Simplified DAE
Efficient symbolic model reduction techniques fornonlinear DAEs with continuous numeric error control
Model Export
Behavioral Model
Export the DAEs to several common AHDLs incl. VHDL-AMS, Verilog-A, MAST, (Modelica)
No model reduction,100 % accuracy,
comparableperformance
Qimonda · Daniel Platte · September 16, 2007 · Page 4 Copyright © Qimonda AG 2007 · All rights reserved.
Example μA741
Qimonda · Daniel Platte · September 16, 2007 · Page 5 Copyright © Qimonda AG 2007 · All rights reserved.
Analytic Model of μA741
μA741 Dimension Sparsity
Circuit 62 90.3%
Model(MNA) 368 98.8%
No model reduction applied564 Parameters115 kB Code Size1300 Lines of Code VHDL10 additional internalvariables per BJT
26 Transistors(Gummel-Poon)
Qimonda · Daniel Platte · September 16, 2007 · Page 6 Copyright © Qimonda AG 2007 · All rights reserved.
Toda
yTa
rget
tCPU1 10 1000.1
Motivation
Circuit(Refer.) tref
Model(full) 200x tref
Model(full) <10x tref
Model(reduced) 0.2
Reduction
Reduction
Adaptation
Model(reduced) 4x tref
Speed-Up 5x
Qimonda · Daniel Platte · September 16, 2007 · Page 7 Copyright © Qimonda AG 2007 · All rights reserved.
Reasons for Performance Problems
• „Intelligence“ of built-in device models missing• Equation formulation not optimized for numerical methods• Model compilers do not efficiently handle such complex behavioral models• Restrictions by using common AHDLs
– procedural statements in VHDL-AMS– simultaneous equation sets in Verilog-A– no optimization of Jacobian matrix possible
Qimonda · Daniel Platte · September 16, 2007 · Page 8 Copyright © Qimonda AG 2007 · All rights reserved.
Outline
Bottom-Up Modeling Through Symbolic AnalysisBottom-Up Modeling Through Symbolic Analysis
Simulation ResultsSimulation Results
SummarySummary
Model CompilationModel Compilation
Qimonda · Daniel Platte · September 16, 2007 · Page 9 Copyright © Qimonda AG 2007 · All rights reserved.
Model Compilation with Analog Insydes
Device Models
Circuit Netlist
Import
Equation Setup
Network DAE
Model Reduction
Simplified DAE
Optimizations
Model Export
Behavioral Model Compiled Model
No model reduction,100 % accuracy,
comparableperformance
• Generation of compiled models forQimonda‘s inhouse-simulator Titan
• Analog Insydes is used as platform forthe model compilation
• Key features:– Efficient processing of simultaneous
DAE– Local solving of sequential equations– Data locality (cache)– Sparse handling and data structures– Direct interface to the simulator kernel
Qimonda · Daniel Platte · September 16, 2007 · Page 10 Copyright © Qimonda AG 2007 · All rights reserved.
Example – Sequential Network Equations
vd@tD − RSI$AC@tDAREA
+ V$IN@tD − V$OUT@tD
id@tD AREAikjj− −0.00333167qHBV+vd@tDL
k IBV+ikjj−1+
0.00181069qvd@tDk
y{zz IS
y{zz+ GMIN vd@tD
cj@tD AREA CJO IfBvd@tD < 0.25, 1H1.−[email protected] , 2.51926H0.3335+ 0.666 vd@tDLF
I$VGND@tD+ V$GND@tD−V$OUT@tDR0
+ C0HV$GND @tD− V$OUT @tDL 0
I$AC@tD+ I$VIN@tD 0
−I$AC@tD + −V$GND@tD+V$OUT@tDR0
+ V$OUT@tD−V$OUT2@tDRL
+ C0H−V$GND @tD +V$OUT @tDL 0
I$VOUT@tD+ −V$OUT@tD+V$OUT2@tDRL
0
I$AC@tD id@tD+ 1.15×10−8 id@tD + cj@tD vd @tDV$IN@tD VINV$OUT2@tD 0V$GND@tD 0 RC
D
VIN VGND VOUT00
RL
A K
GNDB
SimultaneousA
BK
GND
Sequential
VINVOUTVGND
Qimonda · Daniel Platte · September 16, 2007 · Page 11 Copyright © Qimonda AG 2007 · All rights reserved.
Newton‘s Method for Sequential DAEs
seq seq seq11 12
sim sim sim21 22
f / f / f ( , )f / f / f ( , )∂ ∂ ∂ ∂ Δ Δ⎡ ⎤ ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ ⎤
⋅ = ⋅ = −⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥∂ ∂ ∂ ∂ Δ Δ⎣ ⎦ ⎣ ⎦⎣ ⎦⎣ ⎦ ⎣ ⎦
y x y xJ Jy yy x y xJ Jx x
( )( )
111 seq 12
1 122 21 11 12 sim 21 11 seq
*22 sim
f ( , )
f ( , ) f ( , )
f ( , )
−
− −
Δ = ⋅ + Δ
− ⋅ ⋅ ⋅Δ = − + ⋅ ⋅
⋅Δ = −
y J y x J x
J J J J x y x J J y x
J x y x
1. Calculate yn+1 from xn (sequential equations)2. Update J with yn+1 and xn3. Calculate Schur complement
4. Solve *22 simf ( , )⋅Δ = −J x y x
* 122 22 21 11 12
−= − ⋅ ⋅J J J J J
0
Qimonda · Daniel Platte · September 16, 2007 · Page 12 Copyright © Qimonda AG 2007 · All rights reserved.
Methods to Calculate Schur Complement
1) Symbolic Schur complement- extremely complex expressions, impossible2) Numerical calculation- matrix inversion and multiplications
necessary3) Semi-symbolic Schur complement with
roll-out- symbolically calculate the Schur
complement with references to theevaluated Jacobian matrix
- generate an efficient transformation processby elimination of the submatrix J21
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
J11
J22J21
J12
Schur Complement
122 22 21 11 12
−= − ⋅ ⋅J J J J J%
Qimonda · Daniel Platte · September 16, 2007 · Page 13 Copyright © Qimonda AG 2007 · All rights reserved.
Example Schur Complement
1 0 0 J[1,4] J[1,5]
J[2,1] 1 0 0
J[3,1] 0 1 0 J[3,5]
J[4,4]
J[5,4]
0
J[5,5]
0
0 J[4,2] J[4,3]
J[5,1] 0 J[5,3]
Structural Matrix
-J[2,1]
J[2,4] = J[2,1]*J[1,4]J[2,5] = J[2,1]*J[1,5]J[2,1] = 0
J[3,4] = J[3,1]*J[1,4]J[3,5] = J[3,5]-J[3,1]*J[1,5]J[3,1] = 0
J[4,4] = J[4,4]-J[1,4]*J[4,1]J[4,5] = J[4,5]-J[1,5]*J[4,1]
J[4,4] = J[4,4]-J[2,4]*J[4,2]J[4,5] = J[4,5]-J[2,5]*J[4,2]…
Transformation Process
J[4,1] J[4,5]
J[2,4] J[2,5]
J[3,4]0
0
0 0 0
0 0
Qimonda · Daniel Platte · September 16, 2007 · Page 14 Copyright © Qimonda AG 2007 · All rights reserved.
Optimization Strategies
Op
timize
dD
AE S
yste
m• Identification of Sequential Equations (BLT Transf.)
Automated recognition of sequential equations from generalDAE systems to reduce the number of the simult. equations
• Common Subexpression Elimination (CSE)CSE on DAE level to reduce redundancy within the functionevaluation and improve data locality
• Loop Invariant Expressions / PreloadingPreevaluation of constant and parametric expressions within theinitialization phase, recognition of parametric subexpressions
• Copy / Constant PropagationReduce memory accesses by propagation of copied or constantvalues into subsequent expressions
Qimonda · Daniel Platte · September 16, 2007 · Page 15 Copyright © Qimonda AG 2007 · All rights reserved.
Example - Optimized DAE System
vd@tD VIN− RSI$AC@tDAREA
− V$OUT@tD
SeqExpr2@tD − V$OUT@tDR0
SeqExpr3@tD V$OUT@tDRL
id@tD −AREAikjj − 0.00333167qHBV+vd@tDL
k IBV−ikjj−1+
0.00181069qvd@tDk
y{zz IS
y{zz+ GMIN vd@tD
cj@tD AREA CJO IfBvd@tD < 0.25, 1H1.−[email protected] , 2.51926H0.3335+ 0.666 vd@tDLF
SeqExpr1@tD −C0 V$OUT @tDI$VGND@tD −SeqExpr1@tD − SeqExpr2@tD−I$AC@tD − SeqExpr1@tD −SeqExpr2@tD + SeqExpr3@tD 0I$AC@tD id@tD+ 1.15×10−8 id@tD + cj@tD vd @tD Simultaneous
Sequential
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
• Dimension reduced from 8x8 to 2x2• No redundancy within DAEs (3 common
subexpressions eliminated)
Qimonda · Daniel Platte · September 16, 2007 · Page 16 Copyright © Qimonda AG 2007 · All rights reserved.
Outline
Bottom-Up Modeling Through Symbolic AnalysisBottom-Up Modeling Through Symbolic Analysis
Simulation ResultsSimulation Results
SummarySummary
Model CompilationModel Compilation
Qimonda · Daniel Platte · September 16, 2007 · Page 17 Copyright © Qimonda AG 2007 · All rights reserved.
Performance Improvement
opamp741 (ZMS)
6.85
0.57 0.50 0.34
(0.30)(0.64)
2.41
0.64
5.77
1.370.76
0
1
2
3
4
5
6
7
8
TML (sparseloading)
ZMS (sim) ZMS (seq) ZMS (opt) Circuit
CPU
-Tim
e [s
]
Loading Solving
ZMS Compiler
MUMPS Solver
Local Solving
Optimi-zations
8x Speed-Up (14x) 2x Slow-Down
6x
50 %
50 %
Qimonda · Daniel Platte · September 16, 2007 · Page 18 Copyright © Qimonda AG 2007 · All rights reserved.
Outline
Bottom-Up Modeling Through Symbolic AnalysisBottom-Up Modeling Through Symbolic Analysis
Simulation ResultsSimulation Results
SummarySummary
Model CompilationModel Compilation
Qimonda · Daniel Platte · September 16, 2007 · Page 19 Copyright © Qimonda AG 2007 · All rights reserved.
Outlook
Analog Insydes
Circuit Netlist
Equation Setup
Equation Optimization
Model Export
Netlist Import
Symbolic Device Models
Circuit Equations
new
Compiled Model
Model Reduction
Simplified Eqs.
improved
newAHDL Model
Behavioral Model
Model Import
Model Equations
ModelReduction
ModelOptimization
Translation, Device Model Compilation
missing
Qimonda · Daniel Platte · September 16, 2007 · Page 20 Copyright © Qimonda AG 2007 · All rights reserved.
Summary
• Model generation based on a symbolic analysis flow using the toolbox Analog Insydes presented
• Such complex and high-dimensional AHDL-based models have insufficient simulation performance
• New model compiler for the Titan simulator developed• Efficient processing of the generated models due to
– Local solving with semi-symbolic Schur complement– Optimization techniques on DAE level– Close interaction between model and simulator
• Performance improvement by a factor of at least 8
Thank you
The World’s LeadingCreative Memory Company
Qimonda · Daniel Platte · September 16, 2007 · Page 22 Copyright © Qimonda AG 2007 · All rights reserved.
Performance Problem
Circ
uit
sim
ulat
ion
Circuit
Equations
Waveform
Solver Sym
bolic
Ana
lysi
s
Circuit
Equations
Model
Beh
avio
ral
sim
ulat
ion
Waveforms
Solver
equivalent
identic
similar
~200x
Qimonda · Daniel Platte · September 16, 2007 · Page 23 Copyright © Qimonda AG 2007 · All rights reserved.
Dimension of the Linearized Systems
28
58
21
11
28
68
21
11
47
137
83
27
95
314
(1396)
(281)
0 200 400 600 800 1000 1200 1400 1600
mul
tiplie
rop
amp7
41cf
cam
pna
nd2
Dim [1]cir blt seq sim
Sim: All model equationssimultaneous
Seq: Sequential equationsdefined within devicemodels
Blt: With identification of sequential equations
Cir: Dimension of the MNA equations in circuitsimulation
Seq
• Enormous reduction of themodel dimension
• Only minor differencebetween optimized modeland circuit simulation
• Local solving and optimizations very efficient
MOSBSIM3v3
BJTGummel-
Poon
Qimonda · Daniel Platte · September 16, 2007 · Page 24 Copyright © Qimonda AG 2007 · All rights reserved.
Jacobian Matrix of Sequential DAEsB
ehav
iora
l Mod
elC
ircui
t
Nodal Equations
Node Potentials Currs
Voltage Equations
Model Equations
InternalQuantities
Simultaneous Variables
Port Volt.
Simult. Equations
Seq. Equations
SequentialVariables
11 12
21
1
Nodal Equations(Ports)
22 2
Typically muchlarger dimension
Very few
Solved within thesimulator kernel
Qimonda · Daniel Platte · September 16, 2007 · Page 25 Copyright © Qimonda AG 2007 · All rights reserved.
Semi-Symbolic Schur Complement
1) Transform S11 to unity matrixchanges within S12, fill-ins
2) Eliminate S21
update of S22, fill-ins
Qimonda · Daniel Platte · September 16, 2007 · Page 26 Copyright © Qimonda AG 2007 · All rights reserved.
DAE System with Sequential Structure
( )f , , , , t 0′ ′ =y y x xA system of differential algebraic equations (DAE)
mR∈ynR∈x
withsequential variables
simultaneous variables
{ }seq simf f , f=n m m
seqf : R R R R× × ⎯⎯→n m n
simf : R R R R× × ⎯⎯→
sequential equations
simultaneous equations
( )i seq,i 1 i 1 1 i 1f y y , y y , , , t− −′ ′ ′=y x xK K i 1 n= K
is of sequential structure, if
for
Qimonda · Daniel Platte · September 16, 2007 · Page 27 Copyright © Qimonda AG 2007 · All rights reserved.
Modeling DAE Systems with Seq. StructureSe
quen
tial
Equa
tions
Sim
ulta
neou
sEq
uatio
ns
Verilog-A// Analog procedural// assignment
real vd;
vd = -RS*X(I_A))/AREA-X(V_K)+V(A);
// (Indirect) branch// contribution
DAEVar V_G;
X(V_G) <+ X(V_G)+id-0.115E-7*X(id_d1)-X(I_A)+cj*X(vd_d1);
Sequ
entia
lEq
uatio
nsSi
mul
tane
ous
Equa
tions
Qimonda · Daniel Platte · September 16, 2007 · Page 28 Copyright © Qimonda AG 2007 · All rights reserved.
Modeling DAE Systems with Seq. Structure
Verilog-A VHDL-AMS// Analog procedural// assignment
real vd;
vd = -RS*X(I_A))/AREA-X(V_K)+V(A);
-- Simultaneous-- procedural statement
vd:=-RS*I_A/AREA-V_K+A;
-- or: Function
function vd (I_A,…) is
…
// (Indirect) branch// contribution
DAEVar V_G;
X(V_G) <+ X(V_G)+id-0.115E-7*X(id_d1)-X(I_A)+cj*X(vd_d1);
-- Simple simultaneous-- statement
QUANTITY V_G : Voltage;I_A-ID-0.115E-7*ID_D1-CJ*VD_D1 == 0.0 Tolerance “Current”;
Sequ
entia
lEq
uatio
nsSi
mul
tane
ous
Equa
tions
Qimonda · Daniel Platte · September 16, 2007 · Page 29 Copyright © Qimonda AG 2007 · All rights reserved.
Block Lower-Triangular Matrix
Sim. Vars.
Sim
. Eq
s.S
eq. E
quat
ions
Seq. Variables
11 12
21 22
1
1
11
11
11
11
Qimonda · Daniel Platte · September 16, 2007 · Page 30 Copyright © Qimonda AG 2007 · All rights reserved.
4801
12002
153722
30889
7874
16895
275135
57285
0 50000 100000 150000 200000 250000 300000
mul
tiplie
rcf
cam
p
EvalCost [1]blt/cse seq
Qimonda · Daniel Platte · September 16, 2007 · Page 31 Copyright © Qimonda AG 2007 · All rights reserved.
Speed-Up for TML Models
opamp741 (TML)
49.36
64.79
6.85
0.64
27.840.70
0
10
20
30
40
50
60
70
80
TML (default) TML (sparse solver) TML (sparse loading)
CP
U-T
ime
[s]
Loading Solving
Conver-gence
Sparse Solver
Sparse Loading
10x Speed-Up
Qimonda · Daniel Platte · September 16, 2007 · Page 32 Copyright © Qimonda AG 2007 · All rights reserved.
Sequential Equations (Definition)
F(x , y ) 0=
1 1
2 2 1
3 3 1 2
n 1 1 n 1
x G (y)x G (y, x )x G (y, x , x )
x G (y, x , , x )−
===
=L
K
m simultaneous equationsm simultaneous variables
n sequential equationsn sequential variables x
y
0)),(()(~ == yyxFyF
• Reduced dimension of Jacobian matrix• Iterative solution for less variables mmmnmn ×→+×+ )()(
Advantages:
sequential variables as function of simultaneous variables y
)( yxx =
Qimonda · Daniel Platte · September 16, 2007 · Page 33 Copyright © Qimonda AG 2007 · All rights reserved.
Recognition of Sequential Equations
Identification
1 2 4 6 8 10 11
12
4
6
8
1011
1 2 4 6 8 10 11
12
4
6
8
10118x8
Variables
Equations
1 2 4 6 8 10 11
12
4
6
8
1011
1 2 4 6 8 10 11
12
4
6
8
1011
Seq
3x3
Variables
Equations
Qimonda · Daniel Platte · September 16, 2007 · Page 34 Copyright © Qimonda AG 2007 · All rights reserved.
Simulator Comparison for opamp741
opamp741
426.6
35.0
36.7
192.3
448.5
30.9
32.9
230.8
277.3
27.5
17.3
148.7
0.0 125.0 250.0 375.0 500.0
Thetys
Rhea
Dione
Titan
Sim
ulat
orSlow Down [1]absolute per time step per iteration
opamp741
3556
5470
2098
2665
2117
4479
3753
4856
1701
1618
1559
1764
1569
1748
1669
1391
0 2000 4000 6000
circuit
model
circuit
model
circuit
model
circuit
model
Thet
ysTh
etys
Rhe
aR
hea
Dio
neD
ione
Tita
nTi
tan
Sim
ulat
or
N [1]iterations time steps