hybrid modelling of a large communication system - … · hybrid modelling of a large communication...
TRANSCRIPT
May 6, 2015 ‹#›
Hybrid modelling of a large communication system
May 6, 2015
Ilia Greenblat [email protected] [email protected]
@
Using open-source tools and verilog simulator
May 6, 2015 ‹#›
The system
2
Asic + Fpga
Linecard
Box
Many Boxes
Compass Networks system is a
collection of several boxes. The
boxes send packets among
themselves and outside world.
The architecture starts with
ASIC/FPGA rtl design and
continues into line card and
box.
The challenge is to predict the
whole system’s behaviour
based on individual ASIC/FPGA
architectural choices.
CPUs running software are
minor part of the system and
therefore modelled by hardware
functionality.
May 6, 2015 ‹#›
Why not just simulation?
• The system behaviour and performance is determined not only by
individual chip or fpga makeup, but also by interaction among
many nodes comprised of those .
• The performance can only be predicted by running and examining
variations on topology and load scenarios. Correct ASIC design
can be based on lessons learned and validated by modelling.
• Using ASIC/FPGA rtl code is not practical for two reasons :
• rtl design itself depends on modelling and ,
• simulation runtimes will be prohibitive.
May 6, 2015 ‹#›
modelling phases
4
building the model
“hardware”
defining excitation
profiles big data analysis
chasing the black swans. reaching the corner cases
building the model is only part of the story, much work is invested into defining interesting patterns and after that into analysing the gigabytes of resulting data.
What is the aim? 1. what-ifs and confidence in algorithms. 2. find reasonable memory sizes. 3. rarity/normality of certain events. 4. optimal settings of certain parameters. 5. unlike verification it is not pass/fail. 6. overdesign / underdesign. 7. validate assumptions. 8. validate topologies / hw configs
May 6, 2015 ‹#›
Modeling sv simulation
5
rtl:
model: unlike rtl, model just
computes the delay of a
packet over the line,
based on length of the
packet. Model usually
moves “pointers” to
actual packet objects.
• Modeling is a simulation of
a system, where key
components are
abstractions and instead of
clocks we have timed
events.
May 6, 2015 ‹#›
modelling : the two aspects
6
simulation algorithms
Simulation aspect deals with
modelling of delays , activation
of agents, triggering events and
correctly shuffling the data
around.
Algorithm aspect deals with
decision making inside the
hardware and software agents.
For example, inspecting the
packet to decide on routing.
May 6, 2015 ‹#›
modelling : our approach
7
Verilog
Python
We split: simulation is done by verilog and algorithms by python .
verilog moves “pointers”, python has
access to packet objects associated
with these pointers .
VPI.so
May 6, 2015 ‹#›
alternatives
8
Our approach enables us to use open-source tools (verilog
included) - freeing the need to obtain and manage licenses and
also starting an unlimited number of parallel runs.
Other tools give nicer GUI interface - but it quickly becomes a
liability when system scales up.
Other tools do some parts well, but no other tool does well all
what we need: expressing all levels of design.
If the algorithm part of the model is written in C++ it becomes
much less productive then using Python. Python has expressive
power such that building models are quick and hassle free.
Also, we are not testing TCP/IP/router stack protocols: things
written in C/C++. But rather model IP behaviour by python.
Arena
OmNet++
OpNet
Simulink
SystemVerilog
SystemVision (mentor)
May 6, 2015 ‹#›
a word on vpi connection
9
reg clk;
initial $python(“initial ((”(;
always begin
clk= 0;
#10;
$ python(“posedge ((”(;
clk= 1;
#10;
$ python(“negedge ((”(;
end
On verilog side
import veri #all verilog connections are in veri lib
cycles=0 # just count cycles
def initial(): # put here startup code
print ‘put here initial code, clean, open files ...’
def posedge:((
global cycles
cycles=cycles+1
def negedge:((
if (cycles>1000): veri.finish() #exit from simulator
Req = veri.peek(‘top.dut.request ’ (
if (Req==’ 1’:(
print ‘requesting at %d cycle’%(cycles(
veri.force(‘top.dut.ack’,’ 1’ (
On python side (verilog.py
file(
veri_vpi.so
May 6, 2015 ‹#›
“hardware” hierarchy of models
10
the zoo: building blocks
asic/fpga data flow
model
linecard
system.v (topology)
zoo: fifos, buffers, round-robins ,…
functionality and statistics
written in verilog and python
usually schematics,
translated to verilog
generated by config script, netlist of test
topology we want to examine.
verilog code, reflects actual
connectivity of board
components.
May 6, 2015 ‹#›
building blocks: the Zoo
11
The ZOO: queue buffer roundrobins splitters pipe flowMeter source sink red lights queue system reorders …. ….
Each member includes:
simple members may include only verilog.
more complex may include call to function.
those that require “memory” include definition and instance of a class.
complex members may have very long python code.
Each member of the zoo has also icon for use in schematics.
icon examples:
flowMeter
125GBS pipe
3 flavors : - functional (queue,pipe, roundrobin ) , - specialised (decision maker ) , - alerts &monitors (flow meter, red light, timeouts.)
May 6, 2015 ‹#›
asic/fpga models
12
Schematics are convenient way to maintain
data flow models of ASIC/FPGA level.
Schematics are exported to model-compatible
verilog. All components come from the zoo.
Higher levels (like linecard, box, whole
system) are not suitable for schematics .
May 6, 2015 ‹#›
Scenario: driving the simulation
13
scenario: - configure each source with traffic generation patterns
and their evolution over time. - configure variables controlling the system behaviour - control amount of randomisation and noise. - control occurrence of faults .
- modelling facilitates (and requires) global
synchronisation of events. To lead the system through interesting conditions.
chasing the black swans.
May 6, 2015 ‹#›
analysing the run
14
red flags alerts errors
“spice” waves
logfile
big data problem. Several logfiles document step by step events that happen in the system. It is pretty hard to extract useful info out of it.
“assertions” inside the model monitor and alert on predefined bad conditions or undefined behaviour. Usually debug starts from those.
Each building block reports periodically it’s vital stats. Like amount of traffic passing, occupancy of memory modules or split of kinds of traffic. Together all this data is displayed like “spice” waves.
May 6, 2015 ‹#›
Big data: analysing the results
15
The challenge —
- discern between bugs in the model and real problems.
- coverage (what interesting combinations occurred (
- what limits were crossed .
- why some strange behaviour occurred.
- what size of fifos/buffers/queues is optimal (size vs back pressure(
- what is the chance of bad events: e.g. losing packets in current settings
- what is optimal setting of “knobs” - configuration registers.
May 6, 2015 ‹#›
waves look something like this
16
wave application enables us to correlate various events in
time and understand the causality - what triggered what.
May 6, 2015 ‹#›
some points • Proving the model is correct (and follows modifications) is a daunting
task - so we must settle for “rising confidence ”.
• Hybrid approach proved to be the simplest and easiest to debug and
keep up with the design changes. Also it speaks designers language.
• In some cases it is possible to replace model abstraction with actual rtl.
e.g load balancer.
• verilog-python connection replaces systemverilog verification in some
projects.
• consider using free and self-developed tools. It enables to develop the
flow from anywhere.
• our flow doesn't lock us in “legacy” grip.
• the flow is modular: easy to replace parts by better alternatives.
18