arm system technology mentor tech forum v6
TRANSCRIPT
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
1/35
1
Effective System Designwith ARM System IP
Mentor Technical Forum 2009
Serge Poublan
Product Marketing Manager
ARM
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
2/35
2
Higher level of integration
WiFi
Bluetooth
Camera
Platform OS Graphic 13 days standby
H.264
MP3
Flash 9
128 MB DDR
Skype
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
3/35
3
World-class market-proven technology
20+ processors for every application
200+ silicon partners
500+ licenses
15Bu shipped
ARMv4
Processors are evolving, e.g. MP
x1-4
ARMv5
ARMv6
ARMv7Cortex
ARM966E-S
SC200ARM7EJ-S
ARM922T
SC100
ARM920T
ARM7TDMI(S)
ARM1176JZ(F)-S
ARM1156T2(F)-S
ARM1136J(F)-S
ARM1026EJ-S
ARM968E-S
ARM926EJ-S
ARM946E-S
x1-4
Cortex-A9
SC300
Cortex-M1
Cortex-M3
Cortex-R4
Cortex-R4F
Cortex-A8
ARM11 MPCore
Cortex-M0
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
4/35
4
ARM Mali GPU - Scalable Performance to over 1G Pixel/s
Mali
-400 MP
Mali-200
Mali-55
Visualco
mplexity
Screen resolution
NextGenerationNavigation
FlashLite
Mobile
Gaming
WebBrowsing Java
Gaming
3D
Navigation
Flash 10
TV HD UI
VideoPostProcessing
HD VideoPost
Processing
2D/3D
Presentations
HD 3DGaming
Console 3DGaming
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
5/35
5
Higher Mobile Device Resolution
2007 2008 2009 2010 2011 2012 2013
QVGA
320x240
VGA640x480
WVGA800x480
WXGA
1280x800
WSVGA1024x600
1080p301920x1080
1080p60
1920x1080
Requirements of next generation Mobile
platform
- Increasing bandwidth requirements simply to
refresh the display
- Ignoring Fill rate, Input Vertex Data
and Texture bandwidth
Display Refresh Bandwidth MB/s
1080p60, 1920x1080, 60fps 475
1080p30, 1920x1080, 30fps 237
720p, 1280x720, 30fps 105
WVGA, 800x480, 30fps 44
VGA, 640x480, 30fps 35
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
6/35
6
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Example SoC Mobile Platform
64 or 128
LPDDR2
L2Cache
UART0 UART1 SPI WDT RTCTimer1Timer0 GPIO
NANDFlash
SDRAMCtrl
AMBA
Interconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPU
CPU
L2CC
L2CC
MediaEngine
Media
Engine
Static
MemoryCtrl
StaticMemory
CtrlLatencyrequire
ment
Bandwidthrequirement
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
7/35
7
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Example SoC Mobile Platform
64 or 128
LPDDR2
L2Cache
NANDFlash
SDRAMCtrl
AMBA
Interconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPU
CPU
L2CC
L2CC
MediaEngine
Media
Engine
Static
MemoryCtrl
StaticMemory
Ctrl
Digital Highway
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
8/35
8
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
9/35
9
AMBA Ecosystem :
The on-chip infrastructure is critical to system performance
Increased focus on processor memory performance Different types of processors have different requirements
ARM has grown the AMBA architecture eco-system to helpaccelerate SoC design:
70+ Connected Community partners
have AMBA compatible products 10+ AMBA specification downloads a day
the de facto standard is of course the ARM bus architecture, AMBA.
Ron Wilson, EETimes
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
10/35
10
Each path must be designed to minimise the inherent pipeline
latency
Next generation AXI Interconnect halves the interconnect latency
Masters which issue multiple AXI requests effectively hide latency
PrimeCell Cache Controllers
Trade an increase in minimum latency for dramatically reduced average latency
Design to Minimise Latency
Processor sub-systemAXI Interconnect
Dynamic Mem CtrlDDR2 PHY
DDR2 SDRAM
Addressformat andarbitration
DDR2SDRAMCAS latency
De-skewand
capture
Data FIFO
and businterface
Round trip memory latency
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
11/35
11
Design to Maximise Throughput
Effective on-chip Quality of Service depends on the co-
operation of the interconnect and memory controller
Support for multiple outstanding requests
The best use of memory pages by scanningthe list of requests
Controlling the order of queued transactions to Meet maximum latency targets
Ensure throughput-dependentprocessors are well serviced
Provide low latency paths
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
12/35
12
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM Level2 Cache Controllers
64 or 128
LPDDR2 NANDFlash
SDRAMCtrl
AMBA
Interconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPUCPU
L2Cache
L2Cache
MediaEngine
Media
Engine
Static
MemoryCtrl
StaticMemory
Ctrl
Digital Highway
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
13/35
13
L2CC Increases Processor Performance
Benchmark : MPEG4 decode
System : ARM PrimeXsys Platform for
ARM1136J-S
CPU : 400MHz ARM1136J-S 16K I & D caches
Memory : 100MHz 32 bit SDRAM
L2 cache : L210 128K unified L2 cache
Web Page Render Time as a function of L2 Cache Size
0.0 1.0 2.0 3.0 4.0
0
128
256
512
L2CacheSize(KB
Speed Up Compared to 0K L2
First Time
Subsequent
Benchmark: Linux + Mozilla (5 htmlpages from I-Bench looped 4 times)CPU: Cortex-A8 (speed, L1 cache), L2part of Cortex-A8
Results may vary for systemconfiguration and web content
No L2
128K L2
256K L2
512K L2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
+74%+102%
+104%
MPEG4 Decode on ARM1136EJ-S
Relative performance
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
14/35
14
L2CC Increases System Performance
Reduced System Power Consumption
External memory access ~10x more energy than on-chip External memory accesses reduced with L2 cache
Enables use of lower-power and lower-cost memorysub-system
E.g. 16-bit instead of 32-bit external interface
Or LPDDR instead of DDR2
Reduced On-Chip traffic & contention
Only cache misses propagated to the interconnect
Improve overall system performances
Provide more bandwidth to others SoC components
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
15/35
15
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM AMBA Interconnect
64 or
128
LPDDR2 NANDFlash
SDRAMCtrl
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
Cortex A8Cortex A8
L2CCL2CC
MediaEngine
Media
Engine
StaticMemory
Ctrl
StaticMemory
Ctrl
Digital Highway
NIC-301
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
16/35
16
AMBA Interconnect (NIC-301)
Low latency communication for ARM CPUs
High bandwidth for ARM Graphics and Video
Supporting: AXI, AHB & APB
Data widths from 32- to 128-bit Supporting both synchronous & GALS implementations
Quality of service
Configurable through AMBA Designer For minimum area & maximum frequency
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
17/35
17
Optimise your Interconnect Topology
Use properties of the traffic to influencethe topology
RAM SMC DMC
Cortex A9
Real-time masters
Low bandwidth peripherals
Freq F
Low bandwidth peripherals
Real-time masters
Fx2.5
Fx2.5
RAM SMC DMC
Cortex A9
Fx2.5
High connectivity & increasingnumbers of IP cores does not scalewith a single interconnect
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
18/35
18
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Topology Optimisation with ARM Interconnect
64 or
128
LPDDR2 NANDFlash
SDRAMCtrl
NIC-301
200MHz
DynamicMemory
Controller
DynamicMemory
Controller
InterruptController
InterruptController
CortexCortex
L2CCL2CC
NeonNeon
StaticMemory
Ctrl
StaticMemory
Ctrl
NIC-301400MHz
Low LatencyInterconnect
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
19/35
19
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM Memory Controllers
64 or
128
LPDDR2 NANDFlash
SDRA
M CtrlDMC-34xDMC-34x
InterruptController
InterruptController
CortexCortex
L2CCL2CC
NeonNeon
SMC-35xSMC-35x
Low LatencyInterconnect
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
20/35
20
ARM Memory Controllers
Synthesizable, Configurable soft cores
Wide range of memory types, silicon processes & targetapplications
AXI Dynamic Memory Controllers for SDR, DDR, LPDDR,DDR2 and LPDDR2 (DMC-34x)
Over 20 licensees to date
AXI Static Memory Controllers for NOR Flash, NANDFlash and SRAM (SMC-35x)
Over 40 licensees to date
AHB Memory Controllers for Dynamic and Static Memories(PL24x)
Over 60 licensees to date
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
21/35
21
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
22/35
22
Topolology
Configure
Cross-configure
Stitch & Check
What is AMBA Designer?
Cross-configure
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
23/35
23
Topolology
Configure
Cross-configure
Stitch & Check
What is AMBA Designer?
Stitch & Check
(Export as individual signals)
Interface checking on:
Signal widths
Signal directionInterface propertiesValid response
types
Interleave depth
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
24/35
24
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
25/35
25
AXI Slave
Interface
AXI Master
Interface
AXI Slave
Interface
UUT(Block or Sub-system)
User
AXI Master
Interface
AXI Master
AXI Master
User VIP
AXI Slave
IEEE 1800 SystemVerilog Testbench
User IP
AXI Monitor
Directed
Vectors
Prof.
Data
AVIP Features for RTL Simulation
FunctionalVerification
For VerificationEngineers, AVIP is aset of System Verilog
modules that enablefaster and higherquality verification of
AXI based IP.
PerformanceExploration
For SoC architects, HW
and VerificationEngineers. AXI based
SoC performance canbe explored and verified.
Prof.
Data
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
26/35
26
AXI Slave
Interface
AXI Master
Interface
AXI Slave
Interface
UUT(Block or Sub-system)
User
AXI Master
Interface
AXI Master
AXI Master
User VIP
AXI Slave
IEEE 1800 SystemVerilog Testbench
User IP
AXI Monitor
AVIP Features for RTL Simulation
Protocol
Checkers OVL and SVA
assertion librariesprovided for AXIprotocol checking.
AXI ProtocolCoverage
Channel level,
transaction level andsequence level pre-
defined coverage pointsfor AXI protocolcoverage.
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
27/35
27
AMBA Designer + AVIP: RTL Design Flow
To optimise interconnect and memory
architecture ARM recommends thefollowing flow:
Configuration
Set the correct parameters and checkthe components
Integration
Assemble the sub-system and staticallycheck the design
Simulation
Run test scenarios to check usagemodes
Analysis
Check results and loop back
ConfigurationConfiguration
IntegrationIntegration
SimulationSimulation
AnalysisAnalysis
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
28/35
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
29/35
29
Fabric Design Tools: What is AVIP?
It enables System Exploration at RTL level
TTT = Time to tweak = 20s
TTS = Time to simulate = 5 mins
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
30/35
30
System Exploration Methods
RTL simulation, AVIP, User VIP
Industry standards VIP
Block-level, Internal bus, RTL simulation
Spreadsheet
Analysis
SoC, static
Acceleration/Emulation
VIP, Logic Tiles, SW
SoC, Real Stimulus, external I/F
Silicon/Applications
Real-time Behavior
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
31/35
31
Iteration time vs Realism
Spreadsheet
Static analysis
AVIPInternal bus simulation
SoC + s/wEmulation/proto
Silicon + ApplCoreSight
mins/hrs
mins/hrs
days/wks
mths/yrs
Cyc
letime
LOW
HIGH
Realisticbe
haviour
LOW
HIGH
Observeactual
behaviour
Adding S/W,external I/F with
realistic scenarios
Statistical orrecorded traffic
profiles
Mathematicalformula, not
dynamic
AVIP: the iteration time of a spreadsheet with the accuracy approaching RTL simulation
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
32/35
32
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
33/35
33
Improve the Performance of Your SoC
Analyzing real silicon performance enables you to
confidently improve the next design If you want to find out how a car really performs, drive it
CoreSight Design Kit & Performance Profiling
Provide accurate, real-time telemetry from your system
Essential tools for delivering system performance improvements
Your SoC may be optimized, but is the software?
ARM Profiler analyzes system performance, enabling optimization viaProfile Driven Compilation
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
34/35
34
CoreSight Debug & Trace
The Debug & Trace Architecture for the Digital World
Open Standard available on www.arm.com Optimise software productivity
on your multi-core SoC
SW Debug
SW Performance
Optimisation SoC Performance
optimisation
Visibility and trace of thewhole SoC
ARM trace and performance sources (ETM, PTM, Interconnect)
Leverage CoreSight architecture for YOUR IP
-
8/8/2019 ARM System Technology Mentor Tech Forum v6
35/35