keystone real time trace workshop - texas...

102
TI Confidential NDA Restrictions 1 Keystone Real Time Trace Workshop March 2013 Vikas Varshney Imtaz Ali Engineering Manager Debug and Emulation Engineering Manager Analysis Tools

Upload: duongkhanh

Post on 11-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions 1

Keystone Real Time Trace Workshop March 2013

Vikas Varshney Imtaz Ali Engineering Manager – Debug and Emulation Engineering Manager – Analysis Tools

Page 2: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Agenda

• Tools roadmap overview (15 minutes)

• Multicore debug and trace features (15 minutes)

• Workshops (120 minutes)

– Setup overview

– WS1: Getting started with DSP trace

– WS2: Customize DSP trace for data tracing

– WS3: Hotspot analysis with function profiling, stalls, and cache analysis

– WS4: Getting started with non-intrusive system trace (STM) SoC profiling

– WS5: Customize SoC profiling for DDR and MSMC bandwidth and latency analysis

– WS6: DDR bandwidth analysis and latency by interfacing embedded APIs (cToolsLib)

– WS7: Command line trace decode interface for offline decode

– WS8: Getting started with Cortex A program execution trace and function profiling

• SerDes based trace receiver – preview (15 minutes)

• Q & A (15 minutes)

2

Page 3: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Roadmaps

3

Page 4: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

CCS Keystone support roadmap

4

CC

S v5

Eclipse 3.8 GUI Composer System Analyzer 1.1 XDS2xx Support Enabler for:

Keystone2

Future Production

Early Adopter In Development

5.5.0

CCSv6 GA

Windows 8 Host Support Trace Usability Improvements I CGT Optimizer Assistant Keystone2 Simulators

CCSv6 EA

4Q’12 1Q’13 2Q’13 3Q’13

5.4.0

26 Nov’12 15 Apr’13 Aug’13

Trace Analyzer: Code Coverage Statistical Function Profiling

5.3.0

Release

4Q’13

6.0.0 Beta

Edison / Lamarr Support Enhancements to System and Trace Analysis CGT C6000 v8 Smaller downloadable images

CC

S v6

Eclipse v4.3 Cortex A15 Hypervisor debug support Trace Usability Improvements II

Code Coverage Statistical Function Profiler Context Aware Function Profiler

CGT C6000 v8 Beta New installer framework (InstallBuilder)

Apr’13 to Jun’13 Release Jul’13 to Oct’13 Dec’13

Eclipse v4.3 (Pre-Release Version) Partial / Incremental implementation of new

features & enhancements Bug fixes

4

Page 5: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

XDS roadmap

Y10 1Q11 2Q11 3Q11 4Q11 1H12 2H12 Future

Plan/Future

Production

Early Adopter/ Beta

In Development

XDS100 v2 • Entry level JTAG • USB 2.0 • TI 14 and CTI 20 native • Open HW reference design

XDS100 v3 (cJTAG) • Entry level JTAG & cJTAG • USB 2.0 •TI 14 and CTI 20 native •Open HW reference design

XDS200 • Performance JTAG & cJTAG at low cost • ARM SWD and SWO support • USB 2.0 and optional ENET • Low cost embedded emulator option • Bi-direction GPIOs for instrumentation • TI, MIPI, and ARM connector option • 3P EPK licensed

Future On-chip

XDS Pro Trace $ 3.5K

XDS560 $3K

XDS560T Trace Receiver • External DSP trace receiver • USB 2.0 •10 pin @ 167 MHz DDR • 64 MB trace storage buffer • Auto skew & jitter calibration • TI 60 native • Discontinued

XDS560T $10K

XDS510 $1.5K

XDS100 v2 < $100

XDS560 v2 STM

$1.5K

XDS510 • Performance JTAG • USB 2.0 • TI 14 and CTI 20 native • 3P EPK licensed

XDS560 • High performance JTAG • PCI and USB 2.0 • TI 14 and CTI 20 native • 3P EPK licensed

XDS560 v2 STM • High performance JTAG & cJTAG • USB 2.0 and ENET • STM receiver • 4 pin @ 100 MHz •128 MB storage • MIPI 60 native • 3P EPK licensed

XDS Pro Trace Receiver • Multi-channel ARM, DSP, and STM • USB 2.0 and ENET • 32 pin @ 250 MHz DDR • 2 GB trace storage buffer • Auto skew & jitter calibration • MIPI 60 native / add-on cable to any XDS • 3P EPK licensed

XDS560 v2 derivatives

~ 1K

XDS200 ~$299 XDS100 v3

(cJTAG) < $150

5

Page 6: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Multicore debug and trace features

6

Page 7: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Keystone debug and trace – key goals

• Multicore debug IP within SoC

Efficiency

• Across all KeyStone devices

Consistency

• Development and deployment

Product Life Cycle

• 3P framework and low cost tools

Eco System Enablement

7

Page 8: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Keystone debug & trace strategy

• Debugging cores independently or as a group

• Synchronization and execution correlation

• Runtime visibility in data flows spanning over multiple cores

Multicore Interactions

• CPU loading and load balancing

• Actual application execution sequence

• Cache and CPU stalls impact

Application Optimization

• Bus transactions visibility

• Interface throughput and bottlenecks characterization

Optimize Data Flow in the System

8

Page 9: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Keystone-1 (debug & trace view)

CTools Debug SS

1149.1

STM ETB

Peripherals…

Switch Fabric

Trace Pin Export STM Pin Export

C66x DSP

DSP

SS

JTAG

Trace

AET

ETB

SW Msg

ICEpick

CP_Tracer

DDR

CP_Tracer L2 Memory

CP_Tracer

9

C66x DSP Shannon - 8 Cores Nyquist – 4 Cores Appleton - 4 Cores

XTI

AR

M S

S

Cortex A8

JTAG

SW Msg

HW BP/WP

CTM

ETM Trace

PMU

ETB

Trace Funnel

Replicator

Cortex A8 Shannon - NA Nyquist – NA Appleton - 1 Core

TPIU

Page 10: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Keystone-2 (debug & trace view)

CTools Debug SS

1149.1

STM TBR

Peripherals…

Switch Fabric

Trace Pin Export STM Pin Export

C66x DSP

DSP

SS

JTAG

Trace

AET

ETB

SW Msg

CP_Tracer

DDR

CP_Tracer L2 Memory

CP_Tracer

10

Tetr

is

Cortex A15

JTAG

SW Msg

XTI

HW BP/WP

CTM

PTM Trace

ARM STM

TPIU

TBR

ICEpick

PMU

Trace Funnel

Replicator

Tetr

is S

S

C66x DSP Kepler - 8 Cores

Cortex A15 Kepler - 4 Cores

Page 11: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Debug capabilities summary

11

Feature Shannon Nyquist Appleton Kepler

JTAG debug √ √ √ √

CoreSight ARM Debug for Cortex -- -- √ √

DSP AET – HWBP, WP, & sequencer √ √ √ √

DSP Trace – PC, data, & events √ √ √ √

Cortex HWBP & WP -- -- √ √

Cortex Performance Measurement Units (PMU) -- -- √ √

Cortex Processor Trace - PC & timing -- -- √ √

Cortex Processor Trace – Data -- -- √ --

STM SW messages √ √ √ √

STM CP Tracers 17 16 18 32

On-Chip Trace Buffer – DSP Trace 4KB 4KB 4KB 4KB

On-Chip Trace Buffer – STM 32KB 32KB 32KB 32KB (TBR)

On-Chip Trace Buffer – Cortex Trace -- -- 32KB 16 KB (TBR)

Trace Export 20 pins 20 pins 32 pins (TPIU) 32 pins (TPIU)

Page 12: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Embedded debug and analysis

• CtoolsLib – Enabling embedded debug, trace setup, and analysis use case

• Easy access to debug capabilities via simple C APIs

• Very low latency and small footprint (order of few KBs)

• Easy OS integration

• Integrated with MCSDK

• Easy import and data visualization via CCS

Field Deployed Debug and Trace

12

API Shannon Nyquist Appleton Kepler

AETLib √ √ √ √

ETBLib (with DMA draining support) √ √ √ √

DSPTraceLib √ √ √ √

ETMLib -- -- √ √

STMLib √ √ √ √

STM Linux driver -- -- √ √

CPTLib √ √ √ √

Page 13: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Workshops

13

Page 14: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

MCSDK Image processing demo overview

14

Input Image

Bit Map

Image

RGB Slice

0

Luma (Y)

ImageGradient

ImageEdge

RGB Slice

1

Luma (Y)

ImageGradient

ImageEdge

RGB Slice

3

Luma (Y)

ImageGradient

ImageEdge

Combine

Edge

Slices and

Create Bit

Map Image

File Read

RGB extract

& Slicing

Slice 0

(Core 0)

Slice 1

(Core 1)

Slice 3

(Core 3)

Bit Map

Image

Output Image

RGB to Y

RGB to Y

RGB to Y

IMGLIB:

Sobel filter

IMGLIB:

Sobel filter

IMGLIB:

Sobel filter

IMGLIB:

Threshold

IMGLIB:

Threshold

File Write

Core0 (Master Core)

Processing

Core0-3 (Slave Core)

Processing

• This application shows implementation of an image processing system using a simple multicore framework. This application will run TI image processing kernels (a.k.a, imagelib) on multiple cores to do image processing (eg: edge detection, etc) on an input image from host PC.

• For more details on MCSDK Image processing demo, please refer to: http://processors.wiki.ti.com/index.php/MCSDK_Image_Processing_Demonstration_Guide

Page 15: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Setup and Installation overview • Hardware Setup

– XDS560v2 Pro Trace

– Nyquist (C6670) EVM

– Appleton (C6614) EVM (only required for ARM trace workshops)

– Ethernet cable connected between Nyquist EVM and host PC

• Software Installation

– Code Composer Studio v5.4

– BIOS-MCSDK v02.01.02.06 (+ patch 02.01.02.06P01) or newer

15

Page 16: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Setup and Installation overview 1. Download the following from links specified on previous page

i. CCS5.4.0.000xx_win32

ii. bios_mcsdk_02_01_02_06_setupwin32.exe (or newer)

iii. bios_mcsdk_02_01_02_06_patch01_setupwin32.exe (if using v2_01_02_06)

2. Install Code Composer Studio

3. Install BIOS MCSDK into c:\ti folder

4. Install BIOS MCSDK Patch (if required) into c:\ti folder

5. Start Code Composer Studio

6. Select a workspace when requested

7. Wait until CCS Add Discovered Products window comes up

8. Select OK

9. If warning pop-up, select OK

10. Say Yes to restarting CCS when requested

11. After CCS restarts, if requested to add other versions of NDK, select Cancel

12. Close TI Resource Explorer window

16

Page 17: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to import and build the demo 1. From CCS main menu, select Project -> Import Existing CCS Eclipse Project

2. Browse to folder C:\ti\mcsdk_2_01_02_06\demos\image_processing\ipc\evmc6670l

3. Select OK

17

4. Select the following projects i. Image_processing_evmc6670l_slave

ii. Image_processing_evmc6670l_master

iii. Image_processing_evmc6670l_total_bandwidth_master

5. Select Finish

6. Right click in Project Explorer on each project imported and select Build Project

Page 18: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to run the demo (i) • To setup the demo in static IP mode, SW9 position2 should be OFF. Other DIP switch settings, starting from

position1 to position4:

– SW3: OFF, ON, ON, ON SW4: ON, ON, ON, ON SW5: ON, ON, ON, ON

– SW6: OFF, ON, ON, ON SW9: ON, OFF, ON, ON

• Connect the Nyquist EVM to the host PC using Ethernet cable and

• Connect XDS560v2 Pro Trace to the EVM and PC (via USB). Power up the EVM and the XDS560v2.

• Change the host PC network settings to use static IP address 192.168.2.101.

– Got to “Control Panel” -> “Network and Sharing Center” -> “Change Adapter Settings”

– Right click on “Local Area Network” and change the “Properties”

– If you prefer to use DHCP, IP address is shown in CCS console after the target is run.

18

Page 19: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to run the demo (ii) • In CCS, setup C6670 target configuration with XDS560v2 Pro Trace USB connection

– Go to File New Target Configuration File

– Type file name as C6670_XDS560v2 and click Finish

– Now select Connection as Spectrum Digital XDSPRO USB Emulator

– Type C6670 in the Device field; device names will be filtered; select/check TMS320C6670

– Click on Target Configuration from Advanced Setup (RHS)

– Now select C66x_0 and include <CCS_INSTALL>\ccsv5\ccs_base\emulation\boards\evmc6670l\gel\evm6670l.gel from “initialization script” box (RHS). Click on Save

– Select View->Target Configurations to see a list of all configuration files

– Select the one you just created (as C6670_XDS560v2 .ccxml) under User Defined

– Launch the debug session by selecting the Launch Selected Configuration in the context menu

• Group all the 4 C66x cores into one single group

19

Page 20: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to run the demo (iii) • Connect the cores

• Load image_processing_evmc6670l_master.out (mcsdk_2_01_02_06 \ demos \ image_processing \ ipc \ evmc6670l \ master \ no_instrumentation \ Debug) on core0.

• Load image_processing_evmc6670l_slave.out (mcsdk_2_01_02_06 \ demos \ image_processing \ ipc \ evmc6670l \ slave \ no_instrumentation \ Debug) on cores1,2, and 3.

• Run all the cores from the CCS debug view

20

Page 21: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to run the demo (iv)

21

• Open a web browser and type in 192.168.2.100 (EVM’s IP address) in the address box. One can see the following interactive webpage:

• Select Number of Cores as “Four”

• Browse and provide the path to a bitmap image evmc6678l_1920x1080_5_93MB.bmp (available at: mcsdk_2_01_02_06 \ demos \image_processing \ images)

• Click on Process

Page 22: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Steps to run the demo (v) Image processing demo output includes details such as processing time and output image.

22

Page 23: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 Getting started with DSP trace

23

Page 24: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – Tracing program execution 1. If Image Processing Demo is not already running then complete steps i to iii of Steps

to Run Demo as described in Workshop Setup

4. Click to clear any saved/cached settings from previous run

5. Set Trace Range = End at Address

6. Set End Address = convert_rgb_to_y

7. Select Start to Open Trace Viewer

2. Select c66xx_0 in the debugger

3. In menu select Tools -> Hardware Trace Analyzer -> PC Trace to start PC Trace

24

Known Issue: First time this menu is selected may take few seconds.

List of analysis available is dependent on the selected

core(s) and their state Description of default configuration

Page 25: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – Tracing program execution 8. Trace Viewer status shows that buffer is already wrapped but will only be shown

when recording ends

9. In this case recording will end either when convert_rgb_to_y is executed (or C66xx_0 is halted)

10. Ignore the warning in the view for now. Clock frequency will be obtained when data collection stops

11. Complete “steps iv” of “Steps to Run Demo” (as described in Workshop Setup) to run process image

12. Wait for Trace Viewer to show all collected data

25

Page 26: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – Analyze trace result 13. Grab and drag column borders to resize as needed

14. Graph and drag column headers to reposition columns are required

15. Trace Viewer shows Program Addresses executed leading up to convert_rgb_to_y

Use this button to auto-fit all column width

Grab column edge and drag to resize column width

Grab column header and drag to move column

26

Page 27: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – View source code

1. In Trace Viewer, click on the record before convert_rgb_to_y

2. From Trace Viewer right-click-context-menu select Trace Viewer -> View Source Code

3. The file mcip_core.c is open at line 116 showing source code corresponding to the program address in the selected record

4. Scroll down in Trace Viewer to the record containing convert_rgb_to_y and notice the function convert_rgb_to_y is highlighted in the source file

27

Page 28: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – Function execution graph

1. From Trace Viewer toolbar, select Analyze -> Function Execution Graph

2. Click on the + next to Function on y-axis to expand graph

3. Double-click on graph title to expand graph to full-screen

4. Click multiple times on the Zoom out button in the graph toolbar to see entire execution

28

Page 29: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1 – Function execution graph

5. Grab y-axis with mouse and drag to see more of the name of the functions

6. Place mouse just below the x-axis and select that last bit of the graph to zoom into selected region

7. (Optional) From Function Execution Graph toolbar select Display Properties

8. (Optional) In the properties view, click on State/Event Categories tab, uncheck Visibility of functions that are not of interest, select OK. This will fit more of the graph in view

29

Page 30: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

9. Note the graph shows what function is executing and not function entry/exist

10. From latter part of graph observe process_rgb making some uia logging calls then calling convert_rgb_to_y

11. Double click on graph title to collapse full-screen view

12. Click anywhere in graph to automatically scroll Trace Viewer to same cycle position

13. Click anywhere in the Trace Viewer to scroll graph to same cycle position

14. Click on Graph toolbar to disable grouping

15. Now click anywhere in graph and note that Trace Viewer is no longer scrolled

WS1 – Function execution graph

Function Name

Cycle count

Running Function

Sort Functions

Expand & Collapse

Enable/Disable

Grouping

30

Page 31: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

15. Click on in Function Execution Graph toolbar then click at the beginning of an instance of process_rgb in the graph. This inserts a measurement marker 1 (X1)

16. Repeat step 1 but this time click at the end of same instance or process_rgb. This inserts measurement marker 2 (X2)

17. Look at top left corner of graph to see number of cycle between X1 and X2

18. While holding Shift button, use mouse to select and drag X2. Notice change in the number at top left corner of graph

19. Double-click on X2 to remove

20. From context menu select Remove All Measurement Marks to remove remaining markers (in this case only X1)

WS1 – Function execution graph

31

Page 32: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

1. From Trace Viewer toolbar, select Analyze -> Program Address vs. Cycle

2. Click on the graph zoom out button ( ) multiple times to see entire range of program addresses executed

3. (Optional) While holding ‘ALT’ button, use mouse to zoom into a selected region

4. (Optional) Use the zoom reset button ( ) on graph toolbar to restore original zoom

WS1 – Program address graph

32

Page 33: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS1: What did we learn?

• Can use DSP trace to get real-time tracing of program execution

• Analysis are available in Trace Viewer to process collected data

• Function Execution Graph provides a bird’s eye view of program execution

• Function Execution Graph can be used to measure the number of cycles between operations

• Program Address Graph shows what program addresses are executed

• Views have numerous features to help navigate the large volume of data that may be collected

33

Page 34: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 Customize DSP trace for data tracing

34

Page 35: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 – Tracing data access 1. Skip this step if continuing from WS1 or if Image Processing Demo is already running.

Complete steps i to iii of Steps to Run Demo (as described in Workshop Setup)

2. Select C66xx_0 in debugger

3. Open PC Trace Analysis from Tools -> Hardware Trace Analyzer -> PC Trace

4. If PC Trace was already running (from WS1) then select Close PC Trace in the Resource already in Use! dialog that pops-up (only 1 trace analysis can run on a cpu at any time)

5. Note that configuration from previous run is restored

6. In the Hardware Trace Configuration dialog select Advanced Settings

35

Page 36: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 – Tracing data access 7. In Advanced Properties dialog note that this analysis has three trace “jobs”. A

receiver (in this case ETB) and two trigger jobs. One trigger to start trace and the other to end trace (when program address at convert_rgb_to_y is executed)

8. Select PC Trace in the left column

9. In the right column expand the Properties tree and to What to Trace properties and enable tracing of Write Data and Read Data

10. Select OK

11. Select Start in Hardware Trace Analysis Configuration

36

Page 37: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 – Tracing data access

12. Complete step iv of Steps to Run Demo to process image

13. Wait for Trace Viewer to update with collected data

14. Note that Trace Viewer does not show Data Read and Data Write columns by default

15. In trace viewer toolbar click on Column Settings button

16. In Column Settings dialog, enable visibility of Read Data and Write Data

17. Select OK to exist dialog

18. In Trace Viewer resize and reposition columns as required

19. Scroll through Trace Viewer to see what data was read/written

37

Page 38: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 – Saving configuration

1. Click on Analysis Properties button in Trace Viewer toolbar – This reopen the configuration dialog. Here properties can be modified and re-applied to the analysis

– We’ll not be modifying properties, instead we’ll save current configuration for future reuse

2. Press the Save button at the bottom of the configuration dialog

3. In Save Configuration dialog enter My Data Trace for Analysis Name then press Save

4. Press Cancel to exit Hardware Trace Analysis Configuration dialog

5. Close the Trace Viewer

6. Go to Tools -> Hardware Trace Analysis -> Saved Analysis and note that My Data Trace is now available for reuse

38

Page 39: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2 – Sharing configuration

7. Create a c:\temp folder on your hard disk

8. Select Tools -> Hardware Trace Analysis -> Saved Analysis -> My Data Trace

9. Click Export Analysis button at bottom of configuration dialog

10. Browse to c:\temp folder, select Save

11. Click on Delete button at bottom of configuration dialog to delete this saved analysis

12. Go to Tools -> Hardware Trace Analyzer. Note that Saved Analysis no longer exists

13. Select Tools -> Hardware Trace Analyzer -> Import Analysis …

14. Browse to c:\temp, select File Name My Data Trace.zip, click Open

15. Go to Tools -> Hardware Trace Analyzer. Note that Saved Analysis now exists with My PC Trace

39

Page 40: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS2- What did we learn?

• Trace can be used to monitor what data addresses and values are accessed

• Can further customize trace configuration using Advanced Settings

• Configurations can be saved for reuse

• Saved configurations can be exported/imported

40

Page 41: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 Hotspot analysis with function, stall and

cache profiler

41

Page 42: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running function profiler 1. If Image Processing Demo is not running, complete steps i to iii of Steps to Run Demo

2. Select C66xx_0 in debugger

3. Open Function Profiler from Tools -> Hardware Trace Analyzer -> Function Profiling

4. Click to reset to original settings

5. Change Transport/Receiver Type to Pro Trace with Buffer Size 1MB

42

Page 43: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running function profiler 6. Click Data Collection Settings to expand

7. Select Start and Stop at Address for Trace Range. Note: Stop Address will not end trace just stop collection until Start is encountered again

8. Set Start Address = IMG_sobel_3x3_8 and End Address = MultiProc_self

43

9. Select Start

10. Complete step iv of Steps to Run Demo

11. Wait for Demo to complete

12. Press Stop in Trace Viewer toolbar

13. Note Trace Viewer and Exclusive Function Profiler processing data

Know Issue: Next time this UC is run, End Address will equal Start Address. End Address will have to be explicitly modified.

Page 44: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Analyzing function profile results 1. Wait for Trace Viewer and Exclusive Function Profiler processing to complete

2. Resize column width of Exclusive Function Profiler view as needed

3. Click on CPU Cycle Total column header twice to sort data in descending order

4. Note that assembly routines are shown as unknown_<address of first symbol above >_<address of first symbol below – 1>_<first symbol above>. Explicit names can be provided via xml file specified in preference

5. Note 2 functions IMG_thr_le2min_8() and IMG_sobel_3x3_8() are taking ~99% of time

44

Page 45: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Analyzing function profile results 6. Scroll to right on Exclusive Function Profiler Table

7. Observe that ~92% (1.2M cycles) of IMG_thr_le2min_8() time was a result of pipeline stalls

45

Page 46: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running stalls profiler 1. Select Tools menu->Hardware Trace Analyzer->Function Profiling (C66xx_0)->Close

Session to close the current running Function Profiler

2. Open Stall Profiler from Tools -> Hardware Trace Analyzer -> Stall Profiling

46

Page 47: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running stall profiler

3. Click to reset to original settings

4. Change Transport/Receiver Type to Pro Trace with Buffer Size 1MB

5. Click on Advanced Settings to setup Start/Stop condition (Start/Stop support will be added

to the configuration dialog in the next release)

47

Page 48: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running stall profiler

6. Select Pipeline Stall Analysis trigger in left column of Advanced Properties dialog

7. Expand Properties tree in left column and change Actions to Start Trace and Location to IMG_sobel_3x3_8

8. Expand Global Category to see what events are collected by default

9. Click on in the left margin to add another trigger (default name can be changed)

10. Select Trigger2 in the left column and change Actions to End Trace and Location to MultiProc_self. Click somewhere else in property view to allow symbol to be evaluated

48

Page 49: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running stall profiler

49

11. Select OK in Advanced Properties dialog

12. Select Start in Hardware Trace Analysis Configuration dialog

13. Complete step iv of Steps to Run Demo

14. Wait for Demo to complete

15. Press Stop in Trace Viewer toolbar

16. Wait for Trace Viewer and Stall Cycle Profiler processing to complete

Page 50: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Analyzing stall profiler results

50

1. Observe that ~1M of stall cycles for IMG_thr_le2min_8() is a result of L1D Read Misses

Page 51: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running cache analysis 1. Open Cache Analyzer from Tools -> Hardware Trace Analyzer -> Cache Analyzer

2. Close Stall Profiler when requested

3. Click to reset to original settings

4. Change Transport/Receiver Type to Pro Trace with Buffer Size 1MB

5. Expand Data Collection Settings and select LID Cache Miss Analysis

6. Click on Advanced Settings

51

Page 52: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running cache analysis 7. Select L1D Cache Miss Analysis trigger in left column

8. Expand Properties tree and change Actions to Start Trace and Location to IMG_sobel_3x3_8

9. Expand Global Category to see what event are collected by default

10. Click on in the left margin to add another trigger

11. Select Trigger2 in the left columns and change Actions to End Trace and Location to MultiProc_self

52

Page 53: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Running cache analysis

53

12. Select OK in Advanced Properties dialog

13. Select Start in Hardware Trace Analysis Configuration dialog

14. Complete step iv of Steps to Run Demo

15. Wait for Demo to complete

16. Press Stop in Trace Viewer toolbar

15. Wait for Trace Viewer and Cache Event Profiler processing to complete

Page 54: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Analyzing cache results

54

1. Observe that the 1.1M L1D Read Miss cycles of IMG_thr_le2min_8() is resulting from 16320 cache misses

Page 55: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Using files to view data across analysis

55

1. Open Function Profiler from Tools -> Hardware Trace Analyzer -> Function Profiling

2. Select close Cache Analysis when requested

3. Set End Address = MultiProc_self. This is to work around the issue where this is not remembered.

4. Select Start

5. Complete step iv of Steps to Run Demo

6. Wait for Demo to complete

7. Press Stop in Trace Viewer toolbar

8. Wait for Trace Viewer and Exclusive Function Profiler processing to complete

9. Create a c:\temp folder on your hard disk if not already present

10. Select Save in Trace Viewer toolbar

11. Browse to c:\temp, Specify File Name mytrace and select Save

Page 56: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Using files to view data across analysis

56

12. Click Start in Trace Viewer toolbar to restart tracing

13. Repeats steps 5 to 8 above to profile the application again

14. Select Open File from Tools -> Hardware Trace Analyzer -> Open File

15. Browse to c:\temp, select File Name mytrace.tdf and select Open

16. In Trace Viewer – MyFunctionProfileTrace.tdf select Analyze->Exclusive Function Profiling

17. Now current profile result can be visually compared with saved result

Page 57: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Exporting data

57

1. In Exclusive Function Profile – C66xx_0 right-click-context menu, select Data -> Export All … to export all records

2. (Optional) In the Export Data Dialog Add/Remove columns to export

3. (Optional) In the Export Data Dialog use the Move button to rearrange order in which columns are to be exported

4. Browse to c:\temp folder, specify File Name myexporttrace and select Save, then select OK to export all records

5. Data is exported in CSV format which can be consumed by CCS and other tools such as Excel

Page 58: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Importing data

58

1. Select Open File from Tools -> Hardware Trace Analyzer -> Open File

2. At Bottom Right corner of Open Trace File dialog select CSV trace data file (*.csv)

3. Browse to c:\temp, select File Name myexporttrace.csv, select Open

4. The data form the csv file is now visible in the Trace Viewer

Page 59: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3 – Using analysis dashboard

59

1. Select Open File from Tools -> Hardware Trace Analyzer -> Analysis Dashboard

2. Observe features of Dashboard shown below

3. Select Remove All ( ) to remove all running analysis

List of all running analysis

Run additional analysis

Delete selected analysis

Delete all analysis

Expand/collapse all nodes

Enable/Disable analysis. This free up all hardware resources

Open configuration dialog

What is the data source

Click to collapse/expand

Double click to open/select view

Page 60: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS3- What did we learn?

• DSP Trace can be used to profile hotspots in application

• Hotspots can further be analyzed using stall and cache profiling

• Data can be saved to binary file to use for comparison with future results or to share with others

• Data can be exported/imported via CSV file

• Analysis Dashboard provides access to all analysis

60

Page 61: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4 Getting started with non-intrusive system

trace (STM) SoC profiling

61

Page 62: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4- Setup memory throughput analysis (i)

62

1. Continue from the WS3

2. Go to Tools Hardware Trace Analyzer Memory Throughput and Access Analysis

3. Select Transport Type Pro Trace, Buffer Type Stop-on-full, Buffer Size 64 MB and Number of Pins 4 pin. Go to the advanced settings

Page 63: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

4. By default, DDR3 memory throughput will be captured.

5. For DDR3, as shown in the snapshot below, under Transaction Master enable only C66x_0 (core 0) and disable all other masters.

6. Click OK.

7. Now click on Start to setup the trace

8. Run all the cores from CCS, if not already running. Now, run the demo by following the steps in slides steps to run the demo (iv & v).

63

WS4- Setup memory throughput analysis (ii)

Page 64: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions 64

9. Hit Stop in Trace Viewer tab:

10. DDR CP tracer messages are captured in the Trace viewer tab:

WS4- Analysis view

Page 65: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4- View core0 DDR3 bandwidth utilization

65

11. Select Memory Throughput – CSSTM_0 tab and select DDR:CPU zoom to the portion of the graph where the image is being processed:

Page 66: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4- View system DDR3 bandwidth utilization

66

12. Now select DDR:All Bus Masters

Page 67: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4- View system DDR3 Latency

67

12. Select Minimum Average Latency – CSSTM_0 tab and zoom to the portion of the graph where the image is being processed:

Page 68: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS4- What did we learn?

• On the MCSDK image processing demo, we compared Core0’s DDR3 bandwidth usage with the complete system’s DDR3 bandwidth usage.

• On the MCSDK image processing demo, we captured system’s DDR3 latency.

• We were able to setup and analyze DDR3 memory performance and access analysis for MCSDK image processing demo.

68

Page 69: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5 Customize STM SoC profiling for DDR and

MSMC bandwidth and latency analysis

69

Page 70: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- Setup (i)

70

1. Continue from the WS4

2. Click on Analysis Properties for bringing up setup configuration dialog box

3. Now go to the advanced settings

Page 71: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- Setup (ii) 4. We can see by default, DDR3 memory throughput will be captured.

5. Now add a custom trigger for capturing MSMC memory throughput.

6. In properties, select Transaction monitor MSMC_0, Average Access Size/Rate false

7.Click OK.

8. Now click on Start to setup the trace

9. Run all the cores from CCS, if not already running. Now, run the demo by following the steps in slides steps to run the demo (iv & v)

71

Page 72: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- Analysis view

72

10. Hit Stop in Trace Viewer tab:

11. DDR and MSMC CP tracer messages are captured in the Trace viewer tab:

Page 73: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- View core0 DDR3 bandwidth utilization

73

12. Select Memory Throughput – CSSTM_0 tab and select DDR:CPU zoom to the portion of the graph where the image is being processed:

Page 74: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- View system DDR3 bandwidth utilization

74

13. Now select DDR:All Bus Masters:

Page 75: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- View cores MSMC_0 bandwidth utilization

75

14. Now select MSMC_0:CPU :

Page 76: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- View system MSMC_0 bandwidth utilization

76

15. Now select MSMC_0:All Bus Masters :

Page 77: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS5- What did we learn?

• Customize non-intrusive SoC profiling (memory performance and access analysis) job to add MSMC memory bandwidth measurement to the default DDR memory bandwidth job.

• On the MCSDK image processing demo, we compared Core0’s DDR3 bandwidth usage with the complete system’s DDR3 bandwidth usage.

• On the MCSDK image processing demo, we compared all cores (0-3) MSMC (bank0) bandwidth usage with the complete system’s MSMC (bank0) bandwidth usage.

• Similar to MSMC_0, bandwidth at any other memory end point (core0 L2, core1 L2 . . ) can be measured.

77

Page 78: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6 DDR bandwidth and latency analysis by

interfacing embedded APIs (cToolsLib )

78

Page 79: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

cToolsLib Software

79

Keystone Debug and Trace HW

AETLib DSPTraceLib ETBLib CPTLib STMLib

Ctools Use-Case Library (Ctools_UCLib)

CP Tracer profiling- System bandwidth

- System latency

- Master bandwidth- Total bandwidth

- Event profiling

PC Trace- Trace capture on an exception- Start and stop PC + timing trace

Memory watch- Capture a list of unintended accesses to a particular memory range- Raise an exception on the first unintended access

Statistical profiling

Application SW

• CToolsLib package is a collection of libraries that provides access to Keystone debug and trace features via software APIs.

• cToolsLib information and APIs are available on http://processors.wiki.ti.com/index.php/CToolsLib

• MCSDK also comes with cToolsLib package and Use Case library that encapsulates cToolsLib APIs into high level use cases for easy integration (c:\ti\ctoolslib_1_0_0_2)

Page 80: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- Setup (i) 1. cToolsLib instrumentation APIs are embedded for the master core (core0). This workshop uses the image

processing demo with cToolsLib instrumentation.

2. Power cycle the C6670 EVM.

3. Follow all the steps in slide steps to run the demo (ii)

4. Connect the cores

5. Load image_processing_evmc6670l_total_bandwidth_master.out (mcsdk_2_01_02_06 \ demos \ image_processing \ ipc \ evmc6670l \ master \ total_bandwidth \ Debug) on core0.

6. Load image_processing_evmc6670l_slave.out (mcsdk_2_01_02_06 \ demos \ image_processing \ ipc \ evmc6670l \ slave \ no_instrumentation \ Debug) on cores1,2, and 3.

80

Page 81: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- Setup (ii)

81

6. Go to Tools Hardware Trace Analyzer Custom System Trace and select Transport Type Pro Trace, Buffer Type Stop-on-full, Buffer Size 64 MB and Number of Pins 4 pin :

7. Hit Start and a trace viewer – CSSTM_0 tab is opened. 8. Now run all the cores. Run the demo, following all the steps in slides steps to run the demo (iv & v)

Page 82: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- Analysis view

82

9. Hit Stop in Trace Viewer tab:

10. DDR CP tracer messages are captured in the Trace viewer tab:

Page 83: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- View core0 DDR3 bandwidth utilization

83

11. From the trace viewer tab, select Analyze Memory Throughput

Page 84: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- View system DDR3 bandwidth utilization

84

12. Now select DDR:All Bus Masters:

Page 85: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- View system DDR3 Latency

85

13. From the trace viewer tab, select Analyze Minimum Average Latency

Page 86: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS6- What did we learn?

• How to use cToolsLib embedded APIs to perform non-intrusive system trace (STM) SoC profiling.

• On the MCSDK image processing demo, we compared Core0’s DDR3 bandwidth usage with the complete system’s DDR3 bandwidth usage.

• On the MCSDK image processing demo, we captured system’s DDR3 latency.

• Using cToolsLib embedded APIs, we were able to setup and analyze DDR3 memory performance and access analysis for MCSDK image processing demo.

86

Page 87: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS7 Command line trace decode interface for

offline decode

87

Page 88: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS7- Invoking command line decoder

1. This workshop uses TD (http://processors.wiki.ti.com/index.php/TD), a command line decoder to convert hardware trace into human readable format.

2. Open a command line shell and change directory to <CCS>\ccsv5\ccs_base\emulation\analysis\bin

3. We use previously saved trace file (TDF) from the WS3 to decode offline

4. Type the following command

td -procid 66x -bin C:\temp\mytrace.tdf -app C:/ti/MCSDK_2_01_02_06/demos/image_processing/ipc/evm6670l/image_processing_evmc6670l_master.out -rcvr Pro -format CSV_NO_TPOS_QUOTE -columns "Program Address,Cycles,Trace Status" -timestamp abs –output mytrace.csv

Notes:

Without the “–output” option, the output would displayed on the stdio

if you do not specify “–column” option, all the columns will be outputted. Help is available with –help option.

88

Page 89: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS7- Visualizing the output

1. The output from the command line decoder is below (opened in MS Excel) . This includes all the information.

2. If needed, the output CSV file could also be imported in CCS for further analysis.

89

Page 90: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS7- Further processing the output

1. The output of the tool could be directed to other general purpose post processing tools for custom /command line processing.

td -procid 66x -bin C:\temp\mytrace.tdf -app C:/ti/MCSDK_2_01_02_06/demos/image_processing/ipc/evm6670l/image_processing_evmc6670l_master.out -rcvr Pro -format CSV_NO_TPOS_QUOTE -columns "Program Address,Cycles,Trace Status" -timestamp delta | grep “Pipeline stall”

90

Page 91: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS7- What did we learn?

• How to invoke command line decoder for offline decode on trace captured elsewhere.

• The output can also be imported in CCS for analysis.

• The output can be stored in a text file or piped to other tools for additional processing.

91

Page 92: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8 Function profiling using Cortex A program

trace

92

Page 93: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8- Setup (i)

1. In CCS, setup C6614 target configuration with XDS560v2 Pro Trace USB connection

– Go to File New Target Configuration File

– Type file name as C6614_XDS560v2 and click Finish

– Now select Connection as Spectrum Digital XDS560V2 STM USB Emulator

– Type C6614 in the Device field; device names will be filtered; select/check TMS320C6614

– Click on Target Configuration from Advanced Setup (RHS)

– Select View->Target Configurations to see a list of all configuration files.

– Select the one you just created (as C6614_XDS560v2 .ccxml) under User Defined.

– Launch the debug session by selecting the Launch Selected Configuration in the context menu.

2. Connect Cortex A8 core

93

Page 94: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8- Setup (ii)

3. Download the Cortex A8 demo example (A8.zip) and unzip the files on your PC.

4. Load modem.out to the Cortex A8 from the downloaded example.

5. In menu select Tools -> Hardware Trace Analyzer -> PC Trace to start PC Trace

6. Click on Start to setup the trace and open Trace Viewer.

94

Page 95: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8- Tracing program execution 1. Now run the application from debug view (Resume/F8) and then halt or suspend the execution

2. Program trace shows up in the Trace Viewer

95

Page 96: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8- Profiling and execution flow graph 1. Function profiling can be run on the program execution trace by Analyze-> Exclusive Function

Profiler

2. Function execution graph can be launched by clicking Analyze -> Function Execution Graph

96

Page 97: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

WS8- What did we learn?

• Can use Cortex A8 ETM trace to get real-time trace of program execution

• Analysis is available in Trace Viewer to run on collected data

• Profiling analysis provides summary of executed functions and cycles spent

• Function Execution Graph provides a bird’s eye view of program execution

• Can use Function Execution Graph to measure the number of cycles between operations

97

Page 98: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

SerDes based trace receiver - preview

98

Page 99: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

XDS SerDes Trace Integration Kit (TRIK)

• A low cost and high speed trace receiver by using off the shelf Keystone EVM – Uses available SERDES functional interfaces (e.g. Hyperlink)

– EVM DDR memory as trace buffer (e.g. 512 MB with Nyquist EVM)

– Comes with deployable TRIK embedded software (source and binary form)

• Supports simultaneous trace

export from multiple cores

(DSP, ARM, and STM).

• Seamless CCS integration.

99

Target SoC

Trace receiver DSP

DDR3 Memory

Host PC (running CCS)

Ethernet cable

XDS100 or XDS200 (JTAG over USB or

Ethernet)

Hyperlink cableEVM as XDS

SERDES trace receiver

Page 100: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

XDS SerDes Trace Integration Kit (TRIK)

• Non-intrusive to target DDR memory (both in terms of memory and read/write bandwidth).

• Needs dedicated target DMA resources and very little impact on the target Tera-Net data movement.

100

CORE ETB 2

CORE ETB 3

STM ETB

CORE 1

CORE 2

CORE 3

STM

CORE ETB 0CORE 0

DMA Controller and TeraNet

(Chip Infrastructure)

DDR3 System Memory

CORE ETB 1

Hyperlink

Host Computer

(for decoding

and display)

CPU and Trace Sub-System

C6670 (Nyquist)

Trace Fill Path

Trace Drain Path

DMA Controller and TeraNet (Chip Infrastructure)

DDR3 System Memory (2GB)

ETHERNET (or) PCIe (or) USB

Trace Receiver C6670 (Nyquist) or

Low cost C6657 (Gauss)

Hyperlink

4 Lanes @ 12.5 Gbps per lane – 50 Gbps (Hyperlink Cable)

EVM as a low-cost Trace Receiver with required amount of DDR3 memory Target Device which is generating Trace samples

Page 101: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

XDS SerDes Trace Integration Kit (TRIK)

101

ETB0

ETB1

ETB2

ETB7

STM ETB

EDMAHyperlink

Slave

Hyperlink Master

Hyperlink PHY

ETB0 Buffer

(circular)

ETB1 Buffer

(circular)

ETB2 Buffer

(circular)

ETB7 Buffer

(circular)

STM ETB Buffer

(circular)

Hyperlink Master

Hyperlink Slave

Hyperlink PHY

ETB0 Buffer (circular or

stop on full)

ETB1 Buffer (circular or

stop on full)

ETB2 Buffer (circular or

stop on full)

ETB7 Buffer (circular or

stop on full)

STM ETB Buffer (circular or stop on full)

MSMC memory DDR3 memoryTrace Receiver

DSP

Target DSP (Trace generator)

Hyperlink push

Hyperlink pull

Page 102: Keystone Real Time Trace Workshop - Texas …processors.wiki.ti.com/images/archive/5/56/20130410160031...Keystone Real Time Trace Workshop March 2013 ... TI Confidential – NDA Restrictions

TI Confidential – NDA Restrictions

Q & A

102