lecture 7. amba

Download Lecture 7. AMBA

Post on 04-Feb-2016

184 views

Category:

Documents

1 download

Embed Size (px)

DESCRIPTION

COMP427 Embedded Systems. Lecture 7. AMBA. Prof. Taeweon Suh Computer Science Education Korea University. AMBA. Advanced Microcontroller Bus Architecture On-chip bus protocol from ARM - PowerPoint PPT Presentation

TRANSCRIPT

  • Lecture 7. AMBAProf. Taeweon SuhComputer Science & EngineeringKorea UniversityCOMP427 Embedded Systems

    Korea Univ

    AMBAAdvanced Microcontroller Bus ArchitectureOn-chip bus protocol from ARMOn-chip interconnect specification for the connection and management of functional blocks including processor and peripheral devicesIntroduced in 1996AMBA is a registered trademark of ARM Limited.AMBA is an open standard*Wikipedia

    Korea Univ

    AMBA HistoryAMBAASB APB

    AMBA 2 (1999)AHB widely used on ARM7, ARM9 and ARM Cortex-M based designsASBAPB2 (or APB)

    *WikipediaAMBA 3 (2003)AXI3 (or AXI v1.0)widely used on ARM Cortex-A processors including Cortex-A9AHB-Lite v1.0APB3 v1.0ATB v1.0

    AMBA 4 (2010)ACEwidely used on the latest ARM Cortex-A processors including Cortex-A7 and Cortex-A15ACE-LiteAXI4AXI4-LiteAXI-Stream v1.0ATB v1.1APB4 v2.0ACE: AXI Coherency ExtensionsAXI: Advanced eXtensible InterfaceAHB: Advanced High-performance BusASB: Advanced System BusAPB: Advanced Peripheral BusATB: Advanced Trace Bus

    Korea Univ

    ASB*AMBA Specification V2.0

    Korea Univ

    ASB*Hardware Device 0Hardware Device 1Hardware Device 2Hardware Device 3Hardware Device 4Hardware Device 5ASB

    Korea Univ

    AHB*AMBA Specification V2.0

    Korea Univ

    AHB with 3 Masters and 4 Slaves*AMBA Specification V2.0 H indicates AHB signals

    Korea Univ

    AHB Basic Transfer Example with Wait*AMBA Specification V2.0HREADY Source: Slave Write dataRead data

    Korea Univ

    AHB Burst Transfer Example*AMBA Specification V2.0HREADY Source: Slave

    Korea Univ

    AHD Split Transaction*AMBA Specification V2.0If slave decides that it may take a number of cycles to obtain and provide data, it gives a SPLIT transfer response

    Arbiter grants use of the bus to other mastersHRESP: Transfer response fro slave (OKAY, ERROR, RETRY, and SPLIT)

    Korea Univ

    APB Write/Read*AMBA Specification V2.0

    Korea Univ

    AXI v1.0AMBA AXI protocol is targeted at high-performance, high-frequency system designs

    AXI key featuresSeparate address/control and data phasesSupport for unaligned data transfers using byte strobesSeparate read and write data channels to enable low-cost Direct Memory Access (DMA)Ability to issue multiple outstanding addressesOut-of-order transaction completionEasy addition of register stages to provide timing closure

    *AMBA AXI Specification V1.0

    Korea Univ

    5 Independent ChannelsRead address channel and Write address channelVariable length burst: 1 ~ 16 data transfersBurst with a transfer size of 8 ~ 1024 bits (1B ~ 128B)

    Read data channelConvey data and any read response info.Data bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits

    Write data channelData bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits

    Write response channelWrite response info.*

    Korea Univ

    AXI Read Operation*AMBA AXI Specification V1.0Read Address ChannelRead Data ChannelRREADY: From master, indicate that master can accept the read data and response info.

    Korea Univ

    AXI Write Operation*AMBA AXI Specification V1.0Write Address ChannelWrite Data ChannelWrite Response ChannelWVALID Source: MasterWREADY Source: SlaveBVALID Source: SlaveBREADY Source: Master

    Korea Univ

    Out-of-order CompletionAXI gives an ID tag to every transactionTransactions with the same ID are completed in orderTransactions with different IDs can be completed out of order*AMBA AXI Specification V1.0

    Korea Univ

    ID Signals*AMBA AXI Specification V1.0Write Address ChannelWrite Data ChannelWrite Response ChannelRead Address ChannelRead Data Channel

    Korea Univ

    Out-of-order CompletionOut-of-order transactions can improve system performance in 2 waysFast-responding slaves respond in advance of earlier transactions with slower slavesComplex slaves can return data out of orderA data item for a later access might be available before the data for an earlier access is available

    If a master requires that transactions are completed in the same order that they are issued, they must all have the same ID tag

    It is not a required featureSimple masters and slaves can process one transaction at a time in the order they are issued*AMBA AXI Specification V1.0

    Korea Univ

    Addition of Register SlicesAXI enables the insertion of a register slice in any channel at the cost of an additional cycle latencyTrade-off between latency and maximum frequency

    It can be advantageous to useDirect and fast connection between a processor and high-performance memorySimple register slices to isolate a longer path to less performance-critical peripherals

    *AMBA AXI Specification V1.0

    Korea Univ

    *Backup Slides

    Korea Univ

    A Computer System*CPUNorth BridgeSouth BridgeMain Memory(DDR2)FSB (Front-Side Bus)DMI (Direct Media I/F)Hard diskUSBPCIe cardI/O devicesGraphics card

    Korea Univ

    A Typical I/O System Schematic (Simplified) *Memory Bus, I/O busCPU CoreCacheMain MemoryDiskI/O ControllerGraphics CardNetworkInterruptsDiskI/O ControllerI/O ControllerMemory Controller

    Korea Univ

    I/O InterconnectionA bus is a shared communication link A single set of wires used to connect multiple componentsComposed of address bus, data bus, and control bus (read/write)AdvantagesVersatile new devices can be added easily and can be moved between computer systems that use the same bus standardLow cost a single set of wires is shared in multiple waysDisadvantagesCommunication bottleneck bus bandwidth limits the maximum I/O throughput

    The maximum bus speed is largely limited byThe length of the busThe number of devices on the bus*

    Korea Univ

    I/O Interconnection (Cont)I/O devices and interconnection largely contribute to the performance of computer system Traditionally, parallel shared wires had (have) been used to connect I/O devicesAs the clock frequency increases for communicating with I/O devices, parallel shared wires suffer from clock skew and interference among wiresIndustry transitioned from parallel shared buses to high-speed serial point-to-point interconnections *

    Korea Univ

    Types of BusesProcessor-memory busFront Side Bus (FSB), proprietary busReplaced by QPI (QuickPath Interconnect) in IntelReplaced by Hypertransport in AMDShort and high speedMatched to the memory system to maximize the memory-processor bandwidthOptimized for cache block transfers

    Backplane (backbone) busIndustry standarde.g., PCIexpressAllow processor, memory and I/O devices to coexist on a single busUsed as an intermediary bus connecting I/O busses to the processor-memory bus

    I/O bus Industry standarde.g., SATA, USB, FirewireUsually is lengthy and slowerNeeds to accommodate a wide range of I/O devices*CPUNorth BridgeSouth BridgeMain Memory(DDR2)FSB (Front-Side Bus)DMI (Direct Media I/F)Hard diskUSBGraphics cardProcessor-memory bus Backplane busI/O bus

    Korea Univ

    How Does CPU Access I/O Devices?All the I/O devices have registers implemented, so software programmers can use them to control the devicesThen, for programming, where and how to write to or read from?There are 2 ways to access I/O devicesMemory-mapped I/OI/O-mapped I/O

    Memory-mapped I/OI/O device is mapped to a memory spaceCPU generates a memory transaction to access I/O deviceTo access I/O device In MIPS, use lw or sw instructionsIn x86, use mov instruction*0x00xFFFF_FFFF(4GB-1)Main Memory(1GB)0x3FFF_FFFF(1GB-1)I/O deviceI/O deviceI/O device

    Korea Univ

    How CPU Accesses I/O Devices?I/O-mapped I/OI/O devices are mapped to I/O spaceCPU generates I/O transaction to access I/O deviceTo access I/O deviceIn x86, there are in and out instructions. In x86, I/O space is 64KB

    To differentiate memory space and I/O space, there should be hardware supportISA supportIn x86, mov instruction for memory transaction and in,out instruction for I/O transactionPhysical pin from processor indicating the transaction type (memory or I/O)For example, the pin is driven to 1 for memory transaction or 0 for I/O transaction*0x00xFFFF(64KB-1)I/O deviceI/O deviceI/O device

    Korea Univ

    How I/O Communicates with CPU?PollingCPU periodically checks the status of I/O devices to determine its need for serviceCPU is totally in controlCan waste a lot of CPU time due to speed differences

    InterruptI/O device issues an interrupt to indicate that it needs attentionAn I/O interrupt is asynchronous wrt (with respect to) instruction executionIt is not associated with any instruction, so doesnt prevent any instruction from completingYou can pick your own convenient point in the pipeline to handle the interrupt

    *

    Korea Univ

    DMA (Direct Memory Access)Typically, moving data from one place to another involve CPU instructionsLoad (lw) from a location (e.g. memory in an I/O device)Store (sw) to another location (e.g. main memory)Moving a large chunk of data with CPU instructions could take a large fraction of CPU time

    DMA has the ability to transfer large blocks of data directly to/from the memory without involving the processorThe processor initiates the DMA transfer by supplying source and destination addresses, the number of bytes to transferThe DMA controller manages the entire transfer (possibly thousand of bytes in length), arbitrating for the busWhen the DMA transfer is complete, the DMA controller interrupts the processor to inform that the transfer is complete

    There may be multiple DMA devices in one systemProcessor and DMA controllers contend for bus cycles and for memory

    *

    **Because AXI channel transfers information in one direction***