lecture 17: power distribution network in high speed integrated circuit … high... ·...
TRANSCRIPT
ELCT 1003:High Speed Electronic Circuit
Lecture 17: Power Distribution Network in High Speed Integrated Circuit (Continued)
Dr. Mohamed Abd El Ghany, Department of Electronics and Electrical Engineering
Synchronous NoC Architecture
2Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
The clock is distributed for the all components of the switch.
The Write and Full signals are used to control the operation of the Synchronous input and output FIFO.
Global Clock distribution Network
for Synchronous BFT Architecture
3Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Block to block distances are increasing
– Delay of the global wiring is increasing dramatically
*National Technology Roadmap for Semiconductors: Semiconductor Industry Association, 2007
The clock signal is distributed across whole IP blocks and switches using H-tree
When the number of IPs blocks increases, the complexity increases.
Clock network dissipates a significant portion of the total power consumed by a Synchronous circuit.
The clock distribution becomes a difficult task.
16 IPs 64 IPs 256 IPs
Global Clock distribution Network for
Synchronous CLICHÉ Architecture
4Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Block to block distances are increasing
– Delay of the global wiring is increasing dramatically
*National Technology Roadmap for Semiconductors: Semiconductor Industry Association, 2007
Asynchronous Architecture
5Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Communication between the NoC switch and the Synchronous units is done using pausable clock mechanism called SAS (Synchronous-to-Asynchronous and Asynchronous-to-Synchronous interfaces) *.
A programmable local clock generator is implemented within each unit to generate a variable frequency.
Implementing the NoCarchitectures using GlobalAsynchronous LocallySynchronous techniques(GALS) has been suggestedas a potential solution to theproblem of the clock skew inSynchronous NoCarchitectures
* * Krstic, M.; Grass, E.; Gurkaynak, F.K.; Vivet, P., “Globally Asynchronous, Locally Synchronous Circuits: Overview and Outlook”, IEEE Design & Test of Computers, vol. 24, no. 5, pp. 430 – 441, Sept.-Oct. 2007
FIFO Interfaces
6Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
The synchronous put interface is controlled by CLKput.
The synchronous get interface is controlled by CLKget.
Validget is asserted during a get operation.
Synchronous FIFO interfaces Asynchronous FIFO interfaces
The asynchronous interfaces are not synchronized to a clock signal.
This interface does not have a full output; instead, the interface simply withholds putack until the FIFO becomes non-full.
The interface withholds getack until the FIFO becomes non-empty..
Mixed-timing FIFO
7Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Each FIFO is constructed as a circular array of identical cells, communicating with the two external interfaces (put and get) on common data buses.
Two tokens control the input and output behavior of the FIFO: a put token is used to enqueue data items, and a get token is used to dequeue data items.
The synchronous interfaces have two additional types of components: detectors, which compute the current state of the FIFO, and external controllers, which conditionally pass requests for data operations to the cell array.
Synchronous -Synchronous FIFO
Asynchronous -Asynchronous FIFO
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
FIFO Protocols
8Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Synchronous -Synchronous FIFO
Synchronous put protocol
When FIFO receives a request on putreq and a data item on
putdata, the data item is enqueued at the start of the next clock cycle.
If the FIFO becomes full, then full signal is asserted before the
next clock cycle, and the put interface is prevented from any further
operation.
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
FIFO Protocols
9Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Synchronous -Synchronous FIFO
Synchronous get protocol
A synchronous get operation is enabled by a request on getreq, asserted immediately after the
positive edge of CLKget.
By the end of the clock cycle, a data item is placed on getdata together with its validity bit
(validget ).
If the FIFO becomes empty, that empty is also asserted, and the get interface is stalled until
the FIFO becomes non-empty.
a get request, and empty can indicate three outcomes:
1) data item dequeued, more data items available (validget=1, empty=0 );
2) data item dequeued, FIFO has become empty (validget=1, empty=1 );
3) FIFO empty, no data item dequeued (validget=0, empty=1 ).
FIFO Protocols
10Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Asynchronous -Asynchronous FIFO
Asynchronous put protocolThe asynchronous interfaces use a
four-phase handshaking.
The sender starts a put operation by placing a data item on putdata
and requesting the FIFO to enqueue it on putreq.
The enqueuing completion is indicated by asserting putack.
The two control wires are then reset to the idle state, first putreq and
then putack .
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
FIFO Protocols
11Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Asynchronous -Asynchronous FIFO
Asynchronous get protocol
The receiver starts a get operation by requesting the FIFO on getreq to
dequeue a data item.
The FIFO places the data item on getdata and indicates on getack that
the data item can be read by the receiver domain.
The two control wires are then reset to the idle state, first getreq and
then getack.
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Asynchronous-asynchronous FIFO cell
12Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Asynchronous -Asynchronous FIFO
An asynchronous cell consists of :
Obtain put token (OPT) controller
Obtain Get Token (OGT) controller
Data validity (DV) controller
Two asynchronous-C elements
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Asynchronous-asynchronous FIFO cell
13Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Asynchronous -Asynchronous FIFO
The Cell enqueues a data item as follows:
After a two transitions on we1, the put token is in the cell (ptok=1). When the environment
requests a put operation (putreq=1 ), we is asserted. This event causes several operations in
parallel: the state of the cell is changed to full by DVas, register REG is enabled to latch data,
and the cell starts to send the put token to the left cell and to reset OPT (ptok=0 ).
When putreq is deasserted, we is then deasserted. This event completes the passing of the put
token to the left cell. The cell is now prepared to start another put operation once the data in
REG is dequeued.
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Asynchronous-asynchronous FIFO cell
14Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Asynchronous -Asynchronous FIFO
The Cell dequeues a stored data item as follows:
After two transitions on re1, the get token is in the cell ( gtok=1). The register then immediately
outputs its data onto the global get data bus, even before the get interface requests it. When the
environment does request a get operation (getreq=1 ), re is asserted. This event causes the cell
to acknowledge the get operation and to start sending the get token to the left cell. When getreq
is deasserted, re is deasserted. This event causes several operations in parallel: the cell
completes the four-phase handshake on the get interface, it completes sending the put token,
OGT is reset (gtok=0), and the data validity controller changes the state of the cell to “empty” (
valid=0).
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Asynchronous-Synchronous FIFO cell
15Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
The Cell consists of
the asynchronous put part
and the synchronous get
part and new data validity
controller (Dvas).
Dvas accepts as inputs
we and re. Its output are
ei (indicating the cell is
empty, allowing the next
put operation), and fi
(indicating the cell is full-
used by the empty
detector).
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Synchronous-Asynchronous FIFO cell
16Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
The Cell consists of
the synchronous put part
and the asynchronous get
part and new data validity
controller (Dvas).
Dvas accepts as inputs
we and re. Its output are
ei (indicating the cell is
empty-used by the full
detector), and fi
(indicating the cell is full-
used, allowing the next
get operation).
T. Chelcea and S. M. Nowick, “Robust Interfaces for Mixed-Timing Systems,” IEEE Tranaction on Very Large Scale Integration Systems, ,Aug. 2004.
Power Dissipation
17Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Using the leakage power reduction technique proposed in [1], [2], the dynamic power dissipation can be considered as an efficient metric to compare the power dissipation in Synchronous and Asynchronous designs.
IP
S S
IP
S
IP IP
S
Inter-switch linkswitch
repeater
The activity factor of the data transfer
between two switches
The activity factor of the interswitch link
[1] Y. Thonnart, E. Beigné, A. Valentian, P. Vivet, “Power Reduction of Asynchronous Logic Circuits using Activity Detection,” IEEE Transactions on VLSI Systems, pp. 893-906, July 2009[2] M. A. Abd Elghany, M. A. El-Moursy, D. Korzec and M. Ismail, ”Power Efficient Networks on Chip," Proceedings of the IEEE International Conference on Electronics, Circuits, and Systems,2009
Power Consumption of the Interswitch Links
18Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Interswitch links
data transfer
signalscontrol signals clock signal
request/
acknowledgment
signals
Power Dissipation of Asynchronous and
Synchronous Designs
19Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Syn_switchSyn_switch Asyn_switchAsyn_switch
In BFT, the switch has six ports
The Asynchronous design is more power efficient when
is greater than zero.
Power Dissipation of Asynchronous and
Synchronous Designs
20Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
The Asynchronous design is more power efficient when
is greater than zero.
Power Dissipation of Asynchronous and
Synchronous Designs
21Dr. Mohamed Abd el Ghany
Department of Electronics and Electrical Engineering
ELCT 1003: High Speed
Electronic Circuits
Given the BFT architecture, a system of 16 IPs, chip
size of 20mmX20mm, 90nm technology node, and
clock frequency of 200MHz:
- Draw the global clock distribution network
- The total power dissipation required for the global
clock distribution network