improvement noc reliability by noc-7mr (noc-septuple-modular-redundancy)

Upload: journal-of-computing

Post on 05-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Improvement NOC Reliability by NOC-7MR (NOC-septuple-Modular-Redundancy)

    1/5

    Improvement NOC Reliability by NOC-7MR

    (NOC-septuple-Modular-Redundancy)

    Reza Kourdy

    Department of Computer EngineeringIslamic Azad University,

    Khorramabad Branch, Iran

    Mohammad Reza Nouri rad

    Department of Computer EngineeringIslamic Azad University,

    Khorramabad Branch, Iran

    AbstractA new chip design paradigm called Network on Chip (NOC) offers a promising architectural choice for future

    systems on chips. NOC architectures offer a packet switched communication among functional cores on the chip. NOC

    architectures also apply concepts from computer networks and organize on-chip communication among cores in layers similar

    to OSI reference model. A fault tolerant network on chip (FT-NoC) system with redundant architecture for reliable applications is

    proposed. Applying different types of redundancy on chip increases reliability, efficiency and effectiveness of the NoC and, at

    large, the aircraft control system itself..

    Index Terms integrated circuit (IC), Systems-on-chip (SOC), Network on Chip (NOC), fault tolerance, very large scale

    integration (VLSI) and deep submicron domain (DSM).

    1 INTRODUCTION

    he International Technology Roadmap for Semicon-ductors (ITRS) predicts that before the end of thisdecade single Systems on Chip (SoC) could embed 4

    billion transistors using 50nm technology, operating at10GHz each ( Benini and De Micheli, 2002) . These ad-vancements raise problems in the communication andinterconnection infrastructure between the components

    inside the chip, hence new architectures and scalable de-sign approaches are needed. In order to cope with thegrowing needs of the interconnected infrastructurethe Network on Chip (NoC) concept has been introducedwhich benefits computer systems by providing higherlevels of performance and reliability. Such computers arenow a mandatory component in the design of automaticcontrol units for aviation systems.

    The Network on Chip system is a collection of compu-tational resources connected together through a net-work inside the chip and communicate using packets.The communication architecture consists of intercon-nected switches each connected to a resource which can

    be a processor core, a memory block, or even a customdesigned hardware which is generally called Intellec-tual Property (IP) Block (Ning, et al. 2007). For avionicsand aircraft control systems, IP blocks can be sensors,analogue to digital converters (ADC), etc. One of themain advantages of NoC systems is the separation ofcomputation and communication in these systems. Thecommunication units are Network Interfaces (NI) andswitches. NI act as the middle layer and transformstreams of bits from the computational resources intopackets before sending them to a router or switch andvice versa.

    2NOC TOPOLOGY

    Network topology defines the placement and intercon-nection of nodes inside the NoC area and determines thebandwidth and latency of a network (Salminen et al.2008). The most common topologies are identified as the

    2D Mesh and Torus due to their grid-type shapes andregular structure (Hu et al. 2008). These are the most ap-propriate topologies and formations for a two dimension-al layout on a chip when an application specific topologyis not considered.These selected topologies are based on the routinghop count, redundancy overhead in number of links incase of link failure, link lengths, energy consumption overthe links and switches, and finally area usage over thesilicon surface. The Torus topology introduces long wires(link redundancy) among the last nodes to complete theshape of the topology.Employing long wires in very large scale integration

    (VLSI) and deep submicron domain (DSM) systems in-creases the capacitance among wires, influences the in-ductance of links, and results in development of crosstalkover links. The other promising topology formation forFT-NoC is the application specific architecture.

    Generally speaking, interconnection networks are dif-ficult to design. The vast design space leaves the systemarchitect with many difficult design decisions to make, allof which impact each other. Furthermore, the suitabilityof a specific NoC design is dependent on what is thehighest priority in the power/performance/area tradeoff,all of which depend on the typical traffic pattern over thenetwork. Therefore, a good NoC design is often applica-

    T

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

    https://sites.google.com/site/journalofcomputing

    WWW.JOURNALOFCOMPUTING.ORG 103

  • 7/31/2019 Improvement NOC Reliability by NOC-7MR (NOC-septuple-Modular-Redundancy)

    2/5

    tion specific.While most interconnection networks have the same

    basic components, there are a number of decisions thatmust be addressed before the actual hardware can be de-signed. Most importantly, wemust focus on selecting atopology, routing algorithm, and flow control combina-tion.

    The network topology dictates the arrangement of thenodes of the NoC, and the links that connect them. Directnetwork topologies are ones in which each node has adedicated router. Some examples include the more basicring, star, and fully-connected networks, and the basiccrossbar circuit. More common direct networks are thosebelonging to the family of 2D and 3D meshes, tori, hyper-cubes, and cube connected cycles networks. Alternatively,in an indirect network a node is either a router or aprocessing element. A message passed betweenprocessing elements goes through one or more routernodes before arriving at its destination processing ele-ment. The (fat) tree networks, butterfly, and clos net-

    works are all common indirect networks.There are many options, but in this work we will focus

    on 2D and 3D mesh and torus networks, the flattenedbutterfly, and fully connected networks. A fully con-nected network is not pictured, but is one in which everynode is directly connected to every other node. Each ofthese topologies can have a concentration of 1, whereeach processing element has its own dedicated NoC rou-ter, or higher than 1, where multiple processing elementsare grouped with a single router. We consider the concen-tration of the network to be the total number ofprocessing elements divided by the number of routers.These topologies represent a good tradeoff of the network

    metrics in which we are interested. These are the mostincrementally expandable networks, and they maintainhigh bisection widths and have good path diversity withreasonable cost. The flattened butterfly especially has alow network diameter, and since it is based on a clos net-work, has path diversity unlike traditional butterfly net-works [1].

    3 USES OF NOCS

    There are several ways interconnection networks can im-prove communication performance in chips. NoCs can beused as a direct replacement for top level interconnect in

    complex ASICs. Instead of consuming routing resourceswith dedicated wires for every cross-chip signal, and de-signing around the issues mentioned earlier, ASICs canbe partitioned into a regular array of tiles. Dedicatedwires can then be used for local interconnections that fallcompletely within a tile, and global interconnections thatcross tile boundaries are replaced with a centralized in-terconnection network; this concept is referred to as pack-et-routed tiles.Through carefully developed wire delay and power con-sumption models, it may be possible to arrive at an exacttile size for a given technology where the power con-sumption is minimized; this is the point at which as tile

    size shrinks the incremental power savings from moving

    local dedicated wire communication to the interconnec-tion network no longer outweighs the overhead of theadditional router hardware needed [2]. By multiplexingglobal signals over the shared links of an interconnectionnetwork, reasonable power savings are possible.At present, dedicated wires or bus architectures are pre-dominantly used to allow communication between high

    level modules in SoCs and multi-core processing chips.Though there is no clear cut difference between some hie-rarchical buses being implemented today and intercon-nection networks, the growing number of blocks thatmust communicate on-chip, and the demands on the per-formance of that communication, are quickly makingpresent common interconnection fabrics obsolete. Themicroprocessor industrial design view is that the parallelprocessing trend in processor architectures will likelycontinue. The industry standard buses today will not beable to provide the degree of connectivity needed by themany-core systems of the future, especially with thebandwidth to memory that will be required [3]. As an

    indication of this we can consider the state-of-the-art infloating point processor performance, the 65 nm 80 tileNoC processor at Intel that performs over 1.0 TeraFLOPSat 4.27 GHz while consuming 97 W. The 8 10 2D meshNoC has one floating point core, with two pipelined float-ing point multiply-accumulate units, and one router pernode. The packet-switched network has a bisectionbandwidth of 2 Tb/s. The fully functioning chip utilizesover 100 million transistors in 275 mm2 [4].Similarly, the increasing complexity of SoC designs willalso require NoC based communication to efficiently ac-commodate the amount of traffic. The bus protocols typi-cally used in SoC designs have restrictive limits on the

    number of clients that can use it to communicate.Connecting each SoC module to a router to form an NoCnode brings uniformity to global interconnect throughwell controlled electrical parameters, simplifying chiptiming. This can save power, while also facilitating thedesign of higher performance circuits with lower latencyand higher bandwidth. Additionally, design becomesvery modular, to the point where creating new chips is amatter of swapping in and out the functional units of thenodes. Given standardized interfaces, from one design tothe next many of the functional units will be reusable, aswould be the interconnection network, drastically cuttingdown on the time and cost of the redesign and verifica-

    tion of low level blocks [5].

    4 FAULT TOLERANCE AND REDUNDANCY

    Fault tolerance is a particular technique thatenables the building of systems that maintain the ex-pected service despite the presence of errors caused byhardware faults within the system itself. The use of re-dundancy increases the reliability of the system but alsoaffects its performance and increases the applicationcosts (power usage and area consumption). A ba-lanced trade off among these factors must therefore beconsidered for maximum performance and high level of

    fault tolerance.

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

    https://sites.google.com/site/journalofcomputing

    WWW.JOURNALOFCOMPUTING.ORG 104

  • 7/31/2019 Improvement NOC Reliability by NOC-7MR (NOC-septuple-Modular-Redundancy)

    3/5

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 3

    4.1 Fault Classification

    Faults are classified in three major groups: designfaults, manufacturing faults, and operational faults(Weaver and Austin, 2001). Operational faults, basedon their frequency and probability of occurrence, aredivided into permanent, intermittent, and transient (Aliet al. 2007). They may also be caused by different envi-

    ronmental, operational, and technological processes (DeMicheli and Benini, 2006) . A major concern in a fault tole-rant NoC design is the tolerance and redundancy of thesystem against permanent and transient faults causedduring operation. Transient faults or malfunctions occurregularly and can be tolerated even at the instruction lev-el (Schagaev, 2008). An example is when some area of thechip experiences an internal failure with permanent ef-fect. However, both types of fault cannot be easily corre-lated to any specific operational, environmental or tech-nological condition.

    4.1 Redundancy ClassificationRedundancy in computer systems can be classified interms of time, information and structure (Schagaev andZalewski, 2001). Any of these redundancy types can beapplied to system hardware or system software toprotect the system against various types of faults and toincrease the reliability of the system. Information redun-dancy can be realized by introducing coding techniquesfor parity check into the data stream and packets. Im-plementation of redundant hardware for simultaneousexecution of same data on various channels and compar-ing the outcomes is a frequent type of structural redun-dancy.

    4.1 Fault Tolerance in NoC using Redundancy

    There are several potential forms of fault toleranceimplementations in NoC systems. Segmentation of thecommunication and computational infrastructure of NoCsystems, one of its core concepts, provides inherent solu-tions to the reliability problems among different com-ponents and areas of systems. For information re-dundancy, information may be prioritized based on theattention needed by the network infrastructure for thesafety and integrity of data into three classes: latencycritical, data streams and miscellaneous information

    (Bjerregaard and Mahadevan, 2006). Each group has itsown type of coding technique for parity check.Most common faults in the structure of the system arenoise concerns, technology delays and fabricationfaults in the manufacture of NoC integrated circuits(IC) (De Micheli and Benini, 2006) . The self-calibratingmethod was a solution to tolerating the gate delay (Wormet al. 2005). For noise concerns, packet encoding and re-dundant transmission of information has been intro-duced. Inserting extra links and wires would tolerate themanufacturing faults but would compromise the perfor-mance and energy consumption considerations insidea NoC IC.

    5 SIMULATION DETAILS

    A network simulator is used to evaluate the conceptfor a typical communications scenario that must supportseveral classes of traffic having a range of QoS require-ments.We would use the tool, Network Simulator ns-2[6],[7] which has been extensively used in the research fordesign and evaluation of public domain computer net-

    work, to evaluate various design options for NOC archi-tecture, including the design of router, communicationprotocol, Routing algorithms. NS-2 is an open source,object-oriented and discrete event driven network simula-tor written in C++ and OTcl. It is a very common andwidely used tool to simulate small and large area net-works [8].

    In this study, we have modeled our NoC architectureconcepts with the widely used network simulator ns-2 [9].This tool has been widely applied in research related tothe design and evaluation of computer networks and toevaluate various design options for NoC architectures[10], including the design of routers, communication pro-tocols, etc.

    6 SIMULATION RESULTS

    In this section, we present the Simulation of NOC-7MR(NOC-septuple-Modular-Redundancy) and we sur-vey the ability and flexibility of ns2 in NOC-Redundancysimulations. Mapping an application, which is describedby a parameterized task graph, on to NoC is a key re-search problem in NoC design. Mesh topology has beenused in a variety of interconnection network applicationsespecially for NoC design. However, the septuple-Modular-Redundancy network has not been studied yetas the underlying topology for NoCs.Figures 1 to 6 show different views of the 7mr-NOC.

    .

    Fig.1. the first view of 7mr-NOC Topology

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

    https://sites.google.com/site/journalofcomputing

    WWW.JOURNALOFCOMPUTING.ORG 105

  • 7/31/2019 Improvement NOC Reliability by NOC-7MR (NOC-septuple-Modular-Redundancy)

    4/5

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 4

    Fig.2. the 2nd view of 7mr-NOC Topology

    Fig.3. the 3rd view of 7mr-NOC Topology

    Fig.4. the 4th view of 7mr-NOC Topology

    Fig.6. the 6th view of 7mr-NOC Topology

    Fig.5. the 5th view of 7mr-NOC Topology

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

    https://sites.google.com/site/journalofcomputing

    WWW.JOURNALOFCOMPUTING.ORG 106

  • 7/31/2019 Improvement NOC Reliability by NOC-7MR (NOC-septuple-Modular-Redundancy)

    5/5

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 5

    REFERENCES

    [1] J. Kim, J. Balfour, and W. Dally, Flattened ButterflyTopology for On-Chip Networks,in Proc. 40th An-nual IEEE/ACM International Symposium on Micro-architecture MICRO 2007, J. Balfour, Ed., 2007, pp.172182.

    [2] S. Heo and K. Asanovic, Replacing global wires with

    an on-chip network: a power analysis, in Proc. In-ternational Symposium on Low Power Electronicsand Design ISLPED 05,K. Asanovic, Ed., 2005, pp.369374.

    [3] S. Borkar, Thousand Core Chips-A TechnologyPerspective, in Proc. 44th ACM/IEEE Design Auto-mation Conference DAC 07, 2007, pp. 746749.

    [4] S. R. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson,J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V. Er-raguntla, C. Roberts, Y. Hoskote, N. Borkar, and S.Borkar, An 80-Tile Sub-100-W TeraFLOPS Processorin 65-nm CMOS, IEEE J. Solid-State Circuits, vol. 43,no. 1, pp. 2941, 2008.

    [5] W. Dally and B. Towles, Route packets, not wires:on-chip interconnection networks, in Proc. DesignAutomation Conference, B. Towles, Ed., 2001, pp.684689.

    [6] LBNL Network Simulator, http://www-nrg.ee.lbl.gov/ns/

    [7] The network simulator - ns-2, available athttp://www.isi.edu/nsnam/ns/

    [8] M. Ali, M. Welzl, A. Adnan, F. Nadeem , " Using theNS-2 Network Simulator for Evaluating Network onChips (NoC)".

    [9] www.isi.edu/nsnam/ns[10]R. Lemaire, F. Clermidy, Y. Durand, D. Lattard, and

    A. Jerraya, Performance Evaluation of a NoC-BasedDesign for MC-CDMA Telecommunications UsingNS-2, in The 16th IEEE International Workshop onRapid System Prototyping, Jun. 2005, pp. 2430.

    Reza Kourdy received his B.Sc. degree in Com-puter Engineering and his M.Sc. degree in Com-puter Architecture both from Azad University ofArak, Iran, in 2002 and 2007, respectively. His re-search interests include Network-On-Chip Archi-tecture and Fault-tolerance.

    Mohammad Reza Nouri Rad re-ceived his B.Sc. Degree in Comput-er Engineering Software from AzadUniversity of Najafabad, Iran, in2001, and his M.Sc. Degree in Com-puter Software from Azad Univer-

    sity of Arak, Iran, in 2010. His re-search interests include Network-On-Chip Architecture and NetworkSecurity. He is Program Committeeof following conferences : WICT 2011 CSNT 2011 CICN 2011 SocProS 2011 CSNT 2012 CICN 2012 BIC-TA 2012

    JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

    https://sites.google.com/site/journalofcomputing

    WWW.JOURNALOFCOMPUTING.ORG 107