generationofapplicationspeciﬁcfault … · 2019. 6. 23. · bezerra, gustavo alves. generation of...

Universidade Federal do Rio Grande do NorteCentro de Ciências Exatas e da Terra

Departamento de Informática e Matemática AplicadaBachelor in Computer Science

Generation of Application Specific FaultTolerant Irregular NoC Topologies Using Tabu

Search

Gustavo Alves Bezerra

Natal-RN

June 2019

Gustavo Alves Bezerra

Generation of Application Specific Fault twTolerantIrregular NoC Topologies Using Tabu Search

Undergraduate thesis submitted to the De-partamento de Informática e MatemáticaAplicada of the Centro de Ciências Exatas eda Terra of the Universidade Federal do RioGrande do Norte as a partial requirement forobtaining the bachelor’s degree in ComputerScience.

Advisor

PhD Monica Magalhães Pereira

Universidade Federal do Rio Grande do Norte – UFRNDepartamento de Informática e Matemática Aplicada – DIMAp

Natal-RN

June 2019

Bezerra, Gustavo Alves. Generation of application specific fault tolerant irregularNoC topologies using tabu search / Gustavo Alves Bezerra. -2019. 119f.: il.

Monografia (Bacharelado em Ciência da Computação) -Universidade Federal do Rio Grande do Norte, Centro de CiênciasExatas e da Terra, Departamento de Informática e MatemáticaAplicada. Natal, 2019. Orientadora: Monica Magalhães Pereira. Coorientadora: Sílvia Maria Diniz Monteiro Maia.

1. Computação - Monografia. 2. Redes em chip - Monografia. 3.Topologias irregulares - Monografia. 4. Aplicação específica -Monografia. 5. Tolerância a falhas - Monografia. 6. Busca Tabu -Monografia. I. Pereira, Monica Magalhães. II. Maia, Sílvia MariaDiniz Monteiro. III. Título.

RN/UF/CCET CDU 004

Universidade Federal do Rio Grande do Norte - UFRNSistema de Bibliotecas - SISBI

Catalogação de Publicação na Fonte. UFRN - Biblioteca Setorial Prof. Ronaldo Xavier de Arruda - CCET

Elaborado por Joseneide Ferreira Dantas - CRB-15/324

Undergraduate thesis under the title Generation of Application Specific Fault Tolerant

Irregular NoC Topologies Using Tabu Search presented by Gustavo Alves Bezerra and

accepted by the Departamento de Informática e Matemática Aplicada of the Centro de

Ciências Exatas e da Terra of the Universidade Federal do Rio Grande do Norte, being

approved by all members of the examining board specified below:

PhD Monica Magalhães PereiraAdvisor

Departamento de Informática e Matemática AplicadaUniversidade Federal do Rio Grande do Norte

PhD Sílvia Maria Diniz Monteiro MaiaCo-advisor

Departamento de Informática e Matemática AplicadaUniversidade Federal do Rio Grande do Norte

PhD Márcio Eduardo KreutzDepartamento de Informática e Matemática Aplicada

Universidade Federal do Rio Grande do Norte

Natal-RN, June 2019.

To my family and friends that supported me throughout this journey.

Acknowledgements

It would be impossible to conceive this work without the support provided by the

professors and UFRN’s Programa de Educação Tutorial - Ciência da Computação. Thus,

special thanks to Monica Magalhães Pereira, Sílvia Maria Diniz Monteiro Maia, and Um-

berto Souza Da Costa.

Thanks to my family for all the love and support, and for withstanding all the diffi-

culties encountered – Erbena Sales Alves Bezerra, José Guilardo Gonçalves Bezerra, and

Juliana Alves Bezerra. In addition, thanks to Iria de Fátima Bezerra Pinho for the long

distance support.

Thanks to Breno “Blinn” Viana “Phong”, “Deba” Emili Costa, Felipe “Barba-lho”,

Jhonattan “Johnson” Cabral, “Pratíxia” Pontes Cruz, Raul “Dalinda” Silva, “Showzivan”

Medeiros da Silva Gois, and Vitor “God”eiro for all the discussions, conversations, memes

and for turning the last semesters of the Computer Science course one of the most mem-

orable times of my life.

Thanks to Giorgio Brito, “Juhauare” Jales, Larissa “Lucy”ano, Misa Uehara, and Paola

Gessy for being present in some keys moments, helping me to keep my sanity. Thanks to

Joel Felipe, Vitor “God”eiro (again), and Vitor Greati for directly and indirectly inspiring

me to focus, and persist on my studies.

Last but not least, thanks to “the dudes” Victor “Polar” Santos, and Yuri “Kbelo”

Messias for being present since 2010; and for all the games, CiViKs, defeats, achievements,

and coffees shared.

“But now that it’s over

I’ll see you the next time

Remember the future is yours”

Nektar,

remember the future

Geração de Topologias Irregulares para AplicaçãoEspecífica e Tolerantes à Falhas Utilizando Busca Tabu

Autor: Gustavo Alves Bezerra

Orientador(a): Doutora Monica Magalhães Pereira

Resumo

As redes em Chip (NoC) foram propostas para aprimorar o desempenho de computa-

dores. As primeiras topologias sugeridas tendiam a possuir uma estrutura regular, vis-

ando flexibilidade – desempenho razoável para diversas aplicações e múltiplos caminhos

entre roteadores. Topologias regulares são piores em desempenho se comparadas a to-

pologias geradas para aplicações específicas, normalmente irregulares. Por outro lado,

topologias irregulares podem possuir baixa flexibilidade. Na era dos bilhões de transist-

ores, componentes de circuitos são mais suscetíveis a falhas, sejam causadas por radiação,

interferência eletromagnética ou efeitos similares. Devido ao custo de produção de tais

circuitos, deseja-se aumentar a durabilidade (vida útil), desempenho e flexibilidade dos

mesmos. Durabilidade pode ser obtida ao se adicionar tolerância a falhas num circuito.

Portanto, ao adicionar-se componentes redundantes numa NoC (roteadores e conexões),

é possível que sua durabilidade e flexibilidade (caminhos alternativos) sejam melhoradas,

embora o consumo de energia piore. Este trabalho propõe a geração de topologias irregu-

lares utilizando Busca Tabu.Por conseguinte, gerando topologias intermediárias: flexíveis

se comparadas com a maioria das NoCs irregulares (possuindo certo grau de tolerância

a falhas e caminhos alternativos entre roteadores), porém obtendo alto desempenho para

aplicações específicas se comparadas com NoCs regulares.

Palavras-chave: Redes em Chip, Topologias Irregulares, Aplicação Específica, Tolerância

a Falhas, Busca Tabu.

Generation of Application Specific Fault TolerantIrregular NoC Topologies Using Tabu Search

Author: Gustavo Alves Bezerra

Advisor: Monica Magalhães Pereira, PhD

Abstract

Network on Chip (NoC) was proposed to enhance computer performance. Initially con-

ceived topologies tended to have a regular structure, aiming flexibility – regular perform-

ance for different applications, and multiple paths between routers. Regular topologies

lack in performance if compared to specific application generated topologies, often irreg-

ular. On the other hand, irregular topologies may lack flexibility. In the billion-transistor

era, circuit components are more susceptible to faults, whether caused by radiation, elec-

tromagnetic interference or similar effects. Due to the cost of producing such circuits, it

is desirable to increase their durability (lifespan), performance, and flexibility. Durability

may be achieved by adding fault-tolerance to the circuit. Therefore, by adding redundant

components – e.g. routers or links – to an irregular NoC, it may be possible to increase

its durability and flexibility (multiple communication paths), though energy consump-

tion may be impaired. This work proposes the generation of irregular topologies using

Tabu Search.Thus generating intermediate topologies: flexible if compared to most irreg-

ular ones (some fault resistance), yet achieving application specific high performance if

compared to regular NoCs.

Keywords : Network on Chip, Irregular Topologies, Application-Specific, Fault-Tolerance,

Tabu Search.

Lista de figuras

1 Graph examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 22

2 Example of regular NoC topologies. . . . . . . . . . . . . . . . . . . . . p. 25

3 Examples of irregular NoC topologies. . . . . . . . . . . . . . . . . . . . p. 26

4 Examples of areas isolated in NoCs after faults. . . . . . . . . . . . . . p. 27

5 A Task Graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 28

6 Example of Task Graph edge conversion. . . . . . . . . . . . . . . . . . p. 39

7 Example of unfeasible solution – UNF . . . . . . . . . . . . . . . . . . . p. 40

8 Example of Delete Edges Until Epsilon. . . . . . . . . . . . . . . . . . . p. 41

9 Example of Add Edges Until Epsilon. . . . . . . . . . . . . . . . . . . . p. 42

10 Example of making an unfeasiable solution feasible. . . . . . . . . . . . p. 43

11 Tabu List examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 46

12 Examples of valid edges’ node swapping process . . . . . . . . . . . . . p. 52

13 Example of invalid edge node swapping operation. . . . . . . . . . . . . p. 52

14 Example of spin operation with a minimum degree node. . . . . . . . . p. 54

15 Examples of default scenario. . . . . . . . . . . . . . . . . . . . . . . . p. 54

16 Example of successful spin operation with a maximum degree node. . . p. 56

17 Examples of unsuccessful spin operation with a maximum degree node. p. 57

18 Example of successful double spin operation. . . . . . . . . . . . . . . . p. 58

19 Example of Fault Injection Algorithm. . . . . . . . . . . . . . . . . . . p. 60

20 Chosen TGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 61

21 Influence of tabuListSize arguments for Latency Estimation in chosen

TGs’ generated solutions. . . . . . . . . . . . . . . . . . . . . . . . . . p. 62

22 Influence of terminationCrit arguments for Latency Estimation in chosen

TGs’ generated solutions. . . . . . . . . . . . . . . . . . . . . . . . . . p. 63

23 Overall median latency for all benchmarked TGs. . . . . . . . . . . . . p. 64

24 Box plots of chosen TGs solutions’ latency estimation. . . . . . . . . . p. 66

25 Examples of AP2TG solutions. . . . . . . . . . . . . . . . . . . . . . . p. 67

26 Examples of MPEGTG solutions. . . . . . . . . . . . . . . . . . . . . . p. 67

27 Fault injection on median chosen TGs solutions. . . . . . . . . . . . . . p. 69

28 SAP2TG,15 behaviour during fault injection. . . . . . . . . . . . . . . . . p. 70

29 SMPEGTG,19 behaviour during fault injection. . . . . . . . . . . . . . . . p. 71

30 Fault injection on median solutions with median ε of the chosen TGs. . p. 72

31 Influence of tabuListSize on AP1TG solutions. . . . . . . . . . . . . . p. 78

35 Influence of tabuListSize on INTEGRALTG solutions. . . . . . . . . p. 79

36 Influence of tabuListSize on MPEGTG solutions. . . . . . . . . . . . p. 80

37 Influence of tabuListSize on MWDTG solutions. . . . . . . . . . . . . p. 80

38 Influence of tabuListSize on V OPDTG solutions. . . . . . . . . . . . . p. 80

39 Influence of terminationCriterion on AP1TG solutions. . . . . . . . . p. 81

43 Influence of terminationCriterion on INTEGRALTG solutions. . . . p. 82

44 Influence of terminationCriterion on MPEGTG solutions. . . . . . . p. 83

45 Influence of terminationCriterion on MWDTG solutions. . . . . . . . p. 83

46 Influence of terminationCriterion on V OPDTG solutions. . . . . . . . p. 83

47 Fitness (latency estimation) box plots of AP1TG generated solutions. . p. 84

51 Fitness (latency estimation) box plots of INTEGRALTG generated

solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 85

52 Fitness (latency estimation) box plots of MPEGTG generated solutions. p. 86

53 Fitness (latency estimation) box plots of MWDTG generated solutions. p. 86

54 Fitness (latency estimation) box plots of V OPDTG generated solutions. p. 86

55 Fault injection on median AP1TG solutions. . . . . . . . . . . . . . . . p. 87

59 Fault injection on median INTEGRALTG solutions. . . . . . . . . . . p. 89

60 Fault injection on median MPEGTG solutions. . . . . . . . . . . . . . p. 90

61 Fault injection on median MWDTG solutions. . . . . . . . . . . . . . . p. 90

62 Fault injection on median V OPDTG solutions. . . . . . . . . . . . . . p. 91

63 Median ε AP1 with median fitness. . . . . . . . . . . . . . . . . . . . . p. 92

67 Median ε INTEGRAL with median fitness. . . . . . . . . . . . . . . . p. 94

68 Median ε MPEG with median fitness. . . . . . . . . . . . . . . . . . . p. 95

69 Median ε MWD with median fitness. . . . . . . . . . . . . . . . . . . . p. 95

70 Median ε V OPD with median fitness. . . . . . . . . . . . . . . . . . . . p. 96

71 Median ε AP1 solution with median fitness after 10% fault injection. . p. 97

75 Median ε INTEGRAL solution with median fitness after 10% fault in-

jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 99

76 Median ε MPEG solution with median fitness after 10% fault injection. p. 100

77 Median ε MWD solution with median fitness after 10% fault injection. p. 100

78 Median ε V OPD solution with median fitness after 10% fault injection. p. 101

jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 104

jection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 109

95 Fault injection on median SAP1TG,15 solution. . . . . . . . . . . . . . . . p. 112

99 Fault injection on median SINTEGRALTG,15 solution. . . . . . . . . . . . p. 113

100 Fault injection on median SMPEGTG,19 solution. . . . . . . . . . . . . . p. 114

101 Fault injection on median SMWDTG,18 solution. . . . . . . . . . . . . . . p. 114

102 Fault injection on median SV OPDTG,19 solution. . . . . . . . . . . . . . p. 114

103 AP1TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 115

104 AP2TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 116

105 AP3TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 116

106 AP4TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 117

107 INTEGRALTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 117

108 MPEGTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 118

109 MWDTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 118

110 V OPDTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 119

List of abbreviations and initials

NoC – Network on Chip

QAP – Quadratic Assignment Problem

TG – Task Graph

MPSoC – Multi-Processor System-on-Chip

CVRP – Classical Vehicle Routing Problem

SEA – Set of Edges to Add

List of Symbols

∅ – Empty Set

∪ – Set Union

¬ – Logical negation

∀ – For all

∈ – In

∧ – Logical conjunction

ε – A fixed number of edges

← – Attribution

/∈ – Not in

∨ – Logical disjunction

∃ – Exists

⊆ – Is contained in

* – Is not contained in

SC – The complement of set S

∩ – Set intersection

N – Natural numbers set

List of Algorithms

1 Tabu Search Algorithm Skeleton. . . . . . . . . . . . . . . . . . . . . . p. 30

2 Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 35

3 Generate Initial Solution Graph. . . . . . . . . . . . . . . . . . . . . . . p. 38

4 Fit Solution’s number of edges to ε. . . . . . . . . . . . . . . . . . . . . p. 39

5 Deletes edge with largest degree incident nodes possible. . . . . . . . . p. 40

6 Adds edge between the two nodes with smallest possible. . . . . . . . . p. 42

7 Make a Solution Feasible. . . . . . . . . . . . . . . . . . . . . . . . . . p. 43

8 Implemented Tabu Search. . . . . . . . . . . . . . . . . . . . . . . . . . p. 44

9 Fitness Function implementation. . . . . . . . . . . . . . . . . . . . . . p. 45

10 Neighbourhood Search . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 48

11 Special Neighbourhood Search deletions . . . . . . . . . . . . . . . . . . p. 49

12 Special Neighbourhood Search Additions . . . . . . . . . . . . . . . . . p. 50

13 Swaps the nodes incident to two distinct edges. . . . . . . . . . . . . . p. 50

14 Spin edge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 53

15 Spin edge incident to one maximum degree node . . . . . . . . . . . . . p. 55

16 Double spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 58

17 Fault Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 59

Contents

1 Introduction p. 20

2 Theoretical Framework p. 22

2.1 Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 22

2.2 Quadratic Assignment Problem Function . . . . . . . . . . . . . . . . . p. 23

2.3 Network On Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 23

2.3.1 Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 25

2.3.2 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 26

2.4 Taks Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 27

2.5 Metaheuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 28

2.5.1 Tabu Searh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 29

3 Related Works p. 31

3.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 31

3.2 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 32

4 Methodology p. 35

4.1 Definitions and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . p. 36

4.1.1 Solution Representation . . . . . . . . . . . . . . . . . . . . . . p. 36

4.1.2 Feasible Solution . . . . . . . . . . . . . . . . . . . . . . . . . . p. 37

4.2 Initial Topology Generation . . . . . . . . . . . . . . . . . . . . . . . . p. 38

4.2.1 Fitting to Epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . p. 39

4.2.2 Making Feasible . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 40

4.2.2.1 Deleting Edges - DEL_EDGE() . . . . . . . . . . . . . p. 40

4.2.2.2 Adding Edges - ADD_EDGE() . . . . . . . . . . . . . p. 41

4.2.2.3 Make Feasible Algorithm . . . . . . . . . . . . . . . . . p. 42

4.3 Best Solution Search – Tabu Search . . . . . . . . . . . . . . . . . . . . p. 43

4.3.1 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . p. 45

4.3.2 Tabu List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 46

4.3.3 Neighbourhood Search . . . . . . . . . . . . . . . . . . . . . . . p. 47

4.3.3.1 Delete Edge Between Two Minimum Degree Nodes . . p. 50

4.3.3.2 Delete Edge Incident to One Minimum Degree Node . p. 52

4.3.3.3 Default Scenario . . . . . . . . . . . . . . . . . . . . . p. 54

4.3.3.4 Add Edge Incident to One Maximum Degree Node . . p. 54

4.3.3.5 Add Edge Between Two Maximum Degree Nodes . . . p. 57

4.4 Fault Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 58

5 Results p. 61

5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 63

5.2 Latency Estimation After Fault Injection . . . . . . . . . . . . . . . . . p. 67

6 Concluding Remarks p. 73

References p. 74

Appendix A -- Influence of tabuListSize Arguments p. 78

Appendix B -- Influence of terminationCriterion Arguments p. 81

Appendix C -- Latency Box Plots p. 84

Appendix D -- Fault Injection in Median Fitness Solutions p. 87

Appendix E -- Examples of Median Epsilon Solutions p. 92

Appendix F -- Median Epsilon Solutions After 10% Fault Injection p. 97

Appendix G -- Median Epsilon Solutions After 20% Fault Injection p. 102

Appendix H -- Median Epsilon Solutions After 30% Fault Injection p. 107

Appendix I -- Detailed Fault Injection in Some Median Fitness Solu-

tions p. 112

Annex A -- Mesquita’s Work TGs p. 115

1 Introduction

The technological advances in computers, specially transistors, lead to an increase in

the number of components that fit in a single chip. Consequently, the need to improve com-

munication between chip components also increased (WANG et al., 2013). There are some

ways to achieve this result. It is possible to reduce the size of circuit components, therefore

shortening the physical distance between them. Hence, the density of component increases

alongside the computational power in a fixed area (SCHALLER, 1997). It is also possible

to increase the number of transmitted bits per second by increasing clock frequency or

the number of channels for parallel communication (STALLINGS, 2003; PATTERSON; HEN-

NESSY, 2013). Alongside these techniques, changing the communication protocol may also

decrease latency. Network on Chip (NoC) is such an example.

Bits were traditionally transmitted between computer components via communication

bus (STALLINGS, 2003; PATTERSON; HENNESSY, 2013). This solution was satisfactory for

the early Computer Age. However, the number of components per chip and the com-

munication demand between components raised over the decades. Hence, the provided

bus communication proved to lack flexibility and efficiency as applications’ complexity

increased. In order to solve this problem, the idea of NoC was conceived.

NoCs take advantage of a well consolidated field of Computer Science: Computer

Networks (HEMANI et al., 2000). This Computer Science branch has been evolving since

the 1960s (KUROSE; ROSS, 2013). Its theory and systems are so sophisticated that the

initial limited networks evolved to a worldwide net involving several security techniques

(KUROSE; ROSS, 2013).

In order to improve chips by inserting networks features, it is necessary to add new

components to it: routers. The routers are responsible for transmitting information re-

ceived from a component to another (ZEFERINO; SUSIN, 2003). Additionally, these com-

ponents determine the course a message will take through the network. Noticeably, the

flexibility of a chip is severally increased (ZEFERINO; SUSIN, 2003). The chip’s area, how-

ever, is also increased (MORAES et al., 2004). Furthermore, due to the processing time

required by a router to determine the appropriate action, the communication overhead is

also affected (BEIGNÉ et al., 2005).

Initially, NoCs tended to be regularly structured. A few examples are Mesh-2d, Torus

and Honeycomb NoCs (ZEFERINO; SUSIN, 2003; HEMANI et al., 2000). The regular router

distribution tends to offer greater flexibility than the irregular one – regular performance

for different applications, and multiple paths between routers. Therefore, efficiency for

specific applications is lacked (ASCIA; CATANIA; PALESI, 2004).

On the other hand, irregular NoCs attempt to improve application specific efficiency

and network performance compared to regular topologies (CHOUDHARY; GAUR; LAXMI,

2011; CHOUDHARY et al., 2010). Regular NoCs may also become irregular during the

circuit’s lifespan due to faults on either routers or links (CHOUDHARY; GAUR; LAXMI,

2011).

Efforts are being applied to generate irregular topologies with lower energy consump-

tion (JAIN; CHOUDHARY; SINGH, 2014). In addition, classical routing algorithms such as

XY (ZEFERINO; SUSIN, 2003) tend to have unsatisfactory performance when applied to

irregular NoCs (RODRIGO et al., 2011). Nevertheless, such approaches are guaranteed to be

deadlock-free only for regular NoCs (RODRIGO et al., 2011). Therefore, countless routing

algorithms are being conceived to improve network deadlock-free performance (MILFONT

et al., 2017; GABIS; KOUDIL, 2016; LEE; PARIKH; BERTACCO, 2015). Some of these examples

focus not only in deadlock-freedom, but also in fault tolerance, congestion management,

and livelock-freedom.

Similarly to any circuit, NoCs are susceptible to faults. The operation of a chip may

be severally committed depending on fault location (CHANG et al., 2011). Moreover, it is

desired to increase the lifespan of such circuits due to their fabrication cost. Although care

must be taken not to generate a regular topology, lifespan increase may be achieved by in-

troducing redundant components into the NoC (WANG et al., 2013; MESQUITA, 2016; SHAH;

KANNIGANTI; SOUMYA, 2017). During the fault tolerant irregular topology generation,

constant monitoring is necessary to avoid energy consumption increasement. Otherwise,

the main advantages of irregular topologies would be lost.

In this scenario, the proposed work focuses on generating fault tolerant irregular NoC

topologies. It is desired to obtain long-lasting and efficient circuits for specific applications.

Notwithstanding, the circuits should be suitable for multiple applications. The generated

topologies will be evaluated regarding fault tolerance capacity, and latency.

2 Theoretical Framework

The purpose of this work is to generate task graph based irregular NoCs targeting low

latency and reliability via metaheuristics. Some topics require a more solid background and

are thus explored in this section: NoC, Task Graph, Metaheuristics and Fault Tolerance.

2.1 Graph Theory

A Graph is defined as a tuple of a set of vertices and a set of edges G(V,E), re-

spectively (WILSON, 1979). V is often represented as a set of integers. There are multiple

representations for E, but in any representation, an edge connects two nodes. Throughout

this work, GV means “the set containing G’s vertices”; and GE means “the set containing

G’s edges”. A Graph may be weighted or unweighted, directed or undirected.

(a) G0

(b) G1

Figure 1: Graph examples.

In a weighted graph, some value is associated to an edge (Graph G0, Figure 1a).

On the other hand, unweighted edges have no value associated to them (G1, Figure 1b).

A directed and unweighted graph may represent edges as tuples, because edge (0, 1) 6=(1, 0). On the other hand, undirected unweighted graphs may represent edges as sets

since {0, 1} = {1, 0} (G1). Hence, self-loops would be represented as a set of one element

({0, 0} = {0}). For directed and weighted graphs (G0), edges may be represented as a

triple, i.e. (v1, v2, w); while undirected edges may be represented as a tuple of a set and a

value, i.e. ({v1, v2}, w). During this work, directed weighted graphs are used to represent

task graphs; while undirected unweighted graphs represents solutions or NoCs. Since no

self-loops are allowed for solutions, undirected unweighted graphs’ edges have two elements

– ∀e ∈ GE(|e| = 2).

Another concept that will be used throughout this work is the Null Graph. The Null

Graph is the only graph containing 0 nodes, and consequently no edges. Specifically the,

undirected unweighted Null Graph will represent an invalid solution or graph. In other

words, a tuple of empty sets, i.e. (∅, ∅).

2.2 Quadratic Assignment Problem Function

The Quadratic Assignment Problem (QAP) is an NP-hard problem that raises when

assigning facilities to locations. This problem is defined by Bokhari (BOKHARI, 1981) and

adapted to the current work as follows. An affinity measure between two objects i, and

j is given – in the current work it corresponds to the edge weight wij between nodes i,

and j –; n locations – in the proposed work, n = |TGV | where TG is a Task Graph –; the

distance distst between the locations in a Graph G – in the proposed work, the distance is

given by the number of hops in G’s shortest path from node s to t calculated by Dijkstra’s

Algorithm (DIJKSTRA, 1959) –; and a function that maps objects to locations – in the

proposed work, this is the identity function, i.e. i = s and j = t. Then, minimise the

Function ∑ij

wijdistst. (2.1)

It is important to highlight that in the current work, distij = 1 for TG. However, for G,

distij = distst ≥ 1.

Throughout this work, the QAP Function will be used as the Tabu Search fitness

function for latency estimation. Such a scenario is possible because the smaller the QAP

function value, the smaller shall be the overall latency in a network.

2.3 Network On Chip

The main goal of a NoC is to improve the communication between components of a

chip, specially if compared to traditional communication bus (YESIL; TOSUN; OZTURK,

2016). NoCs also provide a more scalable communication method if compared to tradi-

tional ones (YESIL; TOSUN; OZTURK, 2016).

A NoC consists of two major components: routers and links (ZEFERINO; SUSIN, 2003).

Routers are responsible for transmitting information (packets) between each other through

the links (SOTERIOU et al., 2009). The packets may pass through multiple links and routers

before reaching its destination. The router’s behaviour is described by routing algorithms,

which define the path to be travelled by the packets (SOTERIOU et al., 2009). There are

four core NoC features that describe the message transfers – routing algorithms, switching,

flow control, and arbitration.

Routing algorithms describe the path to be coursed by a packet. According to Ze-

ferino, a routing algorithm impacts on NoCs’ connectivity, deadlock and livelock freedom,

adaptability, and fault-tolerance (ZEFERINO; SUSIN, 2003). The connectivity is the capa-

city of sending packets from and to any core. Deadlock and livelock freedom guarantees

that all packets will arrive on its destination. Adaptability is related to flexibility – the

capacity of adapting to different topologies. Fault-tolerant routing algorithms attempt to

guarantee connectivity even though the NoC has faulty components (ZEFERINO, 2003).

Switching describes how packets are transferred from the input to the output of a

router. Some switching methods are circuit switching, store-and-forward, and wormhole.

Circuit switching reserves a path until the entire message is transmitted. Store-and-

forward packets have a header with information about its destination; and stored in a

buffer every router until its next hop is decided. Wormhole switching divides packets

into flits, and, if a flit’s output path is free, it is not stored in a buffer, being straightly

transmitted to the communication channel (ZEFERINO, 2003).

Flow control describes what shall be done with packets unable to acquire some re-

source. This may happen, for example, if there are numerous packets travelling through

the NoC, overloading it. Depending on the flow control, a packet may be discarded, tem-

porarily stored, or have its route changed (ZEFERINO, 2003).

On the other hand, arbiters are responsible for redirecting packets inside a router, i.e.

input path. This scenario may occur when a router simultaneously receives multiple pack-

ets competing for the same output path. Thus, the arbiter will be responsible for deciding

which packets will have access to the resources first. There are centralised – one per router

–, and distributed – one per path – arbiters. Some examples of arbitrating mechanism are

round-robin, first-come-first-served, least recently served, et cetera (ZEFERINO, 2003).

The next two NoC’s aspects are the focus of this work and hereafter explored: topo-

logies, and fault tolerance.

2.3.1 Topologies

The NoC components may be distributed in a chip regularly or irregularly. Regular

topologies tend to be used for general purpose applications and have reduced design time

(SRINIVASAN; CHATHA; KONJEVOD, 2006). Mesh (Figure 2a), Torus (Figure 2b), Ring

(Figure 2c), and Honeycomb (Figure 2d) are examples of regular NoC topologies (ZE-

FERINO; SUSIN, 2003; HEMANI et al., 2000; BONONI; CONCER, 2006). Routing algorithms

for regular NoCs are often simple, since they are based on the regular distribution of

resources. Some examples of routing algorithms for regular NoCs are XY (DEHYADGARI

et al., 2005), and DyXY (LI; ZENG; JONE, 2006).

(a) Mesh

(b) Torus topology

(c) Ring topology

(d) Honeycomb topology

Figure 2: Example of regular NoC topologies.

On the other hand, irregular topologies tend to be tailored for specific-purpose applic-

ations (CHOUDHARY; GAUR; LAXMI, 2011). Notwithstanding, irregular topologies may be

obtained from regular NoCs for which one or more components have a permanent failure

(ZHANG et al., 2009). Therefore, the study of fault-tolerance in irregular NoCs is interesting

even for the regular topology scenario. Irregular topologies can potentially improve area,

energy consumption, and performance if compared to regular ones (SRINIVASAN; CHATHA;

KONJEVOD, 2006). However, their routing algorithms cannot depend on the components’

regular distribution, thus, not as simple. Even so, multiple algorithms are developed and

benchmarked considering high-performance improvements (MILFONT et al., 2017). Graphs

IG0, IG1, IG3 are examples of irregular NoCs.

(a) IG0

(b) IG1

(c) IG2

Figure 3: Examples of irregular NoC topologies.

There are several ways to generate specific-purpose irregular NoCs. For example,

Srinivasan, Chatha, and Konjevod proposed to use slicing tree and linear programming

(SRINIVASAN; CHATHA; KONJEVOD, 2005). Pinto, Carloni, and Sangiovanni-Vincentelli

applied a heuristic to a previously proposed Constraint-Driven Communication Synthesis

(PINTO; CARLONI; SANGIOVANNI-VINCENTELLI, 2003). Ho and Pinkston’s work is based

on a recursive bisection technique (HO; PINKSTON, 2003). Metaheuristics are also com-

monly used for generation, a few examples are the works of (KREUTZ et al., 2005), (NEEB;

WEHN, 2008), (MESQUITA, 2016), and (CHOUDHARY et al., 2010).

2.3.2 Fault Tolerance

Faults may occur in both regular or irregular topologies. Faults may occur in links,

routers or even cores (AZAD et al., 2016). There are two types of faults: transient and

permanent. Transient faults may be the result of noise or interference (MILFONT et al.,

2017). Transient faults are hard to be corrected and do not compromise the behaviour of

the circuit for a long period (MILFONT et al., 2017). On the contrary, permanent faults

may happen due to physical damage or fabrication problems (MILFONT et al., 2017). NoCs

are jeopardised by permanent faults, and many works focus on dealing with them (AZAD

et al., 2016), increasing the circuit’s lifespan.

Faults may turn a topology unfeasible, i.e. creating two isolated (incommunicable)

areas, which is clearly an undesirable scenario (CHANG et al., 2011). Some examples are

illustrated by Graphs DISCG0, DISCG1, DISCG2, and DISCG3 (Figures 4a, 4b, 4c,

and 4d, respectively). In Figures 4a and 4c, the dotted nodes represent faulty routers;

while in Figures 4b and 4d, the dotted lines represent faulty links. Graphs DISCG0,

and DISCG1 illustrate faults turning regular NoCs unfeasible. Similarly, DISCG2, and

DISCG3 represent disconnected NoCs after the failures.

(a) DISCG0

(b) DISCG1

(c) DISCG2

(d) DISCG3

Figure 4: Examples of areas isolated in NoCs after faults.

There are two approaches to amortise the impacts of a fault: architecture level, and

system and application level approaches (AZAD et al., 2016). Architecture level approaches

tackle fault-tolerance by adding redundant components, whether routers, links, or cores

(AZAD et al., 2016; CHANG et al., 2011; ZHANG et al., 2009). System and application level

approaches tackle the problem by adding software flexibility, e.g. routing algorithms (AZAD

et al., 2016).

2.4 Taks Graphs

A Task Graph (TG) describes an application subdivided into tasks. Tasks may depend

on each other. A TG is commonly modelled as a directed graph, where the vertices

and edges represent tasks and dependency between them, respectively. Edges are often

weighted, possible representing communication cost or duration. Figure 5 illustrates a TG

generated with Task Graphs For Free (DICK; RHODES; WOLF, 1998). In this TG, task 3

depends on task 2, and the communication cost from 2 to 3 is 18.

Figure 5: A Task Graph.

In the Multi-Processor Network-on-Chip (MPSoC) context, NoCs may be used for

MPSoC design, while TG tasks are mapped to MPSoC cores. There is not necessarily a

bijection between TG and NoC edges. Mapping a TG to a NoC falls into the QAP category

(BOKHARI, 1981; ROCHA, 2017). Irregular NoC topologies may be generated according to

TGs using metaheuristics, such as Simulated Annealing (NEEB; WEHN, 2008), and Genetic

Algorithm (CHOUDHARY et al., 2010; MESQUITA, 2016).

2.5 Metaheuristics

The Computer Science core is to model problems mathematically so their solution

can be calculated by a computer. Problems may be classified in various categories, and

according to different aspects, e.g. running time, and memory usage. Regarding running

time, there exists problems known to be efficiently solvable on a computer (P class), i.e.

problems that require a polynomial number of operations. On the other hand, among

other characteristics, the NP class consists of decision problems for which a solution can

be verified in polynomial time (CORMEN et al., 2009). Some examples of NP problems

are the decision versions of the Quadratic Assignment Problem (BOKHARI, 1981), the

Classical Vehicle Routing Problem (CVRP) (GENDREAU; POTVIN et al., 2010), and the

Vertex Cover Problem (GAREY; JOHNSON; STOCKMEYER, 1974).

Although P ⊆ NP , it is unknown if P = NP . Thus, there are NP problems for

which no polynomial solution is known. Nevertheless, it is desirable to find efficient solu-

tions even for these problems. There are techniques capable of finding the best solution

(exact Algorithms). For instance, it is possible to perform exhaustive searches, branch-

and-bound, et cetera. Exhaustive searches visit all the solutions looking for the optimal

one. Branch-and-bound visits some solutions while pruning part of the search space. This

occurs only if it can be mathematically proved that no solution of the pruned space is

better than the current best one (BALAS; TOTH, 1983). However, computational time for

non-small problem instances is often unfeasible.

Depending on the desired results, some non-optimal solution may be sufficient (solu-

tions different from the best one). The CVRP is an example of such a problem because it

may be desirable to obtain the best solution possible in a limited period of time (GENDR-

EAU; POTVIN, 2005). These solutions are denominated local optima, in contrast to the

global optimum. Thus, strategies for finding local optima while still searching for the

global optimum were developed. An example of such strategies are metaheuristics – con-

trolled local searches capable of finding multiple local optima, and often analogous to some

natural phenomena (GENDREAU; POTVIN et al., 2010). Some metaheuristic examples are

Simulated Annealing, Tabu Search, Genetic Algorithms, Memetic Algorithms, et cetera.

Hybridisation is also possible.

Metaheuristics are used for generating irregular topologies because it is an NP-hard

problem – a variation of the Steiner Tree Problem (RAVI et al., 2001; MESQUITA, 2016).

2.5.1 Tabu Searh

The main idea of the Tabu Search is to find local optima, while escaping from recently

found solutions (GENDREAU; POTVIN, 2005). One solution can be obtained from another

by performing a neighbourhood step operation on the Search Space using the Neighbour-

hood Structure (GENDREAU; POTVIN, 2005). A neighbourhood step operation depends on

how the problem is modelled, and on how a movement is defined (GENDREAU; POTVIN,

2005). A movement may be described as swapping, removing, or adding Neighbourhood

Structure elements, et cetera.

One of the key factors of Tabu Search is to prevent the search from visiting a solution

multiple times. This is achieved with a short term memory – called Tabu List – that stores

recently performed neighbourhood movements (GENDREAU; POTVIN, 2005). Essentially,

if a Neighbour Solution contains a Tabu Movement, it is not considered in the Search

Space. Tabu Lists are often implemented as circular queues and their size depends on

the problem and the performed experiments (GENDREAU; POTVIN, 2005). In this case,

the union operation may remove the oldest inserted element. A Tabu List may be too

restrictive or too permissive, depending on how the problem is modelled or even in the

desired results.

The Tabu Search skeleton is described in Algorithm 1, this is the same Algorithm

as the one described by Gendreau and Potvin (GENDREAU; POTVIN, 2005), but with

a different notation . It is important to emphasise some points also highlighted by the

authors. The termination criterion depends on the problem, though it is usually defined

in number of iterations. The fitness() function is used to rank and evaluate different

solutions. The selectBestNeighbour() function returns the best neighbour of the current

solution considering movements not in the Tabu List. Gendreau and Potvin also states that

an Aspiration Criterion may be necessary while searching the neighbourhood (GENDREAU;

POTVIN, 2005). For instance, if the fitness of a solution containing a Tabu Movement is

better than the best solution found, the movement should be performed even though it is

Algorithm 1 Tabu Search Algorithm Skeleton.1: function Tabu Search skeleton

2: S ← S0 . creates initial solution and sets current solution

3: BS ← S0 . sets the initial solution as the best found

4: TL← ∅ . initialises Tabu List

5: while ¬ termination criterion do

6: S ← selectBestNeighbour(S, TL)

7: if fitness(S) < fitness(BS) then

8: BS ← S . saves current solution as the best

9: end if

10: TL← TL ∪ {performedMovement}11: end while

12: end function

3 Related Works

There are several works in the literature regarding irregular NoC topologies. Two

major areas are of interest: performance and fault tolerance. The performance area focus

on improvement by generating topologies; routing algorithms; physical simulations; etc.

On the other hand, fault-tolerance embraces topics such as maintenance of a regular NoC

using spare routers and virtual topologies; topology reconfiguration; built-in router self-

diagnosis; detection and handling of transient and permanent faults; routing algorithms,

fault tolerance on routers and links; et cetera (SALMINEN; KULMALA; HAMALAINEN, 2008;

RADETZKI et al., 2013).

3.1 Performance

Two possible ways to enhance performance are to improve the routing algorithms, and

to generate application-specific topologies. Two works about routing algorithms and four

about topology generation are henceforth mentioned: routing table minimisation (MOTA et

al., 2016), fault-tolerant enhanced odd-even XY routing algorithm (ABEDNEZHAD; ALAVI,

2017), design of irregular topologies for heterogeneous NoCs (NEEB; WEHN, 2008), lin-

ear programming (SRINIVASAN; CHATHA; KONJEVOD, 2006), ant lion optimisation (VEN-

KATARAMAN; KUMAR, 2019), and the genetic algorithm (MESQUITA, 2016).

There are multiple works about routing algorithms for irregular NoCs. Most of the

works focus on ensuring deadlock free algorithms, and the method used to guarantee it

directly affects performance. In addition, routing algorithms are often applied to irregular

NoCs to increase its lifespan. For instance, (MOTA et al., 2016) focuses on reducing the size

of the routing table to improve performance. As another example, (ABEDNEZHAD; ALAVI,

2017) uses a hybrid approach to obtain a fault tolerant deadlock free routing algorithm.

Their solution acts as a XY routing algorithm by default and uses enhanced odd-even

model when a faulty link is found.

The work of Neeb and Wehn uses Simulated Annealing to map tasks to a bidirectional

chain topology; then, edges are added to it by a greedy algorithm (NEEB; WEHN, 2008).

The obtained graphs are compared to mesh, torus, and spidergon topologies.

(SRINIVASAN; CHATHA; KONJEVOD, 2006) generate application-specific NoCs by using

Linear-Programming . Their objective is to minimise power consumption while maxim-

ising performance. Thus, the physical size and distance between components is considered

throughout the process.

Venkataraman and Kumar also proposed to decrease power consumption for applic-

ation specific topologies. Their work uses an ant lion optimisation technique to generate

the topologies (VENKATARAMAN; KUMAR, 2019). In addition, redesigning the router ar-

chitecture helped to improve the obtained results (VENKATARAMAN; KUMAR, 2019).

The Genetic Algorithm proposed in (MESQUITA, 2016) generates irregular topologies

from a 2D-Mesh population. The population is submitted to mutations, where links circuit

components may be removed. The implemented algorithm uses single-point crossover.

Due to the nature of the problem, however, single-point crossover may not contribute to

multiple neighbourhoods exploration. In addition, improved results may be achieved if

the initial population also contains different individuals, such as Torus, and Honeycomb.

However, for a few works, it is necessary to review some implementations details in

order to explore a wider range of solutions. In addition, the presented works focus solely on

performance, while fault-tolerance is mentioned as a desired feature for future projects. On

the contrary, the proposed work focuses on generating topologies that are simultaneously

efficient and fault-tolerant.

3.2 Fault Tolerance

Two usual ways to add fault tolerance to NoCs are through routing algorithms, or com-

ponent redundancy. For component redundancy, one may simply duplicate the resources

(links, routers or PEs); create alternative paths inside routers; etc.

The proposed work focuses on topology link redundancy, ensuring alternative paths

between two routers. Five complementary works are thus highlighted: lightweight fault-

tolerant mechanism, (KOIBUCHI et al., 2008), fault-isolation circuits (LIN et al., 2009), a

fault-tolerant honeycomb model (YANG et al., 2016), De Bruijn’s algorithm (HOSSEIN-

ABADY et al., 2007), Bio-inspired algorithms (BECKER; KRÖMKER; SZCZERBICKA, 2015),

and the Poorest Neighbour approach (SHAH; KANNIGANTI; SOUMYA, 2017).

The lightweight fault-tolerant Mechanism achieves fault tolerance by adding redund-

ant components (KOIBUCHI et al., 2008). However, the work’s premise is to duplicate

simple components since they are less susceptible to failure (KOIBUCHI et al., 2008). In

summary, it prevents failures on routers by adding alternative paths bypassing the cross-

bar (KOIBUCHI et al., 2008). The work of Lin, et al. also uses a similar strategy (LIN et al.,

2009).

In the fault-tolerant honeycomb model (YANG et al., 2016), tolerance is achieved by

adding one extra input/output link per processing element. This is achieved by adding

a spare router in the centre of each hexagon. The spare router is therefore connected

to the six processing elements. Hence, this approach handles faults in links and routers.

A message will move around a faulty link by passing through the spare router, though

there is an overhead increase. If a router fails, the corresponding processing element would

normally become inaccessible. The honeycomb model solves this problem by connecting

two routers to a processing element. Thus, a new hexagon is simulated by nearby spare

routers. In the proposed work, it was decided not to use this technique since a honeycomb

model tends to occupy larger areas.

(SHAH; KANNIGANTI; SOUMYA, 2017) states that De Bruijn’s graph is widely ap-

plied in Bioinformatics. De Bruijn’s algorithm uses mathematical formulae to determine

if two nodes should be connected (HOSSEINABADY et al., 2007). The binary version of

the algorithm focuses on associating every node possible to four edges, with only a few

exceptions (HOSSEINABADY et al., 2007). De Bruijn’s algorithm achieves 100% fault toler-

ance for links (SHAH; KANNIGANTI; SOUMYA, 2017). Although, this approach is unfeasible

because nonplanar graphs may be generated.

The work developed by (BECKER; KRÖMKER; SZCZERBICKA, 2015) seems promising

since it evaluates heuristics and Bio-inspired algorithms to generated fault-tolerant graphs.

However, the work is very superficial. It even lacks core information such as algorithm

details.

The Poorest Neighbour Algorithm (SHAH; KANNIGANTI; SOUMYA, 2017) is a determ-

inistic algorithm that adds link fault tolerance to a NoC given its application graph.

Compared to De Brujin’s algorithm, the generated topology considerably reduces the

number of necessary links. Additionally, the authors claim that the Poorest Neighbour

achieves 100% link fault-tolerance.

While simulating the algorithm provided by the author, problems were found. The

algorithm was tested only for few more than forty graphs. From these graphs, just a small

subset represent applications for which fault-tolerance could not be added manually and

effortless. In addition, the algorithm has three different implementations not mentioned in

the paper. And given the same graph, the algorithms may have different outputs. However,

the most compromising problem is that building a NoC directly from an outputted graph

may be unfeasible. This is due to the algorithm’s nature. Routers are simulated with

unlimited ports and links are never removed from the topology, only inserted.

Some of the presented works tackle the problem of adding fault-tolerance, focusing

either on regular or on irregular topologies. The works that generate fault-tolerant topolo-

gies need to be enhanced to consider necessary limitations (such as port limit per router).

Therefore, inapplicable (nonplanar) graphs are more likely to be obtained. On the other

hand, the proposed work focuses on generating irregular topologies for which the routers

have a maximum of four ports. In addition, a limit for the number of links is required.

Together, these restriction increase the odds of obtaining planar solutions.

4 Methodology

Irregular topologies overperform regular topologies for specific applications, since their

structure tends to be more similar to some TG. Adding fault tolerance is desirable, and

in many cases, essential to increase the lifespan of a NoC, whether its topology is regular

or irregular. Thus, the proposed work focuses on generating high-performance irregular

NoC topologies with link redundancy to increase fault tolerance.

The proposal is to generate irregular NoC topologies with redundant links for fault

tolerance. The topologies generated are evaluated in order to estimate how latency can

be affected by the approach.

To generate the topologies, a software was implemented in C++. The choice of a

high level abstraction design tool was made based on the fact that this work attempts to

explore different topologies through heuristic algorithms.

Therefore, latency will be estimated by the QAP function, i.e. number of hops weighted

by the TG weight, as detailed in section 2.2. For a given TG, the number of routers is

fixed; and the number of links is used to classify different topologies. It is also desirable to

evaluate if there exists a (number of links) limit for significant performance improvements.

Nevertheless, topologies for efficient circuits with long lifespan are expected to be

outputted. The proposed algorithm has three main stages: initial topology generation,

best solution search and fault injection. These stages are listed in Algorithm 2.

Algorithm 2 Methodology.1: function main(graph: TG; int: ε, tabuListSize, terminationCriterion)

2: S ← GENERATE_INITIAL_SOLUTION(TG, ε)

3: S ← TABU_SEARCH(TG, S, ε, tabuListSize, terminationCriterion)

4: FAULT_INJECTION(S, TG)

5: end function

The first step randomly generates an initial feasible solution given a TG, and an ε

value. The second step uses Tabu Search and the QAP function to generate a local op-

timal solution. The third step stresses the generated topology multiple times by randomly

choosing links to fail.

4.1 Definitions and Assumptions

There are a set of definitions and assumptions that will be used throughout the re-

maining sections. The topics to be discussed are solution representation, and feasible

solution.

4.1.1 Solution Representation

Any NoC (consequently any solution) can be represented as a graph. Thus, there

is a bijection between nodes (vertices) of a graph and routers of a topology, and edges

of a graph and links of a topology. These terms are hereafter used interchangeably. For

the current problem, a TG is read as an adjacency, while a solution is represented as a

triangular adjacency matrix with no main diagonal.

Due to problem restrictions, the links on solutions are bidirectional. Thus, solutions

can be represented as symmetric adjacency matrices. Since self-loops are not allowed, it is

not necessary to store the matrix’s main diagonal. Hence, in order to save memory, only

the elements below the main diagonal are stored. For example, the graph represented by

the matrix on the left would be stored as the matrix on the right,

0 10 0 7 5 0

10 0 1 0 2 0

0 1 0 4 0 0

7 0 4 0 2 2

5 2 0 2 0 0

0 0 0 2 0 0

5 2 0 2

0 0 0 2 0

. (4.1)

Thus, by Arithmetic Progression, instead of storing |V |2 edges, only

n(a1 + an)

2=|V |(0 + |V | − 1)

2(4.2)

=|V |2 − |V |

2(4.3)

edges are stored.

Throughout the Tabu Search, the only information used for solution edges is whether

they exist or not. Thus, the solution is stored as a boolean matrix with no main diagonal.

For instance, the previous graph would be represented as the following matrix:

5 2 0 2

0 0 0 2 0

1 1 0 1

0 0 0 1 0

. (4.4)

This matrix representation was implemented to represent graphs. However, the defin-

itions of Section 2.1 will be used in the pseudocodes. Henceforth, a solution will be rep-

resented as an undirected and unweighted graph.

4.1.2 Feasible Solution

Henceforth, a solution is considered feasible if it meets the following restriction. As-

sume that the corresponding solution graph is S(V,E); and that degree(v) is a function

returns the degree of vertex v, i.e. the number of edges incident to it. Thus, the restriction

is described by,

∀v (v ∈ SV ∧ 2 ≤ degree(v) ≤ 4). (4.5)

This restriction guarantees that the routers have a standard design, with a maximum

of four output ports. This router architecture is commonly found in regular NoCs, such

as Mesh-2D, and Torus (JANTSCH; TENHUNEN et al., 2003). Thus, the obtained solutions

are more likely to have lower power consumption, and simpler design if compared to the

solutions generated by the Poorest Neighbour (SHAH; KANNIGANTI; SOUMYA, 2017), and

De Bruijn’s algorithm (HOSSEINABADY et al., 2007). In addition, this restriction aims to

increase reliability through redundancy since there are at least two edges through which

it is possible to reach a node, i.e. at least one alternative path. Therefore, link redundancy

is achieved; potentially improving reliability.

It is important to highlight that Equation 4.5 is not sufficient to guarantee graph

planarity. Therefore, non-planar graphs may be generated by the Algorithm and classified

as feasible solutions. In other words, some solutions may not be used in 2D-MPSoC design.

4.2 Initial Topology Generation

In order to generate an initial feasible solution for a given combination of TG and

ε, there are two not necessarily distinct possible scenarios. First, the number of edges in

the original TG may be different from the ε value; thus it is necessary to remove or add

edges until both values match. Second, the TG may not be feasible; thus, it is necessary

to move some existent edges until the condition of Equation 4.5 is met.

For some values of ε, no feasible solution is possible. Thus, it is necessary to assert its

value before initiating the process of generating an initial topology. This condition can be

easily verified using the Handshaking Lemma (WILSON, 1979),∑v∈V

degree(v) = 2|E|. (4.6)

Therefore, it is possible to assert that the condition described by Equation 4.5 is satisfied

by guaranteeing that, for the desired solution,

2 ≤ 2ε

2|E||V |≤ 4. (4.7)

In summary, the algorithm for generating the initial topology (Algorithm 3) is divided

in four steps: asserting that ε is valid, converting the directed edges TGE to undirected,

fitting |E| to ε, and making the solution feasible. The last two steps are discussed in

Sections 4.2.1, and 4.2.2, respectively.

Algorithm 3 Generate Initial Solution Graph.1: function GENERATE_INITIAL_SOLUTION(graph: TG, int: ε)

2: if 2 ≤ 2ε/|TGV | ≤ 4 then

3: SE ← {{v1, v2}|∃e∃w(e ∈ TGE ∧ (e = (v1, v2, w) ∨ e = (v2, v1, w))}4: S ← (TGV , SE)

5: S ← FIT_TO_EPSILON(S, ε)

6: S ← MAKE_FEASIBLE(S)

7: return S

8: end if

9: return (∅, ∅)10: end function

The solution is represented as a symmetric unweighted graph. Therefore, the conver-

sion process (Algorithm 3, line 3) “removes” the edge direction and weight, adding it to the

graph. This process can be visualised in Figure 6. Suppose that Graph TG = GISEC0

(Figure 6a). Then, both edges (5, 2, 6) – i.e. edge from node 5 to 2 with weight 6 –, and

(2, 5, 2) would be converted to {2, 5}; thus, two edges were collapsed to one. In addition,

edge (1, 0, 4) would be converted to {0, 1}. After all conversions are performed, the res-

ulting Graph S is represented in Figure 6b (Graph GISEC1). Note that the obtained

Graph is undirected and unweighted.

(a) GISEC0

(b) GISEC1

Figure 6: Example of Task Graph edge conversion.

4.2.1 Fitting to Epsilon

This stage guarantees that the initial solution will have a number of edges correspond-

ent to the epsilon restriction parameter (ε = |E|). For example, if ε < |E|, then edges

need to be removed from the graph. On the other hand, if ε > |E|, then edges will be

added to the graph. In order to explore a wider range of solutions in multiple executions,

these edges are randomly deleted and inserted, as detailed in Algorithm 4.

Algorithm 4 Fit Solution’s number of edges to ε.1: function FIT_TO_EPSILON(graph: S, int: ε)

2: while ε < |SE| do3: SE ← SE − {random(SE)}4: end while

5: while ε > |SE| do6: SE ← SE ∪ {random(SCE )}7: end while

8: return S

9: end function

4.2.2 Making Feasible

This stage guarantees that the initial solution is feasible. For instance, Graph UNF

does not represent a feasible solution because nodes 4 and 7 have degree 5, and nodes

0 and 2 have degree 1 (Figure 7). In addition, the condition stated in Equation 4.7

would be true for Graph UNF if ε = 13. Therefore, it could have been outputted

from the FIT_TO_EPSILON function (Section 4.2.1). In order to properly under-

stand the MAKE_FEASIBLE() function, it is necessary to comprehend the behaviour of

DEL_EDGE() and ADD_EDGE() functions.

Figure 7: Example of unfeasible solution – UNF .

4.2.2.1 Deleting Edges - DEL_EDGE()

To select an edge to be deleted, this function simply selects the node with largest

degree (ldn). Then, selects its neighbour with largest degree (ldneigh). Afterwards, it

removes the edge between these two nodes. It is possible that multiple vertices have

the largest degree, any of them can be chosen randomly. The algorithm’s pseudo-code is

described in Algorithm 5.

Algorithm 5 Deletes edge with largest degree incident nodes possible.1: function DEL_EDGE(graph: S)

2: ldn← random(argmax(degrees(SV ))) . largest degree node

3: edges← {e|e ∈ SE ∧ ldn ∈ e} . edges incident to ldn

4: neighs← {v|v ∈ e ∧ v 6= ldn ∧ e ∈ edges} . nodes adjacent tp ldn

5: ldneigh← random(argmax(degrees(neighs))) . largest degree neighbour

6: edge← {ldn, ldneigh}7: SE ← SE − {edge}8: return S

9: end function

To illustrate this procedure, suppose that S = DUE0 (Figure 8a). In Graph DUE0,

degree(v) = 4 for either v ∈ {0, 1, 3, 5}. Hence, any of these nodes can be selected. Let

ldn = 0; since degree(1) = degree(5) = 4, it is possible to delete two edges – {0, 1}, or{0, 5}. This scenario is illustrated by Graph DUE1 (Figure 8b), where removable edges

are dotted. If {0, 1} is chosen, Graph DUE2 (Figure 8c) is obtained. If the function was

called for Graph DUE2 (S = DUE2), edge {3, 5} would be deleted. The resulting Graph

is illustrated in Figure 8d, where the removed edge is dotted.

(a) DUE0

(b) DUE1

(c) DUE2

(d) DUE3

Figure 8: Example of Delete Edges Until Epsilon.

4.2.2.2 Adding Edges - ADD_EDGE()

This function simply selects the nodes with smallest degrees (sdn, and sdn2), and

adds an edge between them. The Algorithm expects a set of prohibited edges, a “Tabu

List” named TL. If the Algorithm attempts to add edge {sdn, sdn2}, and it already exists

({sdn, sdn2} ∈ SE), or it is tabu ({sdn, sdn2} ∈ TL); it is necessary to change vertex

sdn2 to a previously unvisited smallest degree node. The process is repeated until an edge

can be inserted into the Graph. This Algorithm’s behaviour is detailed in Algorithm 6.

The random function is necessary because there may exist multiple smallest degree nodes

in a Graph.

For example, suppose that S = AUE0 (Figure 9a), and TL = ∅. The possible values

for sdn are 1, 2, 3, 4, or 5. Suppose that sdn = 3. Then, the possible values for sdn2 are 1, 2,

4, or 5. If sdn2 = 4, the edge {sdn, sdn2} cannot be added to the Graph since {3, 4} ∈ SE.This scenario is illustrated by Graph AUE1 (Figure 9b), where the dashed edges are

addable, and the dotted edge cannot be inserted. If sdn2 = 5, then the Graph S obtained

is represented by AUE2 in Figure(9c). If the function is called for AUE2 and TL =

{ {1, 2} }, the possible values for sdn, and sdn2 are sdn = random({1, 2, 4}), and sdn2 =

random({1, 2, 4}−{sdn}). In other words, one random edge in the { {1, 2}, {1, 4}, {2, 4}}set will be inserted. However, {1, 2} ∈ TL, thus it cannot be inserted into the Graph.

Graph AUE3 (Figure 9d) illustrates this scenario, where dashed links can be added,

while the dotted link cannot.

Algorithm 6 Adds edge between the two nodes with smallest possible.1: function ADD_EDGE(graph: S, tabulist: TL)

2: sdn← random(argmin(degrees(SV ))) . smallest degree node

3: V ← {sdn} . set of visited vertices

4: repeat

5: sdn2 ← random(argmin(degrees(SV − V )))

6: edge← {sdn, sdn2}7: V ← V ∪ {sdn2}8: until edge /∈ SE ∧ edge /∈ TL9: SE ← SE ∪ {edge}

10: return S

11: end function

(a) AUE0

(b) AUE1

(c) AUE2

(d) AUE3

Figure 9: Example of Add Edges Until Epsilon.

4.2.2.3 Make Feasible Algorithm

The algorithm presented in this section swaps edges’ positions until the obtained

solution is feasible (Equation 4.5). The algorithm consists of deleting edges from the

nodes with the largest degrees and adding them between the nodes with smallest degrees

until the solution is feasible. The last deleted edge cannot be added in the same iteration

or an infinite loop may occur. There is a chance that a disconnected graph is generated

during this process. The MAKE_FEASIBLE() function is detailed in Algorithm 7.

Algorithm 7 Make a Solution Feasible.1: function MAKE_FEASIBLE(graph: S)

2: while min(degrees(SV )) < 2 ∨max(degrees(SV )) > 4 do

3: S2 ← DEL_EDGE(S)

4: TL← SE − S2E . identifies deleted edge and creates Tabu List

5: S ← ADD_EDGE(S2, TL)

6: end while

7: return S

8: end function

Suppose, for example, that S =MFG0 (Figure 10a). Then, during the first iteration,

either edge {0, 3}, or {0, 5} are deleted, and stored into S2. If {0, 5} is randomly chosen,

then S2 = MFG1 (Figure 10b). resulting in Graph MFG1 – the deleted edge is dotted.

Afterwards, the Algorithm randomly adds one of the edges {1, 4}, {2, 4}, or {5, 4}. Ifedge {2, 4} is chosen, then the Graph MFG2 (Figure 10c) is obtained. The removed, and

inserted edges are respectively drawn as dotted, and dashed lines in the Figure. After this

step, the Algorithm would stop since MFG2 is a feasible solution.

(a) MFG0

(b) MFG1

(c) MFG2

Figure 10: Example of making an unfeasiable solution feasible.

4.3 Best Solution Search – Tabu Search

Given an initial feasible solution, a local search (Tabu Search) is performed to optimise

the QAP function (Section 2.2). The Tabu Search skeleton was discussed in Section 2.5.1.

The Tabu Search per se is not complicated. Although, as stated by Gendreau, and Potvin,

it is necessary to properly analyse the problem in hand in order to efficiently represent

a solution and the neighbourhood step. The Tabu List contains recently deleted edges,

preventing them to be added again to the solution in the next few iterations. The major

efforts of the proposed Tabu Search focus on the Neighbourhood Search steps to explore

as much feasible solutions as possible. Section 4.3.2 details the Tabu List implementation,

while Section 4.3.3 describes the Neighbourhood Search Algorithms. The behaviour of the

proposed Tabu Search is described by Algorithm 8.

Algorithm 8 Implemented Tabu Search.1: function TABU_SEARCH(graph: TG, S0; int: ε, tabuListSize, termCrit)

2: S ← S0

3: BS ← S0 . sets the initial solution as the best found

4: TL← ∅ . empty circular queue

5: count← 0

6: while count < termCrit do

7: NS ← NEIGHBOURHOOD_SEARCH(S, ∅, n) . set of n random neighbours

8: BN ← SELECT_BEST_NEIGHBOUR(NS, TG)

9: if FITNESS(BN, TG) ≥ FITNESS(BS, TG) then

10: REMOVE_TABU_SOLUTIONS(NS, TL)

11: if NS = ∅ then12: NS ← NEIGHBOURHOOD_SEARCH(S, TL, n)

13: if NS = ∅ then14: return BS . non-tabu neighbour obtainable

15: end if

16: end if

17: BN ← SELECT_BEST_NEIGHBOUR(NS, TG)

18: end if

19: count← count+ 1

20: S ← BN

21: if FITNESS(S, TG) < FITNESS(BS, TG) then

22: BS ← S . saves current solution as the best

23: count← 0

24: end if

25: if |TL| = tabuListSize then

26: REMOVE_OLDEST_ELEMENT(TL)

27: end if

28: TL← TL ∪ {S performedDels}

29: end while

30: return BS

31: end function

The termination criterion (termCrit) is defined in number of iterations with no best

solution improvement. Each iteration begins with a Neighbourhood Search (NS), return-

ing n solutions, independent of being tabu or not. In the implemented code, n = ε.

Afterwards, it is verified if the best neighbour is better than the best solution – the FIT-

NESS() function is defined as the QAP stated in section 2.2. Since NS may contain tabu

neighbours, this condition simulates an Aspiration Criterion. If the condition is not met,

tabu neighbours are removed from NS, and the best solution from the remaining ones

is chosen as the current solution. However, this set may be empty, i.e. only tabu neigh-

bours were generated. If this is the case, then a maximum of n non-tabu neighbours are

generated. The best solution of this new set is then chosen as the current solution. If the

new set is empty, then a non-tabu neighbour of S does not exist, and the Tabu Search

returns the best solution found even though the termination criterion is not yet met. Such

a scenario may happen if the Tabu List size is too large, i.e. too restrictive. The deletions

performed during the Neigbourhood Search to obtain BN (Best Neighbour) are added to

the Tabu List.

4.3.1 Fitness Function

The fitness function is defined as the QAP function. The better the solution, the

smaller will be its fitness value. The function behaviour is described by Algorithm 9.

Dijkstra’s Algorithm is being used to calculate the shortest path between two nodes.

Then, the number of edges (hops) is multiplied by the respective communication cost of

the TG. This process is repeated for all edges in the TG.

Algorithm 9 Fitness Function implementation.1: function FITNESS(graph: S, TG)

2: sum← 0

3: for all (v1, v2, w) ∈ TGE do

4: sum← sum+ w · |SHORTEST_PATH(TG, v1, v2)|5: end for

6: return sum

7: end function

4.3.2 Tabu List

The Tabu List stores all the edges deleted in a single movement. Multiple edges may

be removed in a single movement, thus, a Tabu List stores sets of edges (a set of sets

of edges). For the current proposal, implementing the Tabu List simply as a set of edges

would be too restrictive, while storing all edges deleted and added in a single movement

would be a loose restriction.

Unless the Aspiration Criterion is being evaluated, the elements of the Tabu List will

neither be present in the current solution, nor in the generated neighbours. Formally, a

graph S is considered Tabu if, for a Tabu List TL,

∃te (te ∈ TL ∧ te ⊆ SE). (4.8)

For instance, if a Tabu List element contains two edges, then they cannot be simultan-

eously in the graph.

As a concrete example, suppose that Graph TLG0 corresponds to some iteration’s

current solution (Figure 11a). Also, assume that

TL = { {{5, 2}}, {{5, 0}, {4, 1}}, {{3, 0}, {5, 3}} } (4.9)

is the Tabu List for the same iteration. Then, using the movements described in Section

4.3.3, it is possible to generate neighbours Graphs TLG1, TLG2, and TLG3 (Figures 11b,

11c, and 11d, respectively).

(a) TLG0

(b) TLG1

(c) TLG2

(d) TLG3

Figure 11: Tabu List examples.

In Figure 11, dotted lines represent tabu edges, while the dashed line is part of a tabu

element te, though te 6⊆ SE. TLG1, and TLG2 will not be visited since they correspond to

Tabu Solutions. TLG1 is Tabu because it contains edge {5, 2}, i.e. {{5, 2}} ⊆ GRAPH1E.

TLG2 is Tabu because it contains both edges {5, 0}, and {4, 1}, i.e. {{5, 0}, {4, 1}} ⊆GRAPH2E. On the contrary, TLG3 is not Tabu because although it contains edge (3, 0),

it does not contain edge {5, 3}, i.e. {{3, 0}, {5, 3}} * GRAPH3E.

4.3.3 Neighbourhood Search

Although Gendreau, and Potvin states that visiting unfeasible solutions may con-

tribute to find the global optimum, this scenario is not desirable in the proposed work

due to the large number of unfeasible solutions. For example, the simplest scenario is to

generate ring topologies 1 – topologies for which all nodes have degree 2. Considering

a TG with 10 vertices, a ring topology would have 10 edges as well. Therefore, there

are |V |2−|V |2

= 45 possible edges (Section 4.1.1). Then, by simple Combinatorics, there are45!

10!(45−10)!) ≈ 3.19 ·109 possible graphs with 10 edges. However, there are only 10!10≈ 3.6 ·105

possible ring topologies, i.e. approximately 0.0113% of the total combinations to be ex-

plored. Consequently, allowing Tabu Search to explore unfeasible solutions in this scenario

is impracticable.

Therefore, the neighbourhood step focus on visiting solely feasible solutions. This

process is described by Algorithm 10. It basically randomly selects an existent edge to be

deleted (edel ∈ SE), and randomly selects a non-existent edge to be added (eadd ∈ SCE ).Once the values of edel or eadd are chosen, the Algorithm will attempt to generate a non-

tabu solution using these edges, adding it to the set of obtained neighbours NS. If this is

not possible, the edges are added to a set of non-selectable edges – Tdel for edel, and Taddfor eadd. Then, yet not explored edges are selected (edel /∈ Tdel ∨ eadd /∈ Tadd). The process

is repeated until n neighbours are generated (|NS| = n) or all combinations of edel and

eadd are explored (Tdel = SE). It is important to highlight that in the implemented code,

n = ε.

The condition in lines 8 and 19 guarantees that the obtained neighbour is valid. It

must not be the Null Graph, N 6= (∅, ∅), neither be an already generated neighbour,

N /∈ NS, nor be a Tabu Neighbour after the special operations were performed, @te(te ∈TL ∧ te ⊆ NE).

1Even though ring topologies are regular, they are the minimum possible fault tolerant topologyconceivable. Therefore, they are still eligible study objects. Notwithstanding, for a given TG, it may beinteresting to analyse what would be the solution (smallest latency) for the worst case scenario (minimumfault tolerance).

Algorithm 10 Neighbourhood Search1: function NEIGHBOURHOOD_SEARCH(graph: S, tabulist: TL, int: n)

2: Tdel ← ∅ . tabu edges to del

3: NS ← ∅ . set of generated neighbours

4: while |Tdel| < |SE| ∧ |NS| < n do

5: edel ← random(SE − Tdel)6: N ← SPECIAL_DELS(S, edel, TL)

7: if NE 6= SE − {edel} then . some special deletion was performed

8: if N 6= (∅, ∅) ∧N /∈ NS ∧ @te (te ∈ TL ∧ te ⊆ NE) then

9: NS ← NS ∪ {N} . non-tabu neighbour generated

10: else

11: Tdel ← Tdel ∪ {edel} . edel cannot be deleted

12: end if

13: continue

14: end if

15: Tadd ← {edel} . tabu edges to add

16: while Tadd < NCE do

17: eadd ← random(NCE − Tadd)

18: N ← SPECIAL_ADDS(N, eadd, TL ∪ {edel})19: if N 6= (∅, ∅) ∧N /∈ NS ∧ @te (te ∈ TL ∧ te ⊆ NE) then

20: NS ← NS ∪ {N}21: break

22: end if

23: Tadd ← Tadd ∪ {eadd}24: end while

25: if Tadd = NCE then . if no possible edge can be added

26: Tdel ← Tdel ∪ {edel} . then edel cannot be deleted

27: end if

28: end while

29: return NS

30: end function

Throughout Algorithm 10, it is possible to generate unfeasible solutions by either

deleting or adding the edges. Thus, five possible scenarios may raise depending on the

chosen edges:

1. Deleting an edge between two nodes with minimum degree (Section 4.3.3.1);

2. Deleting an edge between two nodes for which one has minimum degree (Section

4.3.3.2);

3. Default scenario (Section 4.3.3.3);

4. Adding an edge between two nodes for which one has maximum degree (Section

4.3.3.4);

5. Adding an edge between two nodes with maximum degree (section 4.3.3.5).

The first two scenarios are delegated by Algorithm 11, invoked by the Neighbourhood

Search in line 6. Similarly, the last two scenarios are delegated by Algorithm 12, invoked

by the Neighbourhood Search in line 18. The default scenario is implicitly delegated by

both Algorithms 11, and 12.

Algorithm 11 is responsible for deleting a random selected edge from the Graph – edel.

If no special operation is needed, the Graph (SV , SE−{edel}) is returned and it is necessary

to select an edge to be added – Algorithm 10 line 7. On the other hand, if a special

operation is performed, a distinct Graph will be returned. If the special operations cannot

be performed (probably due to the Tabu List), the Null Graph is returned. Therefore, it

is necessary to verify if N 6= (∅, ∅) in Algorithm 10’s line 8 condition. To prevent that edelis reinserted into the Graph by special deletions or additions, it is temporarily added to

the Tabu List – Algorithm 11 lines 5 and 8, and Algorithm 10 line 17.

Algorithm 11 Special Neighbourhood Search deletions1: function SPECIAL_DELS(graph: S, edge: edel, tabulist: TL)

2: SE ← SE − {edel}3: performedDels← {edel}4: if ∃v(v ∈ SV ∧ degree(v) < 2) then . if true, then special deletion is needed

5: if ∀v(v ∈ SV ∧ v ∈ edel ∧ degree(v) < 2) then

6: return SWAP_EDGES_NODES(S, edel, TL ∪ {edel})7: else

8: {centre} ← {v|v ∈ SV ∧ v ∈ edel ∧ degree(v) < 2}9: return SPIN_EDGE(S, edel, centre, TL ∪ {edel})

10: end if

11: end if

12: return S . no special deletion is needed

13: end function

Algorithm 12 is responsible for adding a random selected non-existent edge to the

Graph – eadd. Before inserting eadd into the Graph, some special operations may be needed:

spin max degree (Section 4.3.3.4), or double spin (Section 4.3.3.5). If these special opera-

tions cannot be performed (probably due to the Tabu List), the Null Graph is returned.

This situation is verified by Algorithm 10’s line 19. Whether no special operation is ne-

cessary or it can be successfully performed, eadd is added to the Graph. In addition, to

prevent that eadd will be inserted during the special operations, it is temporarily added

to the Tabu List.

Algorithm 12 Special Neighbourhood Search Additions1: function SPECIAL_ADDS(graph: S, edge: eadd, tabulist: TL)

2: if ∀v(v ∈ SV ∧ v ∈ eadd ∧ degree(v) = 4) then

3: S ← DOUBLE_SPIN(S, eadd, TL ∪ {eadd})4: else if ∃v(v ∈ NV ∧ v ∈ eadd ∧ degree(v) = 4) then

5: {mdn} ← {v|v ∈ SV ∧ v ∈ eadd ∧ degree(v) = 4} . Max Degree Node

6: S ← SPIN_MAX_DEGREE(S, eadd,mdn, TL ∪ {eadd})7: end if

8: if S = (∅, ∅) then9: return (∅, ∅)

10: end if

11: SE ← eadd

12: return S

13: end function

4.3.3.1 Delete Edge Between Two Minimum Degree Nodes

This scenario raises in situations similar to attempting to remove edge {1, 2} (dottedline) from the graph SWG0 (Figure 12a). The resulting solution would be unfeasible

because for both nodes 1, and 2, degree(1) = degree(2) = 1 < 2.

In order to keep the solution feasible, both nodes need to have degree 2. This can be

done by swapping a node incident to one edge with a node incident to another edge. The

swapping process is detailed in Algorithm 13. It searches all valid swap movements until

a non-tabu solution is found. If no movement is possible, the Null Graph is returned.

Algorithm 13 Swaps the nodes incident to two distinct edges.

1: function SWAP_EDGES_NODES(graph: S; edge: edel, tabulist: TL)

2: V S ← all valid swap movements

3: while V S 6= ∅ do4: e2del, SEA← random(V S)

5: SE ← (SE − {e2del}) ∪ SEA . does swap

6: if @te (te ∈ TL ∧ te ⊆ SE) then . if not tabu

7: performedDels← performedDels ∪ {e2del}8: return S

9: end if

10: SE ← (SE − SEA) ∪ {e2del} . undoes swap

11: V S ← V S − {(e2del, SEA)}12: end while

A swap is represented as a tuple. The first element is an existent edge, and the second

is a set of two non-existent edges. A swap tuple is thus represented as (e2del, SEA), where

SEA = {eadd, e2add} and stands for Set of Edges to Add. A swap tuple has the following

restrictions, given a deleted edge edel.

1. e2del ∩ edel = ∅;

2. eadd ∩ edel 6= ∅ ∧ eadd ∩ e2del 6= ∅;

3. eadd ∩ e2del 6= ∅ ∧ e2add ∩ e2del 6= ∅.

If these restrictions are not met, an unfeasible solution would be generated.

Suppose it is desired to perform a swap operation on Graph SWG0. For this Graph,

edel = {1, 2}. The dashed lines in Graph SWG1 indicate possible values for e2del (Figure

12b). Suppose that e2del = {4, 5}. Then, it is possible to obtain two Graphs after the swap

operation successfully finishes: SWG2 (Figure 12c), and SWG3 (Figure 12d). In these

figures, the nodes corresponding to edel are dotted, while e2del are dashed. Both edges were

deleted. In addition, SEA edges are represented as dash-dotted lines and were added to

the Graphs. It is interesting to note that Graph SWG3 is feasible, though undesirable.

The Tabu Search will move away from disconnected graphs by evaluating its fitness to

infinity, e.g. FITNESS(SWG3) =∞.

(a) SWG0

(b) SWG1

(c) SWG2

(d) SWG3

Figure 12: Examples of valid edges’ node swapping process

The established restrictions prevent unfeasible solutions to be generated. For instance,

if restriction 1 was loosened, it would be possible that edel = {0, 1}, and e2del = {0, 5},Graph SWG4 (Figure 13a). If a swap operation was then performed, either Graph SWG4

would be re-obtained, or the unfeasible Graph SWG5 would be generated (Figure 13b). If

either restriction 2 or 3 were loosened instead, it would be possible to obtain Graph SWG6

(Figure 13c) from SWG1 – edel = {1, 2}, e2del = {4, 5}, and SEA = { {2, 5}, {0, 3} }.This solution is unfeasible since degree(1) = degree(4) < 2.

(a) SWG4

(b) SWG5

(c) SWG6

Figure 13: Example of invalid edge node swapping operation.

4.3.3.2 Delete Edge Incident to One Minimum Degree Node

This scenario raises in situations similar to the previous one. The core difference is

that only one of the selected edge’s incident nodes have minimum degree. That is, after

deleting that edge, the respective node degree would be 1. For example, this situation

would be caused by removing edge {0, 1} from Graph MINSG0 (Figure 14a) because

degree(0) = 2, which is the minimum acceptable.

This problem can be simply solved by, after deleting the edge, adding a random edge

incident to the node with minimum degree. This process is hereafter called spin edge. The

unchanged node will be named spin centre (or simply centre), while the remaining nodes

will be referred as targets (a set of vertices). The node that was originally incident to the

deleted edge will be named as original target, and original target /∈ targets. The process

of spinning an edge is described by Algorithm 14. Its only restriction is that the vertex

centre ∈ e. It is important to recall that edge e was previously deleted from the Graph,

and e ∈ TL (Algorithm 11 – lines 2, and 6, respectively).

Algorithm 14 Spin edge.1: function SPIN_EDGE(graph: S; edge: e, vertex: centre, tabulist: TL)

2: targets← SV − e3: while targets 6= ∅ do4: target← random(targets)

5: targets← targets− {target}6: eadd ← {centre, target}7: if eadd /∈ SE ∧ degree(target) < 4 then

8: SE ← SE ∪ {eadd} . spins

9: if @te(te ∈ TL ∧ te ⊆ SE) then . if non-tabu

10: return S

11: end if

12: SE ← SE − {eadd} . unspins

13: end if

14: end while

Until all targets are explored, the Algorithm will select a random target, and attempt

do add an edge between it and the spin centre. If this edge is already in the Graph, or one

of its nodes have maximum degree, it cannot be added (line 7). If the Graph obtained by

adding this edge is not tabu, return the solution, tracebacking otherwise. If there is no

possible spin, the Null Graph will be returned.

For example, deleting edge {0, 1} from Graph MINSG0 would trigger the scenario

represented by MINSG1 (Figure 14b), where the dotted edge was deleted, and either of

the dashed lines may be added ({0, 2}, {0, 3}, or {0, 4}). It is not possible to spin {0, 1}to {0, 5} because it is already in the graph.

Algorithm 14 also verifies if target’s degree is the maximum possible; preventing the

edge spin in this case. For instance, consider the extreme scenario described by Graph

MINSG2 (Figure 14c). If edge {0, 1} was removed, there would be no possible spin

because {v|v 6= 0 ∧ v 6= 1 ∧ degree(v) = 4} = MINSG2V − {0, 1}. In this case, the

algorithm would return the Null Graph.

(a) MINSG0

(b) MINSG1

(c) MINSG2

Figure 14: Example of spin operation with a minimum degree node.

4.3.3.3 Default Scenario

The default scenario is the simplest one. No feasibleness condition is violated when

deleting or adding edges. Thus, no special operation is needed to guarantee it. It is impli-

citly performed by the Neighbourhood Search – Algorithm 11, line 2, and Algorithm 12,

line 11. Therefore, the focus of this Section is solely to illustrate such scenarios.

In the Graphs represented in Figure 15, dotted lines represent deleted edges, while

dashed line are added edges. Therefore, considering Graph DG0 (Figure 15a), if edel =

{1, 5}, and eadd = {2, 4}, the resulting Graph is DG1 (Figure 15b). Analogously, if edel =

{1, 5}, and eadd = {1, 4}, the Graph DG2 is obtained (Figure 15c). In this case, if edelwas not previously deleted, an unfeasible solution would have been generated because the

degree of 1 would be 5.

(a) DG0

(b) DG1

(c) DG2

Figure 15: Examples of default scenario.

4.3.3.4 Add Edge Incident to One Maximum Degree Node

This scenario is similar to delete an edge incident to one minimum degree node. The

core difference is that the maximum degree node will not be the spin centre, but rather

its adjacent node. This case occurs, for example, if edge {2, 5} is attempted to be added

in Graph MAXSG0 (Figure 16a).

Algorithm 15 describes how to tackle this problem. Essentially, an already existent

edge incident to the maximum degree node will be spun. The algorithm expects to receive

the maximum degree node (mdn) such that mdn ∈ eadd. The Algorithm will attempt to

spin all edges incident to mdn (TES). One of these edges is randomly selected (edel), and

the spin centre is set to the node adjacent to the mdn. A temporary tabu list containing

the selected edge is created (TLS). Tabu or invalid solutions obtained after the spinning

process will also be added to TLS. A copy of the current solution is created removing edel,

and the SPIN_EDGE is called. A union operation is performed between TL and TLS to

prevent edel from being added. After the spin process, it is verified if it is possible to add

eadd to the Graph, i.e. if no adjacent node to eadd has degree 4. If that is the case, eadd is

inserted, and it is verified if the new solution is tabu or not. If it is not tabu, then a valid

solution was found. If none of the previous two conditions is met, the obtained solution is

added to TLS to prevent the spin process of generating it again. Eventually, if no solution

is found, SC = (∅, ∅), and the process will be repeat for a previously unselected edge from

TES. If it is not possible to perform the max spin operation, the Null Graph is returned.

Algorithm 15 Spin edge incident to one maximum degree node1: function SPIN_MAX_DEGREE(graph: S, edge: eadd, vertex: mdn, tabulist: TL)

2: TES ← {e|e ∈ SE ∧mdn ∈ e} . target edges to spin

3: while TES 6= ∅ do4: edel ← random(TES)

5: {centre} ← edel − {mdn} . node adjacent to max degree node

6: TLS ← {edel} . tabu list of spins

7: repeat

8: SC ← (SV , SE − {edel}) . Solution Copy with removed edge to spin

9: SC ← SPIN_EDGE(SC, edel, centre, TL ∪ TLS)10: if @v(v ∈ SCV ∧ v ∈ eadd ∧ degree(v) = 4) then

11: SCE ← SCE ∪ {eadd}12: if @te(te ∈ TL ∧ te ⊆ SCE) then . if not tabu

13: performedDels← performedDels ∪ {edel}14: return SC

15: end if

16: end if

17: TLS ← TLS ∪ {SCE − {eadd}}

18: until SC = (∅, ∅)19: TES ← TES − {edel}20: end while

Before executing Algorithm 15, some edge was deleted from the graph (Algorithm 11,

line 2). Both this edge and eadd were previously added to TL in order to prevent them

from being inserted to the Graph during the spinning process – Algorithm 10, line 15,

and Algorithm 12, line 6.

As an example of this process, suppose that S = MAXSG0, and eadd = {2, 5},consequently, mdn = 5. Therefore, it would be necessary to spin either {0, 5}, {1, 5},{3, 5}, or {4, 5}. This scenario is illustrated by GraphMAXSG1 (Figure 16b), where eaddis the dashed line and the pontential edges to spin are dotted. If edge {0, 5} is chosen

to be spun, then it is possible to add edge {0, 3} to the graph, resulting in MAXSG2

(Figure 16c).

(a) MAXSG0

(b) MAXSG1

(c) MAXSG2

Figure 16: Example of successful spin operation with a maximum degree node.

There are several conditions to assert feasibility during or after the spin. If eadd is

incident to a node of degree 4 after the spin, that solution cannot be visited. This scenario

could be triggered, for example, from Graph MAXSG1 if edge {0, 5} was spun to {0, 2}instead of {0, 3}. As illustrated by Graph MAXSG3 in Figure 17a – the spun edge

{0, 2} (dashed) is prohibiting the originally chosen eadd = {2, 5} (dash-dotted) to be

inserted. A problem may also happen even if a non-Tabu solution was obtained after

the spin. Because it may become tabu after adding eadd to the graph. For instance, if

TL = {{{0, 3}, {2, 5}}, {0, 4}}, the Graph generated after spinning edge {0, 5} to {0, 3}is not tabu – Graph MAXSG4 in Figure 17b considering only the solid lines. However,

after inserting eadd (dotted line) to the Graph, a tabu solution would be obtained because

{{0, 3}, {2, 5}} ∈ TL ∧ {{0, 3}, {2, 5}} ⊆ SCE.

(a) MAXSG3

(b) MAXSG4

Figure 17: Examples of unsuccessful spin operation with a maximum degree node.

4.3.3.5 Add Edge Between Two Maximum Degree Nodes

It may be necessary to spin two edges if an edge incident to two nodes of maximum

degree is chosen to be added. Such a scenario would occur by attempting to insert edge

eadd = {0, 3} (dashed line) to the Graph DSG0 (Figure 18a). Also, suppose that edge

{1, 2} (dotted) was previously removed – Algorithm 10, line 15.

The behaviour of the double spin is described in Algorithm 16. It basically con-

sists on spinning one edge incident to each vertex of eadd. The eadd’s vertices are ran-

domly assigned to v1, and v2. An empty set of visited solutions is created – V S. The

SPIN_MAX_DEGREE() function is called. Note that ∅ is then passed as the argument

for eadd; therefore, only a random edge incident to v1 will be spun. The edge deleted during

the call is identified edel, and temporarily added to the Tabu List TL to guarantee that

it will not be inserted during the SPIN_MAX_DEGREE() call. If the solution obtained

is valid; it is returned, and both edges deleted throughout the Algorithm are added to

performedDels. If the Null Graph was obtained from the second spin, then it is not pos-

sible to obtain a solution from SC. Hence, SC is added to V S in order to be considered

as a tabu solution in the next iteration. This process is repeated until all possible solu-

tions are explored, returning the Null Graph if that is the case. Algorithm 16 does not

verify if the obtained solutions are tabu because this is implicitly done by Algorithm 15.

Notwithstanding, eadd is not added to TL since it was previously performed – Algorithm

12, line 3.

As an example of this Algorithm, suppose that, for Graph DSG0, v1 = {0}, andv2 = {3}. Then, after the first spin, it would be only possible to obtain Graph DSG1

(Figure 18b) by spinning {0, 5} (dotted) to {5, 2} (dashed). Analogously, for the second

spin, the only obtainable solution is to spin edge {3, 4} to {4, 1}, as illustrated in Graph

DSG2 (Figure 18c. The dotted edges ({0, 5}, and {3, 4}) were spun during the Algorithm,

while the dashed edges ({2, 5}, and {4, 1}) were added. The dash-dotted line ({0, 3}) willbe added posteriorly by Algorithm 12 in line 11.

Algorithm 16 Double spin1: function DOUBLE_SPIN(graph: S, edge: eadd, tabulist: TL)

2: v1 ← {random(eadd)}3: {v2} ← eadd − {v1}4: V S ← ∅5: repeat

6: SC ← SPIN_MAX_DEGREE(S, ∅, v1, TL ∪ V S)7: {edel} ← SE − SCE8: if SC 6= (∅, ∅) then9: SC2 ← SPIN_MAX_DEGREE(SC, ∅, v2, TL ∪ {edel})

10: if SC2 6= (∅, ∅) then11: {e2del ← SCE − SC2E}12: performedDels← performedDels ∪ {edel} ∪ {e2del}13: return SC2

14: end if

15: end if

16: V S ← V S ∪ {SC}17: until SC = (∅, ∅)18: return (∅, ∅)19: end function

(a) DSG0

(b) DSG1

(c) DSG2

Figure 18: Example of successful double spin operation.

4.4 Fault Injection

The fault injection consists in a step performed after the topology generation to eval-

uate the reliability of each solution.

Given the best solution from the previous algorithm step, this stage will stress the

provided solution by randomly removing links from the topology. The random removal

simulates a faulty link between two routers, which cannot be used. After the links are re-

moved, the fitness function is recomputed. Systematically, the experiments were executed

thirty times for five different scenarios: removing 10%, 15%, 20%, 25%, and 30% of the

total links. The process is detailed in Algorithm 17.

Algorithm 17 Fault Injection1: function FAULT_INJECTION(graph: BS, TG)

2: for all perc ∈ {0.1, 0.15, 0.2, 0.25, 0.3} do3: qtd← |BSE| − b|BSE| · percc4: for i← 1 to 30 do

5: S ← BS

6: while |SE| > qtd do

7: e← random(SE)

8: SE ← SE − {e}9: end while

10: write S

11: write FITNESS(S, TG)

12: end for

13: end for

14: end function

The Algorithm receives the Best Solution calculated by the Tabu Search, and the

Task Graph in order to compute the fitness function. Depending on the number of edges

in the Graph, the value of failed links may not be a natural number. Therefore, the floor

function is used.

For example, consider Graph FI0 (Figure 19a), and assume that perc = 0.2. Since

|FI0E| = 12, qtd = 12 − b12 · 0.2c = 10. Therefore, two edges will be removed from

the Graph. Graph FI1 is an example of an obtainable Graph in this scenario (Figure

19c), where edges {0.4}, and {1, 3} (dotted lines) were randomly deleted. It is expected

that Graph FI1 has lower performance if compared to FI0, since there are fewer paths

possible. Analogously, if BS = FI2, and perc = 0.3, then qtd = 5. Therefore, it would

be possible that edges {0, 5}, and {4, 5} (dotted lines) were randomly selected; thus ob-

taining Graph FI3 (Figure 19d). Note that FI3 is a disconnected graph. Therefore, its

fitness would be evaluated to infinity – FITNESS(FI3, TG) = ∞. In other words, a

router (corresponding to node 5) would be isolated from the system. As a consequence,

a processor that is directly connect to this node would also be isolated from the system.

(a) FI0

(b) FI1

(c) FI2

(d) FI3

(e) FI4

Figure 19: Example of Fault Injection Algorithm.

However, if edge {0, 3} was chosen to be deleted instead of {4, 5}, the resulting graph

would not be disconnected (Figure 19e).

5 Results

Throughout this Chapter, the results of the obtained latency estimations are dis-

cussed. The latency is estimated via the QAP Function (Section 2.2). Several graphs were

generated, and studied during the analyses whenever possible. However, for a more de-

tailed analysis a subset of all graphs is chosen – median latency graphs. For instance, fault

injection analysis is made on this subset; also taking the obtained latency estimation into

consideration.

The TGs used to benchmark the Algorithm are the same discussed in Mesquita’s work

(MESQUITA, 2016). For sake of clarity, from the eight TGs, two will be the focus of the

forthcoming discussion, and hereby referred as chosen TGs – AP2TG (Figure 20a), and

MPEGTG (Figure 20b). However, all eight TGs are illustrated in Annex A.

(a) AP2TG(b) MPEGTG

Figure 20: Chosen TGs

For each TG, the remaining Algorithm 2 arguments were all possible combinations

between the following set elements.

• tabuListSize ∈ {1, |V |/2, |V |};

• terminationCriterion = {100, 250, 500};

• ε ∈ {|TGV |, |TGV |+ 1, . . . , 2|TGV | − 1, 2|TGV |}.

Algorithm 2 was executed 30 times for each combination.

Figures 21 and 22 respectively illustrate the influence of parameters tabuListSize,

and terminationCriterion for the chosen TGs. The x-axes correspond to the arguments

values, while the y-axes illustrate the latency estimation for each outputted solution. Each

x-axes value has an associated box plot. Several values can be inferred from a box plot.

The median value corresponds to the line inside the box (approximately 1200 for the

first box plot of Figure 21a). The first quartile is described by the line that inferiorly

delimits the box (a value slightly smaller than 1200 for the the first box plot of Figure

21a). Analogously, the third quartile is described by the line that superiorly delimits the

box (approximately 1300 for the first box plot of Figure 21a). The box plot also describes

the maximum and minimum values in the distribution. For instance, the maximum value

for the first box plot of Figure 21a is approximately 1500, while the minimum value equals

the first quartile. In the third box plot of Figure 21a, the maximum value is approximately

1600, while the minimum value is a bit smaller than the first quartile. The dots in the box

plots represent outliers, i.e. values that are not in the range of the expected distribution.

For example, the outliers of the first box plot of Figure 21a are in the [1750, 1900] range.

(a) Influence on AP2TG solutions. (b) Influence on MPEGTG solutions.

Figure 21: Influence of tabuListSize arguments for Latency Estimation in chosen TGs’generated solutions.

Practically no change is observable between the box plots. The minimum, median, and

first and third quartiles are very similiar between box plots of the same Figure. Only the

(a) Influence on AP2TG solutions. (b) Influence on MPEGTG solutions.

Figure 22: Influence of terminationCrit arguments for Latency Estimation in chosen TGs’generated solutions.

maximum value, and the outliers change. However, no overall latency estimation improve-

ment is observable. Therefore, it is possible to conclude that the influence of parameters

tabuListSize, and terminationCriterion is minimal. The influence of the same paramet-

ers for all TGs are compiled in Appendices A and B, and the same behaviour is noticeable.

Therefore, throughout this Chapter, the discussion will focus on the ε parameter.

There are two interesting analyses for the current work – a study regarding the gen-

erated topologies’ performance, and a study regarding the reliability after fault injection.

Thus, this chapter is subdivided into two Sections, one for each analysis. Throughout this

chapter, let STG,ε denote a solution for Task Graph TG with ε links. Also, to save space,

let FITNESS(STG,ε) denote FITNESS(STG,ε, TG).

5.1 Performance

In order to perform an overall analysis of the obtained topologies for a given TG,

the median latency estimation between all executions with the same ε value was se-

lected, i.e. arg(median(FITNESS(STG,ε))). Figure 23a compiles median latency for TGs

AP1TG, AP4TG, and INTEGRALTG, while Figure 23b illustrates the median latency

for TGs AP2TG, AP3TG,MPEGTG,MWDTG, and V OPDTG. For instance, the me-

dian latency for V OPDTG obtained solutions with 15 to 26 links is 3420. There are two

Figures in order to improve the visualisation.

It is possible to conclude that the FITNESS(STG,ε) function behaves similarly to the

f(x) = 1xrational function. Note that if the obtained solution was disconnected for some ε,

(a) Median latency estimation for AP1TG, AP4TG, and INTEGRALTG.

(b) Median latency estimation for AP2TG, AP3TG, MPEGTG, MWDTG, andV OPDTG.

Figure 23: Overall median latency for all benchmarked TGs.

FITNESS(STG,ε) =∞. In other words, some message would not arrive into its destination.

This behaviour is similar to limx→0f(x) =∞.

From Figure 23, it is also possible to infer that as the number of links in the topology

increases, the difference between latency improvements decreases. For instance, let

ε < ψ < ϕ, where ε, ψ, ϕ ∈ N; (5.1)

be possible ε values. Then,

FITNESS(STG,ε)− FITNESS(STG,ψ) ≥ FITNESS(STG,ψ)− FITNESS(STG,ϕ). (5.2)

As an example, for MPEGTG in Figure 23b,

FITNESS(SMPEGTG,13)− FITNESS(SMPEGTG,14) ≈ 500, (5.3)

FITNESS(SMPEGTG,14)− FITNESS(SMPEGTG,15) ≈ 250, (5.4)

500 ≥ 250. (5.5)

There are some cases that this difference is zero. For example, in Figure 23b,

FITNESS(SV OPDTG,21)− FITNESS(SV OPDTG,22) = 0. (5.6)

Such a scenario may happen when the number of links in the solution is greater or equal

to the number of edges in the TG (ε ≥ |TGE|). This scenario may also occur if no

improvement is possible due to the feasibility restrictions. An example of this situation is

MPEGTG, since degree(1) > 4.

It is interesting to analyse how latency estimation varies between different generated

topologies. Thus detailing the information provided by Figure 23. To accomplish this

task, a latency estimation box plot is calculated for each ε value of a given TG. Then,

all the obtained box plots are compiled into a single plot. The plots corresponding to

AP2TG, and MPEGTG are illustrated by Figures 24a and 24b, respectively. The x-axes

correspond to the number of links in the topology (ε), while the y-axes represent the

latency estimation. The plots for all eight TGs are gathered in Appendix C.

From figure 24a, it is possible to verify the same rational function behaviour previously

mentioned. In addition, the difference between the maximum outlier and the minimum

latency estimation tends to decrease as ε increases. Therefore, choosing whether or not to

connect two routers is more relevant for topologies with fewer links. Also, there exists an

ε value for which no major latency improvements are observable – i.e. 16, and arguably

(a) Box plots of AP2TG solutions’ latency es-timation.

(b) Box plots of MPEGTG solutions’ latencyestimation.

Figure 24: Box plots of chosen TGs solutions’ latency estimation.

15 or 14. Hence, it may be possible to obtain an efficient and fault-tolerant MPSoC with

reduced energy consumption. Notwithstanding, after some point, the box plots practically

turn into lines. This may happen when ε is large enough to generate a fault-tolerant

NoC, and yet capable of almost matching the original TG. For AP2TG, this value is 16,

although |AP2TGE| = 19. However, given the feasibility restriction 1 edge from node 2,

and 2 edges from node 5 would be removed, totalling 16. In other words, this value should

be calculated using the following formula,

convergence ε =∑

v∈TGV

min(degree(v), 4). (5.7)

Similar conclusions may be inferred from Figure 24b. However, the outliers in this

graph are more unstable than in the remainder ones. There are a couple of probable

reasons for this behaviour. First, it may be a consequence of MPEGTG’s description.

Second, it may also be caused by an unlucky initial solution; and a much larger number of

iterations would be necessary for the Tabu Search to converge. In addition, all the box plots

for which ε ≥ 16 turn into a line. Note that, given the TG description, convergence ε = 8.

This is the reason why all box plots have a small difference between their maximum and

minimum.

To illustrate some generated topologies, the NoCs with median fitness for both

ε = |TGV |+ round

(2|TGV | − |TGV |

), (5.8)

ε = 2|TGV | − round(2|TGV | − |TGV |

)(5.9)

were selected. Namely, the solutions SAP2TG,13 (Figure 25a), SAP2TG,17 (Figure 25b),

SMPEGTG,16 26a, and SMPEGTG,23 (Figure 26b) that correspond to the median fitness

for the respective ε values. It is interesting to note that TG edges with greater weights

were more likely to be present in the resulting topologies.

(a) Median fitness SAP2TG,13.

(b) Median fitness SAP2TG,17.

Figure 25: Examples of AP2TG solutions.

(a) Median fitness SMPEGTG,16.

(b) Median fitness SMPEGTG,23.

Figure 26: Examples of MPEGTG solutions.

5.2 Latency Estimation After Fault Injection

Recall that Algorithm 17 randomly removes 10, 15, 20, 25, and 30 percent of the best

solution’s links to simulate faults. To analyse the impact of faulty links on performance, a

selection procedure similar to the performance analysis was executed. In other words, the

solutions selected for analysis have median fitness value for each ε. Only five ε values are

chosen for each analysis in order to improve readability – the minimum value (ε = |TGV |),the maximum value (ε = 2|TGV |), and three intermediary values such that all values are as

even spaced as possible. Since 30 fault injections were performed for each percentage, the

topology corresponding to the median fitness after the injection is chosen to be analysed.

The fault injection latency estimation results for AP2TG and MPEGTG are illus-

trated in Figures 27a and 27b, respectively. Each line in the Figures correspond to the

median solution with the specified number of links before fault injection. The x-axes cor-

respond to the fault injection percentage. The y-axes illustrate the proportion by which

the latency estimation has worsened, e.g. the median latency of SAP2TG,20 after 20% fault

injection is approximately 24% larger than the latency of SAP2TG,20 before the fault injec-

tion. Also, the max y-axes value is ∞, represented by Inf. In other words, the topology

became disconnected after the fault-injection. The fault-injection graphs for all eight TGs

are compiled in Appendix D.

For the AP2TG application, it is possible to verify that the performance after fault

injection behaved as expected. Solutions with fewer links are more susceptible to become

disconnected, and less resistant to a larger number of faults. In addition, the latency

overhead tends to increase as the number of fault also increases. Notwithstanding, the

solutions with ε ≥ 15 did not become disconnected even after a 30% fault injection. It

is possible to observe that some solutions with fewer links had smaller latency overheads

than solutions if more links in some scenarios. For instance, considering 10, 15, 20, and

30% fault injection,

FITNESS(SAP2TG,18) ≤ FITNESS(SAP2TG,20). (5.10)

This behaviour probably occurred because of the links removed. In other words, the links

randomly deleted from SAPE2TG,20 were probably responsible for larger communications

(larger edge weight) than the links randomly removed from SAP2TG,18.

From Figure 27b, similar conclusions are inferable. The latency overheads tends to in-

crease as the number of faulty links also increases. Solutions with fewer links are less likely

to become disconnected. Notwithstanding, some scenarios where solutions with fewer links

resist better to fault injection than solution with more links also exist. Note, however, that

the latency overhead for the same solution is more unstable if compared to the results

of Figure 27a. For instance, SMPEGTG,20 resisted better to 30% fault injection than for

(a) Fault injection on median AP2TG solutions.

(b) Fault injection on median MPEGTG solutions.

Figure 27: Fault injection on median chosen TGs solutions.

15, 20, and 25% fault injection; while SMPEGTG,23’s latency overhead alternately increases

and decreases. The TG description (Figure 20b) alongside the feasibility restriction may

have a large influence on this effect. Note that there are eight vertices adjacent to node 1.

Thus, if any edge incident to node 1 is removed, the latency may dramatically increase,

e.g. (1, 5). On the other hand, if links with smaller weights are chosen to be deleted, the

latency overhead will not be large, e.g. (1, 2).

To illustrate some generated topologies after the fault injection, consider the solution

represented in Figure 28a. It corresponds to the median FITNESS(SAP2TG,15) solution.

Figures 28b, 28c, and 28d represent the median fitness topologies after 10%, 20%, and

30% fault injection, respectively. The dotted lines are the faulty links. Analogously, Figure

29a corresponds to the median FITNESS(SMPEGTG,19) solution. Figures 29b, 29c, and

29d represent the median fitness topologies after 10%, 20%, and 30% fault injection,

respectively. The dotted lines are the faulty links. The median fitness topologies for all

eight TGs can be found in Appendix E. In addition, the graphs for these topologies after

10, 20, and 30% fault injection is available in Appendices F, G, and H, respectively.

(a) Median fitness SAP2TG,15 solution.

(b) Median fitness SAP2TG,15 solution with 10%fault injection.

(c) Median fitness SAP2TG,15 solution with 20%fault injection.

(d) Median fitness SAP2TG,15 solution with 30%fault injection.

Figure 28: SAP2TG,15 behaviour during fault injection.

(a) Median fitness SMPEGTG,19 solution.

(b) Median fitness SMPEGTG,19 solutionwith 10% fault injection.

(c) Median fitness SMPEGTG,19 solutionwith 20% fault injection.

(d) Median fitness SMPEGTG,19 solutionwith 30% fault injection.

Figure 29: SMPEGTG,19 behaviour during fault injection.

Nevertheless, it may be desirable to analyse how some provided topologies behave

for all fault injections. The topologies corresponding to Figures 28a and 29a will be the

objects of study. All the corresponding fitness values were gathered and categorised into

the corresponding percentages (10, 15, 20, 25, or 30%). The box plots of Figures 30a and

30b are thus obtained. The x-axes correspond to the fault injection percentage. The y-axes

correspond to the latency overhead, similarly to Figure 27. It is important to emphasise

that values smaller than 1 in the y-axis correspond to disconnected solutions (red dashed

line). Similar plots for the remaining TG solutions are available in Appendix I.

From Figure 30a, it is inferable that the latency overhead worsens as the fault injection

percentage increases, as expected. In addition, the box plots’ range increases alongside the

percentage. Therefore, the latency overhead becomes more dependent on the faulty links

as the fault injection percentage increases. Disconnected NoCs are considered outliers

until 30% fault injection. In addition, minimum box plot values for 10, 20, and 25% fault

injection is 1. Therefore, depending on the faulty links, the obtained topology may be

capable of achieving the same latency estimation previous to the fault injection. Such a

(a) Fault injection on median SAP2TG,15 solu-tion.

(b) Fault injection on median SMPEGTG,19 solu-tion.

Figure 30: Fault injection on median solutions with median ε of the chosen TGs.

scenario may occur either if a “spare link” was removed, or if the number of hops through

the alternative path equals the number of hops through the original path.

Figure 30b illustrates the fault injection behaviour on median solution SMPEGTG,19.

Unlike the previous Figure, no significant variation in the box plots is observable, e.g.

the 20% box plot is more similar to the 30% than to the 25% box plot. As previously

discussed, this is probably a consequence of the MPEGTG description (Figure 20b).

In other words, removing a link adjacent to node 1 may greatly increase the latency

estimation. Notwithstanding, disconnected graphs were never classified as outliers.

6 Concluding Remarks

This work proposed the generation of fault-tolerant irregular NoC topologies. The

importance of this topic involves the fabrication of supercomputers, which need to be as

fast, flexible, and durable as possible.

Moreover, the advances in NoC topologies are independent of the cores’ progress.

Therefore, a more suitable NoC may be capable of extracting the full potential of the

cores by diminishing the communication overhead. On the other hand, faster processing

elements may explore the NoC resources better by requesting data transmission more

frequently.

The obtained topologies were generated using Tabu Search. Throughout the process,

multiple operations were performed to guarantee minimal fault resistance. The resulting

topologies were benchmarked and shown to be application specific efficient. In addition,

topologies with a few more links than the minimal acceptable (ring topology) resisted

from 10 to 30% of the random link failure. Therefore, these topologies may also have

lower power consumption if compared to regular NoCs; yet achieving high performance

and fault resistance.

Some features may be added to the proposed algorithm to augment its benefits. For

instance, combining the Tabu Search with an Evolutionary Algorithm – i.e. a Memetic

Algorithm – may explore a wider range of solutions. Thus increasing the odds of finding

the global optimum. It may be also necessary to add graph planarity restrictions to the

algorithm to guarantee that the generated topologies can be used for 2D-MPSoC design.

Future versions of the Algorithm may also measure fault-tolerance throughout the process;

though multi-objective approaches may be necessary. Notwithstanding, further studies

may synthesise the generated topologies, properly benchmarking its performance.

References

ABEDNEZHAD, D.; ALAVI, S. E. A new irregular fault-tolerant routing algorithm innetwork-on-chip. International Journal of Computer Science and Network Security (IJC-SNS), International Journal of Computer Science and Network Security, v. 17, n. 4, p. 166,2017.

ASCIA, G.; CATANIA, V.; PALESI, M. Multi-objective mapping for mesh-based nocarchitectures. In: ACM. Proceedings of the 2nd IEEE/ACM/IFIP international conferenceon Hardware/software codesign and system synthesis. [S.l.], 2004. p. 182–187.

AZAD, S. P. et al. Holistic approach for fault-tolerant network-on-chip based many-coresystems. arXiv preprint arXiv:1601.07089, 2016.

BALAS, E.; TOTH, P. Branch and bound methods for the traveling salesman problem.[S.l.], 1983.

BECKER, M.; KRÖMKER, M.; SZCZERBICKA, H. Evaluating heuristic optimization,bio-inspired and graph-theoretic algorithms for the generation of fault-tolerant graphswith minimal costs. In: Information Science and Applications. [S.l.]: Springer, 2015. p.1033–1041.

BEIGNÉ, E. et al. An asynchronous noc architecture providing low latency service andits multi-level design framework. In: IEEE. Asynchronous Circuits and Systems, 2005.ASYNC 2005. Proceedings. 11th IEEE International Symposium on. [S.l.], 2005. p. 54–63.

BOKHARI, S. H. On the mapping problem. IEEE Transactions on Computers, IEEE,n. 3, p. 207–214, 1981.

BONONI, L.; CONCER, N. Simulation and analysis of network on chip architectures:ring, spidergon and 2d mesh. In: EUROPEAN DESIGN AND AUTOMATION ASSOCI-ATION. Proceedings of the conference on Design, automation and test in Europe: Design-ers’ forum. [S.l.], 2006. p. 154–159.

CHANG, Y.-C. et al. On the design and analysis of fault tolerant noc architecture usingspare routers. In: IEEE PRESS. Proceedings of the 16th Asia and South Pacific DesignAutomation Conference. [S.l.], 2011. p. 431–436.

CHOUDHARY, N.; GAUR, M.; LAXMI, V. Irregular noc simulation framework: Irnirgam.In: IEEE. Emerging Trends in Networks and Computer Communications (ETNCC), 2011International Conference on. [S.l.], 2011. p. 1–5.

CHOUDHARY, N. et al. Genetic algorithm based topology generation for applicationspecific network-on-chip. In: IEEE. Circuits and Systems (ISCAS), Proceedings of 2010IEEE International Symposium on. [S.l.], 2010. p. 3156–3159.

CORMEN, T. H. et al. Introduction to algorithms. [S.l.]: MIT press, 2009.

DEHYADGARI, M. et al. Evaluation of pseudo adaptive xy routing using an objectoriented model for noc. In: IEEE. 2005 International Conference on Microelectronics.[S.l.], 2005. p. 5–pp.

DICK, R. P.; RHODES, D. L.; WOLF, W. Tgff: task graphs for free. In:IEEE. Proceedings of the Sixth International Workshop on Hardware/SoftwareCodesign.(CODES/CASHE’98). [S.l.], 1998. p. 97–101.

DIJKSTRA, E. Dijkstra’s algorithm. Dutch scientist Dr. Edsger Dijkstra network al-gorithm: http://en. wikipedia. org/wiki/Dijkstra’s_algorithm, 1959.

GABIS, A. B.; KOUDIL, M. Noc routing protocols–objective-based classification. Journalof Systems Architecture, Elsevier, v. 66, p. 14–32, 2016.

GAREY, M. R.; JOHNSON, D. S.; STOCKMEYER, L. Some simplified np-completeproblems. In: ACM. Proceedings of the sixth annual ACM symposium on Theory of com-puting. [S.l.], 1974. p. 47–63.

GENDREAU, M.; POTVIN, J.-Y. Tabu search. In: Search methodologies. [S.l.]: Springer,2005. p. 165–186.

GENDREAU, M.; POTVIN, J.-Y. et al. Handbook of metaheuristics. [S.l.]: Springer, 2010.

HEMANI, A. et al. Network on chip: An architecture for billion transistor era. In: Pro-ceeding of the IEEE NorChip Conference. [S.l.: s.n.], 2000. v. 31, p. 11.

HO, W. H.; PINKSTON, T. M. A methodology for designing efficient on-chip intercon-nects on well-behaved communication patterns. In: IEEE. The Ninth International Sym-posium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.[S.l.], 2003. p. 377–388.

HOSSEINABADY, M. et al. Reliable network-on-chip based on generalized de bruijngraph. In: IEEE. 2007 IEEE International High Level Design Validation and Test Work-shop. [S.l.], 2007. p. 3–10.

JAIN, K.; CHOUDHARY, N.; SINGH, D. Energy efficient branch and bound based on-chip irregular network design. Global Journal of Computer Science and Technology, 2014.

JANTSCH, A.; TENHUNEN, H. et al. Networks on chip. [S.l.]: Springer, 2003.

KOIBUCHI, M. et al. A lightweight fault-tolerant mechanism for network-on-chip. In:IEEE COMPUTER SOCIETY. Proceedings of the Second ACM/IEEE International Sym-posium on Networks-on-Chip. [S.l.], 2008. p. 13–22.

KREUTZ, M. et al. Energy and latency evaluation of noc topologies. In: IEEE. 2005IEEE International Symposium on Circuits and Systems. [S.l.], 2005. p. 5866–5869.

KUROSE, J. F.; ROSS, K. W. Computer networking: a top-down approach: internationaledition. [S.l.]: Pearson Higher Ed, 2013.

LEE, D.; PARIKH, R.; BERTACCO, V. Highly fault-tolerant noc routing withapplication-aware congestion management. In: ACM. Proceedings of the 9th InternationalSymposium on Networks-on-Chip. [S.l.], 2015. p. 10.

LI, M.; ZENG, Q.-A.; JONE, W.-B. Dyxy: a proximity congestion-aware deadlock-freedynamic routing method for network on chip. In: ACM. Proceedings of the 43rd annualDesign Automation Conference. [S.l.], 2006. p. 849–852.

LIN, S.-Y. et al. Fault-tolerant router with built-in self-test/self-diagnosis and fault-isolation circuits for 2d-mesh based chip multiprocessor systems. In: IEEE. 2009 Interna-tional Symposium on VLSI Design, Automation and Test. [S.l.], 2009. p. 72–75.

MESQUITA, J. W. d. Exploração de espaço de projeto para geração de redes em chip detopologias irregulares otimizadas: a rede UTNoC. Dissertação (Mestrado) — UniversidadeFederal do Rio Grande do Norte, 2016.

MILFONT, R. et al. Analysis of routing algorithms generation for irregular noc topologies.In: IEEE. Test Symposium (LATS), 2017 18th IEEE Latin American. [S.l.], 2017. p. 1–5.

MORAES, F. et al. Hermes: an infrastructure for low area overhead packet-switchingnetworks on chip. INTEGRATION, the VLSI journal, Elsevier, v. 38, n. 1, p. 69–93,2004.

MOTA, R. G. et al. Efficient routing table minimization for fault-tolerant irregularnetwork-on-chip. In: IEEE. 2016 IEEE International Conference on Electronics, Circuitsand Systems (ICECS). [S.l.], 2016. p. 632–635.

NEEB, C.; WEHN, N. Designing efficient irregular networks for heterogeneous systems-on-chip. Journal of Systems architecture, Elsevier, v. 54, n. 3-4, p. 384–396, 2008.

PATTERSON, D. A.; HENNESSY, J. L. Computer Organization and Design MIPS Edi-tion: The Hardware/Software Interface. [S.l.]: Newnes, 2013.

PINTO, A.; CARLONI, L. P.; SANGIOVANNI-VINCENTELLI, A. L. Efficient synthesisof networks on chip. In: IEEE. Proceedings 21st International Conference on ComputerDesign. [S.l.], 2003. p. 146–150.

RADETZKI, M. et al. Methods for fault tolerance in networks-on-chip. ACM ComputingSurveys (CSUR), ACM, v. 46, n. 1, p. 8, 2013.

RAVI, R. et al. Approximation algorithms for degree-constrained minimum-cost network-design problems. Algorithmica, Springer, v. 31, n. 1, p. 58–78, 2001.

ROCHA, H. M. G. d. A. O Problema do Mapeamento: Heurísticas de mapeamento detarefas em MPSoCs baseados em NoC. Monografia (B.S. thesis) — Universidade Federaldo Rio Grande do Norte, 2017.

RODRIGO, S. et al. Cost-efficient on-chip routing implementations for cmp and mpsocsystems. IEEE transactions on computer-aided design of integrated circuits and systems,IEEE, v. 30, n. 4, p. 534–547, 2011.

SALMINEN, E.; KULMALA, A.; HAMALAINEN, T. D. Survey of network-on-chip pro-posals. white paper, OCP-IP, Citeseer, v. 1, p. 13, 2008.

SCHALLER, R. R. Moore’s law: past, present and future. IEEE spectrum, IEEE, v. 34,n. 6, p. 52–59, 1997.

SHAH, P.; KANNIGANTI, A.; SOUMYA, J. Fault-tolerant application specific network-on-chip design. In: IEEE. Embedded Computing and System Design (ISED), 2017 7thInternational Symposium on. [S.l.], 2017. p. 1–5.

SOTERIOU, V. et al. A high-throughput distributed shared-buffer noc router. IEEEComputer Architecture Letters, IEEE, v. 8, n. 1, p. 21–24, 2009.

SRINIVASAN, K.; CHATHA, K. S.; KONJEVOD, G. An automated technique for to-pology and route generation of application specific on-chip interconnection networks. In:IEEE COMPUTER SOCIETY. Proceedings of the 2005 IEEE/ACM International con-ference on Computer-aided design. [S.l.], 2005. p. 231–237.

SRINIVASAN, K.; CHATHA, K. S.; KONJEVOD, G. Linear-programming-based tech-niques for synthesis of network-on-chip architectures. IEEE Transactions on Very LargeScale Integration (VLSI) Systems, IEEE, v. 14, n. 4, p. 407–420, 2006.

STALLINGS, W. Computer organization and architecture: designing for performance.[S.l.]: Pearson Education India, 2003.

VENKATARAMAN, N.; KUMAR, R. Design and analysis of application specific networkon chip for reliable custom topology. Computer Networks, Elsevier, 2019.

WANG, C. et al. An efficient topology reconfiguration algorithm for noc based multi-processor arrays. In: IEEE. High Performance Computing and Communications & 2013IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC),2013 IEEE 10th International Conference on. [S.l.], 2013. p. 873–880.

WILSON, R. J. Introduction to graph theory. [S.l.]: Pearson Education India, 1979.

YANG, P. et al. A fault tolerance noc topology and adaptive routing algorithm. In: IEEE.Embedded Software and Systems (ICESS), 2016 13th International Conference on. [S.l.],2016. p. 42–47.

YESIL, S.; TOSUN, S.; OZTURK, O. Fpga implementation of a fault-tolerant application-specific noc design. In: IEEE. 2016 International Conference on Design and Technologyof Integrated Systems in Nanoscale Era (DTIS). [S.l.], 2016. p. 1–6.

ZEFERINO, C. A. Redes-em-chip: arquiteturas e modelos para avaliação de área e desem-penho. 2003.

ZEFERINO, C. A.; SUSIN, A. A. Socin: a parametric and scalable network-on-chip.In: IEEE. Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings. 16thSymposium on. [S.l.], 2003. p. 169–174.

ZHANG, L. et al. On topology reconfiguration for defect-tolerant noc-based homogeneousmanycore systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,IEEE, v. 17, n. 9, p. 1173–1186, 2009.

APPENDIX A -- Influence of tabuListSize

Arguments

Figure 31: Influence of tabuListSize on AP1TG solutions.

Figure 35: Influence of tabuListSize on INTEGRALTG solutions.

Figure 36: Influence of tabuListSize on MPEGTG solutions.

Figure 37: Influence of tabuListSize on MWDTG solutions.

Figure 38: Influence of tabuListSize on V OPDTG solutions.

APPENDIX B -- Influence ofterminationCriterion

Arguments

Figure 39: Influence of terminationCriterion on AP1TG solutions.

Figure 43: Influence of terminationCriterion on INTEGRALTG solutions.

Figure 44: Influence of terminationCriterion on MPEGTG solutions.

Figure 45: Influence of terminationCriterion on MWDTG solutions.

Figure 46: Influence of terminationCriterion on V OPDTG solutions.

APPENDIX C -- Latency Box Plots

Figure 47: Fitness (latency estimation) box plots of AP1TG generated solutions.

Figure 51: Fitness (latency estimation) box plots of INTEGRALTG generated solutions.

Figure 52: Fitness (latency estimation) box plots of MPEGTG generated solutions.

Figure 53: Fitness (latency estimation) box plots of MWDTG generated solutions.

Figure 54: Fitness (latency estimation) box plots of V OPDTG generated solutions.

APPENDIX D -- Fault Injection in MedianFitness Solutions

Figure 55: Fault injection on median AP1TG solutions.

Figure 59: Fault injection on median INTEGRALTG solutions.

Figure 60: Fault injection on median MPEGTG solutions.

Figure 61: Fault injection on median MWDTG solutions.

Figure 62: Fault injection on median V OPDTG solutions.

APPENDIX E -- Examples of Median EpsilonSolutions

Figure 63: Median ε AP1 with median fitness.

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

Figure 67: Median ε INTEGRAL with median fitness.

Figure 68: Median ε MPEG with median fitness.

Figure 69: Median ε MWD with median fitness.

9 10 11

Figure 70: Median ε V OPD with median fitness.

APPENDIX F -- Median Epsilon SolutionsAfter 10% Fault Injection

Figure 71: Median ε AP1 solution with median fitness after 10% fault injection.

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

Figure 75: Median ε INTEGRAL solution with median fitness after 10% fault injection.

Figure 76: Median ε MPEG solution with median fitness after 10% fault injection.

Figure 77: Median ε MWD solution with median fitness after 10% fault injection.

9 10 11

Figure 78: Median ε V OPD solution with median fitness after 10% fault injection.

APPENDIX G -- Median Epsilon SolutionsAfter 20% Fault Injection

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

9 10 11

APPENDIX H -- Median Epsilon SolutionsAfter 30% Fault Injection

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

9 10 11

APPENDIX I -- Detailed Fault Injection inSome Median FitnessSolutions

Figure 95: Fault injection on median SAP1TG,15 solution.

Figure 99: Fault injection on median SINTEGRALTG,15 solution.

Figure 100: Fault injection on median SMPEGTG,19 solution.

Figure 101: Fault injection on median SMWDTG,18 solution.

Figure 102: Fault injection on median SV OPDTG,19 solution.

ANNEX A -- Mesquita’s Work TGs

Figure 103: AP1TG

Figure 104: AP2TG

Figure 105: AP3TG

Figure 106: AP4TG

Figure 107: INTEGRALTG

Figure 108: MPEGTG

Figure 109: MWDTG

Figure 110: V OPDTG

generationofapplicationspeciﬁcfault … · 2019. 6. 23. · bezerra, gustavo alves. generation of...

Documents

tabu pymemed

tabu parte1

tabu search

tabu ada 2003

entrevista à tabu

tabu formacion

presentación tabu

tabu 411 kelime

tabu package updates -...

ups topologies - project performance comparisons topologies...

tÜrkÇe tabu kartlari (astagos) idealdir. -...

tabu presentacion

tabu in has

barra tabu

implantes tabu

psihoterapija (ni)je tabu! - spuh · psihoterapija (ni)je...

totem in tabu

religion tabu

uma introdução à busca tabu - ime-uspaspiração por...

velkommen til tabu · 2019-02-13 · velkommen til tabu...