modelling robustness part 1: prologue fabrice saffre

Modelling RobustnessPart 1: Prologue

Fabrice Saffre

© British Telecommunications plc, 2002 2

A few words about modelling

There are basically two techniques:- Analysis (mathematical and/or numerical).- Simulation (typically Monte Carlo).

Most of the time, they complement each other nicely, so you’re likely to need both (independently of your preferences!)

They both require abstracting the problem at hand to some extent (i.e. get rid of “insignificant” details).


A few words about… me

I got my PhD modelling swarming behaviour of social spiders (no kidding!)

So by any practical definition, I’m a biologist, not a computer scientist.

My job now involves developing biologically-inspired algorithms to:

- model topological robustness.- manage dynamic response to changing conditions

(in a network security context).


The context

Network resilience to attack/failure is a growing concern.

There are several routes toward improving robustness:

- Increase ability to withstand damage.- Find new ways to limit damage.

First step however is to find a suitable measurement for network state (with respect to dependability/QoS).

Part 2: “Complex networks”


What is a “complex network”?

You tell me... A fancy but (let’s face it) meaningless

expression... A (poor) designation for something very real

and very widespread, combining elements of the graph theory with self-organisation and complex systems.

But then what are “complex systems”?


Where do you find them?

Everywhere... In physics and chemistry (crystals, reaction

chains...) In physiology and morphology (neural nets,

cellular interactions in the embryo...) In ecology (food webs) In sociology (“small worlds”, collaboration

networks...) In technology (power grids, telecom...)


What we mean when talking about “complex networks” A collection of nodes

and links... featuring global

invariant properties (diameter, clustering coefficient etc)...

even though it is produced using local probabilistic connection rules.


Building a hierarchy with the “preferential attachment” rule (1/4)

A scale free network can be generated if vertices are sequentially added and connected to the existing structure (Barabasi et al., 1999).

To obtain the desired architecture, each new vertex needs only select its connection on the basis of the current degree distribution within the network.


This can be done simply by attributing to each existing node i a probability of being selected Pi depending on its degree ki:

Repeating the connection process for all vertices using this expression (with = 1) is enough to generate a realistic network.

n

jj

ii

k

kP

1




What’s meant by “realistic” is that the resulting topology appears to be similar to that of real networks like the Internet.

Faloutsos et al. (1999) have found the degree distribution profile of the Internet to obey a power law.

Because of the built-in amplification process, the algorithm of Barabasi et al. (1999) also generates such a distribution (with the appropriate slope for = 1).



In other words: a very basic growth algorithm can be used to generate a variety of plausible network models on which to run numerical experiments.


However...

As usual, things are not that simple... Global network cohesion (i.e. a path exists

between any 2 nodes) is only guaranteed as long as vertices are added one at a time.

The sequential aspect of the connection process is also responsible for the emergence of hierarchy if attractiveness (Pi) grows linearly ( = 1) with node degree (ki).


The “delayed attachment” rule (1/2)

Connections are initiated when all nodes are already present.

For each link, the origin is chosen at random and the target by preferential attachment.


The “delayed attachment” rule (2/2)

This generates a very different topology.

Global cohesion is lost (not all nodes are within the giant component).

The hierarchy (degree distribution profile) is modified...

because the advantage to elder vertices is lost.


Why consider “delayed attachment”?

Because there are no “hidden” (implicit) non-linear effects ( = 1 does not generate a power law).

But also, more importantly, because it is a better model for highly dynamic architectures like ad hoc networks.

Continuous re-mapping of graph topology also limits (removes?) the advantage to elder vertices.


What is robustness (in networks)?

The ability to sustain accidental damage and remain operational.

The ability to sustain intentional damage and remain operational.

The ability to survive topological changes, which can often be assimilated to a special form of accidental damage (nodes move out of range, a router is overloaded...)


How do we measure it?

Most authors tend to use a (reductive) practical definition:

“Being robust is being able to maintain most surviving nodes within the giant component after having sustained damage.”

Accordingly, network robustness is simply inversely proportional to the rate of decay of the giant component’s relative size, as a function of damage extent.


However (again)...

This is obviously a very “unrefined” view. Maintaining network cohesion (keeping

nodes within the giant component) is necessary but not sufficient...

because structural changes can cause congestion or routing failure, which can in turn prevent normal operation of the presumably “intact” network.

But we’ll have to live with that for now...


What we knew (1/2)

Scale free networks are very robust to accidental node failure...

because the “strong” hierarchy makes it unlikely that random events hit those few high degree nodes which are responsible for global cohesion (Albert et al., 2000).

As a result, the average size of the giant component decays gracefully with cumulative node failure (up to one point...)


What we knew (2/2)

“Graceful” doesn’t mean “slow”, only that catastrophic events are rare (the relative size of the giant component doesn’t drop faster than that of the network as a whole).

Initiating more than one connection per node is an obvious way of increasing robustness (multiple paths available between graph regions).


What we found (1/5)

The decay of the giant component’s average size can be approximated using a simple non-linear expression (whatever the connection rules):

xeX

XS

Where X and are constants, while x is function of the fraction of nodes that have been removed.


What we found (2/5)

Sequential addition of nodes, with variable (A) and average degree (B).


What we found (3/5)

Delayed attachment, with variable (A) and average degree (B).


What we found (4/5)

The constants and X vary as a function of average degree, which could become the basis for a predictive tool(?)


What we found (5/5)

But being able to estimate the decay of the average size of the giant component isn’t necessarily useful...

There are regions of the parameter space for which the distribution is strongly bimodal and the average is virtually never observed!


Conclusions

A huge variety of potential architectures can be simulated simply by applying different local connection rules.

The study of “complex networks” provides tools for a quantitative description of those architectures’ properties.

But we are still a long way from an efficient and robust network design...

Especially if the topology is to be dynamic!

Part 3: RAn (Robustness Analyser)


What RAn can do for you:

Simulate cumulative node failure and plot the evolution of the largest component’s relative size for any given topology.

Conduct basic statistical analysis of the numerical data.

Compute a set of global variables summarising the network’s behaviour under stress.

In a matter of seconds (for N up to 104).


What RAn cannot (yet) do:

Take into account the additional effects topological changes can have beyond affecting the relative size of the largest component.

This unfortunately includes forming of bottlenecks due to re-routing of traffic through surviving nodes.

But we are working on it...


Summary:

RAn is a lightweight, easy-to-use, network robustness analyser.

Its primary purpose is to quickly obtain a rough evaluation of (and comparison between) alternative topologies.

It is therefore a powerful tool in the early stages of network design (or audit), but is meant to be used in conjunction with other, more specific simulators.


Example: robust “small worlds”?

Most people belong to a highly clustered social network: “I know most of my friends’ friends...”

But many have a few acquaintances outside the dense local mesh: “I have never met some of my friends’ friends...”

Hence the very popular “small world” effect.


“Small worlds” have well-known interesting properties:

Their diameter grows as a logarithmic function of their size...

Even when rewiring probability (proportion of “long-range” connections) is relatively low.

They are notoriously difficult to navigate possible routing problems for otherwise very appealing network applications!


But how robust are they?


Obviously, it depends on several factors:

How far do short-range links go (~ how many people are there in a “local” cluster)?

- At one end of the spectrum is the “not-so-small world”, that is: a simple ring (local connections limited to 1 hop, no long-range links) - fairly brittle!

- At the other end is the fully connected mesh (everybody knows everybody) - unbreakable!

In between (i.e. “true” SW networks) a key question seems to be: what is the proportion of long-range links?


Benchmark: the basic ring

Best fit

Xc~ 0.0035


Basic ring + “2 hops” connections (no long-range links...)

Xc~ 0.064


“Classic small world”(rewiring probability = 1%)

Xc~ 0.21



Xc~ 0.37

Best fit



Xc~ 0.43


Evolution of Xc as calculated by RAn

y = 0.1756x0.3921

R2 = 0.98160

0.1

0.2

0.3

0.4

0.5

0.6

0 5 10 15 20

Rewiring probability (%)

Xc


0

5

10

15

20

0 5 10 15 20

Rewiring probability (%)

Evolution of (more puzzling!)

Maximum?


“Equivalent” scale free network(same size and average degree)

Xc~ 0.61


“Equivalent” scale free networkunder attack

Xc~ 0.44

Best fit


Conclusions

There are indications that a scale free network featuring extensive redundancy (2 connections per node) is more robust to node failure than a “small world”...

But if under attack, the behaviour of the scale free architecture appears remarkably similar to that of its counterpart...

Which is understandable considering that a “small world” cannot really be “attacked”!


Practical demonstration(s)...

modelling robustness part 1: prologue fabrice saffre

Documents

british telecommunications

preferential attachment

delayed attachment rule

realistic network

complex networks

network state

prologue fabrice saffre

global network cohesion