next up previous
Next: Main Result Up: Designing Computer Networks to Previous: Introduction.

Graph Theory Background

A computer network will be modeled as an undirected graph consisting of a set of vertices corresponding to the host computers that are the nodes of the network, and a set of edges corresponding to (bi-directional) communication links that connect the nodes. The failure and recovery characteristics of the components of a network are complex. We will assume that nodes and links usually function properly, with a small probability that each node is non-functioning at a randomly chosen instant, while each link is non-functioning with small probability . In practice, these probabilities may vary from node to node and from link to link, but we assume standard components so that is constant for all nodes, and similarly for links. We will also suppose that the functioning of components is stochastically independent.

In what follows, we are concerned with the MTBF (Mean Time Between Failures) for a network, where the failure being considered is network partitioning. To deal with the MTBF concept, we quantize time by breaking it into relatively small discrete segments. The time out of service for a non-functioning node or link before a repair is made is, on the average, made up of a small number of such segments. Now in the stochastic model of component failure, we define a component to be non-functioning during a time segment if it is non-functioning at any instant during that segment. This has the effect of slightly expanding the average time out of service for any component, and therefore increasing the probabilities and by a small amount. The advantage is that when we show that the probability of network partitioning during a time segment is extremely unlikely, we are simultaneously showing that the mean time between partitionings is a large multiple of a time segment, and therefore that the network is reliable.

A network is partitioned if and only if the subnetwork consisting of functioning nodes and links is disconnected. In other words, there exist two functioning nodes where no path connecting the two nodes exists through functioning nodes and links of the network.

The key concepts for analyzing resilience to network partitioning are multiple connectivity of graphs and related results. These concepts are now introduced along with some simple examples.

For example, a 1-connected network is simply a nonempty connected network. A network consisting of a simple loop, with edges passing from one node to the next until all nodes are visited and the path returns to the original node, is easily seen to be 2-connected; this is known as a loop network.

The relationship between d-connectivity and containers is given by Menger's Theorem [1, Chapter 9,] as follows:

This kind of result can be used to estimate the probability of partition. If a network is d-connected, then each pair of nodes is joined by d disjoint paths. All of these paths must simultaneously have a component failure before the two nodes they connect will be partitioned. However, knowing that there are d disjoint paths is not quite enough to give a reasonable estimate of the probability of a partition. The problem is that Menger's Theorem says nothing about how long the paths are. For example, consider a simple loop network having n nodes. As we already noted, this network is 2-connected. However, a container of width 2 for a pair of adjacent nodes consists of one path of length 1 and a second path of length n-1. So this container has length n-1. Thus nodes that appear to be close can actually be very far apart with respect to containers of width greater than 1.

The concept of d-diameter was introduced in [4] to measure the efficiency of routing and fault tolerance of networks. In this paper we show that this concept can also be used to measure the partition resilience of a network.



next up previous
Next: Main Result Up: Designing Computer Networks to Previous: Introduction.



Kenneth Baclawski
Wed Nov 1 21:08:51 EST 1995