A Fault-tolerant Routing Strategy for Gaussian Cube Using Gaussian Tree

Similar documents
On Embeddings of Hamiltonian Paths and Cycles in Extended Fibonacci Cubes

Tight Bounds on the Diameter of Gaussian Cubes

Hybrid Fault diagnosis capability analysis of Hypercubes under the PMC model and MM model

On Detecting Multiple Faults in Baseline Interconnection Networks

Robust Network Codes for Unicast Connections: A Case Study

The Complexity of a Reliable Distributed System

Pancycles and Hamiltonian-Connectedness of the Hierarchical Cubic Network

Routing Algorithms. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Distributed Systems Byzantine Agreement

An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

Fault-Tolerant Consensus

Node-to-set disjoint-path routing in perfect hierarchical hypercubes

Active Measurement for Multiple Link Failures Diagnosis in IP Networks

cs/ee/ids 143 Communication Networks

Reliable Broadcast for Broadcast Busses

Conquering Edge Faults in a Butterfly with Automorphisms

Hypercube Network Fault Tolerance: A Probabilistic Approach

Generalized Fibonacci cubes

HYPERCUBE ALGORITHMS FOR IMAGE PROCESSING AND PATTERN RECOGNITION SANJAY RANKA SARTAJ SAHNI Sanjay Ranka and Sartaj Sahni

Edge-pancyclicity of Möbius cubes

Part 2: Random Routing and Load Balancing

Utilising public information in Network Coding

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences

Algorithms for Collective Communication. Design and Analysis of Parallel Algorithms

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas

Generalized DCell Structure for Load-Balanced Data Center Networks

An Ins t Ins an t t an Primer

Gossip Latin Square and The Meet-All Gossipers Problem

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences

Mutually Independent Hamiltonian Paths in Star Networks

CORRECTNESS OF A GOSSIP BASED MEMBERSHIP PROTOCOL BY (ANDRÉ ALLAVENA, ALAN DEMERS, JOHN E. HOPCROFT ) PRATIK TIMALSENA UNIVERSITY OF OSLO

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

Embed Longest Rings onto Star Graphs with Vertex Faults

Information in Aloha Networks

Combinational Techniques for Reliability Modeling

Burst Scheduling Based on Time-slotting and Fragmentation in WDM Optical Burst Switched Networks

Symmetric Rendezvous in Graphs: Deterministic Approaches

All Good (Bad) Words Consisting of 5 Blocks

CHAPTER 3 LOW DENSITY PARITY CHECK CODES

High Performance Computing

Theoretical Computer Science. Fault-free longest paths in star networks with conditional link faults

Algebraic Properties and Panconnectivity of Folded Hypercubes

Distributed Algorithms of Finding the Unique Minimum Distance Dominating Set in Directed Split-Stars

MAXIMAL VERTEX-CONNECTIVITY OF S n,k

Tight Bounds on the Ratio of Network Diameter to Average Internode Distance. Behrooz Parhami University of California, Santa Barbara

PARALLEL computing is important for speeding up. Embedding Two Edge-Disjoint Hamiltonian Cycles and Two Equal Node-Disjoint Cycles into Twisted Cubes

Spanning Paths in Infinite Planar Graphs

Weakly Secure Data Exchange with Generalized Reed Solomon Codes

Biswapped Networks and Their Topological Properties

2 Introduction to Network Theory

Searching for Black Holes in Subways

Some mathematical properties of Cayley digraphs with applications to interconnection network design

Failure detectors Introduction CHAPTER

On the minimum neighborhood of independent sets in the n-cube

arxiv: v1 [math.co] 27 Apr 2016

Structuring Unreliable Radio Networks

Partial cubes: structures, characterizations, and constructions

A note on network reliability

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks

AN EFFICIENT GOLAY CODEC FOR MIL-STD A AND FED-STD-1045 ERIC E. JOHNSON NMSU-ECE FEBRUARY 1991

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

PATH BUNDLES ON n-cubes

Explicit Constructions of Memoryless Crosstalk Avoidance Codes via C-transform

Disjoint Path Covers in Cubes of Connected Graphs

Geometric Steiner Trees

PARALLEL computing is important for speeding up

Research Article On Counting and Embedding a Subclass of Height-Balanced Trees

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

Construction of Vertex-Disjoint Paths in Alternating Group Networks

Notation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing

Hyper-Panconnectedness of the Locally Twisted Cube

Maximizable Routing Metrics

Generation of High Quality Non-Robust Tests for Path Delay Faults

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

arxiv: v1 [cs.gr] 25 Nov 2015

ON THE EDGE-HYPER-HAMILTONIAN LACEABILITY OF BALANCED HYPERCUBES

Efficient Bit-Channel Reliability Computation for Multi-Mode Polar Code Encoders and Decoders

Minimal Delay Traffic Grooming for WDM Optical Star Networks

Asymptotics, asynchrony, and asymmetry in distributed consensus

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Computer Engineering Department, Ege University 35100, Bornova Izmir, Turkey

R(4,5) = 25. Stanis law P. Radziszowski JGT 19 (1995)

k-protected VERTICES IN BINARY SEARCH TREES

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Tutorial on Mathematical Induction

Improving the Height of Independent Spanning Trees on Folded Hyper-Stars

Constructing Independent Spanning Trees on Locally Twisted Cubes in Parallel

EMBEDDED PATHS AND CYCLES IN FAULTY HYPERCUBES

Lecture 4 : Introduction to Low-density Parity-check Codes

Stochastic Enumeration Method for Counting Trees

The shortest path tour problem: problem definition, modeling, and optimization

Sum-networks from undirected graphs: Construction and capacity analysis

Santa Claus Schedules Jobs on Unrelated Machines

FAIRNESS FOR DISTRIBUTED ALGORITHMS

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Bipanconnectivity of Cartesian product networks

Maximum Flow Problem (Ford and Fulkerson, 1956)

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

Transcription:

A Fault-tolerant Routing Strategy for Gaussian Cube Using Gaussian Tree Peter, K K, Loh 1 and Zhang, Xinhua 1 School of Computer Engineering, Nanyang Technological University, Singapore Dept of Computer Science and Engineering, Shanghai Jiao Tong Univ, PRChina E-mail: askkloh@ntuedusg Abstract Gaussian Cubes (GCs are a family of interconnection topologies in which the interconnection density and algorithmic efficiency are linked by a common parameter, the variation of which can scale routing performance according to traffic loads without changing the routing algorithm However, there is no existing fault-tolerant routing strategy for GCs as well as node/link diluted cubes In this paper, the void is filled for GC with an algorithm based on a new topology: Gaussian Tree (GT With a many-to-one mapping, the original problem is converted into routing in GT, which is found to be more definite and predictable A new approach to categorizing faulty components is presented to overcome the problem of low node availability and the maximum number of faults tolerable is given The algorithm is livelock free and generates deadlock-free routes, which are at most F hops longer than the optimal route in a fault-free setting, if F faults are encountered Finally, simulation is done to show the algorithm s performance, demonstrating its contribution to making GC a more fault-tolerant topology 1 Introduction Gaussian Cubes (GCs are a family of interconnection networks parameterized by a modulus M and a dimension n [1] Their desirable scalability makes possible generalized analysis of interconnection cost, routing performance, and reliability Besides, communication primitives such as unicasting, multicasting, broadcasting /gathering [7] can also be done rather efficiently in all GCs [1] However, although research achievements abound in routing in binary hypercubes, there are no existing fault-tolerant routing strategies for GCs or for node/link dilution cubes One of the difficulties lies in the low network node availability (maximum number of faulty neighbors of a node that can be tolerated without disconnecting the node from the networ Therefore, if the topology is fixed, a new method has to be employed to tackle this intrinsic problem In this paper, we present a new routing algorithm based on a new topology called Gaussian Tree (GT In GC (, GT is dependent only on and divides all the nodes in GC ( into classes according to their least significant bits So the original problem is converted into first routing in GT (ie between different classes and then routing in one such class The former is facilitated by the definite and predictable routing in trees while the latter is actually routing in ordinary binary hypercube Faults encountered in different stages of this divide-and-conquer strategy lead to a new categorization of faulty components, enabling analysis of routing strategy in the presence of far more faults than the network node availability Methodologically speaking, this approach also opens window to a brand-new way of analyzing network reliability, which is especially valuable for incomplete networks The characteristics of our routing strategy for GC ( encompass: 1 Incurs message overhead of only O(n The computation complexity for intermediate nodes is at most O ( ( n log 3 Guarantees a message path length not exceeding F longer than the optimal path found in a fault-free setting, provided the distribution of faulty components in the network satisfies the precondition of Theorem 3 and 5 n 4 Each node requires at most + 1 rounds of fault status exchange with its neighbors 5 Each node maintains and updates at most F n-bit node addresses, where F is the number of faults related to nodes whose least significant bits are same as the current node 6 Generates deadlock-free and livelock-free routes 7 The number of faulty components tolerable is presented in Figure 4 and Theorem 5 This paper is organized as follows Preliminaries are given in Section to provide an equivalent definition of

GC that facilitates the following discussion Section 3 defines GT The routing algorithm for the fault-free GC is described in Section 4 separately to make the subsequent section clearer In Section 5, the fault-tolerant routing strategy that deals with all categories of faults is studied Then in Section 6, simulation results are presented to demonstrate the performance of our algorithm The whole paper is concluded by Section 7 where some suggestions for further work are given Preliminary The binary Gaussian Cube [1] is denoted by GC ( M, where n (network dimension and M (modulus 1 It has n nodes that are identified with unique n-bit labels A link connects two nodes p and q if the following conditions are both true: th 1 The labels of p and q differ in the c bit for some c, c (n 1 p and q are in the congruence class [c] M, where M = min { c, M The congruence class of c modulo M, [c] M, is the set { km + c k Z, where Z represents the set of integer According to the definition above, if node p = an 1an ac a1a ( a i {, 1 for i [, n ] is connected with q = an 1an ac a1a, then there must exist k1 and k Z, such that: an 1an ac a1a = k 1 M + c (1 an 1an ac a1a = k M + c ( ( (1 and take absolute value on both sides, we get: c = k1 k M ' (3 Therefore, M must be the power of Since M =min { c, M, M must also be the power of if M < c Otherwise, as M = min { c, M we can assume M < n-1 without loss of generality Consequently, there will be no link spanning in any dimension c > log M Effectively, n 1 log M the network is separated into disconnected subnetworks, with each combination of the first n 1 log M bits representing one such subnetwork n log M Formally, GC ( M = U G i= i Each subnetwork G i is composed of <V i, E i >, where V i = { an log M ai ab log M b j b log M, an log M ai a i b j {,1, j = E i = {( v1, v E v1 Vi, v Vi, where E is the set of edges in the original network n Obviously, for, [, log M i j and i j, Vi IV j = Ф, Ei I E j = Φ So routing can be done within the subnetwork G i if the source and destination both belong to G i, or fails otherwise Furthermore, as G i is isomorphic to GC ( log M +1, log M, this situation is covered in the following case, where M is power of So henceforth, we assume M is power of (Theorem 1 In GC (, node p = anan ac a ( a i {, 1 for i [, n ] has a link in dimension c ( c [ 1, n ] if and only if: ac ac a = c if c [ 1, ] where x % y represents the modulus of x divided by y, like in C/C++ And each node has a link in dimension Proof: For dimension, since for any M 1, M = min {, M = 1 Any integer p and q must be in the congruence class [] M = [] 1 So each node has a link in dimension Two other cases should be considered to prove Theorem 1 (Case I c (, n (Necessary According to Equation (1, anan a a1a = k 1 M +c Thus, an an a + 1a + a a a = k1 + c Take the modulus of on both sides and due to the fact that a a a, we obtain < 1a a c% If a 1a a = c% a = (Sufficien a a, then a a a1a c can be wholly divided by n n ana n ac a1a c Define k1 = Z and if a c =, k = k1 + c otherwise k = k1 c The a a n n a a a c 1 = k 1 + c = k1 min( = k 1 M + c and an 1an ac a1a = k + c c = k min(, + c = k M + c In other words, according to the original definitio an 1an ac a1a has a link in dimension c (Case II c [ 1, ] The proof is similar to case I g 3 Gaussian Tree 1a a c % a = if c (, n According to Theorem 1, we can see that whether a packet can be forwarded through dimension c at node p is entirely irrelevant to an an a, regardless of whether c > or not So the last bits in nodes address c,

are of more importance We define a Gaussian Graph based on these bits (Definition 1 : Gaussian Graph We call the undirected graph G n ( n Gaussian Graph if it is composed of <V n, E n >, where: V n = { an 1an a1a a i {,1, for i [, n ] E n = {( an 1an ac a1a, an 1an ac a1a c = or c [1, n-1] and ac 1ac a1a = c Figure 1 demonstrates the topology of G, G 3, and G 4 They can be generated easily by adding edges, according to the definition of E n, to the original graph which is composed only of nodes G : G 3 : G 4 : 1 11 1 1 11 11 1 111 1 11 11 111 1111 111 11 1 11 111 11 1 1 1 11 111 11 Figure 1: Gaussian Graphs: (a G, (b G 3, (c G4 (Lemma 1: Tree s Equivalent Definition Suppose graph G has n vertices and e edges G is a tree if and only if G is connected and e = n [] (Theorem G n is a tree ( n Proof: Step 1 G n is connected This is evident from the following algorithm PC It can find a route from s to d in G n, when s and d are given (Algorithm 1 Path Construction Algorithm (PC path PC( s = sn 1sn s1s, d = d n 1d n d1d (c (a (b { Let c be the dimension corresponding to the leftmost 1 in R = s d; // means bitwise exclusive OR if (c== return (s, d; // s and d are neighbors, suppose ac ac a = c ( a i {,1, i [, c ] path = ( sn 1 sca c a, sn 1 sca c a ; path1 = PC( sn 1sn s1s, sn 1 sca c a ; path = PC( sn 1 sca c a, d n 1d n d ; return path1 path path; // Here, stands for the concatenation operation Eg, PC (111,1111 = PC (111, 111 (111,1111 PC (1111, 1111 The recursion must be able to terminate within depth n because the leftmost 1 moves at least one bit rightward after one recursio until it reaches dimension when the source and destination will be neighbors As links are not found step by step from source to destinatio it requires a sort to re-order the final link set n Step There are nodes in G n (Obvious n Step 3 There are only edges in G n We denote the number of links spanning in dimension i as E n (i ( i [, n ] E n ( = n A node has a link on dimension 1 if and only if its rightmost bit is 1 Such links only connect nodes in the form of ( an 1an x 1, an 1an x 1 So E n (1 = n A link spanning in dimension can only connect node pairs in the form of: ( an 1an x1, an 1an x1 So E n ( = n 3 Likewise, it is easy to prove that E n (i = n 1 n 1 n i Thus E n = n ( n i n = i= i= Combine 1-3 and apply Lemma 1, it can be concluded that G n is a tree g From now o we denote G n as T n and name it as Gaussian Tree (GT to emphasize this property We denote the node k in T n as T n ( In PC Algorithm, since the path will not go to one node more than once and we are routing in a tree, the resultant route must be optimal Besides, as the algorithm finds the path link by lin the spatial and computational complexities are dependent on the diameter of T n (maximum distance between node pairs, denoted as D ( T n Figure shows that D ( T n is O (n Thus, the time and space complexity for running Path Construction Algorithm is O ( D( Tn + D ( Tn log D( Tn = O ( n log n The second term is for the sorting

Diameter 15 1 5 Diameter ~ Dimension 5 8 11 14 17 3 Diameter Figure Diameter of T n versus n (Dimension The existence of Gaussian Tree is crucial for our algorithm because, for each source and destination pair in a tree, there is a set of nodes, which the packet must cover in its journey, and which can be calculated at the source This makes routing much more definite and predictable 4 Routing in fault-free Gaussian Cube (Definition k-ending Class In GC (, for k [, ], we call the following set EC(, k-ending class: EC(, = { anan a a a1a ai {,1, i [, n ], a a = k EC(, is abbreviated as EC( when the GC ( is given According to Theorem 1, if link ( 1,v v spans in dimension c, then v 1,v EC( c% EC ( corresponds to T ( in Gaussian Tree T Let the dimensions in [, n ] on which each node of EC ( has a link comprise set Dim(,, then Dim(, = [, n 1] I[ k] When the GC ( is give Dim(, is also abbreviated as Dim( Suppose the source is s and the destination is d Denote R=s d (Exclusive OR If there is a 1 in R and its dimension c is no less than, then the path from s to d must cover at least one node x, such that x EC ( c% Viewed in T, that means the path must begin from T ( s %, end at ( % T d and must pass all nodes in S = { T ( k % k, & k R, where dimension n & stands for bitwise AND operation Since the problem has now been mapped to a tree, with the starting and ending nodes as well as the intermediate nodes give it is simpler to find an optimal route We can use Algorithm 1 to find a route from T (s to T (d in T, when, sand d are given Secondly, we introduce an algorithm for arranging multi-destination routing from a tree root Several nodes belonging to the tree need to be visited and then the packet must go back to the root It is easy to find that as long as the following principle is met, the path generated must be optimal: if the packet is currently at node it can never backtrack to the parent unless no destination still exists in the subtree of p (Algorithm Closed-Traverse Algorithm in tree (CT Suppose we are at the r root r = r r r1 r where r i {,1 for all branch i [, ] We are to point for visit D = { d 1, d,, dn d b i whose members are all i nodes in the tree and finally go back to r The prototype of the algorithm is d j CT ( r, D We first pick up randomly one d D and d i use Algorithm 1 to find a route L from r to d Then for each d i D, if d i is d covered by L, we only need Figure 3 Example for to record that fact But if it CT algorithm is not covered, we must find a node in L at which the packet must branch away from L For example, in Figure 3, the bold line represents L, and to reach d i, the route must branch at b i However, to calculate b i, we need not find the complete path from r to d i The following function FindBP(L, r, d i is enough Suppose CheckIn (v,l returns whether v is covered by L Point FindBP(L, r, d i // FindBP(route, source, destination { Let c be the dimension corresponding to the leftmost 1 in R = s d i ; if (c== return r; // b i = r suppose ac ac a = c ( a i {,1, i [, c ] ; ( 1,v v = ( r rc aca, r rc aca ; if (CheckIn( v 1, L &&!CheckIn( v, L return v 1 ; if (CheckIn( v 1, L && CheckIn( v, L return FindBP(L, v, d i ; if (!CheckIn( v 1, L &&!CheckIn( v, L return FindBP(L, r, v ; //!CheckIn( v 1, L && CheckIn( v, L is impossible As a node in L might serve as branch point for more than one destination in D, we use a table to record it We denote the mapping as B ( For example, in Fig 3, b i is the branch point for d i and d j, so B ( b i = { d i, d j

After all members in D are processed and table B ( is obtained, we begin to go from r to d by following L Once we arrive at a node p where B ( p Φ, run this algorithm again by calling CT ( B( p After that, we proceed along L, until d is reached Then go back to r in a reverse direction of L Since this is a distributed algorithm, CT is not recursive as it appears here g Finally, we present the complete routing algorithm for fault-free Gaussian Cube (Algorithm 3 Fault Free GC Routing (FFGCR The input of FFGCR is: n and for GC (, binary source s = sn 1sn s, destination d = d n 1d n d FFGCR (, s = sn 1sn s1s, d = dn 1dn d1d { // map the problem from GC ( to T R = s d; s = ( % ( % i P={ i [, n ] R &, D={ T ( x% x P ; L = PC(s, d ; // L T Build table B ( for all nodes n L with FindBP( n = s ; while (n Null // traverse using the least significant { // dimensions in GC( if ( B (n Φ call CT( B(n to traverse all nodes in B(n and go back to n ; if ( n D go through all preferred dimensions c [, n ] I[ x] n = getnext(l, n; // get next node in L // getnext(l, d = Null It can be easily deduced that the message overhead is O (n and the computation complexity is O ((n log 5 Fault-tolerant routing strategy in GC To overcome the problem of low node availability, we categorize faulty components (Definition 3 A-category (lin fault If a link error occurs at a dimension c, it is called A-category (lin fault (Definition 4 B-category fault If all link failures incurred by an error are in dimensions less than, then the error is called B-category fault B-category faults can be both link error and node error as long as that node has no incident link spanning in a dimension c A link error is either A or B-category (Definition 5 C-category (node fault If a node error implies break down of links in dimensions both smaller and no smaller than, it is called C-category (node fault A node error is either B or C- category (Definition 6 k-ending-t-equivalent Class In k-ending class EC(,, for n Dim ( k t [, 1] (see Definition for the meaning of Dim(, we call the set EEC(, k-ending-tequivalent class: EEC(, = { a n a a a EC( a, bits in dimensions other than [, ] U Dim ( comprise value t k-ending-t-equivalent Graph GEEC ( a, is defined as a subgraph of GC( whose vertex set is EEC(, with original edges connecting vertex in EEC(, (Theorem 3 If only A-category faults exist in GC (, and in all GEEC(, ( k [, ], t [,, the number of faulty component is less than n k N ( = + 1 δ ( ( δ ( k, = k <?1:, there is a fault-tolerant and cycle-free routing strategy for any source and destination pair Proof Obviously, GEEC ( a, is a binary hypercube embedded in GC ( Let source be s and destination be d Let p = s d Denote: P = i { i [, n ] p AND, D = { x % x P, I ={ EC( x x D As there are only A-category faults, traversing through links spanning in the least significant dimensions is always successful So it is guaranteed that for any member EC ( in I, a packet can reach at least one node in EC ( Suppose a packet reaches EC( n Dim( I by arriving at node x and x EEC(, The k ( if k, k +, k +, k + 3,, k + max(, n k bits of x and d are x' = x x1 x Dim ( and d ' = dd1 d Dim ( respectively Then we can focus on routing in binary hypercube GEEC(, from x ' to d ', which is guaranteed by the precondition of the theorem and FTCR in [4] or strategies in [5][6] which ensure a packet to be sent from any non-faulty source to any non-faulty destination in a deadlock-free fashio as long as the number of faulty links is less than the dimension of the binary hypercube After all the bits in

dimensions [ k] I [, n ] are set to be same as d, use links spanning in the last dimensions to go to another member in I Finally destination d is reached g Suppose there are only A-category faults and F A- category faults are encountered, then the resultant route is at most F longer than the optimal route found in a fault free setting We can also conclude that in GC (, the maximum number of faulty links tolerable is: 1 n t T( GC( = k max( tk, k= Figure 4 demonstrates the trend of log T( GC( versus when < 5 log ( T ( GC( ~ n 3 5 15 1 5 n k where t k = + 1 δ ( 4 5 6 7 8 9 1 11 1 13 14 15 16 17 18 19 1 3 4 5 6 7 8 9 3 Dimension (n alpha=1 alpha= alpha=3 alpha=4 Figure 4 log ( T ( GC( ~ n (dimension (Definition 7 Exchanged Hypercube The Exchanged Hypercube is defined as EH ( s, = ( V, E ( s 1, t 1, where V = { as abt bc ai, b j, c {,1 for i [,s], j [,t] E = {( v1, v V V v1 v = 1 or v 1 [ s + t : t + 1] = v [ s + t : t + 1], H ( v1[ t :1], v[ t :1] = 1, v1[] = v[] = 1 or v1 [ t :1] = v[ t :1], H ( v1[ s + t : t + 1], v[ s + t : t + 1] = 1, v 1 [ ] = v[] = Here, v [ x : y] represents the bit pattern of v between dimension y and x inclusive H (x, y stands for the Hamming distance between x and y In EH ( s,, the -ending nodes together with the links connecting in between comprise t s-dimension binary hypercubes (denoted as B s ( EH ( s, collectively For t any k [,, we denote as B s ( EH ( s,, the binary hypercube whose nodes are composed of the following set: V s ( EH ( s,, = { a a b b t b = s 1 t b ai, b j {,1, i [, s, j [, If x Vs ( EH ( s,, and x [ s + t : t + 1] = we denote such nodes as V s ( EH ( s,, p Likewise, the 1-ending nodes together with the links s connecting in between comprise t-dimension binary hypercubes (denoted as B t ( EH ( s, collectively For s any l [,, we denote as B t ( EH ( s,, l the binary hypercube whose nodes are composed of the following set: V EH( s,, l { a a b a s a = t ( = s 1 t b l, a, b {,1 i [, s, j [, If x V ( EH ( s,, l i and j x [ t :1] = q, we denote x as ( EH ( s,, l, q So V s ( EH ( s,, p[ s + t :1] = V t ( EH ( s,, [ s + t :1] Suppose there are F s faulty components in B s ( EH ( s,, and F t faulty components in B t ( EH ( s, Let E ( EH ( s, = {( v 1, v EH ( s, v1 v = 1 Suppose there are F faulty links in E( EH ( s, \{( v1, v EH ( s, v 1 or v is faulty We have: (Theorem 4 If F s + F < s and F t + F < t, there is a deadlock-free and livelock-free algorithm that can deliver messages from a nonfaulty source r to a nonfaulty destination d in no more than H ( r, d + ( F s + Ft + hops This theorem is evident from the following algorithm: (Algorithm 4 Fault-tolerant Routing in EH ( s, (FREH (Case I Suppose r = B s ( EH ( s, t, k, l 1 and d = B t ( EH ( s, t, l, k 1 Since F s + F < s, it is affordable to communicate within each B s ( EH ( s,, in the initialization phase, so that each node in it knows and records the set of nodes in B s ( EH ( s,, whose link in E ( EH ( s, (ie in dimension is faulty In one case, if r finds that B s ( EH ( s, t, k, l s link in dimension is non-faulty, it sends the packet within B s ( EH ( s,, k to B s ( EH ( s, t, k, l This is guaranteed to succeed as was proved in Theorem 3 After that, B s ( EH ( s, t, k, l sends the packet to B t ( EH ( s, t, l, k via the link in dimension Finally, the packet is sent in B t ( EH ( s, t, l to B t ( EH ( s, t, l, k 1, which is guaranteed by F t + F < t In the other case, if by looking up its local table, r finds that the -dimension link of B s ( EH ( s, t, k, l is faulty, then there must be a nonfaulty neighbor of r whose -dimension link is also nonfaulty This is guaranteed by V s t

F s + F < s Denote it as B s ( EH ( s, t, k, l So the packet is sent to B s ( EH ( s, t, k, l, which in tur sends the packet to B t ( EH ( s, t, l, k Now there must be a nonfaulty neighbor of B t ( EH ( s, t, l, k in B t ( EH ( s, t, l whose -dimension link is also nonfaulty If there is such a neighbor in preferred dimensio then use it Otherwise, use the spare dimension and mask it so that it will not be used again After going back to B s ( EH ( s,, the process above repeats and finally the packet reaches d Due to the use of mask for dimensions in [1, t], the route is livelock free Deadlock-freeness is still guaranteed Since faulty components might cause the use of a spare dimensio which brings about for and pro between B t ( EH ( s, and B s ( EH ( s,, the number of hops is bounded by H ( r, d + ( F s + Ft (Case II If r = B t ( EH ( s, t, l, k 1 and d = B s ( EH ( s, t, k, l 1, as EH ( s, is isomorphic to EH ( t, s, the algorithm is the same as case I (Case III Suppose r = B s ( EH ( s, t, k, l and d = B s ( EH ( s, t, k 1, l 1 If k 1 = k, then it is routing in s- dimension binary hypercube Otherwise, the packet is sent to B t ( EH ( s,, k via the -dimension link of r or one of its neighbors in B s ( EH ( s,, k Then the problem is the same as in case I But now, the number of hops is bounded by H ( r, d + ( F s + Ft + because of the extra hops in dimension (Case IV Suppose s = B t ( EH ( s, t, l, k and d = B t ( EH ( s, t, l 1, k 1 ( l1 l This case is handled in the same way as in case III g Come back to GC ( Suppose T ( p and (q are neighbors in T For each k [, n Dim( p Dim( q ], define graph G(, = < V (,, E(, >, where V (, is the set of nodes in GC( whose bits in dimensions other than Dim (p U Dim (q U [, ] comprise k in value and whose rightmost bits represent p or q E(, is the subset of links in GC( which connect nodes in V (, The links in E(, that span between B t ( G(, and B s ( G(, comprise E ( G(, If the last bits are viewed as dimension that can take value only in { q, then G(, is effectively isomorphic to Exchanged Cube EH ( Dim( p, Dim ( q (or EH ( Dim (q, Dim (p Suppose there T are e t (, faulty components in B t ( G(, and e s (, faulty components in B s ( G(, The number of faulty links in E ( G(, \{( v1, v EH ( s, v 1 or v is faulty is denoted as e (, (Theorem 5 In GC (, for all T ( p and T (q which are neighbors in T, if e s (, + e (, k < Dim ( p and e t (, + e (, < n Dim( p Dim( q Dim ( q, for all k [, ], then there is a fault-tolerant and cycle-free routing strategy for any source and destination pair Proof (Outline The algorithm used in Theorem 3 fails only when links spanning in dimension [, 1] are broken With our discussion about the fault-tolerant routing in Exchanged Cube, such a problem is solved as long as the fault number satisfies the precondition of Theorem 5 g 6 Simulation results A software simulator is constructed to imitate the behavior of the real networ and thus test the performance of our algorithm The assumptions are: (1 source and destination nodes must be nonfaulty, ( Eager readership is employed where packet service rate is faster than packet arrival rate, (3 a faulty node makes all of its incident links faulty, (4 a node knows the status of its links to neighboring nodes and B or C category faults related to nodes which have the same least significant bits as the node itself The performance of routing algorithm is measured by two metrics: average latency and throughput Average latency is defined as LP/DP, where LP is the total latency of all packets that have reached destination while DP is the number of such packets Throughput is defined as DP/PT, where PT is the total processing time taken by all nodes We use its logarithm with base for clearer comparison Fig 5 and Fig 6 demonstrate the result of fault-free GC ( M (n [6, 14], M {1,, 4 From Figure 5, it can be observed that the average latency of Gaussian Cube increases as the networks dimension increases from 6 to 14 As the network size increases, the diameter of the hypercube also increases A packet to be transmitted has to take a longer path to reach its destinatio resulting in a higher average latency Furthermore, as M increases, the network average latency also increases This is due to the dilution of links with increasing M The influence of M on the average latency is even more significant than network dimension In Fig 6, it is demonstrated that the throughput of all

networks is increasing as the dimension is increased from 6 to over 14 This is due to the parallelism of the networks and the increase in the number of nodes that can generate and route packets in the networ is faster than the time complexity of O( ( n log where = log M By increasing the network size, the number of link is also increasing at a higher rate than the node number This in turn increases the total allowable packets in the network With parallelism, more packets will reach destination in a given duration Figure 7 and 8 demonstrate the result for GC(1 ( n [5,13], with the comparison between two situations: fault free and one faulty node in presence It is Throughput(pkts/s Latency(us/pk Throughput(pkt/s 8 6 4 3 5 15 1 5 Throughput ~ Dimension 6 7 8 9 1 11 1 13 14 Dimension M=1 M= M=4 Fig 6 Throughput ~ Dimension Ave Latency ~ Dimension 3 5 15 1 5 5 6 7 8 9 1 11 1 13 Dimension No Fault One Fault Fig 7 Fault s influence on Ave Latency 3 5 15 1 5 Ave Latency ~ Dimension 6 7 8 9 1 11 1 13 14 Dimension M=1 M= M=4 Fig 5 Average Latency versus Dimension Throughput ~ Dimension 5 6 7 8 9 111113 Dimension No Fault One Fault Fig 8 Fault s influence on Throughput clear that when the number of faults increases, the trend of average latency is to increase while the throughput is to decrease This is because when more faults appear, the packet is more likely to use spare dimensions which makes the final route longer Thus, the latency increases and throughput decreases 7 Discussion and conclusion In this paper, a new effective fault-tolerant routing strategy is proposed for Gaussian Cubes Gaussian Tree is introduced to facilitate the algorithm and helps to make routing more definite The routing strategy ensures livelock freeness and generates deadlock-free routes with the length no more than F longer than the optimal route found in the fault-free setting The space and computation complexity as well as message overhead size are all reasonable Although the Gaussian Cube is very sparsely connected, which partially caused the current nonexistence of fault tolerant routing strategies for it, our algorithm can still tolerate a satisfactory number of faults, with careful analysis of their location and influence Some of our results can still be improved upon For example, although the idea of categorizing the faulty components is useful in Gaussian Cube and possibly in other node/link diluted cubes, the exact way of categorization might vary due to different topologies A new unified metric needs to be designed to measure the fault-tolerance ability of interconnection networks so that it is fair despite their different routing algorithms and different methods of fault categorization 8 References [1] Hsu, W J, Chung, M J, and Hu, Z, "Gaussian Networks For Scalable Distributed Systems", The Computer Journal, Vol 39, No 5, 1996, pp 417-46 [] Hsu, W J, Chung, M J and Hu, Z A New Gaussian networks and Their Applications Int l Symp Parallel and Distributed Supercomputing, Japa 1995 [3] Douglas B West, Introduction to Graph Theory - Second edition Chapter NJ: Prentice Hall, 1 [4] Peter K K Loh, H Schröder, W J Hsu, Fault-tolerant routing on complete Josephus Cubes Proc 6 th Australian Conf Computer systems architecture, IEEE Computer Society Press Queensland, Australia, 1, pp 95-14 [5] Wu, J, Reliable Unicasting in Faulty Hypercubes Using Safety Levels, IEEE Transactions on Computers, Vol 46, No, February 1997, pp 41-47 [6] La Youra An Adaptive Fault-Tolerant Routing Algorithm for Hypercube Multicomputers, IEEE Transactions on Parallel and Distributed Systems, Vol 6, No 11, November 1995, pp 1147-115 [7] D P Bertsekas and J N Tsitsiklis, Parallel and Distributed Computation: Numerical Methods Englewood Cliffs, NJ: Prentice-Hall, 1989, ch 1, pp 7 68