Algorithms for computing Maximally Redundant Trees for IP/LDP Fast-Reroute Juniper Networks
10 Technology Park Drive Westford MA `01886` USA akatlas@juniper.net
Ericsson
Konyves Kalman krt 11 Budapest Hungary `1097` Gabor.Sandor.Enyedi@ericsson.com
Ericsson
Konyves Kalman krt 11 Budapest Hungary `1097` Andras.Csaszar@ericsson.com
Routing Routing Area Working Group A complete solution for IP and LDP Fast-Reroute using Maximally Redundant Trees is presented in . This document describes an algorithm that can be used to compute the necessary Maximally Redundant Trees and the associated next-hops.
MRT Fast-Reroute requires that packets can be forwarded not only on the shortest-path tree, but also on two Maximally Redundant Trees (MRTs), referred to as the Blue MRT and the Red MRT. A router which experiences a local failure must also have pre-determined which alternate to use. This document describes how to compute these three things and the algorithm design decisions and rationale. The algorithms are based on those presented in and expanded in . Just as packets routed on a hop-by-hop basis require that each router compute a shortest-path tree which is consistent, it is necessary for each router to compute the Blue MRT and Red MRT in a consistent fashion. This is the motivation for the detail in this document. As now, a router's FIB will contain primary next-hops for the current shortest-path tree for forwarding traffic. In addition, a router's FIB will contain primary next-hops for the Blue MRT for forwarding received traffic on the Blue MRT and primary next-hops for the Red MRT for forwarding received traffic on the Red MRT. What alternate next-hops a point-of-local-repair (PLR) selects need not be consistent - but loops must be prevented. To reduce congestion, it is possible for multiple alternate next-hops to be selected; in the context of MRT alternates, each of those alternate next-hops would be equal-cost paths. This document provides an algorithm for selecting an appropriate MRT alternate for consideration. Other alternates, e.g. LFAs that are downstream paths, may be prefered when available and that decision-making is not captured in this document. Algorithms for computing MRTs can handle arbitrary network topologies where the whole network graph is not 2-connected, as in , as well as the easier case where the network graph is 2-connected (). Each MRT is a spanning tree. The pair of MRTs provide two paths from every node X to the root of the MRTs. Those paths share the minimum number of nodes and the minimum number of links. Each such shared node is a cut-vertex. Any shared links are cut-links.
There are five key concepts that are critical for understanding the algorithms for computing MRTs. The first is the idea of partially ordering the nodes in a network graph with regard to each other and to the GADAG root. The second is the idea of finding an ear of nodes and adding them in the correct direction. The third is the idea of a Low-Point value and how it can be used to identify cut-vertices and to find a second path towards the root. The fourth is the idea that a non-2-connected graph is made up of blocks, where a block is a 2-connected cluster, a cut-edge or an isolated node. The fifth is the idea of a local-root for each node; this is used to compute ADAGs in each block.
Given any two nodes X and Y in a graph, a particular total order means that either X < Y or X > Y in that total order. An example would be a graph where the nodes are ranked based upon their IP loopback addresses. In a partial order, there may be some nodes for which it can't be determined whether X << Y or X >> Y. A partial order can be captured in a directed graph, as shown in . In a graphical representation, a link directed from X to Y indicates that X is a neighbor of Y in the network graph and X << Y. To compute MRTs, it is very useful to have the root of the MRTs be at the very bottom and the very top of the partial ordering. This means that from any node X, one can pick nodes higher in the order until the root is reached. Similarly, from any node X, one can pick nodes lower in the order until the root is reached. For instance, in , from G the higher nodes picked can be traced by following the directed links and are H, D, E and R. Similarly, from G the lower nodes picked can be traced by reversing the directed links and are F, B, A, and R. A graph that represents this modified partial order is no longer a DAG; it is termed an Almost DAG (ADAG) because if the links directed to the root were removed, it would be a DAG. Most importantly, if a node Y >> X, then Y can only appear on the increasing path from X to the root and never on the decreasing path. Similarly, if a node Z << X, then Z can only appear on the decreasing path from X to the root and never on the inceasing path. Additionally, when following the increasing paths, it is possible to pick multiple higher nodes and still have the certainty that those paths will be disjoint from the decreasing paths. E.g. in the previous example node B has multiple possibilities to forward packets along an increasing path: it can either forward packets to C or F.
A basic way of computing a spanning tree on a network graph is to run a depth-first-search, such as given in . This tree has the important property that if there is a link (x, n), then either n is a DFS ancestor of x or n is a DFS descendant of x. In other words, either n is on the path from the root to x or x is on the path from the root to n. Given a node x, one can compute the minimal DFS number of the neighbours of x, i.e. min( D(w) if (x,w) is a link). This gives the highest attachment point neighbouring x. What is interesting, though, is what is the highest attachment point from x and x's descendants. This is what is determined by computing the Low-Point value, as given in Algorithm and illustrated on a graph in . From the low-point value and lowpoint parent, there are two very useful things which motivate our computation. First, if there is a child c of x such that L(c) >= D(x), then there are no paths in the network graph that go from c or its descendants to an ancestor of x - and therefore x is a cut-vertex. This is useful because it allows identification of the cut-vertices and thus the blocks. As seen in , even if L(x) < D(x), there may be a block that contains both the root and a DFS-child of a node while other DFS-children might be in different blocks. In this example, C's child D is in the same block as R while F is not. Second, by repeatedly following the path given by lowpoint_parent, there is a path from x back to an ancestor of x that does not use the link [x, x.dfs_parent] in either direction. The full path need not be taken, but this gives a way of finding an initial cycle and then ears.
A key idea for the MRT algorithm is that any non-2-connected graph is made up by blocks (e.g. 2-connected clusters, cut-links, and/or isolated nodes). To compute GADAGs and thus MRTs, computation is done in each block to compute ADAGs or Redundant Trees and then those ADAGs or Redundant Trees are combined into a GADAG or MRT. Consider the example depicted in (a). In this figure, a special graph is presented, showing us all the ways 2-connected clusters can be connected. It has four blocks: block 1 contains R, A, B, C, D, E, block 2 contains C, F, G, H, I, J, block 3 contains K, L, M, N, O, P, and block 4 is a cut-edge containing H and K. As can be observed, the first two blocks have one common node (node C) and blocks 2 and 3 do not have any common node, but they are connected through a cut-edge that is block 4. No two blocks can have more than one common node, since two blocks with at least 2 common nodes would qualify as a single 2-connected cluster. Moreover, observe that if we want to get from one block to another, we must use a cut-vertex (the cut-vertices in this graph are C, H, K), regardless of the path selected, so we can say that all the paths from block 3 along the MRTs rooted at R will cross K first. This observation means that if we want to find a pair of MRTs rooted at R, then we need to build up a pair of RTs in block 3 with K as a root. Similarly, we need to find another one in block 2 with C as a root, and finally, we need the last one in block 1 with R as a root. When all the trees are selected, we can simply combine them; when a block is a cut-edge (as in block 4), that cut-edge is added in the same direction to both of the trees. The resulting trees are depicted in (b) and (c). Similarly, to create a GADAG it is sufficient to compute ADAGs in each block and connect them. It is necessary, therefore, to identify the cut-vertices, the blocks and identify the appropriate local-root to use for each block.
Each node in a network graph has a local-root, which is the cut-vertex (or root) in the same block that is closest to the root. The local-root is used to determine whether two nodes share a common block. There are two different ways of computing the local-root for each node. The stand-alone method is given in and better illustrates the concept. It is used in the second option for computing a GADAG using SPFs. The other method is used in the first option for computing a GADAG using Low-Point inheritance and the essence of it is given in . Once the local-roots are known, two nodes X and Y are in a common block if and only if one of the following three conditions apply. Y's local-root is X's local-root : They are in the same block and neither is the cut-vertex closest to the root. Y's local-root is X: X is the cut-vertex closest to the root for Y's block Y is X's local-root: Y is the cut-vertex closest to the root for X's block
This algorithm computes one GADAG that is then used by a router to determine its blue MRT and red MRT next-hops to all destinations. Finally, based upon that information, alternates are selected for each next-hop to each destination. The different parts of this algorithm are described below. These work on a network graph after, for instance, its interfaces are ordered as per . Select the root to use for the GADAG. [See .] Initialize all interfaces to UNDIRECTED. [See .] Compute the DFS value,e.g. D(x), and lowpoint value, L(x). [See .] Construct the GADAG. [See for Option 1 using Lowpoint Inheritance and for Option 2 using SPFs.] Assign directions to all interfaces that are still UNDIRECTED. [See .] From the computing router x, compute the next-hops for the blue MRT and red MRT. [See .] Identify alternates for each next-hop to each destination by determining which one of the blue MRT and the red MRT the computing router x should select. [See .] To ensure consistency in computation, it is necessary that all routers order interfaces identically. This is necessary for the DFS, where the selection order of the interfaces to explore results in different trees, and for computing the GADAG, where the selection order of the interfaces to use to form ears can result in different GADAGs. The recommended ordering between two interfaces from the same router x is given in .
The precise mechanism by which routers advertise a priority for the GADAG root is not described in this document. Nor is the algorithm for selecting routers based upon priority described in this document. A network may be partitioned or there may be islands of routers that support MRT fast-reroute. Therefore, the root selected for use in a GADAG must be consistent only across each connected island of MRT fast-reroute support. Before beginning computation, the network graph is reduced to contain only the set of routers that support a compatible MRT fast-reroute. The selection of a GADAG root is done among only those routers in the same MRT fast-reroute island as the computing router x. Additionally, only routers that are not marked as unusable or overloaded (e.g. ISIS overload or ) are eligible for selection as root.
Before running the algorithm, there is the standard type of initialization to be done, such as clearing any computed DFS-values, lowpoint-values, DFS-parents, lowpoint-parents, any MRT-computed next-hops, and flags associated with algorithm. It is assumed that a regular SPF computation has been run so that the primary next-hops from the computing router to each destination are known. This is required for determining alternates at the last step. Initially, all interfaces must be initialized to UNDIRECTED. Whether they are OUTGOING, INCOMING or both is determined when the GADAG is constructed and augmented. It is possible that some links and nodes will be marked as unusable, whether because of configuration, overload, or due to a transient cause such as . In the algorithm description, it is assumed that such links and nodes will not be explored or used and no more disussion is given of this restriction.