Review: Delayed Internet Routing Convergence

From: Robert Danek <rdanek_at_sympatico.ca>
Date: Wed, 11 Oct 2006 18:09:38 -0400

Paper: Delayed Internet Routing Convergence

Name: Robert Danek.
Course: CS2209, Fall '06.

    This paper discusses the effects of convergence delay on packet loss
and latency in end-to-end Internet paths. The time immediately after a
router change, due to a failure or an addition of a node, to the time
that a consistent view of the new network topology is re-established, is
referred to as the convergence delay. The authors point out that a
widely held myth is that the Internet backbone can react to router
failures fairly rapidly, and that convergence delay on the internet is
minimal. However, this is not actually the case based on the results
presented in this paper.

    The results of the paper are based on an experiment conducted over a
2 year period. During this time the authors injected a quarter million
faults into a diverse number of places in the network, and then measured
the impact using both end-to-end measurements and tracking routing table
changes in backbone routers. Based on these experiments, the authors
discovered that convergence delay could take upwards of tens of minutes,
but with an average of 3 minutes.

    To understand these results, the authors begin by discussing the
slow convergence of distance vector (DV) routing algorithms. The problem
with DV algorithms is that routing loops can form due to insufficient
information being transmitted between nodes. The protocol used for
inter-domain routing, Border Gateway Protocol (BGP), is a type of DV
algorithm. It solves some of the problems with general DV algorithms,
such as count-to-infinity, but it has other problems associated with it.

    The authors also present a BGP Convergence Model in which they
establish theoretical upper and lower bounds on convergence. Assuming
unbounded message delay, the upper bound they present shows that the the
number of computational states explored during convergence grows
factorially with the number of nodes. The lower bound, which assumes
bounded message delay, is Omega((n-3)*30) seconds.

    I thought this was a good paper. In addition to detailing their
empirical results, and providing a theoretical model, the authors went
on to suggest some reasons for the poor convergence delays that they
observed. They also recommended specific changes to ASPath loop
detection to improve the BGP routing protocol.
Received on Wed Oct 11 2006 - 18:09:45 EDT

This archive was generated by hypermail 2.2.0 : Wed Oct 11 2006 - 19:14:38 EDT