Review: Delayed Internet Routing Convergence

From: Fareha <fareha_at_eecg.toronto.edu>
Date: Wed, 11 Oct 2006 20:58:57 -0400

This paper examines the latency in the Internet after a topological
change, due to the convergence properties of inter-domain routing. They
describe several unexpected properties of convergence and claim that the
delay on inter-domain route convergence is due almost entirely to the
unforeseen interaction of protocol timers with specific router vendor
implementation decisions.
A common misbelief was that the Internet backbone infrastructure
supports rapid restoration and rerouting in the event of individual link
or router failures (on the order of 30secs). However, the authors
collected experimental data over a 2 year period (by injecting faults
across the Internet) to show that the average in Internet inter-domain
path failovers is 3 minutes, and in some cases the routing table
oscillations lasted up to 15 minutes. Furthermore, this delay will grow
linearly with the addition of new autonomous systems to the Internet in
the best case, and exponentially in the worst.
The authors briefly decribed the BGP in the context of thier research,
defining a failover to be the implicit withdrawal and replacement of a
route with one having a different ASPath. They then describe their
experimental setup, in which they injected a quarter million faults into
geographically and topologically diverse peering session and then
measured their impact through both routing and end-to-end measurements.
The paper uses the experimemtal results and the BGP specs to present a
simplified model of the BGP convergence process. Each AS is modelled as
a single node, the network is a full mesh, and BGP is a single linear,
global queue. The authors first consider BGP as it is, without bounds on
the delay of update propagation or processing, in order to estimate an
upperbound on convergence. They show that the upper-bound on theoretical
computation on the number of router stated and control messages
exchanged during the process of BGP convergence is factorial with
respect to the number of autonomous systems in the Internet (n). The
authors then assume bounded delays on BGP message propagation and show
that the lower bound on convergence is linear with n.
Furthermore, the paper suggested changes to vendor BGP implementation,
which if deployed would significantly improve Internet latencies,
reducing the lower bound to a constant. However, despite these changes,
BGP path changes will still trigger temporary oscillations. Although
these can be reduced through additional changes, these will just
complicate the protocol and increase router overhead.
The paper describes the steps taken by the authors in a comprehensive
manner, providing measurement results for quite a long period. However,
they do not propose any worthwhile solutions to the problem that they
explored in such detail.
Received on Wed Oct 11 2006 - 20:59:04 EDT

This archive was generated by hypermail 2.2.0 : Wed Oct 11 2006 - 22:21:39 EDT