Delayed Internet Routing Convergence

From: Stefan Saroiu <stefan_at_cs.toronto.edu>
Date: Thu, 12 Oct 2006 07:26:24 -0400

Re-posting on Vladan's behalf...

--Stefan

---------- Forwarded message ----------
Date: Thu, 12 Oct 2006 03:13:54 -0400
From: Vladan D <vladandjeric_at_gmail.com>
To: stefan_at_cs.toronto.edu
Subject: Delayed Internet Routing Convergence

This paper is the result of a 2 year study of inter-domain routers on the
Internet. The authors have observed that there is significantly more
latency in converging inter-domain routers' views of network topology than
previously assumed. The upper bound is higher by an order of magnitude,
with oscillations sometimes lasting as long as 15 minutes. The effect of
this delay is a substantial increase in latency, loss of connectivity, and
packet loss. The paper shows that many of the common beliefs about failover
are incorrect, and proposes more accurate bounds on delay, as well as
improvements.

The authors used passive data collection and fault injection on Internet
routers to study the effects of inter-domain routing faults. Four types of
(BGP) routing changes were studied: explicit new route, explicit route
withdrawal, implicit route replacement with shorter or longer path. Several
conclusions are drawn about the existing situation:

-- A more accurate upper bound for convergence delay is shown to be O(n!),
where n is number of AS's
-- The lower bound is Ù((n-3)*30)
-- The average delay in Internet failover is 3 minutes
-- Most of the delay in restoring paths comes from the interaction of
configurable router protocol timers and specific implementation decisions
during delayed BGP convergence
-- Internet path failover can increase packet loss by factor of 30 and
latency by factor of 4
-- The injection of a single routing event triggers the communication of
multiple route announcements and withdrawals from each AS

The authors created a greatly simplified model of the network and came to
several important conclusions:

-- Introducing a MinRouteAdvert timer imposes monotonically increasing path
metrics for successive routing updates and therefore achieves global
synchronization (routers eliminate k-1 and shorter length paths before
announcing k length paths)
-- Further improvements could be achieved by performing loop detection at
both sender and receiver, convergence could happen within a single thirty
second round
-- Eliminating the concept of rounds is not trivial

These conclusions are backed by simulations on the simplified model.
Received on Thu Oct 12 2006 - 07:26:36 EDT

This archive was generated by hypermail 2.2.0 : Thu Oct 12 2006 - 09:31:02 EDT