Review - One-hop Source Routing from Ivan Hernandez on 2006-10-15 (mbox)

From: Ivan Hernandez <ivanxx_at_gmail.com>
Date: Sun, 15 Oct 2006 18:30:01 -0400

Review of Improving the Reliability of Internet Paths with One-hop
Source Routing
by Ivan Hernández

The paper presents a simple and scalable solution to recover from
Internet path failures using One-hop Source Routing. The idea is quite
simple, when a communication failure occurs, the source attempts to
reach the destination through intermediary nodes; this intermediary
nodes are supposed to be reachable through paths that are not part of
the failure. Of course this approach exploits the fact that in
general, between two computers in the Internet there are several
paths. The solution does not require any knowledge of network state
and does not add overhead routing messages.

The authors configured some PlanetLab computers to act as
intermediaries between a source computer and a destination
computer. The authors tested their idea with the next
experiment. Following a failure the source computer sent a probe
message to the destination using the intermediaries; the
intermediaries then probe the destination and returns the
result. Next, the source will be able to communicate with the
destination through any of the intermediaries that returned a
successful result. An observation resulting from this experiment is
that a large number of failures are recoverable through a large number
of intermediaries.

The prototype of the solution is called Scalable One-hop Source
Routing (SOSR). We can identify from the previous description that we
only need a source-node and several intermediary-nodes. There is no
destination component; furthermore for the destination it is a
transparent solution. In the prototype, the source-node does not probe
all intermediaries, instead the authors select four random
intermediaries. The authors support these simple strategy and the
number of intermediaries with experimental results.

The solution show that works better recovering on core failures,
because in the core there are many paths that may not be part of the
failure, so this paths are used to get to intermediaries. Therefore,
the close the failure is to the source or destination, the more
intermediaries it will render ineffective. The authors highlight that
for popular servers, there are few last-hop failures. This is easy to
explain, they can afford multi-homing and server redundancy
mechanisms. On the other hand, the failures are distributed uniformly
in the middle-core and src/dst-side.

I liked the paper. The solution is simple and scalable. Nevertheless
there are several points that I want to discuss. First, the paper does
not describe when and how the sender is going to go back to normal
operation -- stop its communication through the intermediate. In order
to deploy this solution we need intermediate nodes, these nodes must
be "well distributed" through the Internet, but the authors does not
provide any information regarding to this. On the other hand, in a
real deployment, how much traffic can a intermediate manage? who is
going to provide intermediates?
Received on Sun Oct 15 2006 - 18:30:11 EDT

This archive was generated by hypermail 2.2.0 : Mon Oct 16 2006 - 05:19:07 EDT