Review: One-hop Source Routing from Di Niu on 2006-10-16 (mbox)

From: Di Niu <dniu_at_eecg.toronto.edu>
Date: Mon, 16 Oct 2006 21:37:16 -0400

Review: Improving the Reliability of Internet Paths with One-hop
Source Routing

Reviewer: Di Niu

This paper proposes one-hop source routing, a scalable approach to
recover from Internet path failures. It first characterizes internet
path failures through extensive trace measurements. Based on these
trace data, it then justifies the effectiveness of one-hop source
routing and evaluated the so-called "random-k" policy and 2 other
more complex alternatives. The paper comes with the conclusion that
random-4 is a effective enough recover policy that introduces little
overhead and is scalable. The authors also implemented SOSR a one-hop
source routing prototype and evaluate the prototype in the context of
a web-browsing application for popular servers. The paper is
interesting in the sense that it is based on solid and extensive
trace measurement. However, in my opinion, the evaluation of the
random-k schemes in the later parts still needs improvement.

The paper found from the trace data that for popular servers, there
are relatively few last-hop failures. It is therefore possible to use
one-hop routing to improve end-to-end availability for server paths,
as it targets the non-last-hop failures. However, broadband hosts
will not enjoy so much benefit from one-hop routing as the servers
do. As far as frequency of failures is concerned, certain power law
exisits, as a small number of paths experience a very large number
of failures. During a loss incident, the destination was probed
indirectly through each of 39 intermediaries. And it was found that
66% of all failures to servers are potentially recoverable through at
least one intermediary, and 39% of broadband failures are potentially
recoverable. Therefore, according to the paper, the one-hop routing
could be an effective tool to recover failures in the Internet core.

The paper sounds great thus far. Nonetheless, the evaluation for the
proposed policies is relatively lacking in quality. First, the paper
does not offer a sound evaluation for the overhead incurred by the
random-k policy. During the failure, random-k policy will choose k
intermediaries at random and "send packets through all k
intermediaries in parallel..." This means as k increases, more
duplicated packets will be sent in the recovery phase. As has been
pointed out in its section 2, "only 22% of server paths, 12% of
broadband paths, and 15% of random paths were failure-free: most
paths in each destination set experienced at least one failure".
Failures are actually frequent. Thus when duplicates are sent, we can
expect a large amount of overhead, which can cause congestion.
Although the relationship between eagerness of random-4 and the
overhead amount was studied, yet a more careful study on the
relationship between k and overhead is still needed.

Second, in section 3.4, the authors studied the effectiveness of
random-k for different locations of failures. It was found random-k
recovers poorly from near-source and last-hop failures. However, only
popular servers were considered here. It will lead one to doubt how
broadband hosts will behave for random-k. Moreover, it is unknown to
what extent the popular servers could represent the destinations in
Internet. As has been mentioned at the beginning, among the 3153
Internet destinations, only 982 are servers, which are not majority.
In addition, the overheads of history-k and BGP-paths-k are not
evaluated in the paper. This makes the argument that these schemes
are not scalable unconvincing. Ironically, on the other hand, further
evaluation is needed to determine whether random-k is scalable.

Generally speaking, the paper is clearly presented and contains
substantial work, including data analysis, protocol evaluation and
implementation. The weaknesses mentioned above will not eclipse its
merits.
Received on Mon Oct 16 2006 - 21:38:55 EDT

This archive was generated by hypermail 2.2.0 : Tue Oct 17 2006 - 03:03:41 EDT