This paper proposes a simple and effective approach to recovering  
from Internet path failures. It's called one-hop source routing which  
attempts to recover from path failures by routing indirectly through  
a small set of randomly chosen intermediaries.
The paper is organized threefold. First, they analyze characteristic  
of Internet path failures. It discovers the frequency, location, and  
duration of path failures, then assesses the potential benefits of  
one-hop source routing in recovering from those failures. Overall,  
they find that most Internet paths worked well; most paths only  
experienced a handful of failures, and most paths experienced less  
than 15 minutes of downtime over week-long trace. But failure do  
occur and are widely distributed across paths and portions of the  
network.
The failure characteristics have mixed implications for the potential  
effectiveness of one-hop source routing. Since server path failures  
are rarely on the last hop, there should be plenty of opportunity to  
route around them. So, the second contribution of this paper is to  
show that failures that can be addressed through routing, a simple,  
scalable technique, called one-hop source routing, can achieve close  
to the maximum benefit available with very low overhead. Authors  
model and analyze it in detail. The fail paths could be recovered by  
routing indirectly through picking k intermediaries. It is called  
random-k policy. After the data analysis, they conclude that random-4  
makes a reasonable tradeoff between effort and the probability of  
success. It is very successful at coping with other failure  
locations, recovering from 89% of middle_core and 72% of near- 
destination failure. Moreover, there is a benefit to recovering  
early. Accordingly, random-4 should be invoked after having observed  
just a single packet drop.
After that, authors implement and deploy a prototype one-hop source  
routing infrastructure on PlanetLab. The research demonstrates that  
one-hop source routing is easy to implement. The above is considered  
to be the third contribution of the paper.
The paper is well written, with fine and clear presentation.  
Obviously, it is a work with originality. The main contribution is to  
propose one-hop source routing which adds negligible overhead, and  
achieves close to the maximum benefit available to indirect routing  
schemes, without the need for path monitoring, history, or a-priori  
knowledge of any kind. The paper seems very solid with many details,  
especially the large-scale measurement study of Internet path  
failure. However, there are still some problems.
Does this approach make sense? According the data analysis in this  
paper, Internet paths work well and only few experience long-time  
failures. Moreover, the broadband path failures are often on the last  
hop, there is less opportunity for alternative routing. One-hop  
source routing only solve the path recovery problem for none last  
hop, which means it only solves a small part of problems.
Moreover, authors do not analyze the cost problem. Although they  
demonstrate the one-hop source routing is with low overhead, we  
should modify and change the existing protocol for routers. It is a  
big cost.
Another weakness of this paper I think is that authors do not analyze  
this approach theoretically, which could convince the readers better.  
How to get random-4 approach? How to get four random-4 attempts  
before giving up on intermediary-based recovery? Is the overhead  
really low? Is it really efficient? All these are concluded by the  
analysis of data which authors collected for a week. So, I start to  
doubt the data. Is it reliable enough to come to the conclusion?  
Obviously, it lacks of theoretically analysis.
Received on Thu Oct 12 2006 - 22:05:51 EDT
This archive was generated by hypermail 2.2.0 : Sat Oct 14 2006 - 02:54:38 EDT