Paper Review: An Analysis of Internet Content Delivery Systems
This paper examines content delivery systems and in particular, four  
such systems: HTTP, Akamai, Kazaa and Gnutella.  Their study is based  
on a collected trace of data at the University of Washington.
Based on their data, they attempt to quantify the growth of  
peer-to-peer traffic, the differences in the content being delivered,  
the impact of symmetric data communications and the scalability of  
these delivery networks.  They also examine the potential benefits of  
caching content in these systems.
The paper then thoroughly describes their trace methodology.  I  
believe this is an important step in trace-based measurement studies.   
Although it would be difficult to fully replicate this experiment,  
this section provides key insights and could greatly aid future  
research in similar or parallel fields.  Furthermore, their data and  
resulting analysis is more credible given that their methodology is  
fully known.
The paper first provides high level analysis of their data and had the  
following key observations:
- The four delivery systems in the study accounted for 57% of total  
TCP traffic.
- Since 1999, HTML traffic decreased by 43% while video file traffic  
increased by a staggering 400%.
Next the paper presents a more detailed analysis of the  
characteristics of content delivery systems.  The first observation  
was that the median object size in P2P systems was significantly  
larger when compared to WWW and Akamai suggesting a major difference  
in the type of content being downloaded.  Another trend which is  
observed in this paper is that a small number of users account for a  
major proportion of traffic.  Also, P2P traffic generates few requests  
with large transfers.  The opposite is true for WWW traffic.  In  
Kazaa, the load distribution was not as spread as one would think.   
This is a very interesting result and suggests there is indeed room  
for improving P2P file-sharing systems.   Finally, the service  
required by P2P users is orders of magnitude larger than WWW users  
signaling an immediate need for more bandwidth and provisioning in  
ISPs as these systems continue to grow in popularity.
The final section examines the potential benefits of caching in  
content delivery systems.  The analysis shows that in the Akamai case,  
local web proxies could entirely remove the need for separate content  
delivery networks due to the static nature of the content.  Similarly,  
for Kazaa, a cache which maintained the top 300 objects would result  
in a hit rate of 38%.  It is concluded that a P2P cache would provide  
benefit even in small populations.  The results are therefore  
promising.  P2P traffic consumes an overwhelming amount of bandwidth  
on the Internet and thus any savings provided by a caching system  
could have a major impact.
The use of graphs in this paper was extremely useful and complimented  
the analysis well.  Using raw throughput graphs along with CDF plots  
aided the delivery of the material.  In measurement based papers, the  
analysis section is perhaps the most crucial, and this paper is an  
example of a paper with a strong analytical component.
One thing that is evident from this paper is that the characteristics  
of Internet traffic are evolving.  Although WWW traffic once  
dominated, P2P traffic is now the clear majority.  Killer applications  
have the potential to significantly impact Internet traffic  
characteristics.  It is, therefore, vital to undertake such studies  
not only to understand the characteristics of the Internet but to also  
understand the constantly changing nature of the traffic and therefore  
applications.
-- Nadeem Abji
Received on Sat Nov 18 2006 - 17:58:16 EST
This archive was generated by hypermail 2.2.0 : Mon Nov 20 2006 - 18:54:00 EST