An Analysis of Internet Content Delivery Systems

From: <nadeem.abji_at_utoronto.ca>
Date: Sat, 18 Nov 2006 17:57:54 -0500

Paper Review: An Analysis of Internet Content Delivery Systems

This paper examines content delivery systems and in particular, four
such systems: HTTP, Akamai, Kazaa and Gnutella. Their study is based
on a collected trace of data at the University of Washington.

Based on their data, they attempt to quantify the growth of
peer-to-peer traffic, the differences in the content being delivered,
the impact of symmetric data communications and the scalability of
these delivery networks. They also examine the potential benefits of
caching content in these systems.

The paper then thoroughly describes their trace methodology. I
believe this is an important step in trace-based measurement studies.
Although it would be difficult to fully replicate this experiment,
this section provides key insights and could greatly aid future
research in similar or parallel fields. Furthermore, their data and
resulting analysis is more credible given that their methodology is
fully known.

The paper first provides high level analysis of their data and had the
following key observations:

- The four delivery systems in the study accounted for 57% of total
TCP traffic.
- Since 1999, HTML traffic decreased by 43% while video file traffic
increased by a staggering 400%.

Next the paper presents a more detailed analysis of the
characteristics of content delivery systems. The first observation
was that the median object size in P2P systems was significantly
larger when compared to WWW and Akamai suggesting a major difference
in the type of content being downloaded. Another trend which is
observed in this paper is that a small number of users account for a
major proportion of traffic. Also, P2P traffic generates few requests
with large transfers. The opposite is true for WWW traffic. In
Kazaa, the load distribution was not as spread as one would think.
This is a very interesting result and suggests there is indeed room
for improving P2P file-sharing systems. Finally, the service
required by P2P users is orders of magnitude larger than WWW users
signaling an immediate need for more bandwidth and provisioning in
ISPs as these systems continue to grow in popularity.

The final section examines the potential benefits of caching in
content delivery systems. The analysis shows that in the Akamai case,
local web proxies could entirely remove the need for separate content
delivery networks due to the static nature of the content. Similarly,
for Kazaa, a cache which maintained the top 300 objects would result
in a hit rate of 38%. It is concluded that a P2P cache would provide
benefit even in small populations. The results are therefore
promising. P2P traffic consumes an overwhelming amount of bandwidth
on the Internet and thus any savings provided by a caching system
could have a major impact.

The use of graphs in this paper was extremely useful and complimented
the analysis well. Using raw throughput graphs along with CDF plots
aided the delivery of the material. In measurement based papers, the
analysis section is perhaps the most crucial, and this paper is an
example of a paper with a strong analytical component.

One thing that is evident from this paper is that the characteristics
of Internet traffic are evolving. Although WWW traffic once
dominated, P2P traffic is now the clear majority. Killer applications
have the potential to significantly impact Internet traffic
characteristics. It is, therefore, vital to undertake such studies
not only to understand the characteristics of the Internet but to also
understand the constantly changing nature of the traffic and therefore
applications.

-- Nadeem Abji
Received on Sat Nov 18 2006 - 17:58:16 EST

This archive was generated by hypermail 2.2.0 : Mon Nov 20 2006 - 18:54:00 EST