Globally Distributed Content Delivery Review

From: Troy Ronda <ronda_REMOVE_THIS_FROM_EMAIL_FIRST_at_cs.toronto.edu>
Date: Thu, 6 Oct 2005 09:16:17 -0400

Globally Distributed Content Delivery
Review by: Troy Ronda

Popular web sites suffer from requests overwhelming its infrastructure. Site
operators do not want the extra expense of maintaining excess capacity for
flash crowds but also do not want to lose revenue when the system crashes.
Content delivery systems, such as Akamai, own nodes across the network edge.
They will cache content for web sites so that site operators have on-demand
capacity without paying for excess. Akamai also allows dynamic content by
caching the non-dynamic fragments of a page but requesting the dynamic
fragment from the origin server. Deploying nodes across multiples ISPs
(multi-homing) reduces the chances that a site will be inaccessible to
users. Akamai uses a dynamic DNS system, which allows the Akamai hostname to
be resolved based on service, user location, and load balancing. This
mapping system uses BGP information, such as the number of hops between AS,
and live network statistics to determine network topology. The system also
monitors nodes so that load will be balanced both by reassignment (DNS
updates) and by making nodes inaccessible. Non-cacheable object performance
is improved by splitting the TCP connection between the origin server and
the client. Server nodes are able to have more persistent and higher
bandwidth connections than typical end-users. Content delivery faces many
technical hurdles but users benefit from a large distributed network that
can react much quicker than a single-homed cluster.

I think the idea of cooperation between site operators is a good idea to
handle flash crowds. Instead of one site holding excess capacity, all sites
can handle a small extra load (when distributed). Also, each site acts as a
backup to all others in the sense that each probably has a different ISP
connection (multi-homed). Akamai provides this ability as a service. This
eliminates the need for cooperation among individual sites operators,
although it may be cheaper for the operators to form a coalition. Akamai
also provides the ability to cache static fragments of a dynamic page and
request the rest from the origin server. This gives maximum flexibility to
sites who may wish to keep, for example, their customer database hidden from
Akamai. Since Akamai has a globally distributed network of nodes, users
automatically get directed to the closest server. This increases happiness
of the user because content is delivered faster. Akamai also allows more
caching than traditional web proxy caches allow. They give examples like
authorization, content invalidation, and dynamic content assembly.

I almost see services like Akamai as a giant hack like NAT. Although
extremely useful, they would be even better if it was built into a protocol,
not relying on DNS. This may be unrealistic but it is worth trying. I also
wish there could be more cooperation among geographically separated sites to
accomplish the same at little extra cost, as discussed earlier. I would also
like to see some numbers on how long it takes for content to be propagated
across all nodes. This is most important for content like "breaking news" or
fixing an incorrect price on a commerce web site. Obviously objects like
shopping carts are very difficult to distribute in this system. This should
still cause bottlenecks. There also seems to be a security concern. If a
malicious user were to target Akamai nodes then they could take down many
popular web-sites at once.
Received on Thu Oct 06 2005 - 09:16:27 EDT

This archive was generated by hypermail 2.2.0 : Thu Oct 06 2005 - 09:41:03 EDT