review: Zipf-like

From: Jing Su <jingsu_REMOVE_THIS_FROM_EMAIL_FIRST_at_cs.toronto.edu>
Date: Thu, 13 Oct 2005 10:39:46 -0400

Using many webserver traces, this paper argues that the request
distribution is Zipf-like -- meaning k/(i^a) where 0 < a < 1, and is
specific to site characteristics. Judging from the sampling of
institutions from which the trace data were gathered, it is reasonable
to assume that most requests were for static content pages. However,
the distribution is still helpful today, since many dynamic pages are
built from static components (mostly images).

I don't think that their simplified model which assumes independent
requests is a big draw-back. After all, languages exhibit the same
behaviour. Since hot documents would likely make requests for the
document components hot, we can abstractly think of the distribution as
representing independent pages.

In terms of caching and eviction policies, the reality is that no cache
is ever really big enough. So we get misses and have to push stuff out.
Big deal. The only worry is getting ping-pong behaviour due to boundary
conditions. It seems like a practical hack like a preferential
second-chance list would be sufficient to stop undesirable worst-case
behaviour.
Received on Thu Oct 13 2005 - 10:39:54 EDT

This archive was generated by hypermail 2.2.0 : Thu Oct 13 2005 - 10:45:23 EDT