This paper investigates P2P workload using the Kazaa service as its
study sample. The authors find that the P2 workload does not follow the
Zipf model because Kazaa objects are immutable and clients fetch the
same object at most once.
P2P file sharing systems usually have the characteristic that the same
content is wrapped in different objects from different distributors. So,
users are unlikely to download such content from different objects again
and again. I think this factor contributes to the flat head of the Kazaa
popularity distribution curve as there are more object counts but the
number of requests is distributed among the objects. Unless the authors
spend their time to examine each object shared in Kazaa, it is hard to
identify similar content from objects.
This paper has an initial assumption that the workload of Kazaa is
Zipf-like and then explains why their experimental results do not follow
the Zipf distribution. But why Zipf is THE model to compare with in the
first place? What benefits do P2P file sharing systems have if they
follow the Zipf distribution? For normal web traffic, the Zipf
distribution is sort of important because it renders the web caches less
effective. I am not sure about the implication of whether or not P2P
file sharing systems follow Zipf distribution.
Received on Mon Nov 28 2005 - 10:38:11 EST
This archive was generated by hypermail 2.2.0 : Mon Nov 28 2005 - 10:45:40 EST