This paper provides an argument that to provide high availability of
large amounts of data under a highly dynamic set of peers would exceed
available resources. The observation is that despite growing amounts
of disk space and processing power, available upload bandwidth has not
scaled sufficiently to keep up with demand.
I felt the strongest result of their calculations is that two million
cable-modem users at 40% availability would only serve as much
bandwidth has 2000 high availability universities. This goes against
conventional assumptions that sufficient numbers of peers would
outstrip the pipes of large "centralized" systems.
However, their calculations assume that home users have gigabytes of
newly created content to share. This is not the case. The vast
majority of data on home systems is shared data that is already highly
replicated (e.g. popular songs and movies). Thus the problem isn't
constantly churning all data equally, but to provide sufficient
replication of unpopular data (according to a zipf distribution).
Popular data takes care of itself until it becomes unpopular.
Received on Mon Nov 14 2005 - 11:01:43 EST
This archive was generated by hypermail 2.2.0 : Mon Nov 14 2005 - 11:12:12 EST