REVIEW: Understanding Availability from Nilton Bila on 2005-11-14 (mbox)

From: Nilton Bila <nilton_REMOVE_THIS_FROM_EMAIL_FIRST_at_cs.toronto.edu>
Date: Mon, 14 Nov 2005 10:29:01 -0500

REVIEW: Understanding Availability

SUMMARY:
The paper discusses availablity measurements in peer-to-peer systems. It
performs measuremens of availability in the Overnet network and concludes
that assumptions made in other peer-to-peer availability measurement
studies lead to biased and erroneous results. For the study, the authors
performed availability measurements on 2400 hosts over a period of 15
days. The conclusions arrived are that: measuring host availability by
using hosts' IP as identification underestimates availability and
overestimates the number of hosts in a network as a result of IP aliasing,
the availability of hosts changes over time, the availability also has a
diurnal pattern and this has implications for systems that actively
replicate host's content on joins and leaves. It also concludes that host
availability is generally independent of one another, and that hosts
depart and arrive frequently.

KEY STRENGTHS:
In support of the arguments provided, empirical evidence is presented
comparing measurements of availability based on a unique identifier and
those based on host IPs. In a period of 7 days 1468 hosts used 5867 IPs, a
ratio of 1:4. Even within one day, the authors found that nearly 40% of
probed hosts used multiple IPs. This is evidence that previous studies
which identified hosts by IP have underestimated availability: by uniquely
identifying hosts it was found that 50% of them have availability of 0.3
or less, but if they were to use IP as the identification this number
would be reduced to 0.7.
Evidence that availability changes over time is provided and shows that
the availability distribution changes. Shorter measurement periods
estimate higher availability than longer ones. It is thus recommended that
availability distributions must include the duration of measurements.

WEAKNESSES:
The availability measurements used are those relative to one particular
client - the authors'. It only measures how often the client received a
response from the target host and not how often the target host was
available. This subtle difference implies that if the client's probe time
coincides with the time the host is offline then they will overestimate
availability. An absolute availability measurement would have to be
mearured as uptime/experient time. This inaccuracy can be seen, for
example as the paper points out that 40% hosts change IPs within a day.
According to their metrics these hosts can potentially have 100%
availability even though they go offline before changing IPs.
Received on Mon Nov 14 2005 - 10:29:16 EST

This archive was generated by hypermail 2.2.0 : Mon Nov 14 2005 - 10:42:47 EST