Nearest neighbor methods for density
estimation
points in region
volume of region
total points
Vary the size of a hyper-sphere around each test point
so that exactly K training datapoints fall inside the hyper-
sphere.
Does this give a fair estimate of the density?
Nearest neighbors is usually used for classification or
regression:
For regression, average the predictions of the K
nearest neighbors.
For classification, pick the class with the most votes.
How should we break ties?