Some good and bad properties of
histograms as density estimators
There is no need to fit a model to the data.
We just compute some very simple statistics (the
number of datapoints in each bin) and store them.
The number of bins is exponential in the dimensionality
of the dataspace. So high-dimensional data is tricky:
We must either use big bins or get lots of zero counts
(or adapt the local bin-width to the density)
The density has silly discontinuities at the bin
boundaries.
We must be able to do better by some kind of
smoothing.