Using Pairs of Data-Points to Define Splits for Decision Trees
  Geoffrey E. Hinton and Michael Revow
  Department of Computer Science
  University of Toronto 
  Conventional binary classification trees such as CART either split
  the data using axis-aligned hyperplanes or they perform a computationally expensive search
  in the continuous space of hyperplanes with unrestricted orientations. We show that the
  limitations of the former can be overcome without resorting to the latter. For every pair
  of training data-points, there is one hyperplane that is orthogonal to the line joining
  the data-points and bisects this line. Such hyperplanes are plausible candidates for
  splits. In a comparison on a suite of 12 datasets we found that this method of generating
  candidate splits outperformed the standard methods, particularly when the training sets
  were small. 
  
  Download  [Postscript] [pdf]
  Advances in Neural Information Processing Systems 8. D.S.
  Touretzky, M.C. Mozer and M.E. Hasselmo. MIT Press. 
  [home page]  [publications]