THE 1NN-1 METHOD Prediction using one nearest neighbor of the test case Radford Neal, 25 May 1996 The 1nn-1 method uses the simple procedure of predicting that the target in a test case will the the same as the target in the closest training case, with closeness measured by Euclidean distance between the vectors of inputs. If there is a tie among two or more training cases for closeness to the test case, it is broken by randomly selecting one of the training cases. This method can be used for targets of any type (but currently only a single target is allowed). The method is probably not very good for regression tasks, however. It's not really very good for classification, either, but is thought to be at least a bit more competitive in this context. The 1nn-1 procedure is sensitive to the way in which inputs are encoded, but not to the way that targets are encoded. The standard method is to use the default DELVE encodings for the inputs. The target can be encoded in any way that gives a single field (eg, copy, 0-up, etc.), but the target must NOT be encoded as multiple fields. In particular, the target cannot be encoded as 1-of-n, which is the DELVE default for non-binary categorical targets. The method produces only guesses, not predictive distributions. It therefore cannot be used with the L or Q loss functions. The method is implemented using the 1nn-1 program, as described below: PROGRAM TO MAKE PREDICTIONS USING THE 1-NEAREST-NEIGHBOR METHOD. Usage: 1nn-1 instance Reads cases from train.n, where n is the instance number given as the argument. This training data should be encoding numerically, except perhaps for the target (of which there must be only one, the last item on the line). Also reads a file of inputs for test cases from test.n, and produces a file cguess.n, containing the predictions for the test cases found by the 1-nearest-neighbor method.