Three approaches to classification
Use discriminant functions directly without probabilities:
Convert the input vector into one or more real values
so that a simple operation (like threshholding) can be
applied to get the class.
The real values should be chosen to maximize the useable
information about the class label that is in the real value.
Infer conditional class probabilities:
Compute the conditional probability of each class.
Then make a decision that minimizes some loss function
Compare the probability of the input under separate,
class-specific, generative models.
E.g. fit a multivariate Gaussian to the input vectors of
each class and see which Gaussian makes a test
data vector most probable. (Is this the best bet?)