  | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
      | 
   
   
    | • | 
    If there are
    only two classes, we typically use a single real 
     | 
     | 
   
   
     | 
    valued output
    that has target values of 1 for the “positive” 
     | 
     | 
   
   
     | 
    class and 0 (or
    sometimes -1) for the other class 
     | 
     | 
   
   
     | 
   
   
     | 
    – | 
    For
    probabilistic class labels the target value can then 
     | 
     | 
   
   
     | 
    be
    the probability of the positive class and the output of 
     | 
     | 
   
   
     | 
    the
    model can also represent the probability the model 
     | 
     | 
   
   
     | 
    gives
    to the positive class. 
     | 
     | 
   
   
     | 
   
   
    | • | 
    If there are N
    classes we often use a vector of N target 
     | 
     | 
   
   
     | 
    values
    containing a single 1 for the correct class and zeros 
     | 
   
   
     | 
    elsewhere. 
     | 
     | 
   
   
     | 
   
   
     | 
    – | 
    For
    probabilistic labels we can then use a vector of 
     | 
     | 
   
   
     | 
    class
    probabilities as the target vector. 
     | 
     |