A special case of softmax for two classes
So the logistic is just a special case that avoids
using redundant parameters:
Adding the same constant to both z1 and z0
has no effect.
The over-parameterization of the softmax is
because the probabilities must add to 1.