What to do if there is no separating plane
Use a much bigger set of features.
This looks as if it would make the computation
hopelessly slow, but in the next lecture we will
see how to use the “kernel” trick to make the
computation fast even with huge numbers of
features.
Extend the definition of maximum margin to
allow non-separating planes.
This can be done by using “slack” variables