What to do if there is no separating plane
Use a much bigger set of features.
This looks as if it would make the computation
hopelessly slow, but in the next part of the
lecture we will see how to use the “kernel”
trick to make the computation fast even with
huge numbers of features.
Extend the definition of maximum margin to
allow non-separating planes.
This can be done by using “slack” variables