===========================================================================
CSC 363H                Lecture Summary for Week 12               Spring 2007
===========================================================================

-----------------
Self-reducibility
-----------------

Problem of deciding language A sometimes called "decision problem": given
input x, solution = yes/no answer.  But many problems are more naturally
"search problems": given input x, find solution y.

Examples:
 -  Given prop. formula F, find satisfying assignment, if one exists.
 -  Given graph G, integer k, find a clique of size k in G, if one exists.
 -  Given graph G, find a Ham. path in G, if one exists.
 -  Given set of numbers S, target t, find subset of S whose sum equals t,
    if one exists.
 -  etc.

Many languages come from natural search problems.  Clearly, efficient
solution to search problem would give efficient solution to corresponding
decision problem.  So proof that decision problem is NP-hard implies that
search problem is "NP-hard" as well (in some generalized sense of NP-hard),
and does not have efficient solution.

But exactly how much more difficult are search problems?

Perhaps surprisingly, many are only polynomially more difficult than
corresponding decision problem, in the following sense: any efficient
solution to the decision problem can be used to solve the search problem
efficiently.  This is called "self-reducibility".

Example 1:  CLIQUE-SEARCH
    Input:  Undirected graph G, positive integer k.
    Output:  A clique of size k in G, if one exists; special value NIL if
    there is no such clique in G.

 -  Idea:  For each vertex in turn, remove it iff resulting graph still
    contains a k-clique.

 -  Details:  Assume we have an algorithm CL(G,k) that returns true iff G
    contains a clique of size k.  We construct an algorithm to solve
    CLIQUE-SEARCH as follows.

    CLS(G,k):
        if not CL(G,k):  return NIL  # no k-clique in G
        for each vertex v in V:
            # remove v and its incident edges
            V' = V - {v};  E' = E - { (u,v) : u in V }
            # check if there is still a k-clique
            if CL(G'=(V',E'),k):
                # v not required for k-clique, leave it out
                V = V';  E = E'
        return V

 -  Correctness:  CL(G=(V,E),k) remains true at every step so at the end,
    V contains every vertex in a k-clique of G.  At the same time, every
    other vertex will be taken out because it is not required, so V will
    contain no other vertex.  Hence, the value returned is a k-clique of G.

 -  Runtime:  Each vertex of G examined once, and one call to CL for each
    one, plus linear amount of additional work (removing edges).  Total is
    O(n*t(n,m) + n*m) where t(n,m) is runtime of CL on graphs with n
    vertices and m edges; this is polytime if t(n,m) is polytime.

 -  Exercise:  What happens if G contains more than one k-clique?

 - Another way of doing it (positive, with vertices): 
    CLS(G,k):
        if not CL(G,k):  return NIL  # no k-clique in G
        while there is a vertex v in V:
            # look at the graph spanned by the neighbours of v
            V' = N(v); E' = {uv: u,v in N(v) and uv in E} 
            # check if there is still a k-clique
            if CL(G'=(V',E'),k-1) return CLS(G');
                # v is not part of a k-clique
                V = V-{v};  E = E - edges touching v
        

General technique to prove self-reducibility:
 -  assume algorithm to solve decision problem,
 -  write algorithm to solve search problem by making calls to decision
    problem algorithm (possibly many calls on many different inputs),
 -  make sure that search problem algorithm runs in polytime if decision
    problem algorithm does -- argue at most polynomially many calls to
    subroutine are made and at most polytime spent outside those calls.

Example 2:  HAMPATH-SEARCH
    Input:  Graph G, vertices s,t.
    Output:  A Ham. path in G from s to t.

 -  Idea 1:  For each vertex in turn, remove it iff resulting graph still
    contains a Ham. path.

    Problem:  Every vertex must be in the path anyway, and this does not
    say where to put each vertex (which edges to use to travel through this
    vertex).

 -  Idea 2:  Remove s and its edges.  Then consider each neighbour of s
    (must keep track of them separately), find one that has a Ham. path to
    t and remove it and its edges.  Repeat until t is reached.

 -  Idea 3:  For each edge in turn, remove it iff resulting graph still
    contains a Ham. path -- same as for CLIQUE above, except considering
    edges one-by-one instead of vertices.

    Both ideas work, although their runtime is slightly different and the
    last one is simpler.

Example 3:  VERTEX-COVER-SEARCH
    Input:  Graph G, integer k.
    Output:  A vertex cover of size k, if one exists (NIL otherwise).

 -  Idea 1:  Remove vertices one-by-one as long as resulting graph still
    contains a vertex cover of size k.

 -  Problem:  If G contains a VC of size k, then G-v (remove v and all
    incident edges) also contains a VC of size k, whether or not v is in
    the cover (unless k=n, trivial to solve)!

 -  Idea 2:  Check if G-v contains a VC of size (k-1) -- if v outside of
    cover, then answer should be no.

        VCS(G,k):
            if not VC(G,k):  return NIL
            C = {}  # the vertices in a vertex cover of G
            while k > 0:
                pick an unmarked vertex v in V
                if VC(G-v, k-1):  # v CAN be in the cover
                    C = C U {v};  G = G - v;  k = k - 1
                else:  mark v  # so it doesn't get picked again
            return C

    Why does this work?  If k is not the minimum VC size, what this means
    is that the algorithm will pick every vertex it tries until it gets
    down to k' (size of the minimum VC).  After that, it will work the same
    as the one above.  In other words, this simpler algorithm would "add
    extra vertices" to a minimum vertex cover at the start instead of at
    the end.

    There is an even simpler way to see this.  At every iteration of the
    simpler algorithm, we know G contains a VC of size k.  If G-v contains
    a VC of size k-1, then adding v to that VC yields a VC of size k, so
    it's OK to pick v (even though it might not be necessary).  If G-v does
    not contain a VC of size k-1, then v cannot be in any VC of size k, so
    it's not OK to pick v.  Therefore, the call VC(G-v, k-1) tells us
    whether or not we *can* put v in our VC, and that is enough (we don't
    need to know whether or not we *must* put v in the VC).